Chapter
3
The sequence window
Displaying protein sequences in GPMAW. The sequence window forms the basis for most other windows and operations.
The protein sequence window is the parent window for most other windows (peptide windows, graphs, results of searched etc.). These windows will be daughter windows derived from the sequence window, and when the sequence window (the parent) is closed, all derived daughter windows will also close. Exceptions from this rule are the database mass search window (Chapter 8), fragment windows (which are parent sequence windows themselves, see below), and all dialog boxes, most of which are terminal windows, meaning that they have to be closed in order for the workflow to continue.
When a sequence is read from disk or clipboard (Chapter 2) or has been entered as a new sequence in the sequence editor (Chapter 4.1) it will be displayed in a sequence window:

Depending on the default values in Setup (Chapter 5.1), the sequence will be displayed in either 1- or 3-letter code. The sequence wraps around the right edge of the window, and each line starts with the number of the first residue of that line. If the sequence is too long to be displayed completely, a scrollbar will appear along the right edge, allowing the remainder of the sequence to be brought into view.
The number of residues on each line depends on the width of the sequence and whether the ‘Display modula 5’ option has been set in the setup (Chapter 5.6). With the option set, the number of residues on each line is limited to numbers that can be divided by 5 (e.g. 20, 25. 30 etc.), the rest of the right-hand side of the display will be empty. If the ‘Display modula 5’ option has not been set, the window will be filled to the maximum number of residues that can be fully displayed.
The first level of navigation is the numbering of the first residue
on each line. Then it is possible to label every 10th residue by
setting the ‘Number 10th residue’ option in the setup (Chapter 5.6).
If you are viewing the sequence in 3-letter code, the numbering will show as
subscript with the number divided by 10 (e.g. residue 120 will have the subscript
‘12’).
If you are viewing the sequence in 1-letter code, the subscript will
only be a small vertical line due to limited space.
In addition to direct numbering of residues, GPMAW has a number of tools to help you analyze your sequence and you should read each of the following paragraphs to make certain you understand the differences.
The primary tool is the mouse. When you move the mouse pointer across the sequence, the mass and position of the residue pointed at will be shown in the toolbar.
Often you are interested in specific residues, residue types, or short sequences (motifs). GPMAW enables you to color these in three different background colors.
Ø Individual residues can be colored by double clicking on the relevant residue followed by selection a color at the bottom of the dialog box (see Highlight residues below).
Ø Residue types and short sequences are colored through the Search|Highlight residues (motifs) command (see below)
This background coloring of residues is persistent and is carried along to peptide windows as well. Please read the detailed description on the limitations below.
Highlighting (inverting) sequences. This is a quick way of drawing attention to and getting the mass of a partial sequence. Highlighting is also the fastest shortcut when making Fragment windows (see below). Highlights made in this way are not persistent, and will disappear when you next click the mouse inside the sequence window. See also the discussion in the ‘Highlight sequence’ section 3.2 below.
Underlining is the third way of drawing attention to a peptide. The normal use of this function is to underline peptide sequences found when making a mass search (Chapter 6.1) or digest mass search (Chapter 8), but highlights can also be converted to underlines. This function is particularly useful when you want to calculate the coverage of a given number of peptides. Underlining is persistent (has to be deleted explicitly) but is not carried on to daughter windows. You can activate underlining either from the mass search results window or directly in the sequence from the menu (Edit|Underline|) or pop-up mouse menu (Underline|).
|
|
Persistent |
Sub-windows |
Main function |
|
Color |
Yes |
Yes |
Navigating sequence |
|
Highlight |
No |
No |
Peptide mass |
|
Underline |
Yes |
No |
Calculating coverage |
Each sequence window has its own status bar above the sequence containing five panels and five buttons.
![]()
The first panel shows the total molecular mass of the protein. The button to the right of this panel toggles between average and monoisotopic mass display (see Appendix C). When showing average masses the button displays a blue and a red when the mass is monoisotopic (mass types see Appendix C). If you right-click on the pane, a pop-up window will show the mass of cluster ions (2MH+, 3MH+…6MH+) and multiply charged ions (MH2+, MH3+…. MH6+). These values are also displayed in the Info|Sequence info dialog box.
![]()
The second panel shows the mass of the residue pointed to by the mouse cursor. The third panel shows the number of the residue pointed at. When highlighting part of the sequence (see below), the content of the second and third panel changes to reflect the selected peptide (i.e. mass and range of selection respectively).
i Note: If the current sequence has an offset number (e.g. the first residue is not counted as residue 1 – see Chapter 4.1) the numbering in the third panel will be shown in red.
![]()
The next four buttons have the following functions:
· Toggle between 1- and 3-letter code.
· Cleave protein function (automatic digest, see Chapter 10).
· Open a drop-down menu enabling the fast selection of a cleavage (Quick cleave, see Chapter 10);
· Search protein for peptide masses (Chapter 6).
· Open/close the information frame (see below).
· Extended protein sequence information (see section later).
·
The last button opens the annotation
window. The button will be colored according to the status of the annotation:
Gray: No information (i.e.
annotation is blank).
Red: There is information in the
annotation.
Green: Information in Swiss-Prot
format (i.e. the feature table can be imported into the sequence.
![]()
The next two panels show the status of the N- and C-terminals, respectively. The defaults are ‘Hydrogen’ and ‘Free acid’ for the unmodified polypeptide. The terminals can be edited in the ‘Edit sequence’ dialog box. Resting the mouse cursor on either field for a few seconds will allow the fly-by hint to show the composition of each terminus. When the mouse cursor passes over a modified residue in the protein sequence, the two panels will turn yellow and show the name and composition respectively of the modification.
Two toolbars in the main toolbar (chapter 1.4) are linked to the sequence window (i.e. the buttons will be grayed when the focus is on a non-sequence window):
SEQUENCE
· Highlight residues (motifs – chapter 3.3).
· Search for mass (chapter 6.1).
· Automatic cleavage (digest - chapter 9.1).
· Ms/ms fragmentation (chapter 10.1).
The arrow next to the and ‘Cleavage’ buttons indicates that both buttons have dropdown menus.
The drop-down menu contains the ‘Quickcolor’ menu that enables you to color a number of specific residues (e.g. basic, acidic, cysteine etc.) by a single mouse click (chapter 3.3).
The ‘Automatic cleavage’ drop-down menu contains the top-most 10 enzymes defined in the enzyme list (chapter 9.1). Note that when you use the drop-down enzyme list, you simulate a ‘straight’ cleavage without overlaps (missed cleavages) and other options you can specify in the cleavage dialog.
GRAPH
· Hydrophobicity (chapter 11.3).
· Secondary structure prediction (GOR - chapter 11.2).
· Charge vs. pI (titration - chapter 9.4).
· Dot-plot graph (chapter 11.6).
The commands in the ‘Sequence’ toolbar are available in the ‘Search’ menu while the ‘Graph’ toolbar commands are in the ‘Graph’ menu.
The information frame appears when you click on the frame button
in the sequence window
toolbar. Alternatively you can specify in the setup (Chapter 5.1) that the
frame shall open when the sequence window opens.
The frame contains information on the status of the protein terminals, (singly) modified residues and cross-links. In addition the theoretical charge of the protein is reported at two values, pH 2.0 and pH 7.0.
The molar extinction coefficient and absorption is also reported.
The information frame can be resized by grabbing the right edge of the frame with the mouse cursor and move it left or right.
The pop-up menu, which can be accessed by right clicking in the
sequence window, contains the following commands Edit
with the submenus Edit
sequence and Edit cross-links, Modify [followed by the residue pointed at by
the mouse cursor] with simple modification choice submenu, see chapter 3.6,
Search for mass, Highlight motif (Submenus:
Digest mass search, Mass difference, Mass X-links, Local BLAST), Underline (Submenus: Underline range, Underline highlight, Clear underline,
Clear highlighted underline) (Chapter 3.4), Automatic digest, Ms/ms fragmentation, Create fragment
window, Peptide info (only enabled when part of the sequence is highlighted), Print and Help.
The functions, except Peptide info, are replicated from the Edit, Search and Cleave options in the main menu and are explained in details below.
The peptide info works only if you have part of the sequence highlighted. GPMAW then extracts the highlighted part and treats it as a separate peptide. All the parameters that are calculated for peptides (mass, pI, HPLC index etc.) are then displayed in a peptide info window. For details see the peptide info window in the section on Automatic digest (Chapter 9.1). If multiple sequences are highlighted, only the peptide information for last highlight is displayed.
You can copy the sequence to the clipboard (ready for pasting into another application) by selecting Edit|Copy (<Ctrl+C>). This places a copy of the sequence in the displayed format (1- or 3-letter code) on the clipboard. If part of a sequence is highlighted, only this region will be copied to the clipboard.
If you press <Ctrl+F> the complete sequence will be copied in FastA format (appendix B.3). This command corresponds to File|Export sequence|to clipboard as FastA.
i Note: The name of the sequence is not copied. You can copy the sequence name as well and get much finer control of the copying process by selecting File|Export sequence (see Chapter 2.8).
You can change the display font and size by selecting the Info|Change sequence font menu option. You are only able to select monospaced fonts (the default is Courier New), but you can select any font size and display type like bold, italic etc.
The selection of sequence display font is transient as the font name and size is not saved. Furthermore, it is limited to each individual sequence window.
Printing can be selected from the toolbar, the main menu (File|Print), and the pop-up menu and by pressing F5.
The standard printout includes in the header: The name of the protein, file origin and position in the file, sequence range, mass file, total mass, state of N- and C-terminus and cysteine. Then follows the sequence in 1- or 3-letter code.
.
When you print the contents of a sequence window you always have to fill out a dialog box giving you a number of choices for the hardcopy, both for the header, the sequence and for additional information (most of the options can be preset in Setup, Chapter 5.2):
Omit header: Do not include the standard header (file origin, size and mass of the sequence, mass file, state of terminals and Cys).
Omit modified info: Do not include information about modified amino acid residues. If this box is not checked the information will only be printed if at least one modification is present in the protein.
Extended info: Include amino acid composition, cross-linked residues, elemental composition, and mass of residues.
Mark 10: Every tenth residue will be marked with ‘=’ instead of ‘-‘. This option is only effective when printing in 3-letter code.
2x-line spacing: Print the sequence with double line spacing.
Print in color: Include color information (background color, modified residues, cross-links). When printing on a monochrome printer (e.g. laser printer) most of the colors will print as shades of gray. You will have to experiment as to how the colors translate.
Sequence font: Select between small (8 point), normal (10 point - standard) and large (12 point) characters. The sequence will be printed in the same font as the sequence display (default is Courier New). See ‘Display font’ above for information on how to change the display font.
Residue: Select 1- or 3- letter code. The selection will default to the sequence window display.
You can highlight part of a sequence by moving the mouse cursor to the first (or last) residue of the fragment you want to analyze, press and hold the left mouse button while moving the mouse cursor to the last (or first) residue you want highlighted. The status panel of the current sequence window will change to reflect the highlight status: the second panel will show the mass of the highlighted peptide (residue mass + water) and the highlight number in square brackets, the third panel will show the number of the first and last residue highlighted.
i Note: Be careful not to confuse the 'Highlight sequence' (a transient operation, this section) with the 'Highlight residues (motifs)' command (persistent operation, see the section 3.3 below).
The highlighted area will remain displayed until the mouse button is clicked again inside the sequence window. Up to three regions can be highlighted simultaneously by pressing the shift button when highlighting region two and three with the mouse. The status panels will show the combined mass of the individual peptides (not their residue mass) and the residue numbers reported will be that of the last highlighted region (number of highlights will be shown in square brackets).
When you have a highlighted sequence you can extract, contract and move the highlighted region by using the keyboard arrow keys:
Left/right key: Extends and contracts the C-terminal position.
Ctrl + left/right key: Extend and contracts the N-terminal position.
Shift + left/right key: Moves the selected region towards the N-/C-terminus of the protein.

i Hint: If you highlight relevant residues (i.e. Lys and Arg when working with trypsin) it is much faster to locate relevant peptides.
When a region has been highlighted, you can use this part of the protein as the default input for a number of other functions. These are most easily accessed through the pop-up menu (right-click the mouse in the sequence window), but several of them can also be accessed through the menu.
If more than one region is highlighted, the input will be the most recently defined highlight region.
Peptide info: This is similar to the peptide info for peptides in the ‘Automatic digest’. This window contains physical/chemical information on the highlighted part of the protein, as if it existed as a peptide. For more information, please see ‘Peptide window’, chapter 9.3.
Ms/ms fragmentation: The input for the ms/ms fragmentation will be the highlighted region, if any. If the total sequence length of the protein is less than 50 residues and no region is highlighted, you will get ms/ms of the whole sequence (see chapter 10.1 for more information on ms/ms analysis,
Edit | Copy to clipboard: If part of the protein is highlighted, this region will be copied, if no part is highlighted the complete sequence will be copied.
Make fragment window: The default selection range for the fragment window will be the currently active highlighted area. The range can be modified before creating the fragment window (see chapter 3.7 below).
Underline residues: A highlighted sequence can be used the input for underlining a range in the sequence (use the pop-up menu). See chapter 3.4 below.
The menu command 'Highlight residues (motifs)' is not to be confused with the mouse-driven highlight function (see 'Highlight sequence' above).

The ‘Highlight residues’ command enables you to color short sequence stretches that can be up to 10 residues in length. As you can also include 'wildcards' (e.g. any residue) you can also use the function as a motif search function.
In each of the 4 x 3 cells of the table you can enter any sequence motif (up to 10 residues) using 1-letter code. The highlight that will color each motif is shown in the left-most column. The highlight colors are defined in 'Setup color' (Chapter 5.3).
A question mark (‘?’) may substitute for any residue. In the above example the all sequences are searched for all occurrences of the typical N-glycosylation motif: Asn followed by any residue followed by Ser, Thr or Cys. The basic residues Lys and Arg (tryptic cleavage sites) are colored in a different color (yellow).
The Highlight residues (motifs) command is typically used to get a quick overview of the distribution of specific residues, occurrence of a specific sequence or to search for sequence motifs. As the coloring is persistent, that is it ‘follows’ the sequence into the peptide window (Chapter 9.4), it can also help you get a quick overview in daughter windows.
You may select the ‘Highlight residues’ command from any sequence or daughter window. This will color all windows or just the selected sequence and related daughter windows depending on the state of the ‘Highlight global’ option.
Isobaric residues: When either of these options is checked, the corresponding isobaric residues are counted as identical. You may, typically through ms/ms analysis (Ch. 10.1), acquire a sequence tag where you have the mass difference of 113. This can be either Isoleucine or Leucine (with the same chemical composition their masses are identical). Instead of entering all possible combinations of L and I you can just check the L/I box and both residues will count as one.
Invert sequence: If checked all sequences will be searched in both directions. I.e. the sequence DVTL above will also highlight LTVD.
Highlight global: All sequence windows on the desktop are searched for motifs to highlight even if they are not selected.
Keep highlight: The highlight motifs are saved between each highlight call. If not checked, the highlight dialog will be cleared upon exit.
Highlight profiles: The contents of the highlight table can be saved to disk as a highlight profile (in .PRF files, Appendix A).
Clear: Clears the table.
The ‘Quickcolor’ command in the main menu is a fast way of coloring the most common
residues or combinations:
Basic (R/K) Tryp
Acidic (E/D) Endo Glu-C (wide range)
Aromatic (W/F/Y) Chymotrypsin
N-glycosylation NxT, NxS, NxC
Cysteine (C) Disulphide bridges
Methionine (M) CNBr cleavage
Lysine (K) Endo Lys-C
Arginine (R) Endo Arg-C
Glutamic acid (E) Endo Glu-C (narrow range)
Aspartic acid (D) Endo Asp-N
This command is also available as a
drop-down menu next to the
button in the main toolbar (chapter 1.4)
.
The quickcolor command inserts the appropriate residue in the ‘Highlight residue’ table (see above). The program looks for the first free ‘row’ in the table to use for coloring. If all rows are full, the last row is replaced by the selection.
Clear coloring. Clears all color highlighting, both ‘quickcolor’ and ‘highlight residues’, and redraws the sequence window.
If you double-click on a residue, the ‘Insert modification’ dialog box opens (see Ch. 3.6). At the bottom of the dialog box there are four colored panes, one white and three colored ones for highlighting residues.
Clicking on the leftmost panel clears any coloring for the selected residue. Clicking on any of the colors will change the background color of the selected residue.
i Note: When you select Highlight motif or QuickColor, the coloring of individual residues (chapter 3.6) will be removed.
The underlining of residues is a way of emphasizing part of the sequence in relation to the rest. Underlining is a permanent feature, meaning that it remains as part of the display when changing between different display modes, but it is not saved along with the sequence. Underlined residues are displayed in red in addition to being underlined.
i Note: Underlining residues may obscure the coloring of modified residues that are also colored red.
The most common use of underlining is to calculate the coverage of a digest or mass search, but the feature should be flexible enough for other purposes.
From the digest mass search result window (Chapter 8.5) you can retrieve sequence ‘hits’ from the database and display the protein sequence in a new sequence window. The peptides identified in the digest mass search will be underlined in this window.
When performing mass searches (Chapter 6.1) you can emphasize peptide ‘hits’ by double clicking on the relevant line, or select underlining from the local menu of a selected peptide. This will redraw the line in bold letters in the search results and send a message to the parent sequence window resulting in underlining of the corresponding peptide(s). A special command in the pop-up menu will bold all peptide ‘hits’ and underline the corresponding peptides in the parent window. This command is most useful when the ‘Fit to enzyme’ option has been turned on and there are only relatively few hits.
From the sequence window you can modify underlining from the above child windows or you can manually set underlines through the Edit|Underline or the corresponding item in the pop-up menu. Four sub-menu items are available:
Underline range: Through a dialog box you enter the first and last residue to be underlined.
Underline highlight: If you have highlighted part of a sequence, you can turn the highlighted part into underline through this command.
Clear underline: Removes all underlines from the sequence.
Clear highlighted ul: Clears a highlighted area for all underlines.

Cross-links are displayed in the sequence window as red lines going from one residue to another. Up to 30 cross-links can be defined for each sequence. In order to differentiate between different cross-links, the lines are in three different shades of red. Furthermore, each color have a different vertical offset.
The button in the status bar controls the display of cross-link lines and how the mass of Cys residues is calculated. When depressed (the button shows ) the cross-links are shown as gray lines, and Cys residues are calculated in the reduced form. When the button is in the up state (legend shows ) the mass of Cys residues is calculated as the oxidized form and cross-links are shown.
i Note: Cross-links are not restricted to Cys residues (see below). When cross-linking residues other than Cys, the display color is still controlled by the button.
Cross-links are edited by selecting Edit|Edit cross-links from the menu or by right clicking on the sequence window and selecting from the local menu.
The left-hand list-box shows cross-links that are already defined. You can define new cross-links by
· Select a residue from the right-hand list-box
· Press one of the up-arrows
· Select another residue
· Press the other up-arrow
· Press the button
If you select a link in the left-hand list-box you can replace it by pressing the button.
By default, Cys residues are shown in the right-hand list-box, but other residues can be selected from the drop-down list below the list-box.
Cross-links are deleted by selecting the relevant link in the left-hand box and press the button.

The value of the residues can also be entered directly in the edit-boxes above the up-arrows. The residue selected will be shown atop the edit-boxes.
Any residue can be selected for a cross-link. If the chemical composition of the cross-link alters the mass of the cross-linked residues, you should combine the cross-link with the 'Amino acid modifications' option discussed below. The mass difference of Cysteine vs. Cystine cross-links (i.e. 2 Da.) is catered for automatically through the button (Chapter 1.4). The button also determines whether drawing of the red cross-links in the sequence window takes place.
In printouts the linked residues are listed when the ‘Extended info’ options have been selected. The links are shown as red lines when the ‘Color print’ option is selected.
In GPMAW you can specify chemical modifications for individual amino acid residues. This function works in addition to the modified residues that are defined in the residue mass files (chapter 4.2), which works on all residues of a given type. You may define up to twenty individual modifications, while there are 11 ‘extra’ residues that you can define in the mass files. In addition you can define changes to the N- and C-terminus (Edit sequence – chapter 4.1). You may also define cross-links, but only for disulfide bridges do they by themselves contain other information like mass and composition. Cross-links can be combined with individual modifications in order to cover all kinds of cross-linking.
An amino acid modification (individual, N- and C-terminal) is defined by a name and an elemental composition. The name is not obligatory, but entering a name makes it easier to navigate the sequence. When you define an amino acid modification in the modification file, you can optionally specify that the modification is restricted to specific residues. For the mass files you need to define 1- and 3-letter code in addition to mass and elemental composition.
Modifications are reported on most printouts, and can be seen in the sequence as:
Single residue modification: Residue is painted red and when the mouse points to the residue, the name and composition is shown in the toolbar of the sequence window (right-hand panels with a light yellow background).
N- and C-terminal modifications: Name is shown in the sequence window toolbar by default. The fly-by help shows the elemental composition.
Residue mass file: The name of the currently loaded residue mass file is shown in the main window toolbar. Note: The residue mass file is global for all sequence windows (i.e. when you change residue mass file, all mass and composition values are re-calculated in all windows).
When you have a few residues modified or many different modifications, you should choose single residue modification. If on the other hand you have a large number of identical modifications (e.g. hydroxylated proline in collagen) it makes more sense to specify a ‘new’ residue to replace the ones that are modified.
If you have residues that ‘change’ during the course of your experiments you can make two mass files both with the ‘extra’ residues, but one file without modifications and the other with. This is the idea behind cysteine modifications where you have a mass file for each type of cysteine modification (e.g. aa_mass for the default values, pe_cys for pyridylethylated Cys, ae_cys for amino ethylated Cys etc.).
Insert simple modificationThe ‘Simple modifications’ are a number of modifications that are hard-coded into GPMAW. When you right-click the mouse in a sequence window, you bring up the pop-up menu that presents you with a number of context-sensitive menu options. The second option on the list is to modify residues. The actual residue that is to be modified is shown after the ‘Modify’ command, e.g. Modify –Lys228-. Selecting this option opens a submenu listing the options for this particular residue type. Selecting one of these options will insert the selected modification into the chosen residue.
You can enable all modifications on the list for all residues by un-checking the last option on the submenu ‘Strict residue check’. When you next access the ‘Modify’ command, all modification options are available for any residue.

If you double click on a residue, you will bring up the ‘Insert modification’ dialog box. The same dialog can be accessed by selecting Edit|Edit modification from the menu. Here you can either specify individual modifications or color a residue. If the residue selected is already modified, the modifications will be entered in the ‘Name’ and ‘Elemental composition’ edit boxes:
Residue: The number of the residue to be modified will be shown in the ‘Residue number’ field at the top. To the right of this number, the full name of the residue will be displayed. If the ‘Edit modification box’ was activated by a double-click on a residue, the number displayed will be the residue clicked upon, otherwise it will be number 1 (the first residue). The residue numbers can be changed directly by editing the number or by clicking the up/down arrows next to the number. The residue name will change to show the sequence residue.
Replace residue: This is a drop-down menu that enables you to replace the currently selected residue with any of the 20 standard residues. The insert modification dialog box closes immediately upon selection.
Modification:
Modifications can be entered directly into the 'Elemental composition' box or, if a
modification file has been opened, you can select from the modifications available in
the drop-down box at the bottom of the dialog box, followed by . The 'Name of modification'
is optional, but useful for reference. Clicking on the calculator
opens the ‘Elemental
composition calculator’, see Chapter 4.4 for how to enter elemental
compositions.
i Note: When entering an elemental composition, be sure that all the atoms are defined in the currently active mass file (chapter 4.2). H, C, N, O, P, S, Br and Na are in the default file.
Below the edit boxes a button can activate a
drop-down menu for selecting simple modifications
. This menu is identical to the ‘Insert simple modification’
menu described above.
By using modification databases (see Chapter 4.3 - Edit|Edit modifications) you can have any kind of modification readily available for changing a sequence.
Start by loading a modification database
file
. The modifications available in the file will be listed in
the selection list. Here you can select a modification either by double
clicking on your choice or make a selection followed by ‘Select modif’
. The selection will be entered in the two edit boxes above
where they can be modified before entered into the sequence by selecting .
Modified residues are displayed in the Highlight 4 color (Chapter 5.3).
i Note: Be sure to define the first three highlight colors different from highlight 4 (in Setup|Setup system|Colors), as you will be unable to see the residue, if it is part of a colored motif (Chapter 3.3).
i Tip: For the first and last residue it is better to modify the N- or C-terminal in the Edit|Edit sequence as these modifications does not count as one of the limited twenty individual modifications.
Clicking on one of the bottom colored panels enables you to color the background of the selected residue in the indicated color. Note: This action results in closure of the ‘Modification’ dialog box and any modifications entered will not be executed.
Clicking on the white left-most field will clear background coloring for the selected residue.
The color single residue command can be useful for drawing attention to a single residue. The coloring is persistent and will be carried on to peptide windows.
The 'Fragment window' option is a fast way of creating a new sub-sequence based on an existing sequence. A common usage of this function is when a pre-sequence has been loaded from a database and you want to work with the active protein. Alternatively you may need to work with a smaller peptide, but find it inconvenient to work through a sequence window.
Select Cleavage|Create fragment window (or right-click and select from the local menu) and the following dialog box opens:

If an area is highlighted, the first and last residue of the highlighted area will be displayed as default. If no highlight exists, the values will be 1 and the last residue of the sequence. The values can be edited directly or you can click the up/down arrows. When selecting , a new sequence window will open containing the selected part of the original sequence. The name of the new window will be 'Fragment 51-90 of ' + the original sequence name.
If you check the ‘Retain offset number’ box, the first residue of the sequence fragment will start with the number in the ‘First residue’ box + the original start value - one. The offset number can be edited in usual edit sequence dialog (Chapter 4.1). When an offset number has been specified, the color of the number in the residue number label of the sequence window will change to red.
The new window will be an independent parent window that can be saved as a sequence, searched for mass, etc. When the original sequence window is closed, the fragment windows will remain on the desktop.
You have to edit the sequence name to remove the automatic addition of ‘Fragment…’.
The ‘Sequence information’ dialog box contains three pages labeled ‘Sequence info’, ‘Composition’ and ‘Masses’, respectively. You switch between the different pages by clicking on the tabs. The first two options have their own menu entry (under Info) while in the last you option you have to select from open dialog box.
Some of the information in this dialog is also present in the sequence window information frame (Chapter 3.1).
The Info|Sequence info menu entry opens a multipage dialog box on the 'Sequence information' page.

The page shows statistics for the currently selected protein sequence:
· The full name of the sequence.
· The origin file and position in the file.
· Average and monoisotopic mass (four decimals).
· Molar extinction coefficient at 280 nm [S.C. Gill & P.H. von Hippel, Anal. Biochem. 182, 319-326 (1989)].
· Molar absorbance based on the molar extinction coefficient calculated above.
· Theoretical pI. Values for both oxidized (SS - disulfide bonded) and reduced (SH) cysteine are shown. Three different values are reported, each based on a different table (1 – Skoog & Wichmann; 2 – Free amino acids; 3 – Rickard, Strohl & Nielsen). In the Setup (Chapter 5.6) you can set which tables to use in general calculations of the pI. See also Appendix C.7.
· Number of residues.
· Number of chains.
· Number of modified residues.
· Number of cross-links.
· Number of residues and percentage of highlighted residues.
· Number of residues and percentage of underlined residues.

The Info|Composition menu entry displays the amino acid and atomic composition of the currently selected protein sequence.
The table at the top lists all amino acid residues with 3-letter code, number of residues (#), and percent (%) counted as number of residues. All x's and 'extra' defined residues (see Edit mass file, Chapter 4.2) are grouped under the Xxx entry.
The atomic composition lists the composition of the atoms defined in the current mass list (Chapter 4.2) as number of atoms (#) and percent (%).

The ‘Masses’ section of the ‘Sequence information’ dialog box shows a table of the multiply charged ion species for both monoisotopic and average masses up to 30 charges. Although the monoisotopic mass will most probably not be very useful, it is included for completeness.
Print will make a hardcopy of both the ‘Sequence info’ page and the ‘Composition’ page when the focus is on either of the two first pages. When the focus is on the last page, ‘Masses’, you will get a hardcopy of the multiply charged ion species.
Pressing the ‘Copy to clip’ button will copy the information in the displayed page onto the clipboard.

The annotation window is an editable text window that
can contain any kind of text. The annotation page will be saved along with the
sequence. You can view the annotation by selecting Info|Annotation… or press the button
in the sequence window toolbar.
The color of the ’Annotation’ button changes with the content:
Gray: There is no content on the
‘Annotation’ page.
Red: There is text
on the ‘Annotation’ page.
Green: The ‘Annotation’
contains a Swiss-Prot entry (with an accompanying ‘Feature table’.
Although you can put any kind of text on the annotation page, it is particularly useful when you read a sequence in Swiss-Prot or GenPept format. When you import a sequence via the File|Import ASCII (from file or from clipboard) command (chapter 2.5) you are given the choice of saving the intact record to the annotation window. If you read a sequence from the indexed Swiss-Prot database (chapter 2.6), the complete entry will automatically be placed in the annotation (if the full database is present).
Records in Swiss-Prot format are parsed onto the ‘Feature table’, see below. Records in GenPept format (e.g. Entrez) will be recognized in the near future.
i Note: When changes have been made to the annotation page, you have to save the sequence in order to save the annotation. You are not warned about loosing information on the annotation page when you close the window!

The feature table is a translation of the FT section of the Swiss-Prot record. The main function of the ‘Feature table’ is to allow easy import of posttranslational modification into GPMAW sequences. Most of the well-defined modifications defined in Swiss-Prot can be imported.
To import modifications you check the
appropriate boxes followed by pressing the button
. This action transfers the modifications to the sequence record and
closes the ‘Annotation’ window. You can check all recognized features by
pressing the button
.
The following features are recognized:
SIGNAL, PROPEP – this part of the sequence is deleted.
DISULFID – cross-links are created.
ACETYLATION, AMIDATION, FORMYLATION, HYDROXYLATION, PHOSPHORYLATION, SULFATATION– the appropriate modifications are carried out. No check is made for correct assignment of amino acid residue (e.g. that phosphorylation is on Ser, Thr or Tyr).
Features like chain, domain, site, glycosylation, and lipid, are not recognized as they either do not have a counterpart in GPMAW or they do not represent a specific chemical modification.
Important: As GPMAW does not check the ‘correctness’ of the assignment of imported modifications, it is important that the import is carried out on the intact protein. When importing signal and propeptides (i.e. removing the peptides) together with secondary modifications, the removal of residues is carried out last, so the chemical modifications go to the ‘correct’ residues.
The button
reloads the sequence from
the annotation page (i.e. removes all changes to the sequence).
When the ‘Annotation’ window opens, the sequence from the sequence window is checked against the sequence in the annotation window. If there are discrepancies between the sequence lengths, first or last residues, you are given a warning in the top part of the annotation notebook.
![]()
In this case you should reset the sequence before importing modifications.
i TIP: As the ‘Feature table’ is calculated from the annotation you can enter your own modifications in the annotation page. Start a new line with FT followed by three spaces before entering the location, a space, MOD_RES, and the modification. Remember to save the sequence. When you next open the annotation window, the new modification will be present on the ‘Feature page’.
If you save the sequence ‘intact’, the Feature table can thus be a quick way of looking at your sequence with and without modifications.