Chapter

6

Mass search.

How to determine whether a given mass is present in the protein.

The main concern in this chapter is to search a protein for a given mass (Search|Mass search..). However, at the end three small utilities are presented that may help you identify mass differences (Search|Mass difference..) as well as a function to help you identify cross-linked peptides.

Search for masses                                                                            6.1

The search for mass function enables you to search a protein sequence for a list of masses. The result shows all peptides in the protein that fit within the given mass window. You can only perform one search at a time. If you perform new search on a sequence window that already has a mass search result window, this window will close and the new mass search results will be displayed instead.

Mass list: The list of masses can be entered manually, read from a disk file or pasted from the clipboard (Ctrl+V). The disk file can be in GPMAW peptide mass format (Appendix A) or a peak file saved from a number of mass spectrometry acquisition software. The exact peak formats supported will continue to increase. If your current software is not supported please contact Lighthouse data for information.

Mass values can be entered with as many decimals as needed (but only four decimals will be displayed), but you should consider of the working mass precision that has been set. When a mass has been entered or read from disk it will be enabled (checked in the ‘OK’ column). You can disable masses by un-checking the checkbox in the right-hand column. You can also enable/disable, delete or screen the mass table quickly by selecting the ‘Edit’ button, see below. The last column in the table (labeled ‘Ch’) shows the charge state of each mass. The default charge state is set in the ‘Ion type’ box (top middle of the dialog. Changing the ‘Ion type’ resets all charge states in the table. To change a charge state, either enter the new charge state or double-click in the appropriate cell, the charge will increase once for every double-click, but will cycle to –2 when it reaches +3.

Right-clicking on the mass list opens a local menu with the following options: Load table, Save table, Copy to clipboard (Ctrl+C), Paste from clipboard (Ctrl+V), OK all, Invert OK, Insert line, Delete line, Clear table. ‘Load table’, ‘Save table’ and ‘Clear table’ duplicate the corresponding buttons of the 'Mass table' buttons. ‘Copy to’ and ‘Paste from’ are standard clipboard routines that will copy a mass list either to or from the clipboard.  'OK all' enables all mass values. 'Invert OK' will disable all enabled values and enable all disabled values, thus enabling you to carry out complementary searches on a mass list. 'Insert line' and 'Delete line' will insert an empty line or delete the current line, respectively.

The up/down arrows above the mass list will sort the list in increasing/decreasing order.

Options

Ion type: Select the ion type that fits with the input. M+H subtract and M-H adds the mass of a proton before carrying out the search, while M uses the mass input list as given.

Mass type: Select average or monoisotopic as fits your data (Appendix C).

Mass table: ‘Load..’ loads and ‘Save..’ saves the mass list to a disk file. ‘Edit..’ opens the Enable/disable mass dialog box, see below. ‘Clear’ clears the mass list. If you load a mass table from disk, the file name will be entered in the ‘Mass list info’ edit line. ‘Copy’ and ‘Paste’ will write/read a mass list from the clipboard.

Precision: The mass window calculated around each mass value. The default value is taken from Setup (Chapter 5.1). The up/down buttons shift the precision in 0.01% units. When you get to 0.01% the unit change is 0.001%.

Multicharged: If checked, multicharged ions (M2H2+, M3H3+ etc.) will also be considered during the search (up to +5).

Mass list info: Here you can enter any text you want printed along with your mass search. The mass list info will also be shown in the title of the search result window. If the mass list has been read from disk, the filename of the data will be read here.

Modifications: The drop-down list box enables you to select a modification file to add to your search. If a modification file is added to the search, all masses in the list will first be used for a search as they are listed, and then the mass of each enabled modification will be added and the search will be repeated. Only peptides containing residues that are specified with the given modification will be considered and only up to the value specified in the 'Max. modif.' field. Selecting a modification will automatically set the ‘Max. modif.’ field to 1, if it is not already set at a higher value. The ‘Modifications’ drop-down list will show all modification files present in the ‘system’ directory.

Selecting the  button opens the 'Edit modification file' dialog box with the currently selected file (see Chapter 4.3).

Fit to enzyme: Selecting an enzyme in the 'Fit to enzyme' drop-down box will compare the result list to the specificity of the given enzyme (as specified in the enzyme cleavage list, see Chapter 9.1). If the 'Exact fits only' is selected, only peptides that fit the specificity will be displayed. If the 'Check fits' is enabled, all matching peptide terminals will be marked.

MS diff.: The mass difference button will open a dialog box displaying the mass table in a x/y difference table. By highlighting specific differences (i.e. amino acid residues, carbohydrate residues, modifications) you can quickly make a visual inspection for sequence tags, double basic residues in tryptic digests, identify modified residues, oxidations etc. For a more detailed description of the table please see Chapter 12.1.

Enable/disable masses

The dialog, accessed through the ‘Edit’ button  of the ‘Mass table’, enables you to quickly enable and disable individual mass values in a mass list.

Masses are moved from the enabled to the disabled list (and vise versa) by highlighting the relevant masses and then pressing ‘>’ or ‘<’. Alternatively, you can double click on a mass value to move it to the other list. Pressing  ‘>>’ will move all masses values to the other list.

Checking the 'Delete disabled items' check-box will delete all disabled masses when accepting the dialog box, otherwise the masses will be disabled (‘un-checked’ in the mass list).


Pre-screen mass list: Selecting a mass list from the Background drop-down list and clicking the 'Background' button will compare all masses in the given mass list against the current mass list and move all mass values, that fit within the given mass precision, into the disabled list. This facility is used to quickly screen for background and/or automatic digest mass values. Pressing the 'Adducts' button works in a similar manner, but takes a modification file as input (also chosen in a drop-down list box) and compares all mass differences in the current mass list and moves any possible adduct ions to the disabled list. The file called ADDUCTS.MOD will be selected as default if present.

RESULTS

The results of the mass search is displayed on two pages in a notebook-like window. You can change between the two views by selecting the appropriate tab at the bottom of the display.

i       Note: The two-page notebook will be implemented in GPMAW from version 5.03. As the complete implementation was not finished at the time of writing this manual, the actual implementation may vary from this description. In version 5.02, only the ‘Analyze’ section is available.

The analyze page gives you a complete view of all peptides that are potential ‘hits’ for the given mass list with the chosen parameters. The results page summarizes the results of your search and includes a view of the protein sequence. You can switch between the two views at any point and any changes made in the ‘Analyze’ section will be reflected in the ‘Report ‘section.

The view of the ‘Analyze’ page can be either in the expanded form where search masses and modifications are listed on individual lines or in the compact form where each peptide/’hit’ only have a single line.

i       Note: The ‘expanded’ and the ‘compact’ views are likely to combined into a single format.

Peptide hits resulting from the mass search are displayed on the ‘Analyze’ page in order of mass, with the closest fit to each mass first. If a modification file has been selected for the search, each active modification will be listed immediately after each peptide hit.

Expanded view: The first line of each input mass will show the search parameters. Each hit will display: Number (#), actual mass, [charge], deviation from the search value (Da.), first and last residue number, and sequence. When a peptide is selected, the precision in % and ppm (part per million) will be displayed in the panel of the local toolbar.

The sequence will be displayed with the two residues before and after the actual peptide grayed (color defined in Setup, Chapter 5.3). There is also a space to separate the pre- and post-residues from the hit peptide. If the 'Fit to enzyme' options was enabled with the 'Check fits' option in the search dialog box, all peptide terminals that fit the enzyme specificity will be indicated by '>' or '<'. If both the N- and the C-terminal residues fits the given enzyme specifications, the peptide sequence will be underlined.

Sequences, that are too long to be displayed completely, will be shown with three dots (...) in the middle where the peptide is truncated. When you resize the window, the sequences will be redisplayed.

After the search for a given mass has been performed, the mass of the specified modifications will be added and the search rerun. Positive hits will be listed along with number of modifications, modification type and valid residues (as defined in the modification file, Chapter 4.3). E.g. “Checking modification [*1]: Oxygen (15.9949) [M]” means that the search mass is increased by 15.9949 Da and the protein researched. If a peptide fitting this mass is found and the peptide contains a methionine it will be reported in the list. [*1] means that a total of 1 modification will be searched and reported.

Check-boxes:

 The check-boxes to the left of each peptide enables you to select individual peptides for transfer to the clipboard (Edit|Copy) or highlight the corresponding sequences in the parent sequence. Clicking the ‘Check box button’  in the header line will check/uncheck all check-boxes. See also ‘Pop-up menu’ below. The function of this button will change into the check-box sub-menu of the pop-up menu (see below).

All lines that contains a ‘perfect fit’, i.e. where both the N-terminus and the C-terminus fits the selected enzyme cleavage specificity, will be checked by default.

The lines that are checked are the ones displayed on the ‘Report’ page.

Toolbar:

The buttons in the local toolbar enables you to:

   Toggle between 2 and 4 decimals

   Toggle between 1- and 3-residue display

   Redo the search using the same parameters. The results from the next search will open in a separate window (until 3 mass search windows are open).

  Toggle between expanded display (see above) and compact display (see below).

  Open/close the navigation frame (graphical representation of the mass hits). See below.

   Close the mass search window.

 The precision can be changed by entering a new value in the edit box and click on the V button (or just move the focus by clicking or pressing enter or tab). The box will be in either % or ppm as determined in the System setup (Chapter 5.1).

 The mass precision type can be changed by selecting among the Da (Dalton), % (percent) and ppm (parts per million) radio buttons. Any change will redraw the result list box.

 The enzyme cleavage specifications to match against the mass hits can be selected in the drop down list. The button to the right of the list toggles between ‘Exact fits’ (a blue ‘E’) or ‘Check fits’ (a blue ‘C’). When exact fits are chosen, only peptides where both the N-terminus as well as the C-terminus fits the cleavage specifications will be shown in the result list. When check fits is selected matching cleavage specifications will be marked with a green ‘>’. If both terminals match, the sequence will furthermore be underlined.

 The right hand panel shows additional information (precision etc.) about the selected peptide.

The compact view shows the search mass, the mass found, deviation (in Da, % or ppm as set in the toolbar), modification (if any), position in the sequence and the sequence if the ‘hit’ peptide with the preceding and following two residues in red.

If a peptide is selected, the peptide residue positions and deviation in % and ppm will be shown in the toolbar panel. Like the normal peptide display, the sequence indicates enzyme cleavage positions (‘Fit to enzyme’). N-terminal hits are indicated with ‘>’ and C-terminal hits with ‘<’. If both the N- and C-terminal fits the cleavage specifications, the sequence will be underlined. For further details please see ‘normal display’ above and ‘Pop-up menu’ below.

Navigation frame

The navigation frame is a horizontal graphical band that opens between the toolbar and the result list, when the frame button  is depressed (the default behavior can be set in Setup, 5.1). The state of the frame button will depend on whether it was depressed when the mass search window was last closed.

The navigation frame shows a horizontal line representing the search mass range (rounded down and up to the closest hundred Da). A short black line indicates every 50 Da and every 500 Da is labeled.

The lines indicate:

Gray lines: Upwards are theoretical peptide masses without missed cleavages. Downwards indicate one missed cleavage (i.e. contains 1 overlap).

Red lines (upwards) indicate search masses that match with a theoretical mass. The colored dots are the ‘hit’ precision.

Blue lines (downwards) indicate non-matching search peptides.

Green lines (upwards) indicate search matches that fit theoretical peptide masses including a modification.

You may zoom the mass range of the navigation display by clicking on the left and right mass limit. After the first click, the clicked mass value will be displayed in the lower right corner of the frame.

A double-click in the frame will reset it to the default mass range.

Pop-up menu

The pop-up menu (right-click in the window) contains the following commands: 1/3 letter, Redo, Selected peptides, Export, Print, Select font, Help.

1/3-letter: Toggles between 1- and 3-letter peptide display.

MS/MS of current peptide: The currently selected peptide in the mass hit list will be transferred to an MS/MS cleavage window for further analysis. Please see chapter 10.1 for details.

Redo: Close the result window and opens the ‘Mass input’ dialog box with all input parameters intact.

Checked peptides: The following commands works, in general, only on those lines that have been selected by checking them in the left-hand column of the peptide ‘hit’ list.

1.       Highlight parent sequence: The sequences corresponding to the selected peptides will be underlined in the intact protein sequence. If no peptides have been checked, the currently selected line will be highlighted.

2.       Highlight parent sequence (toggle): This underlines the checked peptides on if they are not underlined, and removes the underline if they already are underlined.

3.       Check perfect fits: All peptides that have N- and C-terminal sequences that fit the specified enzyme cleavage parameters. If no enzyme has been specified, no action takes place.

4.       Check all hits: All peptides in the list are checked.

5.       Clear all hits: All check-boxes are cleared.

6.       Toggle selections: All peptide selections will be inverted. E.g. checked peptides will be unchecked and vice versa.

Export: Three options are available under Export:

1.                   ASCII: Saves the complete ‘hit’ list to a file on disk. The file will be in text format (ASCII) with the individual columns separated by tab characters (#9). The file will get the extension .DAT.

2.                   GRAMS: If the search data loaded are from a GRAMS peak file (.pkm) you can export the results of the search back into the .pkm file. This means that when you reload the mass spectrum you can have the result shown in the spectrum. The data that you want to export has to be highlighted (double click on the relevant lines, or use the ‘Underline all’ command). The exported data can be edited and formatted in various ways before being saved. For more information please see the on-line help.

3.                   Copy to clipboard: The complete result list is copied to the clipboard. This works like the standard ‘Copy to clipboard’ function (Edit|Copy or Ctrl-C). If peptides are checked, you will be asked whether you want the complete list or just the checked ones.
The exact format for copying to the clipboard is determined by the setting in the ‘Setup – Peptide’ page (Chapter 5.2). In particular you should specify space or tab characters for output (text and spreadsheet analysis respectively).

Print: Prints the ‘results’ list, see below.

Select font: Change display font. This will only change the font for the current window.

Report

The report page shows a two-panel view of the results that are checked on the ‘Analyze’ page.

The top panel shows the protein sequence (60 residues/line) where all enzyme cleavage points are marked in blue and all ‘peptide hits’ are shown as horizontal lines below, with the mass of the peptide to the right of each line. The lines are ‘layered’ to that overlapping peptides are displayed on different levels.

The bottom panel shows statistics on the peptide hits (e.g. residue coverage, number of hits/misses etc.), followed by listings of matching peptides, matching modifications and a list of masses that does not fit anything. At the bottom of the table is listed the modifications searched for.

The green divider between the two panels can be shifted with the mouse cursor to change the ratio of the two panels.

Print

Printing the result list gives you the choice of selecting 1- or 3-letter residue printing with the default as displayed, all other options will be as displayed. Additionally, you can set the print as ‘Normal’ (10 point character size) or ‘Compact’ (8 point).

Sequences that are too long to be printed completely will be truncated in the middle (the truncated part will be shown as three dots...).

i       Hint: If you want to print long sequences you can turn the paper orientation to landscape mode (select File|Printer setup).

 

Mass difference                                                                                 6.2

The mass difference command covers three slightly different mass difference searches that are accessed through a multipage dialog box. As these functions are not linked to a sequence window, they are always available except when a dialog box has the focus.

Mass difference

The 'Mass difference' enables you to search for mass differences between residues in the currently defined mass table (Chapter 4.2).

You enter the difference to search for in the edit box and press ‘Search’ button. The differences will be shown in the central table with the closest fit first. By pressing the ‘Invert’ button you can invert the search (e.g. search for 33 Da instead of -33 Da).

Difference table

The difference table is similar to the mass difference function above, except that amino acid residue differences are pre-computed and placed in a table.

The differences are shown with two decimals as default, but by checking the '4 decimals' check-box, the table is redrawn with four decimals.

Printing the above table gives you a handy difference table.

Fit mass to sequence

Occasionally you can end up with the mass of a peptide and only a partial sequence. If you need to know how many sequences that possibly exist for the remaining mass you start the 'Fit mass to sequence' option.

Start by determining whether you are looking for a peptide or fragment (residues only, i.e. an internal fragment where you have subtracted water for the terminals).

Then establish if the mass is average or monoisotopic.

You enter the search mass in the 'Fit to mass' box. Check that the precision is suitable. Enter the maximum number of residues to check.

If you know that certain residues are present in the peptide, you can enter them in the ‘Known residues’ box (1-letter code). These residues will then be omitted from the mass search.

The ‘Compositional residues’ contains all the residues that may make up the peptide/fragment. If you know that a few residues (e.g. C or W) are not part of the peptide, you should remove them as the number of possibilities decreases drastically even with only a few residues removed.

When the search options are as stringent as possible, press the ‘Do search’ button.

A counter above this button will count down, and upon reaching zero, the table will show all possible sequence combinations that fit within the given mass window, closest fit first.


You can perform a new search by entering new values and press the  ‘Do search’ button.

i       Note: The results of the search are compositions, not sequences. The actual sequence may be any permutation of the residues in the result.

Search for cross-linked peptides

The basis for this function is the cross-linking of proteins using chemical cross-linkers that result in cross-links with a specific molecular mass. The two proteins to be cross-linked can either be different or identical. In the case of identical proteins, you can either search for interchain links (linking two molecules together) or intralinks (links internal to a protein). If you search for intralinks, you should first perform a purification (gel filtration, gel electrophoresis) to make certain that you are not looking at multimers.

The operation is started by opening the relevant proteins in the GPMAW desktop. Then you select the Search | MS X-link menu option and you are greeted with the ‘select proteins’ dialog box. Notice the protein in the left-hand box will be denoted ‘Protein A’ and the one in the right-hand box ‘Protein B’.

When you have selected the protein(s) to cross-link, you have to select the cross-linking reagent from the next dialog box.

This is done in three steps:

1)       Select cross-linker type.

2)       Select specific cross-linking reagent.

3)       Select cleavage enzyme and upper mass limit of reporting peptides.

1) Start by selecting the cross-linker type. A zero length cross-linker does not introduce new atoms into the molecules, but typically removes water. Homobifunctional linkers introduce a linker molecule, but links to the same chemical group at both ends (e.g. amine, SH or carboxylic acid). Heterobifunctional linkers (not implemented at present, ver. 4.22) links two different chemical groups (e.g. SH and amine).

Selecting a new cross-linker type will fill the cross-linker drop-down box below with the cross-linkers known to the system.

2) Select a suitable cross-linker in the drop-down box. Whenever a new item is selected, the three composition boxes are filled with the appropriate compositions for a) cross-linker; b) non-linked cross-linker (e.g. a cross-linker that is linked at one end but is ‘free’ at the other end, usually hydrolyzed); 3) reduced cross-linker (if applicable). The analysis of reduced cross-linkers is not supported at present (ver. 4.22).
The cross-link boxes below the compositions show what kind of chemical groups react with the selected cross-linker. If the box shows ‘Amine’, it means that lysine and the N-terminus can react, if the box shows ‘Carb. Acid’, it means that glutamic acid, aspartic acid or the C-terminus may react.

3) Select enzyme to use for cleavage analysis (the list is identical to the ‘automatic digest list’, see Chapter 9.1) and number of overlaps (missed cleavages). The number of overlaps can be 2, 3 or 4. Remember that if your cross-linking reagent modifies lysine residues and you use trypsin or Lys-C for cleavage you will need at least an overlap of 1 in all cross-linked peptides (except for the N-terminal peptide). Finally choose the upper mass limit to report. As the list generated is usually quite long, you should choose the upper mass limit that corresponds to your mass spectrometer.

i       Note: The first time you run this function, you will get an error message telling you that the file containing the cross-linking reagents has not been fount, but default values will be inserted. Upon completion of the dialog, the file will be saved to disk (as ‘xlinker.rea’ in the gpmaw\system directory).

Edit cross-linking reagents

From the cross-linking parameters dialog box above, you can press the ‘Setup’ button to get to the ‘Edit cross-linking reagents’ dialog. Here you can edit the various parameters for the cross-linkers supported by GPMAW.

Each reagent needs to have a unique name and the ‘Link type’ has to be specified to either ‘Zero length’, ‘Homobifunctional’ or ‘Heterobifunctional’ from the drop-down selection box. If either of these requirements is not met, the entry will be deleted upon exit. The heterobifunctional linkers are not supported by GPMAW at present, but will be in the near future.

The ‘Link from’ and ‘Link to’ fields are the residues that participate in the cross-linking. The choices are ‘Carb. ac.’, ‘Amide’, ‘Cysteinyl’ and ‘Tyrosine’. For carboxylic acids, the C-terminus of the protein is expected to be able to participate in cross-linking, and likewise for the N-terminus when amide has been selected.

The ‘X-linker‘ field contains the chemical composition that the cross-linker adds to the mass of the cross-linked peptides. I.e. this is not the mass of the cross-linking reagent!

The ‘Non-linker’ field is the chemical composition of a cross-linker linked at one end, but not at the other end. This will usually be a hydrolyzed reagent (i.e. X-linker + H2O1).

The ‘Red-linker’ is a reducible reagent linked to a residue, after reduction. This will typically be half of ‘X-linker’ + H1.

Whenever the focus is on a composition field (the last three columns) the ‘Composition’ button will be enabled. Pressing this will open the ‘Edit composition’ dialog box enabling you to safely edit chemical compositions in the GPMAW format (see also Ch. 4.4). Once a composition field has the focus, you may double-click on it to activate the ‘Edit composition’ dialog.

When you exit the dialog, the new settings will be saved on disk and entered into the ‘Cross-linking parameters’ dialog.

Results

The results are presented in a list box showing all the peptide masses that may be generated.

The list box show four columns that from left to right (depending on the state of the check-boxes below the list):

1)       Mass of peptide in Da. The mass type is determined by the Average/Monoiso. button at the bottom of the dialog.

2)       Residues from protein A that participate in the peptide.

3)       Residues from protein B that participate in the peptide.

4)       Type of crosslinked peptide(s):
Peptide: Single non cross-linked peptide from either protein A or protein B.
X-link: Two cross-linked peptides with one peptide originating from protein A, the other from protein B.
X-link + linker [1]: Same as above but with an additional non-linked (hydrolyzed) cross-linker. The angular parenthesis indicate the number of non-linked cross-linkers attached.
Peptide + linker [1]: A single peptide from protein A or B with an attached non-linked (hydrolyzed) cross-linker.

What peptides are calculated?

If the cross-linker links to amine residues, both the N-terminus and lysine residues are marked as potential linked residues. If the N-terminus is blocked you have to disregard N-terminal peptides.

If the cross-linker links to carboxylic acids, the C-terminus, glutamic acid and aspartic acid are potential linker residues.

If trypsin is specified as the cleaving enzyme together with a linker that links to amine residues, a peptide that is reported as having either a X-link or a (hydrolized) linker has an ‘internal’ lysine in addition to a terminal lysine/arginine. An alternative to the ‘internal’ lysine is the N-terminus.

i       Note: Due to the fact that in many cases you need ‘internal’ cross-linking residues in the peptides, all peptides are calculated with an overlap level of 0, 1 and 2 etc up the level specified in the parameters dialog.

If you cross-link indentical proteins (i.e. name of protein A is equal to name of protein B) ‘reverse’ links are removed from the list. I.e. if peptide 25-48 is linked to peptide 78-101 the reverse link peptide 78-101 to peptide 25-48 is not listed.

Options

In order to limit the number of reported peptides you can turn off the display of Peptide (non-linked), Peptide + Cross-links, Peptide + Free linker and/or Peptide + Cross-link + free linker by ‘un-checking’ the corresponding check boxes below the peptide list.

The total number of entries in the peptide list is displayed below this line.

If the MH2+ box is checked, the peptide list is displayed with doubly charged mass values in addition to the singly charged mass.

In the bottom right corner, you can choose between displaying average and monoisotopic masses, and molecular and singly charged ion species.

Print: Prints the peptide list in a two-column layout to conserve space.

Copy to clipboard: Copies the complete peptides list to the clipboard unless part of the list has been selected. In this case only the selected part of the list is copied to the clipboard (in the pop-up menu you can choose to copy the complete list even if part of it has been highlighted).

Compare to mass list

As the peptide list quickly gets to be very large it is very convenient to be able to compare the list to a mass table (peak table).

Start by selecting the appropriate mass precision for the comparison. This value has to be in ppm (part per million – 1000 ppm = 0.1%). The press the ‘Compare to ms list’ button and enter the mass list in the input table dialog box (see chapter 12.1 for details on file and clipboard lists).

If a mass list has been entered the peptide list will be split with the original list at the top and the mass ‘hits’ listed below along with the deviation in Dalton and ppm (part per million) listed in the right-hand part of the list along with the search mass.

When the peptide list is printed or copied to clipboard, the search results are included at the end of the list.