Chapter

9

Cleavages

Cleaving a protein into peptides using specific or nonspecific cleavage methods.

This chapter describes how to use GPMAW to simulate cleavages of a protein using specific proteases or chemical cleavage methods. It also describes how the resulting peptide list can be viewed in various modes and can be used for generating simulated HPLC chromatograms and mass spectra.

If you need to mass analyze a protein digest, check out the ‘Cleavage analysis’ section, 9.3. The results of the cleavage (the peptide window) is described in section 9.4)

Automatic digest                                                                               9.1

The Cleavage|Automatic digest command opens the 'Select cleavage parameters' on the Automatic digest page. The other page of the dialog ‘Other cleavage’ is covered in the next section.

The ‘Automatic digest’ can also be accessed from the sequence window by clicking on the ‘Digest’ button . The ‘QuickDigest’ button next to it (the down arrow) opens a menu listing all enzymes in the enzyme list for a quick 1-click digest. Please note that the ‘QuickDigest’ performs a ‘straight’ digest with no partials, or extended options.

In the automatic digest page, you can select a pre-defined enzyme to cleave your protein. By default, 10 enzymes are defined in the list of enzymes. All of the ‘enzymes’ can be edited, and there is room for five additional enzymes. Note that the CNBr (/M) does not yield the correct masses for cyanogen bromide cleavage (use ‘Other cleavage’ for this purpose, see below).

i       Tip: As the most commonly used enzyme is trypsin, this is placed at the top of the list. If you preferentially use another enzyme, you should move this to the first entry in the list.

Each enzyme is listed with name and specificity (see below).

As enzymes do not always cleave at all potential cleavage sites, it is possible to specify 'Partials' (missed cleavages). A partials level of 1 means that the resulting peptide list, in addition to all completely cleaved peptides, will contain all peptides that contain a single potential cleavage site (e.g. in addition to HITLK and SEAQR the list will also contain HITLKSEAQR). A partials level of 2 means that all peptides containing at most two potential cleavage sites will be shown etc. A peptide containing potential cleavage points will be indicated in the peptide list with a ‘*’ after the peptide number.

The ‘Edit’ button enables you to edit the currently selected enzyme line. For each line in the enzyme list, you specify a name (e.g. chymotrypsin) and cleavage specifications (see ‘Specifying enzyme cleavage definitions’ below).

Results of automatic digest are shown in the peptide daughter window (below).

 

Extended options

The peptides resulting from a cleavage can be modified through the ‘Extended options’. The different options can be combined.

Do not cleave if modified: If checked, automatic cleavage will not take place if one of the residues that are part of the cleavage specification is modified. E.g. if you have specified a lysine residue to by hydroxylated, trypsin will not cleave.

Limit mass range: If checked, only peptides having a mass in the range specified by the two edit number boxes will be displayed in the peptide window. If this function is activated, the number of peptides below, in, and above the range will be shown in an extra information pane below the toolbar in the peptide window.

Manual w. pre-cleavage: Enables you to modify the automatic cleavage. After clicking ‘OK’, all cleavage points will be shown inverted in the sequence window. Clicking on the residue with the mouse can now toggle extra as well as existing cleavage points. Right-clicking the mouse in the window terminates data input.

i       Note: As it is not possible to scroll the display when specifying ‘manual’ cleavage points you should make certain that the whole sequence is displayed in the sequence window before selecting this option.

Exchange hydrogens: The mass of all exchangeable hydrogens will be changed to the mass of deuterium.

Terminals: The ‘Terminals’ button enables you to specify the N- and C- terminals of the generated peptides.

Pressing the button opens a dual drop-down dialog box. The selections are the same as for specifying the termini of the intact protein (see ‘Edit’ Ch. 4.1).

The default is H (mass 1 Da) for the N-terminus and OH (mass 17 Da) for the C-terminus.

Specifying enzyme cleavage definitions

Enzyme specificity is defined by the following symbols:

/    following residue is necessary for cleavage

\    following residue prohibits cleavage

-    cleavage position

,    separates multiple residues in the same position

;     separator for individual cleavage specifications. This enables you to combine two or more enzymes within the same specification (e.g. if you digest with trypsin and Endoproteinase Asp-N the definition will be /R/K-\P;-/D). If you want to specify overlapping peptides for one enzyme and not another, you specify the required overlap level for the combined specification. When you have the peptide list you can remove the overlapping peptides for the ‘clean’ cleaving enzyme(s) through the local menu command ‘Remove partials’ (see ‘Pop-up menu’ under ‘Peptide window’ below)

If no dash ('-') is present in the specifications, cleavage takes place after the last residue. Up to 6 positions can be specified.

Examples:

Trypsin:                                 Cleavage takes place after Arg or Lys, but not if the following residue is Pro.

                                               /R,K-\P

Chymotrypsin:                     Cleavage takes place after Trp, Phe, or Tyr, but not if the following residue is Pro.

                                               /W,F,Y-\P

Endoproteinase Asp-N      Cleavage takes place before Asp.

                                               -/D

Acidic cleavage                   Cleavage after Asp or before Ser or Thr.

                                               /D;-/S/T

Other cleavage                                                                                  9.2

Manual digest

The manual digest option functions are identical to 'auto digest' except that you enter the enzyme name and cleavage specificity manually (see how to specify cleavages above).

The result of manual digest is, like automatic digest, displayed in a peptide window (see below).

Other digest

The other digest page lists a couple of cleavage options that cannot easily be defined using the standard cleavage nomenclature.

CNBr: The cyanogen bromide cleavage is a chemical cleavage that cleaves after Met, but modifies the C-terminal methionine residue into either homoserine or homoserine lactone depending on cleavage conditions. Apart from the homoserine/lactone formation the cleavage is similar to automatic digest and manual digest described above.

i       Hint: If you need to obtain information on overlapping peptides in a CNBr digest you have to perform an automatic digest (using /M- as cleavage parameter) and modify the C-terminus according to homoserine or homoserine lactone formation.

Manual cleavage: The manual cleavage enables you to specify all cleavage points yourself. After clicking ‘OK’ you have to click on all residues in the sequence window preceding the bonds to be cleaved. The residue clicked on will be shown in inverted color to indicate that cleavage will take place.

Cleavage positions can be toggled both on and off.

You terminate cleavage definition input by clicking the right mouse button inside the sequence window. The resulting peptide window will then open.

i       Hint: You can modify an automatic digest by checking the ‘Manual with pre-cleavage’ option in the extended section of automatic digest (see above).

Cleavage analysis                                                                             9.3

The Cleavage analysis command will give you a quick overview of the results of enzymatic or chemical cleavages.

The window is a daughter window bounded by the main GPMAW window and is controlled through the right-hand command bar.

The command bar displays from the top the eight first cleavages listed in the automatic cleavage dialog (see section 9.1 above). Selecting one using the left mouse button, will display various peptide parameters in the top left box. The enzyme name and cleavage parameters will be in red, followed by the number of peptides (assuming complete cleavage) and number of double cleavage sites (e.g. two consecutive basic residues in a tryptic digest). The reason for listing the number of double cleavage sites is that most endoproteases are less active towards terminal residues leading to incomplete digests and more complex peptide mixtures. Below comes the distribution of peptide masses divided into 500 Da ranges up to 3500 Da. Finally the mass of the smallest and largest peptide is listed.

The bottom box displays the protein sequence with all cleavage points highlighted. Cleavage takes place after each highlight. The last residue will always be highlighted as the termination of the sequence is counted as a cleavage point.

Before printing you have to select all the cleavage methods you want to analyze by checking the appropriate check boxes above the ‘Print’ button.  The printout will print the peptide parameters on the left-hand side of the page with the protein sequence to the right of the appropriate cleavage parameters. This makes for a quite compact analysis of all relevant cleavages.

i          Note: Only the topmost eight ‘enzymes’ from the automatic digest list can be selected. You may have to edit your digest list to reflect your cleavage preferences (see section 9.1).

 

i       Note: The cleavages performed are ‘straight’ cleavages. This means that you cannot specify overlapping sequences, modified terminals etc. For this you have to perform a ‘normal’ automatic cleavage as detailed in chapter 9.1.

Peptide window                                                                                 9.4

The peptide window shows the list of peptides that is the result of one of the enzymatic or chemical cleavages described above. The window is a daughter window that is linked to its parent sequence window so the daughter window will close when the parent window is closed.

i       Note: Up to three peptide windows derived from the same sequence window can be open simultaneously. Subsequent peptide windows will all replace window number one. The number of the peptide window is displayed as the first item in the title bar in sharp brackets (e.g. [1]).

The initial display of the peptide list is determined by the 'Setup peptide parameters', Chapter 5.2. Several of the display parameters can be turned on and off, either through the tool bar or by right-clicking the mouse to get to the pop-up menu.

i       Hint: The amino acid residues are shown colored if they are colored in the parent sequence window. This means that if you want to color residues after you have created your peptide window, go back to the parent window and create the colored residues you want before going back to the peptide window. If the colors are not displayed immediately, force a repaint by minimizing and restoring the window.

The peptide list shows the peptides generated in a tabular fashion, typically with a number of physiochemical properties as shown above. The ‘Alt’ button switches to an alternate table form that can be set up different from the primary. Typically one is set up with various physical chemical properties and the other as multiply charged ions (below) (see chapter 5.2). In all cases the first column lists the peptide number (in the linear sequence) and the last column lists the amino acid sequence (in either 1- or 3-letter code). The actual parameters shown will depend on the setup. Up to six columns (parameters) can be shown simultaneously in addition to the peptide number (always first) and the peptide sequence (always last).

The header for each column acts as button. When a button is pressed, the corresponding column will be sorted, the first time in ascending order, when pressed again, the sorting will be in descending order. The header of the sorted column will be displayed in red.

If you have specified overlapping peptides in the ‘Select cleavage parameters’ box (chapter 9.1) overlapping peptides will be shown with a superscript after the number indicating the number of overlaps (= number of missed cleavages) present in the peptide.

The commands available for peptide windows are accessed through the toolbar, the Peptide list menu, or the pop-up menu. The commands are listed below with a short description first, followed by a more detailed description of the individual functions.

i       Note: The commands have different scopes. Some commands are for the display only (e.g. 1/3 code toggle), some functions work on the peptide list as a whole (e.g. HPLC chromatogram) and some commands are for the individual peptide (e.g. ms/ms fragmentation).

The toolbar

   Average/mono mass toggle. Default is set in the setup dialog (chapter 5.2).

    Setup: Opens the ‘System setup’ on the Setup peptide parameters page, see Chapter 5.2.

     1/3: Toggle between 1- and 3-letter code.

    Alternate display: Toggles between normal peptide list and alternate list. The actual columns displayed are set in the Setup peptide parameters (Chapter 5.2). The available columns are mass, singly to quadruply charged positive ion, singly and doubly charged negative ion,  from-to, pI, HPLC index, Bull & Breese index and charge.

    Info: Detailed peptide information window, see below.

    Low mass filter: When pressed, the low mass peptides are hidden in the display. The low mass filter limit can be set in System setup (chapter 5.2). This option is very handy when viewing mass spectrometric peptide maps where you very often do not see low mass ions.

    Ms/ms: Displays ms/ms fragmentation pattern of the selected peptide, see Chapter 10.1.

    HPLC chromatogram: Simulated HPLC chromatogram, see below.

    Mass spectrum: Simulated MS spectrum, see below.

    Charge vs pH: Displays a graph of the charge of the peptide at all pH between 0 and 14. See below.

   

             The status panel shows the cleavage agent, the cleavage parameters and the partials level or number of missed cleavages (e.g. p0 = no overlapping peptides; p1 = fully cleaved peptides + peptides containing 1 potential cleavage site).

    Close: Close current peptide window. Also closes derived daughter windows like simulated HPLC chromatogram. Does not close the sequence window.

 Synchronize windows: When this box is checked the peptide window will synchronize with the sequence window so selection of a peptide in the peptide window will cause underlining of the corresponding in the sequence window. Other windows like ms/ms fragmentation and charge vs. pH are also updated whenever the focus changes between peptides (in ‘normal’ mode you have to select the corresponding command in order to update the windows).

i       Hint: If you have a sequence, a peptide, an ms/ms fragmentation and a charge vs. pH window open in GPMAW you can select the Window|Tile command to have all related windows tiled optimally in the main GPMAW window. If you then check the ‘Synchronize windows’ in the peptide window, all windows will be updated whenever the focus changes in the peptide window.

Peptide list commands in the main menu:

The first three menu items (1/3 letter residue, Multicharged, Info) correspond to the toolbar buttons mentioned above.

Predict SS cross-links: Lists a combination of all masses of combinations of all peptides containing cysteine residues.

In order to limit the number of potentially linked peptides you are asked to limit the number of peptides to combine to 2, 3 or 4. The list can be constrained to show only combinations of peptides having an even number of Cys residues, i.e. there will be no free cysteines.

The disulfide cross-links are more fully discussed below.

N-glycosylation: Displays the masses of peptides with potential N-glycosylation sites with the most common combinations of glycosylations (e.g. high mannose, complex and hybrid type), see discussion below.

The remainder of the Peptide list menu (MS/MS, HPLC chromatogram, Mass spectrum, Charge vs. pH) is identical to the toolbar buttons listed above and is all discussed below.

Pop-up menu:

The pop-up menu (opened by clicking the right mouse button in the window) contains the same menu items as the Peptide list menu above with the addition of Remove partials, Copy/Export [Copy to clipboard, Copy columns to clipboard, Selected peptide as new protein, Export to GRAMS], Print and Select font.

Remove partials: This is a special command for the case where you have specified multiple enzymes as cleavage parameter (e.g. trypsin combined with chymotrypsin as /K/R-\P;/Y/F/W-\P) and have overlapping peptides. If you then want to remove the overlaps from one of the definitions (e.g. if your experience tells you that trypsin cleaves completely while chymotrypsin generates overlapping peptides) you select ‘Remove partials’ and from the list of enzymes you click on the cleavage definition from which to remove overlapping peptides. The peptide list will be redrawn reflecting the changes to the definitions.

This command will only be available when the peptide list has been generated with multiple enzymes.

Print

Printing the peptide list essentially gives an output that matches the display. The main difference is that it will be in monochrome with colors shown in bold. The sequence printed will be extended to the right margin – if you select to print in landscape mode (File|Printer setup) you will get more of the sequence printed.

Print options:

When you select print you will be presented with a dialog with the following options:

1-letter code/3-letter code. The default will be what has been selected in the peptide window.

Print size: Normal (10 point) or Compact (8 point). This option refers to the peptide table only; the header and the (optional) composition will be printed in normal size.

Include composition: The amino acid composition will be printed in a table for every 10 peptides.

Print in color: Highlights and colored residues will be printed in color. On a monochrome printer (e.g. laser printer) the colors will in most cases be simulated in gray tones.

Export and Copy to clipboard

The Copy/Export pop-up menu option contains several sub-menus:

Copy to clipboard: The peptide list is copied to the clipboard. The format of the copied list is defined in the system setup (Setup|Setup system).

i       Note: If you want to copy to a spreadsheet you should select ‘Tab delimited’, if you copy to a report select ‘Copy as text’. With ‘Tab delimited’ each column will be in a spreadsheet column by itself.
You also have the choice of copying the sequence with a limited length or full length. For more information see Chapter 5.2 ‘Setup peptide parameters’.

You can highlight and then copy part of the peptide list by using the usual Windows selections keyboard shortcuts
Copy a continuous list: Click with the mouse on the first entry, hold down the <Shift> button and click on the last entry. All entries between the two will now be selected. Select ‘Copy to clipboard’.
Copy a discontinuous list: Hold down the <Ctrl> button while clicking on the different entries. You can combine the two selection methods by first making a continuous list and then de-selecting individual items by holding down <Crtl> while clicking on items to de-select. Select ‘Copy to clipboard’.

Copy columns to clipboard: Lets you select which columns of the peptide list to copy to the clipboard. The complete column will be copied, you will not be able to select a range.

Select peptide as new protein: A new sequence window will open on the GPMAW desktop containing the currently selected peptide. This command is particularly effective if you want to carry out additional cleavages and experiments on an isolated peptide.

Peptide info

 The peptide info window can be accessed either by double-clicking on a line in the peptide list or by selecting a line followed by ‘Peptide info’ from the pop-up menu, the ‘Peptide list’ in the main menu or the ‘Info’ button in the toolbar. The peptide info opens a dialog box showing physical/chemical information on the selected peptide.

The peptide information window can also be called directly from the sequence window (after highlighting a peptide – see chapter 3.2).

The peptide information window is divided into four panels with the following content:

Top left:         The top blue line represents the protein sequence and the green bar shows the relative position and coverage of the selected peptide.
Below is shown the sequence position, length of peptide (with percentage of total sequence) and the elemental composition of the peptide.
Then follows various physical chemical characteristics of the peptide: Monoisotopic and average mass, charge (at the pH selected in System setup, chapter 5.2), Bull & Breeze index, theoretical pI and HPLC retention index. The charge and the pI labels have fly-by help showing the pH and the method used in the respective characteristic.

Bottom left:   The sequence of the selected peptide.

Top right:       The isotopic distribution of the peptide is shown as a stick diagram. The first 15 isotopes are displayed and the graph is always scaled to the largest isotope. The mass of each isotope is shown beneath each stick. Dragging the edge of the window will expand the isotopic distribution.

Bottom right: The isotopic distribution of the peptide in table form. First column shows the mass. Second column shows the abundance of each isotope and the last column shows the relative abundance with the most abundant isotope as 100%.

The arrows ‘Previous’ and ‘Next’ replaces the content of the window with the characteristics of the previous and next peptide in the peptide window. If the ‘Peptide info’ window is called from the sequence window (chapter 3.2) these two buttons will be grayed (non-active).

Selecting ‘Print’ will make a hardcopy of all the information in the window except the blue/green peptide location line.

Selecting ‘Copy’ will copy all the text in the window to the clipboard. No graphics (peptide location line or isotope graph) will be copied to the clipboard.

Simulated HPLC chromatogram

The simulated HPLC chromatogram [Y. Sakamoto, N. Kawakami & T. Sasagawa, J. Chrom. 442, 69-79 (1988)] is based on the separation taking place on a C18 column running a 0.1% TFA/water/acetonitrile gradient. The retention values are the ones displayed in the peptide list and are relative, you cannot translate them directly to minutes on your own separation system.

Each peptide is labeled with the number from the peptide list (e.g. linear order of the polypeptide chain).

The peak heights are based on absorption at 214 nm. The peptide bond and certain side chains (particularly the aromatic ones) are given relative absorption values:

Peptide bond = 1

Cys and Met  = 1

Tyr, Phe and His  = 5

Trp  = 33.

This graph is intended as a guide only, as the real chromatogram will probably never look like this. The yield of  the different peptides on the HPLC column will never be identical for all peptides, you will have partial cleavages, autodigest products, the exact position in the elution order will not be precise, the column ages with time etc. Taken with these precautions, the simulated chromatogram has turned out to be a reasonably good guide.

The edit line in the toolbar can be used for manual input of a peptide. This peptide will have an id number of 0 (zero). Each time a new residue is entered in the edit line, the graph will be redrawn. This will give the user an indication on how changes in single residues change retention time behavior.

The graph can be scaled, zoomed etc. like all graphs, please see Chapter 11.1.

Predict SS cross-links

This command combines all Cys containing peptides in the digest and sorts them by mass. The ‘SS’ button in the main toolbar has to be in the oxidized state (SS), not the reduced (SH) state.

  The first dialog box asks you how many peptides to combine. The options are two, three, or four peptides. If you check the ‘Show only even number of Cys’ option, only linked peptides where the total number of Cys is even will be displayed (that is, no free Cys is likely to be present).

The final peptide list includes average and monoisotopic mass, number of peptides combined and numbers of Cys residues present in the combined peptides. Finally, the numbers (from the master peptide list) of the peptides combined are shown last.

N-glycosylation

The N-glycosylation command will display the most common N-glycosylations for the peptides in your digest that contain the N-glycosylation motif (NxS, NxT, NxC).

The window shows a toolbar at top and the masses of the potential glycosylation below.

The drop-down edit box in the toolbar lists all the peptides in your digest that contains an N-glycosylation site. When selecting a new peptide, the mass lists below will change to reflect the new masses.

The main display shows the mass of the peptide at the top. Below the type of glycosylation is listed, the mass of the core unit (see appendix C), and the different chains that make up the outer arms of the glycosylation (each arm is a Gal-GlcNAc disaccharide). None means no arms, mono to penta means one to five arms. Each arm can terminate in a sialic acid residue (Sia), but as the stability of this sialic acid is not very high, you will often experience a very heterogeneous population.

The complex table is reproduced below with the inclusion of a pentose unit attached to the core.

The ‘Bisecting’ check-box adds an N-acetylGalactosamine unit to the core unit, and the ‘Extra fucose’ check-box adds an extra pentose unit.

Clicking the ‘Glyco type’ button toggles to a list of high mannose (0 to 6 mannose units) and hybrid type glycosylations (mono and di- glycosylation arms +/- sialic acid and mannose).

The hybrid table is reproduced including a pentose unit.

i       Hint: The drop-down edit box can be edited to any peptide (cut and paste is also supported). This means that you can modify residues and are not limited to the peptides present in your digest. You may even enter a sequence without the N-glycosylation motif, although this will not have any biological relevance.

Please read Appendix C.6 for information on the core unit and individual monosaccharide unit masses.

Simulated mass spectrum

The simulated 'Theoretical mass spectrum' draws the masses of the current peptide list as a stick spectrum with a default mass range of 500 to 4500.

All the 'sticks' of the peptide list are drawn to the same height, 20%, of the window height. The 'sticks' are labeled with the peptide number from the list, and have a small error bar across them.

By pressing the 'Load peak table' button () you can load an experimental mass list (either a GPMAW .PKS, a PerSeptive GRAMS peak list, Bruker peak list, or a Hewlett Packard MALDI-TOF peak list) into the spectrum. The spectrum loaded will be drawn in a different color and will usually have both a mass and intensity defined for each peak. The experimental spectrum will be drawn in a relative scale.

The error bars on the theoretical spectrum allow for easy comparison of the two spectra.

The graph can be scaled, zoomed etc. like all graphs, please see Chapter 11.1 for details.

Charge vs. pH graph

The Charge vs. pH graph plots the charge of the selected peptide (shown in the window title) versus pH.

The x-scale shows the pH times 10 (e.g. 70 is pH 7.0) and the y-scale shows the charge times 100 (e.g. 250 is +2.5 charges).

The point where the graph crosses zero charges corresponds to the pI of the peptide. This value can vary slightly from the value reported in ‘Peptide info’ (se above) as the algorithms used for calculations are slightly different. The steepness of the graph as it crosses zero also indicates the confidence to put into the theoretical calculations. A peptide with a shallow crossover point is much more sensitive to the surroundings of the individual charges than a peptide with a steep crossover point.

Two graphs are displayed, one for the reduced and one for oxidized cysteine (there will only be a difference above the pI of Cys).

By pressing the ‘Multiple graphs’ button  you change the display to show all peptides in the digest. In the right hand pane you get a legend for the peptide and can turn the individual peptides on and off.

The graphs can be scaled, zoomed etc. like all graphs, please see Chapter 11.1.