Chapter

11

Graphs

Graphs are used in GPMAW to display a number of parameters from secondary structure prediction to user-defined graphs.

Several graphs have been used as daughter windows of other windows. They behave in a manner similar to the windows described in this chapter. This concerns composition search (Chapter 7), simulated HPLC chromatogram, simulated mass spectrum, and charge / pH graph (Chapter 9).

Common graph commands                                                             11.1

All the graphs in GPMAW, whether they are line or stick graphs, share a number of common commands:

Toolbar

The layout of the toolbar varies slightly from window to window depending on the amount of the needs and capabilities of the current graph.

·       The first button resets the graph to full scale.

·       The second button opens a dialog box enabling you to set the x- and y- scales to specific values.

·       The third button prints the current graph.

·       The fourth button saves the current graph to disk. The graph is saved in Windows metafile format (.WMF). This is a vector drawing format that enables scaling of the graph without loss of quality, even after it has been imported into a word processor.

·       Extra buttons may be present for loading files from disk (simulated mass spectrum).

·       The first panel shows the current position of the mouse cursor.

Mouse commands

You can zoom (expand) part of the graph by pointing to one corner of the area to be zoomed, press and hold the left mouse button while dragging the mouse cursor to the opposite corner of the area to be zoomed.

Pop-up menu: Right-clicking the mouse on the graph opens the local menu enabling you to choose between the following commands:

Full scale, Set scale, Save to file, Print, Exit (close current window).

Copy

The menu command Edit|Copy copies the current graph to the clipboard (in WMF format). From here you can paste the graph into any program that accepts graphics as input (most word processors, vector graphics programs etc.). When you paste into Word, you sometimes have to double click on the miniature graph shown to invoke the built-in graphics editor. When this opens with the graph, just close the graphics editor, and the graph will be displayed correctly in Word.

Print

Graphs are always printed in a pre-determined ratio of width to height, but utilizing the full width of the paper. This means that you can get a considerably larger graph by setting the printer to print in landscape format (File|Print setup).

Graphs are always printed in monochrome (black and white), but copying to a graphics program (see above) is done in color, which thus enables you to print, colored graphs.

Secondary structure prediction                                                      11.2

The secondary structure prediction is based on the values of Garnier [J. Garnier, D.J. Osguthorpe & B. Robson, J. Mol. Biol. 120, 97-120 (1978)] – also known as the GOR secondary predictions.

When the command is selected the graph daughter window opens immediately showing the calculated propensities for alpha helix, beta sheet, beta turn and random coil. Each graph has its own color, which can be set in Setup colors (Chapter 5.3).

Values well above zero, isolated from other curves, give a good indication for a particular structure. Other things to look out for is that alpha helices are usually of a length of 10-15 residues and beta sheets are never single, but always occur in multiples, often separated by beta turns.

When printing the secondary structure propensities, the different curves will be plotted as different line structures (solid, dotted, stippled and dot-dash).

Please note that the predictions are not accurate, but only give an indication of a structure. The proposed structure should always be checked by other means: alpha helical wheels, homologous structures, circular dicroism measurements etc.

Hydrophobicity                                                                                11.3

Selecting Graph|Hydrophobicity opens the hydrophobicity graph based on the values of [J. Kyte & R.F. Doolittle, J. Mol. Biol. 157, 105-132 (1982)].

The graph shows hydrophobic regions above the X-axis and hydrophilic regions below. The graph is particularly good at detecting transmembrane regions of a protein (around residue 500 in the above picture) and you can also often locate the activation peptide (left side of the graph). Also when generating peptide maps, the hydrophobicity graph indicates areas that may generate peptides that can be difficult to handle.

The toolbar shows from left to right:

·          Full scale;

·          Set scale;

·          Save graph to file (as a Windows metafile);

·          Show grid lines in graph;

·          Draw selection bar (see below);

·          Close hydrophobicity graph;

·          X and Y position of the mouse cursor;

·          Select window size (e.g. the number of residues for which the hydrophobicity values are averaged. The default window size is 7, but values in the 7-13 range may be biologically relevant.

 

The selection bar  enables you to draw a horizontal bar across important regions of the graph and dynamically see the same region underlined in the protein sequence. The same region will be shown in the toolbar in sharp brackets .

Primer multiplicity                                                                           11.4

If you want to generate a primer from a protein sequence the Graph|Primer multiplicity generates a graph showing various multimers based on the protein sequence

Low points in the graph show the lowest multiplicity, that is the best position for generating primer. Point the mouse cursor at the points of interest and read the cursor position in the toolbar (see general graph information above). Please note that all graphs go to zero at the ends - this does not mean that the terminals are the best sequences for basing a primer.

User graph                                                                                       11.5

Using the Graph|User graph option it is possible to customize a graphical display of various protein features.

In the Residue graph parameters, you enter values for each residue depending on the feature you would like to emphasize. The example above displays the distribution of positive, negative, hydrophobic, and small residues along the polypeptide chain by assigning each of the residues in the group a value of 10. You do not have to assign the same value to every residue, but you should keep the average value in the range of 10-20, as the height of the graph otherwise will be too small or too large to show any details.

Parameters:

File name: If the data are read from a file, this field shows the file name (8 characters)

Graph name: Title of the graph.

Graph info: Supplemental information.

Graph 1..Graph 4: Values can be specified for each amino acid residue for up to four graphs (remember to enable the graph in the 'Number of graphs' box).

ID: Label for each graph.

Number of graphs: Number of graphs to display.

Scaling factor: Not used at present.

Scale: Initial y-scale of the graph.

Smoothing: Number of values (x-axis) that are averaged in the display.

Reset: Clears the table.

Load / Save: Loads and saves a table to a disk file.

The common graph commands are available for the user graphs, please see above.

Dot-plot                                                                                            11.6

The dot-plot graph is used to compare two protein sequences by plotting identical or homologous residues in a two-dimensional pattern. Unlike the other graphs discussed, the dot-plot graph cannot be zoomed with the mouse, and the local menu is also slightly different. The function can also be used for comparing a sequence against itself in order to check for internal repeats (select the same protein for both axes).

When selecting Graph|Dot-plot you open a dual list dialog box. Each of the list-boxes presents a list of all sequences opened on the desktop. You select one sequence from each list-box and press 'OK'. You can select the same sequence in both list-boxes if you want to make a dot-plot of a sequence against itself.

i       Note: Only sequences opened on the desktop can be selected for display in the Dot-plot window.

The dot-plot shows one sequence along the top and one down the left side. A sliding window of length the 'compare' value is moved along both sequences. For each comparison, a value of 12 is given to identical residues. If the sum is more than a given threshold value (threshold) a dot is placed in the position corresponding to the first residue in the sliding window. If the two sequences contain areas of homology, these will show up as lines running from top left to bottom right with a slope of 45º. If you have selected the 'homology' (Hm) option, residues are compared based on the Dayhoff PAM250 matrix.

As the values in the identity mode (‘ID’) are given values of 12 for each identity, you have to change the ‘threshold’ value in the same steps.

The human eye is very sensitive towards patterns, making it quite easy to locate areas of homology between protein sequences. A ‘compare’ values of 1 (score = 2) will often give a very 'noisy' picture with too many dots. Increasing the 'compare' value to 3 (or 4) and the ‘threshold’ to 20-30 will decrease the 'noise', making diagonals easier to see. The same rules apply when you view the graph in 'homology' mode, except ‘threshold’ can be changed in steps of 1. In 'homology' mode a starting value for ‘threshold’ is 8-10 times the 'compare' value.

Residue numbers are shown along the right and bottom edge of the dot-plot. As even a medium-sized protein is too large to be shown in total, scroll-bars at the right and at bottom edges enable you to scroll to the area of interest.

Toolbar

Unlike the other graph windows the dot-plot window does not have a toolbar on top of the graph, but has a separate control window floating on top of all windows. The ‘Dot-plot control’ has a toolbar, a line showing the compare/threshold limits and two lines displaying the alignment of the two sequences currently pointed to by the mouse cursor.

The first two buttons  toggle between normal (x1) and enlarged view (x2).

The next two buttons  toggle between plotting identical (ID) and homologous (Hm) (comparison based on the PAM250 matrix) residue comparison.

Whenever you click on one of the first four buttons you will be asked to state many residues to compare and the threshold limit.

 Save graphic image to disk (Windows metafile format).

 Move control. By clicking one of the arrows, you will move the view of the dot-plot in the corresponding direction.

 The home button moves the top left corner of the graph to fit with the top left corner of the window.

 Close dot-plot window and control window.

The information panel shows the current values of 'compare' and 'threshold'.

The bottom two lines show the sequence alignment of the two sequences at the position pointed to by the mouse cursor. ‘X’ is the ‘horizontal’ and ‘Y’ is the ‘vertical’ sequence. Then follows the position in the sequence of the first residue listed (= the position pointed to by the mouse cursor) and finally the actual sequence. Identical residues are displayed in red while non-identical residues are shown in black.

i       Hint: In the normal mode (x1) it can be difficult to point exactly to the first residue in a diagonal line, but it is usually easier to get an overview of the alignment. Switch to enlarged mode (x2) to make it easier to position the mouse.

You may copy the graphic image to the clipboard by selecting Edit|Copy. Printing is by clicking the print button in the main toolbar or select File|Print.

The pop-up menu (right-click mouse on dot-plot window) contains four commands to move the graph Top edge, Bottom edge, Right edge and Left Edge moves the dot-plot so the edge of the graph fits with the corresponding part of the window. The Homology matrix lets you change between the PAM250 and a Similarity matrix (for most purposes the PAM 250 is the best choice). The final options Save to disk, Print, and Exit commands are mentioned above.

Dot-plot chart

The dot-plot chart is a variant of the dot-plot discussed above. The algorithm is the same, but the display does not have a 1:1 ratio to the screen pixels. This means that you can show the complete sequence in a small graph.

However, the calculations for the bit-map is much slower, and for large sequences on a slow computer it may turn out to be unrealistic to perform a thorough analysis. In this case you should get the right parameters from the dot-plot above and transfer the parameters to the dot-plot graph afterwards.