?_"""lp {$K E   v @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45C uu(u!߬왨왨왨왨왨왨      왨왨왨 왨왨     왨 왨왨 ؋왨왨왨왨왨왨왨񙨬a/&;)z4|CONTEXT|CTXOMAP"|FONTY|KWBTREE|KWDATA]|KWMAP|SYSTEM}l|TOPICXm|TTLBTREEƼ|bm0|bm1&|bm10%|bm11QL|bm12i|bm13|bm14|bm15,|bm16|bm17|bm18M|bm19~|bm2e<|bm20{|bm21:|bm3JS|bm4i|bm5|bm6$|bm7|bm8|bm9pglpqa/*     I @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45 C HH(Hhdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq֨doq왨doq왨doq0왨doq* 왨doq 왨doq* 왨doq 왨doq*왨doq 왨doq*왨doq  왨doq*왨doq왨doq왨doq0왨doq왨doq왨doq왨doq왨doq񙨬doqVdoqlp{d2~-  r   J @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45r C II(Iڨ왨왨ؒ왨ؒ왨왨ؒ왨  왨 왨 왨  왨왨   왨왨왨  왨ء왨 ح왨ذ왨왨왨왨񙨬\lp{d2,  r   J @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45r C II(Iڨ왨왨왨0왨'왨  왨 왨왨왨 왨  왨  왨!왨 왨왨0왨왨왨왨왨왨񙨬\lp{d2H-  r   J @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45r C II(Iڨ왨왨ؤ왨ؤ왨왨ؤ왨  왨 왨 왨   왨왨   왨왨왨  왨س왨 ؿ왨왨왨왨왨񙨬\(DDlp "  $ @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`45C ##(#p?D#;#lp aK>lpwO$| 'b%  ! @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45b%C   ( Jطldoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqodoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq왨doq왨ędoq왨doq`왨doq왨doqc왨doq왨doq`왨doq왨doqc왨doq왨doq`왨doq왨doqc왨doq왨doq` 왨doq왨doqc 왨doq왨doq`왨doq왨doqc왨doq왨doq`왨doq왨doqc왨doq왨doq N왨doq؅ 왨doq  '왨doq왨doq K왨doq왨doq    $왨doq왨doqK왨doq왨doq    $왨doq왨doqK0doq 왨doq    $0doq왨doq Kdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq왨doq    $doqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq왨doqN㙨doq왨doq    $ędoq왨doqZ왨doq 왨doq    $왨doq왨doqW왨doq왨doq   $왨doq왨doqW왨doq؅؝왨doq    $왨doq왨doqK왨doq؝왨doq  '왨doq왨doq`왨doq왨doqc왨doq왨doq` 왨doq왨doqc 왨doq왨doq`왨doq왨doqc왨doq왨doq`왨doq왨doqc왨doq왨doqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq왨doq왨doqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq왨doq왨왨doq왨왨doq-doq-doq`2W2lpmtd {T   @` @ ` @ @@@`@@@@@` `@``````` @` @` @` @`@ @@@`@@@@@ @ @@ @` @ @ @ @ @@@ @@@`@@@@@@@@@@`@ `@@`@``@`@`@`@`@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@@ @@@`@@@@@ @` @ ` @ @@@`@@@@@` `@``````` @` @`ࠀ @` @` @` @ ` @ @@@`@@@@@` `@```````׀ @` @` @` @`45TC (d2doqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq؅왨؅왨؅왨0왨'왨         왨왨왨    왨 왨 왨      왨!왨 왨 왨0왨؅왨؅왨؅왨؅왨؅왨񙨬l!=DBindex Copyright Lighthouse data 2002Lighthouse dataZmain  ,JI 9E1-EuqContents0 u% ContentsE+& !The DBindex program is intended for use with the GPMAW program, but anyone who is using FastA formatted database will probably find some of the functions useful.When the program opens, you will be presented with four options that each take you to a specific part of the program. This is only to guide you in the right directions, as you can switch between any part of the program at a later time.8uc% &Start functions:z+Ru HRF"fJ]"|P"Click on button to go to help topic.Enables you in a few simple steps to convert from VMS format, Swiss-Prot, create name index, trim name lines, create index for BLAST. Most of these functions can be carried out individually on the individual pages below.Create search index files to enable GPMAW to search by words in the name field. Combine several files to a single file.Convert ASCII (text) files from VMS to DOS format. Reduce database complexity (e.g. convert from VMS and limit name line in FastA databases to less than 255 characters). Convert the Swiss-Prot database to a FastA formatted database.jcc H|hÀ"L0""㙔u͉Create new FastA files (databases) that are filtered in amino acid composition.Download FastA formatted databases from the world wide web.The Index and the Filter buttons requeres you to open a database file before proceeding.Click on the 'mailto' label to open your e-mail program with the address of Lighthouse data in the 'to' field. Clicking on the 'http' lable will open your web browser on the GPMAW web site. From here you can download upgrades and demos of GPMAW and other related programs.Rq. *H㙔u͉The indexing program is freeware and may be used by anyone.If you want to distribute the program, please contact Lighthouse data.A1%Quick conversion:q' & HQuick conversion 9 @gH|fJ]PYou should use this option if you need to make a database (in-house or downloaded) compatible with GPMAW. At the same time you can make indices for text searching from inside GPMAW and for BLAST searching called from GPMAW. Most of the functions performed here can also be done individually on the other pages (Index database and Convert file), but on this page the operations are performed in sequence with a single commmand.R,* & XHYou perform the conversion in four steps:E o ) "8H1) Select database type:* j ]R:H"""Other FastA: This option is for converting all FastA formatted databases that are not one of the big NCBI nr, EMBL nr, Swiss-Prot or TREMBL.EMBL nr: This is the non-redundant database compiled at the EMBL Heidelberg site. The database can currently (Oct. 2002) be found at ftp://ftp.embl-heidelberg.de/pub/databases/nrdb/nrdb.dat. The file is huge (~600 MB).NCBI nr: A non-redundant (rather non-identical) database maintained by NCBI. Can be downloaded from ftp://ftp.embl-heidelberg.de/pub/databases/nrdb/nr.z. The file is compressed and needs to be unzipped before use. The EMBL and NCBI databases are roughly equivalent but the EMBL is a bit easier to parse.ko DBI `R:H"' fJ]Swiss-Prot: This database is the most detailed and verified protein database available (see note). You can download it from ftp://ftp.ebi.ac.uk/pub/databases/swissprot/. Note that you should have both the release (currently sprot40) and the update (new_seq.dat). This database is not in FastA formatDBq and has to be converted before indexing (carried out by the quick conversion - see below) . It is advantageous to combine the release and the update before indexing (you do this on the Index database page). When the conversion and indexing is done by DBindex, GPMAW will link the original full database to the FastA version and load the whole Swiss-Prot record when accessing the database (not just the FastA version).You should also select this option when indexing the TREML database (GPMAW does not at present link the FastA version to the original database).BB) "2H2) Select input file:kDB&C5 :R:H"This button will display an 'Open file' dialog where you select the database to be converted/indexed.BBhC) "2H3) Operation options:g&C(GY R:H""|Convert to DOS file format: This option will always be checked and cannot be de-selected. The program will run through the database and convert single carriage return characters (#13) to the DOS format of carriage return + line feed (#13#10). At the same time line length will be corrected (name <255 characters or length set below) and sequence lines will be set at 60 characters.Create FastA file: This option will be checked when 'Swiss-Prot' has been selected in option 1. A new file will be created based on the SP format by combining the accession number, name and species line to a single FastA name line. The sequence will be extracted and saved as 60 characters/line. This file will be used in the following indexing. A cross-reference file .IDX will also be created to enable GPMAW to access the original Swiss-Prot records from the FastA file.-hCUJS tR:H""Ignore ID in name line: If checked the indexing of the database will not try to extract the accession number from the FastA name line. This option is mostly used for internally generated databases where accession numbers are not used (e.g. name lines can be >Contig SP25 gene 235). Indexing with accession number extraction will in these cases lead to confusing name reports.Reduce name length to xx characters: The name length for EMBL and NCBI nr databases are often very long. This cannot be handled by GPMAW and the line will be truncated at 250 characters (actually the first punktuation mark before the 250 limit). Using this option you can further reduce the name length and thus reduce the size of the file.z(G(MY R:H""Create GPMAW text serach indices: Creates indices that enables you to search the database by words in the name line. Usually includes the protein name and the species. It will also extract the accession numbers.Create BLAST indices: The BLAST homology search program is delivered as part of the GPMAW package. The BLAST program needs a specially formatted version of the FastA database which is performed by a program called 'formatdb.exe' which you have to locate before using this option (press the 'Setup BLAST' button and navigate to the location of formatdb.exe - usually this will be the \gpmaw\bin\ directory).UJN< FuR:H"Update GPMAW: If you create BLAST indices you can let DBindex update GPMAW to be aware of the new database for BLAST. Note that GPMAW must not be active for this option to work.4 (MQN) "H4) Go: kN : BR:H"Starts the indexing. Note that the operations are likely to take 5-10 minutes. When the BLAST indices are generated, a console application (DOS-like program) will be running. Do not close DBindex until this program has finished running (you will be informed by a dialog box).The status lines and status bar below will show the progress of the conversions.QN qQN؁5 8/HWhen the conversions are done, you will be presented with a dialog box presenting the results of the conversions along with some statistics. This report can be printed for later reference.#Note: Please note that the Swiss-Prot database is not public domain. If you are associated with a commercial company you need a license, please check out http://www.genebio.com/interne.asp?pagename=19980708sib* ' H?؁A1AwˈIndex database6w% "Index database;A% ,Create index files.wO lS|""" TheIndex database page will generate index files that can be used by GPMAW for searching the database by accession number or search words taken from the title line of each entry.Start by opening a database by pressing the button and select a new database.If you press the button, you can get a view of the record structure of the database.Press the button to create indices. The indices comprise the target file (.TRG) containing search words, the index file (.NDX) with the index into the main sequence database, and the accession number file (.ACC). The fact file (.fct) file is not necessary, but contains information on the proteins in the original database.Z% The progress of the dabase indexing can be followed in the yellow information panel and the green progress bar at the bottom of the window.5%  Combine filesZB R" Use the 'Add database' function to combine several files, typically databases, into a single file.Start by opening one database using the 'Load database' function. Then press the 'Add database' button to add additional files to the first one.Note: The additional files will be appended to the first file, which means that the final file will have the same name as the first file loaded (if you need this intact, you have to make a copy before starting).l'ˈ$ = 1* <AConvert file4ˈ<% Convert fileC% <Convert Swiss-Prot to FastA<k5 8q" The Swiss-Prot database is currently the best annotated protein sequence database available. In some cases it would be convenient to have the database available in FastA format. Converting the database: Press the button. Select the annotated database, at present called sprot39.dat or sprot39.seq, in the 'Open file' dialog box. The Swiss-Prot database will now be converted to a FastA format database called 'swiss.seq'. Along with the 'swiss.seq' file an index file called 'swiss.idx' will be generated. This file contains offset into sprot39.seq for each sequence entry in swiss.seq. This file is needed for retrieval of complete annotations into GPMAW (needs additional index files).W3 4|If you generate index files (based on 'swiss.seq') and use GPMAW you will be able to retrieve the full annotated entry if you keep all files (sprot40.seq, swiss.seq + the swiss index files) in the same directory.You should also use this function for the TREMBL database as it use the same format as Swiss-Prot. However, GPMAW cannot retrieve the TREMBL record based on the derived FastA database.NB: Please note that Swiss-Prot is not public domain. If you are associated with a commercial company you need a license, please check out http://www.genebio.com/interne.asp?pagename=19980708sibK&k@% LConverting to standard FastA formatWC T" " "Records in non-redundant databases like NCBI-nr and EMBL-nr are combined from a large number of databases. The name line f@ˈrom each databases is then often just added together, creating very long lines. Due to internal data formats in GPMAW it is unable to handle name lines longer than 255 characters.The function enables you to select the maximum name line length. A short name line can considerably reduce the size of the database.In a few cases, databases occur with sequences presented in lower case. If this occurs, you can check the function. However, this will slow the conversion.@- (The 'Convert' function asks for a FastA formatted database (typically a non-redundant downloaded from the Internet) and then reduces the name lines and at the same time converts it from VMS format, if necessary.Note: The input file is converted, that means that it replaces the original file.If you have problems indexing a database, you can always try to run it through the 'Convert' filter, as several discrepancies in the FastA format is rectified in this filter.B.% :Converting from VMS to DOSA1 0"On UNIX/VMS systems, text files are stored without the carriage return character at the end of each line. This has to be put in to be properly processed under DOS/Windows system.Like the conversion of non-standard FastA databases above, the 'VMS to DOS' function takes an ASCII file as input. Unlike the other routines, you can convert any kind of text files, not just FastA databases.If you use the 'Convert' function above, you do not need to use the 'VMS to DOS' function.@.1jFilter database7A% $Filter databasexR0& The database indexing enables you to create new FastA databases based on existing ones, but having specific amino acid compositions and/or molecular masses. The new databases can then be indexed and searched like normal FastA databases.The default is that all compositions should be between 0 and 20% and a mass between 5 and 100 kDa.3c% Operations:B0^ R:H""""1.Start by loading the database to be filtered 2.Enter the name of the result database (default DBSORT.SEQ). 3.Change the composition for the relevant residues. Select a new residue either in the residue dropdown list or by double clicking on the residue in the amino acid residue list on the left side of the window. The low and high percent can be changed either by clicking on the up/down arrows or entering directly in the relevant boxes. Repeat for all residues where the default 0-20 % is inappropriate. You may opt to Ignore composition for any residue.cD VR:H""4.Enter the minimum and maximum mass of the filtered sequences . By checking the Ignore mass, all mass values will be accepted in the filtered database.5.Click on to create the filtered database.7 Y' .http://www.ebi.ac.uk@1[Lighthouse data7Y% $Lighthouse dataN*$ TContact information for Lighthouse data7U% $Postal address:\5' jLighthouse dataEngvej 35DK-5230 Odense MDenmark;U% ,Electronic contact:oI[& Mail: php@bmb.sdu.dkHome: http://welcome.to/gpmaw/Fax: +45 6619 339661: GPMAW-[% GPMAWG! & CThe GPMAW (General/Protein Mass Analysis for Windows) programme is a collection of functions for retrieving, storing and analysing protein sequences.The program is able to retrieve sequences from the net, disk, manual input and CD-ROM. The sequences can be annotated and you can enter almost any kind of modification on individual residues. Mass calculations can be performed on sequences, peptides, fractions thereof etc. Proteins can be cleaved, sequenced (ms/ms, ladder). HPLC elution predicted, predict hydrophobicity, 2D-gel layout etc. . *)㙔u͉For full information on the program, please see the complete manual onhttp://welcome.to/gpmaw/The program is available from Lighthouse data.1 1hMS Sans SerifArialWingdings;2dihfegك) ك/&;)F24ZContentsConvert fileDownload databaseFilter database GPMAWIndex databaseLighthouse dataQuick conversion/&;)Lz e ContentsQuick conversionIndex databaseConvert fileكFilter databaseDownload databaseLighthouse dataGPMAW~/&;)L4|hكu|U%L0RFPfJ]E<lp $lp0 왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨 doqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoqdoq왨왨  ؝doq왨왨왨 ؝왨doq왨왨왨doq؝왨doq왨왨왨doq؝왨doq왨왨왨doq؝0왨doq왨왨왨doqؚ'왨doq왨왨왨doq  ؗ    왨doq왨왨왨doq왨doq왨왨왨doqdoq doqdoq왨doq왨왨왨doqdoqdoqdoq 왨doq왨왨왨 왨doq왨왨 왨doq왨왨    왨doq왨왨!왨doq왨왨 왨doq왨왨 왨doq왨왨0왨doq왨왨왨doq왨왨왨doq왨왨왨doq왨왨왨doq왨왨왨doq왨왨񙨬doq왨왨 doq왨왨왨왨왨왨왨왨 왨왨왨왨왨 왨왨왨doq왨왨왨doq 왨왨왨doq왨왨왨doq왨왨왨doq왨왨왨doq왨왨왨doqdoq doqdoq왨왨왨doqdoqdoqdoq왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨؈왨왨왨왨 ؈왨왨 왨왨왨 왨왨왨doq왨왨왨doq   왨왨왨doq 왨왨왨doq 왨왨왨doq  왨왨왨doq왨왨왨doqdoq doqdoq왨왨왨doqdoqdoqdoq 왨왨왨왨왨؈왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨왨        왨왨왨왨왨왨       왨왨6왨왨      ؔؗ؃