Parse.pm is a simple interface to D.G. Gilbert's ReadSeq program, it is not meant to be particularly elegant or efficient. The interface should be abstract enough to allow future versions to seamlessly access other sequence conversion programs besides ReadSeq.
At this time the interface methods have not been fully thought out or implemented. Suggestions are welcome.
If ReadSeq is not on the local system, or this package is not properly configured, Seq.pm will (hopefully) realize this and not attempt to use this code.
Readseq is freely distributed and is available in shell archive (.shar) form via FTP from ftp.bio.indiana.edu (129.79.224.25) in the molbio/readseq directory. (URL) ftp://ftp.bio.indiana.edu/molbio/readseq/
use Parse;
The correct path to the readseq executable is configured into this module during the 'make Makefile.PL' phase of installation.
Manual edits needed in Parse.pm if auto-configuration does not happen:
- Change the value of $READSEQ_PATH so that it defines a path to the ReadSeq executable on your system.
- Uncomment the line(s) containing $OK = ``Y''
convert_from_raw() method has been written. The following code
will return the sequence ``GAATTCGTT'' as a GCG formatted string.
$reply = &Parse::convert_from_raw(-sequence=>'GAATTCGTT',
-fmt=>'gcg');
The ``fmt'' named-parameter field can be set for the following formats:
IG (or 'Stanford') GenBank (or 'GB') NBRF EMBL GCG Strider Fitch Fasta Zuker Phylip3.2 (use 'Phylip3') Phylip Plain (or 'Raw') PIR (or 'CODATA') MSF ASN.1 (use 'ASN1') PAUP Pretty
The ``options'' named-parameter field can be used to pass switches directly to the ReadSeq executable. This option should only be used by people familiar with operating ReadSeq on the command-line. Use at your own risk as this has not been fully tested.
As an example, the ReadSeq switch '-c' will cause all of the characters in the formatted sequence to be returned in lowercase.
$reply = &Parse::convert_from_raw(-sequence=>"$seq_string",
-options=>'-c',
-fmt=>'gcg');
Title : _rearrange
Usage : n/a (internal function)
Function : Rearranges named parameters to requested order.
Example : &_rearrange([SEQUENCE,ID,DESC],@p);
Returns : @params - an array of parameters in the requested order.
Argument : $order : a reference to an array which describes the desired
order of the named parameters.
@param : an array of parameters, either as a list (in
which case the function simply returns the list),
or as an associative array (in which case the
function sorts the values according to @{$order}
and returns that new array.
Title : _write_tmp_file
Usage : n/a (internal function)
Function : Writes a temporary file to disk. Uses
: the POSIX tmpnam() call to get path &
: filename. Should be more portable than
: just writing to /tmp.
:
Example : &_write_tmp_file("$formatted_sequence");
Returns : string containing the temp file path
Argument : string that is to be written to disk
Title : version
Usage : &Parse::version;
Function : Prints current package version
Example : &Parse::version;
Returns : none
Argument : none
:
Title : convert_from_raw()
Usage : print &Parse::convert_from_raw(-sequence=>$raw_seq,
: -fmt=>'asn1');
:
: $reply = &Parse::convert_from_raw(-sequence=>'GAATTCGTT',
: -options=>'-c',
: -fmt=>'gcg');
:
Function : ReadSeq does not function well when called upon
: to read or convert "raw" or unformatted sequence
: strings or files. This code will take a given
: raw sequence and manipulate it into FASTA
: format before invoking ReadSeq.
:
: The following named paramters may be used as
: arguments:
:
: -sequence=> Sequence string.
: -fmt=> Format sequence will be converted to.
: -options=> String containing command-line
: switches for ReadSeq. Passed
: directly.
:
Example : see usage
Returns : Formatted sequence string
Argument : named parameters, see function
:
Title : convert
:
Usage : print &Parse::convert(-sequence=>$raw_seq,
: -fmt=>'asn1');
:
: $reply = &Parse::convert(-sequence=>'GAATTCGTT',
: -options=>'-c',
: -fmt=>'gcg');
:
: $reply = &Parse::convert(-location=>'/tmp/a.seq',
: -fmt=>'raw');
:
Note : ReadSeq does not function well when called upon
: to read or convert "raw" or unformatted sequence
: strings or files. User beware.
:
Function : Will read/parse a given sequence string *OR* a given
: sequence file.
:
: If a sequence string AND a sequence file path are
: both passed in, the file path will be used with no
: complaint.
:
: The following named paramters may be used as
: arguments:
:
: -sequence=> Sequence string.
: -location=> Sequence file path.
: -fmt=> Format sequence will be converted to.
: -options=> String containing command-line
: switches for ReadSeq. Passed
: directly.
:
Example : see usage
Returns : Formatted sequence string
Argument : named parameters, see function
:
Seq.pm - The biosequence object