PRABI-Doua: ACNUC remote access protocol

ACNUC remote access protocol

DESCRIPTION OF THE SOCKET COMMUNICATION PROTOCOL WITH ACNUC

Sorted list of functions : acnucopen, acnucclose, alllistranks, bcount, bit1, bit0, btest, clientid, copylist, countfreelists, countsubseqs, crelistfromclientdata, fcode, extractseqs, followshrt2, getannots, getattributes, getemptylist, getliststate, getlistrank, gfrag, ghelp, iknum, isenum, knowndbs, loadtaxonomy, modifylist, next_annots, nexteltinlist, nextmatchkey, prep_getannots, prep_requete, prettyseq, proc_query, proc_requete, quit, read_annots, readacc, readaut, readbib, readext, readfirstrec, readkey, readlng, readloc, readshrt, readsmj, readspec, readsub, releaselist, residuecount, savelist, selseqs1node, seq_to_annots, setlistname, setliststate, zerolist, zlibloadtaxonomy

Functions by topic
start	query the database	get annotations & sequences	manage lists	special purpose	low level
knowndbs	proc_query	isenum, getattributes	alllistranks, nexteltinlist	extractseqs	readacc, readaut, readbib
acnucopen	modifylist	read_annots	bcount, bit1, bit0, btest	(zlib)loadtaxonomy	readkey, readlng, readext
acnucclose	fcode	next_annots	getemptylist, getlistrank	nextmatchkey	readloc, readshrt, followshrt2
clientid	iknum	gfrag	zerolist, releaselist, countfreelists, setlistname, setliststate	getannots, prep_getannots	readsmj, readspec, readsub, readfirstrec
	selseqs1node	prettyseq	getliststate, copylist	ghelp, residuecount	bit1, bit0, btest, bcount
quit		seq_to_annots	savelist, crelistfromclientdata		countsubseqs

Introduction

Remote access to acnuc databases works by opening a socket on port # 5558 of pbil.univ-lyon1.fr and by communicating on this socket following the protocol described here.

Example

client opens a socket to server, typically port 5558 of pbil.univ-lyon1.fr
client receives: OK acnuc socket started\n on socket from server (\n indicates end-of-line)
client sends to server on socket: clientid&id="client_name"\n and receives code=0\n
client sends to server on socket: acnucopen&db=embl\n
client receives from server on socket: code=0&type=EMBL&totseqs=31722973&totspecs=224976&totkeys=1148875\n
client sends to server on socket: gfrag&name=J01714&start=1&length=50\n
client receives from server on socket: length=50&aacctttccggtcgcggagataaagacatcttcaccgttcacgatatttt\n
send / receive pairs are repeated
client sends to server on socket: quit\n, receives OK acnuc socket closed\n and closes the socket.

Syntax

==> command& args <newline> command + arguments + end-of-line sent to server
<== string <newline> one or more lines of text received from server
[ arg1 | arg2 ] alternative arguments;
{ arg } optional argument
All arguments (e.g. name="Homo sapiens") can be bracketed by double quotes when useful, but should have internal " escaped with \ (\"); this escape rule is expected from client and is applied by server.

Status codes

The status of the reply to a command is generally returned as
<== code=stat_number{ & optional arguments }
Absence of "code=stat_number" in the reply to a command implies success.
stat_number values are

0: no error
> 0: some kind of error, among which
1: unrecognized command
2: incorrect or missing arguments for this command
3 or more: command-specific error

Functions

==> acnucopen&db=xxxxx{&maxlists=xxxxx}
<== code=2	missing db= argument
code=3	if no database with that name is known by the server
code=4	if database is currently unavailable
code=5	if a database is currently opened and has not been closed
code=6&challenge=xx	if the database requires password authorization, server sends a challenge to client.
	==> reply=xx	authorization data sent by client to server that must be the MD5 digest of
		the string "challenge:dbname:md5-pw" where md5-pw is the MD5 digest of the password.
	<== code=6	when authorization failed
code=0&type=[GENBANK|EMBL|SWISSPROT|NBRF]&totseqs=xx&totspecs=xx&totkeys=xx
&ACC_LENGTH=xx&L_MNEMO=xx&WIDTH_KW=xx&WIDTH_SP=xx&WIDTH_SMJ=xx&WIDTH_AUT=xx&WIDTH_BIB=xx&
lrtxt=xx&SUBINLNG=xx&VALINSHRT2=xx{&version="xxx"}
Initiates remote access to an acnuc database.
The db= argument identifies the target database by a logical name 
that can be any dbname returned by the knowndbs command, or taken from the 1st column of this table,
or the name of a database requiring password authorization.
The optional maxlists= argument indicates the maximum number of lists the client wishes to be able to create in the server.
The command countfreelists can be used after to receive the effective number of free lists available to the client.

type : the type of database that was opened.
totseqs, totspec, totkey : total number of seqs, species, keywords in opened database.
ACC_LENGTH, L_MNEMO, WIDTH_KW, WIDTH_SP, WIDTH_SM, WIDTH_AUT, WIDTH_BIB, lrtxt, SUBINLNG: max lengths of 
record keys in database
VALINSHRT2 : value of the VALINSHRT2 parameter of the opened database
version : if present, a string containing database version information.

==> clientid&id="xxxxx"
<==  code=0
Sends the server an identification of the client, typically a program name.


==> acnucclose
<==  code=xx
To close the currently opened acnuc db.
code : 0 if OK 
       3 if no database was opened by the server


==> quit
<==  OK acnuc socket closed
To close the socket and stop communication over it.


==>   gfrag&[number=xx|name=xx]&start=xx&length=xx
<==  length=xx&....sequence...
Get length characters from sequence identified by name or by number 
starting from position start (counted from 1).
Reply gives the length read (may be shorter than asked for) and then the characters; 
length can be 0 if any error.


==>   read_annots&[number=xx|name=xx|offset=xx&div=xx]{&nl=xx}
<==  nl=xx&...1 or several lines...
Reads nl (1 by default) consecutive lines of annotations identified
by offset and div(ision) or by seq number or by seq name.
Reading of lines stops when nl lines have been transmitted or at the last annotation 
line of the (sub-)sequence (SQ or ORIGIN line; end of feature table entry for a subsequence).
Reply gives the number of lines sent and then these lines


==>   next_annots{&nl=xx}
<==  nl=xx&offset=xx&...1 or several lines...
Reads nl (1 by default) consecutive lines of annotations following
the previously read annotation lines.
Reading of lines stops when nl lines have been transmitted or at the last annotation 
line of the sequence (SQ or ORIGIN line).
Reply gives the number and data read and the offset of the first line read.

==>   seq_to_annots&[number=xx|name=xx]
<==  code=xx&offset=xx&div=xx
Returns the information useful for reading annotations (offset + div) for
a sequence identified by name or by number.
Reply has code != O iff error.


==>   countfreelists
<==  code=xx&free=xx&annotlines="xx"
Returns the number of free lists available.
code: 0 iff OK
free: number of free lists available
annotlines: list of names of annotation lines in the opened database separated by |


==>   prep_requete
This is another, equivalent name for the countfreelists function.


==>   proc_query&query="......."&name="xx"
	in case of error :
<==  code=xx&message="xx"
	in case of success :
<==  code=0&lrank=xx&count=xx&type=[SQ|KW|SP]{&locus=[T|F]}
Processes an acnuc query and puts result in list with specified name, overwriting
the list if one with same name already exists.
The query must follow the ACNUC query language.
Reply gives code = 0 if OK and then gives lrank, the rank of the resulting list,
the count of elements in list, the type of the list (SQ, sequences; SP, species;  
KW, keywords), and for sequence lists, whether the list contains only 
parent sequences (locus=T).
In case of error, code is != 0 and message is a text describing error.


==>   proc_requete&query="......."&name="xx"
This is another, equivalent name for the proc_query function.


==>   nexteltinlist&lrank=xx&first=xx{&count=xx}
<==  next=xx&name="xx"{&length=xx&offset=xx&div=xx&frame=xx&ncbigc=xx}
finds the next element(s) in list identified by rank (lrank argument) after the elt given in 
argument first (possibly several elements on successive lines if &count=xx is used;
set first=1 to start running through list).
Reply returns in next the element value, or 0 if no more elements exist; it 
gives also the name of this element that can be a sequence, a species or a 
keyword, and, for a sequence, its length, division and offset for annotations,
reading frame and ncbi genetic code id.
If &count=xx is used, at most count lines are returned (next=0 indicates < count lines);
if not used, exactly one line is returned.


==>   getliststate&lrank=xx
<==  code=xx&type=[SQ|KW|SP]&name="xx"&count=xx{&locus=[T|F]}
Asks for information about the list of specified rank.
Reply gives the type of list, its name, the number of elements it contains,
and, for sequence lists, says whether the list contains only parent seqs (locus=T).
Reply gives code != 0 if error.


==>   setliststate&lrank=xx{&locus=[T|F]}{&type=[SQ|KW|SP]}
<==  code=xx
To set the type and/or the "locus value" of the specified list.
Reply gives code != 0 if error.

==>   getlistrank&name="xx"
<==  lrank=xx
Returns the rank of list, or 0 if no list with name exists.

==>   setlistname&lrank=xx&name="xx"
<==  code=xx
Sets the name of a list identified by its rank.
Returned code : 0 if OK, 
                3 if another list with that name already existed and was deleted
                4 no list of rank exists


==>   getemptylist&name="xx"
<==  code=xx&lrank=xx
Creates a new, empty list, sets its name, and returns its rank and a status code.
code : 0 OK
       3 if name is already used for another list with given rank (no change done)
       4 no empty list exists (no lrank value returned)


==>   releaselist&lrank=xx
<==  code=xx
Release resources associated to list of specified rank which does not exist anymore.
Code != 0 indicates error.


==>   residuecount&lrank=xx
<==  code=xx&count=xx
Computes the total number of residues (nucleotides or aminoacids) in all sequences of the list of specified rank.
Code != 0 indicates error.


==>   selseqs1node&num=xx&kind=[SP|HO|KW]
<==  lrank=xx{&count=xx}
Creates a list of seqs attached to species (if kind=SP), host (HO) or keyword (KW) 
of number num.
Returns rank of created list (0 if error) and count of seqs therein.


==>   alllistranks
<==  count=xx&n1,n2,...
Returns the count of existing lists and all their ranks separated by commas.


==>  bcount&lrank=xx
<==  code=xx&count=xx
Counts the number of elements in list given by lrank.
Code != 0 indicates error.



==>  bit1&lrank=xx&num=xx
<==  code=xx
Adds element num to list of rank lrank. 
Code != 0 indicates error.


==>  bit0&lrank=xx&num=xx
<==  code=xx
Removes element num from list of rank lrank.
Code != 0 indicates error.


==>  btest&lrank=xx&num=xx
<==  code=0&[on|off]
Tests for presence of element num in list of rank lrank: on means present, off means absent.
Code != 0 indicates error.


==>   copylist&lfrom=xx&lto=xx
<==  code=xx
Copies list of rank lfrom to list of rank lto that must have been previously allocated by e.g., getemptylist or proc_query.
Code != 0 indicates error.


==>   zerolist&lrank=xx
<==  code=xx
Empties the list of specified rank that must have been previously allocated by e.g., getemptylist or proc_query.
Code != 0 indicates error.


==>   countsubseqs&lrank=xx
<==  code=xx&count=xx
Returns the number of subsequences in list of rank lrank.
Code != 0 indicates error.


==>   isenum&[name=xx|access=xx]
<==  number=xx{&length=xx&frame=xx&gencode=xx&ncbigc=xx}{&otheraccessmatches}
Finds the acnuc number of a sequence from its name (name= argument) or its accession number (access= argument).
The name= and access= arguments are case-insensitive.
Reply gives number (or 0 if does not exist), length, reading frame (0, 1, or 2), and 
genetic code ids of the corresponding sequence (gencode= gives acnuc's genetic code, 0 means universal; 
ncbigc= gives ncbi's genetic code id, 1 means universal).
When &otheraccessmatches appears in reply, it means that several sequences are attached to the given accession no., 
 and that only the acnuc number of the first attached sequence is given in the number= argument.

==>   iknum&name="xx"&type=[SP|KW]
<==  num=xx
Finds the acnuc number of a species (type=SP) or a keyword (type=KW).
Returns 0 if does not exist.

==>   fcode&name="xx"&type=[AUT|BIB|ACC|SMJ|SUB]
<==  num=xx
Finds the acnuc number of an author (type=AUT) a reference (BIB) an accession number (ACC)
a record of the SMJYT index file (SMJ) or a sequence (type=SUB).
Returns 0 if does not exist.


==>   readsub&num=xx
<==  code=xx&name="xx"&length=xx&type=xx&is_sub=xx&toext=xx&plkey=xx&frame=xx&genet=xx&ncbigc=xx
Returns data for sequence of number num: name, length, the number of the sequence type; 
the returned value of is_sub is 0 for a subsequence, the rank 
of the corresponding LOCUS record for a parent seq; 
toext is the (positive) value of the pext field.
plkey is the start of the short list of attached keywords.
frame is the reading frame (0, 1, or 2; meaningful only for CDSs).
genet is the acnuc genetic code id (0 means universal genetic code).
ncbigc is the ncbi genetic code id (1 means universal genetic code).
Code != 0 indicates error.


==>   readloc&num=xx
<==  code=xx&sub=xx&pnuc=xx&pinf=xx&spec=xx&host=xx&plref=xx&molec=xx&placc=xx&org=xx&date=xx
Returns data from the record of rank num of file LOCUS.
Code != 0 indicates error.


==>   readspec&num=xx
<==  code=xx&name="xx"&plsub=xx&desc=xx&syno=xx&host=xx{&libel="xx"}
Returns data from the record of rank num of file SPECIES including label if not empty.
Code != 0 indicates error.


==>   readkey&num=xx
<==  code=xx&name="xx"&plsub=xx&desc=xx&syno=xx{&libel="xx"}
Returns data from the record of rank num of file KEYWORDS including label if not empty.
Code != 0 indicates error.


==>   readsmj&num=xx&nl=xx
<==  code=xx&nl=xx
     recnum=xx&name="xx"&plong=xx{&libel="xx"}
  ... a series of nl lines like that ...
Returns data from nl consecutive records starting from rank num of file SMJYT including label 
if not empty.
Code != 0 indicates error.


==>   readext&num=xx
<==  code=xx&mere=xx&debut=xx&fin=xx&next=xx
Returns data from the record of rank num of file EXTRACT.
Code != 0 indicates error.


==>   readlng&num=xx
<==  code=xx&n=xx&xx,xx,...{&next=xx}
Reads part of a long list starting at record number num.
Returns the number of read elements, then these elements separated by commas, 
then, if the list is not finished, information for the next part of the chain ;
the list may not be fully read, but the next value gives information to pursue reading.
Code != 0 indicates error.


==>   readacc&num=xx
<==  code=xx&name="xx"&plsub=xx
Returns data from the record of rank num of file ACCESS.
Code != 0 indicates error.


==>   readaut&num=xx
<==  code=xx&name="xx"&plref=xx
Returns data from the record of rank num of file AUTHOR.
Code != 0 indicates error.


==>   readbib&num=xx
<==  code=xx&name="xx"&j=xx&y=xx{&jname="xx"}{&yname="xx"}&plsub=xx&plaut=xx
Returns data from the record of rank num of file BIBLIO.
Code != 0 indicates error.


==>  readshrt&num=xx{&max=xx}
<==  code=xx&n=xx&xx,xx,...
Returns up to max pairs (default 50) [val,next] of the short list starting at 
record number num, n says how many, then these pairs.
Code != 0 indicates error.


==>  followshrt2&num=xx&kind=name{&rank=xx&max=xx}
<==  code=xx&num=xx&rank=xx&n=xx&xx,...
Asks for up to max (default 500) values of the short list starting at record number num and offset rank (default 0).
kind indicates what kind of short list is considered. 
It is one among: sub_of_bib, spc_of_loc, bib_of_loc, aut_of_bib, bib_of_aut, sub_of_acc, key_of_sub, acc_of_loc.
Return text: n says how many values, then these values. num and rank are set to ask for further values of the same list,
num=0 indicates there are no more values in the list.
Code != 0 indicates error.


==>   readfirstrec&type=[AUT|BIB|ACC|SMJ|SUB|LOC|KEY|SPEC|SHRT|LNG|EXT|TXT]
<==  code=xx&count=xx
Returns the record count of the specified ACNUC index file.
Code != 0 indicates error.


==>   ghelp&file=xx&item=xx
<==  nl=xx&...1 or several lines...
Reads one item of information from specified help file.
File can be HELP or HELP_WIN, item is the name of the desired help item
Reply :	nl is 0 if any problem, or announces the number of help lines returned.


==>  nextmatchkey&num=xx{&pattern="xx"}{&count=xx}
With the &count=xx argument:
<==  code=0&count=xx
num=xx&name="xx"              count such lines
Without the &count=xx argument:
<==  code=xx&num=xx{&name="xx"}
Pattern matching in index file KEYWORDS.
Returns the number and name of the next count (one if &count= argument was not used) keywords matching pattern 
after given number;
use first time with num=2 and giving a pattern; then call without specifying pattern, 
until returns num=0.
A pattern is a character string where @ matches any string (e.g. @polymerase@).
Error code: 3: Not enough memory.


==>  loadtaxonomy or zlibloadtaxonomy
<==  code=0&total=xx\n
rank&parent&count&"...name..."{&"...label..."}\n        one such line for each taxon
loadtaxonomy END.\n
Sends to client the complete sequence taxonomy of the ACNUC database, compressed using zlib if 
zlibloadtaxonomy command is used.
This command can be interrupted by client sending the escape ASCII character to server on socket; 
client should keep reading socket until "loadtaxonomy END.\n" arrives.

total is (slightly) larger than the number of lines that follow
rank: the rank of a taxon (the root has rank 2)
parent: the rank of its parent (the root's parent is 0, arbitrarily)
        synonyms are indicated by a < 0 value of parent
count: the number of seqs directly attached to this taxon
name: the taxon name
label: optionally, a taxon label


==>  crelistfromclientdata{&type=[SQ|AC|SP|KW]}&nl=xx
0 or more lines of data sent by client to server
<==  code=0&name="xx"&lrank=xx&count=xx\n
To create on server a bitlist from data lines sent by client. Each such line contains either a sequence name,
an accession number, a taxon name, or a keyword.
type: the type of data sent to server (SQ=seqs, AC=acc nos, SP=species, KW=keywords)
      SQ by default
nl: announces the number of data lines that follow (0 is OK)
code: 0 iff OK
      3 no list creation is possible
      4 EOF while reading the nl lines from client
name: name of bitlist created from this data
lrank: rank of this bitlist
count: count of elements in bitlist


==>  savelist&lrank=xx{&type=[N|A]}
<==  code=0\n
list element names or acc nos on successive lines
savelist END.\n
To obtain names of all elements of a bit list sent on socket on successive lines;
for sequence lists, option &type=A, will give accession numbers instead of seq names;
end of series of lines is when savelist END.\n appears
lrank : rank of bitlist
type: A gives accession numbers, N (default) gives seq names; useful for seq lists only


==>  modifylist&lrank=..&type=[length|date|scan]&operation=".."
<== code=0&lrank=..&name=".."&count=..{&processed=..}
code=3   if impossible to create a new list
code=2   if incorrect syntax, possibly in operation
lrank:   (input) rank of bitlist to be modified
         (output) rank of created bitlist containing result of modify operation
type:    indicates what kind of modification is to be performed.
operation: for length, as in  "> 10000"    or    "< 500"
         for date, as in   "> 1/jul/2001"   or   "< 30/AUG/98"
         for scan, specify the string to be searched for
                   prep_getannots must be used before using modifylist&type=scan
                   the client can interrupt the scan operation by sending the escape character on the socket
name: name of created bitlist
count: number of elements in created bitlist
processed: only for scan operation, number of list elements scanned until completion or interruption


==>  knowndbs{&tag=xx}
<== nl=.. \n
dbname | on/off | db description \n      nl such lines
Returns, for each database known by the server, its name (a valid value for the db= argument 
of the acnucopen command), availability (off means temporarily unavailable), and description.
When the optional tag= argument is used, only databases tagged with the given string are listed;
without this argument, only untagged databases are listed.
The tag argument thus allows to identify series of special purpose (tagged) databases, 
in addition to default (untagged) ones. The full list of untagged and tagged databases is here.


==>  prep_getannots&nl=xx
key_name{|subkey_name} \n       nl such lines sent to server
....     \n
<== code=0 \n
This command must be used before using the getannots or the modifylist&type=scan commands to specify 
what sorts of annotation records will be returned by the getannots command or will be scanned.
nl: announces the number of key names that follow.
key_name: an annotation key name. 
subkey_name: optionally, an annotation sub-item name (e.g., CDS when key_name = FT)
For the EMBL/SWISSPROT format, keys are: ALL, AC, PR, DT, GN, KW, OS, OC, OG, OX, OH, 
RN, RC, RP, RX, RA, RG, RT, RL, DR, AH, AS, CC, PE, FH, FT, CO, SQ, SEQ.
For GenBank: ALL, ACCESSION, VERSION, PROJECT, KEYWORDS, SOURCE, ORGANISM, REFERENCE, AUTHORS, 
CONSRTM, TITLE, JOURNAL, PUBMED, REMARK, COMMENT, FEATURES, ORIGIN, SEQUENCE. 
For FT(embl,swissprot) and FEATURES(GenBank), one or more specific feature keys can be specified
using lines with only uppercase and such as
FEATURES|CDS
FT|TRNA
Keys ALL and SEQ/SEQUENCE stand for all annotation and sequence lines, respectively.
For the scan operation, key ALL stand for the DE/DEFINITION lines, 
and SEQ/SEQUENCE cannot be used (annotations but not sequence are scanned).


==>  getannots&[number=xx|lrank=xx]
<== code=0 \n
plain text line
...                   (a series of consecutive lines)
\\\                   (a line with exactly three \ announces the end of the line series)
To get the annotations of the sequence of rank number or of all sequences belonging to the sequence list 
with rank given in the lrank= argument.
Use prep_getannots before to specify what types of annotation lines are desired.
Annotation lines from ID+DE (EMBL/SWISSPROT) or LOCUS+DEFINITION (GenBank) are always transferred.
For a subsequence, the same information is always transferred, whatever line kinds are asked.
Use the lrank= option with caution with large sequence lists because the command cannot be interrupted.


==>  prettyseq&num=xx{&bpl=xx}{&translate=[T|F]}
<==   code=0\n
line1
line2
...
prettyseq END.\n
To get a text representation of sequence of rank num and of its subsequences,
with bpl bases per line (default = 60), and with optional translation of 
protein-coding subsequences.


==>  extractseqs&[lrank=xx|seqnum=xx]&format=xx&operation=xx
      {&feature="xx"}{&bounds="xx"}{&minbounds="xx"}{&zlib=[T|F]}\n
            command output for format != coordinates
<==   code=xx{&message="xx"}\n
line1  \n
line2  \n
...    \n
<esc>count=xx\n        (one such line at the end of output related to each member of list)
... all of this for each sequence of the list if lrank=xx was used ...
extractseqs END.\n
            command output for format=coordinates
<==   code=xx\n
rank=xx&start=xx&end=xx| ...  \n        (rank=parent seq rank in DB; start,end=coordinates in this seq)
... for each series of coordinates ...
extractseqs END.\n

To extract a list of sequences (lrank argument) or a single sequence (seqnum argument)
using different output formats and types of extraction.
All formats except "coordinates" extract sequence data.
Format "coordinates" extract coordinate data; start > end indicates the complementary strand.
lrank : rank of list of sequences.
seqnum : rank of sequence in the acnuc database.
format : "acnuc", "fasta", "flat" or "coordinates"
operation : "simple", "translate", "fragment", "feature" or "region"
simple: each sequence or subsequence is extracted.
translate: meaningful only for protein-coding (sub)sequences that are 
extracted as protein sequences. Nothing is extracted for non-protein coding sequences.
fragment: Allows to extract any part of the sequence(s) in list.
Such part is specified by the bounds and minbounds arguments according to the
syntax suggested by these examples:
132,1600        to extract from nucl. 132 to nucl 1600 of the sequence. 
                If applied to a subsequence, coordinates are in the parent seq 
                relatively to the subsequence start point.
-10,10          to extract from 10 nucl. BEFORE the 5' end of the sequence
                to nucl. 10 of it. Useful only for subsequences, and produces
                a fragment extracted from its parent sequence.
e-20,e+10       to extract from 20 nucl. BEFORE the 3' end of the sequence
                to 10 nucl. AFTER its 3' end. Useful only for subsequences, and 
                produces a fragment extracted from its parent sequence.
-20,e+5         to extract from 20 nucl. BEFORE the 5' end of the sequence
                to 5 nucl. AFTER its 3' end.
feature: the feature tables of sequences in list are scanned for a given kind
of entries specified in the feature argument, and corresponding sequence data are extracted. 
Meaningful only for parent sequences (subsequences have no feature table)
and to access those features that do not correspond to a subsequence.
(e.g., EXON, mRNA, PRIM_TRANSCRIPT, REP_ORIGIN).
region: the fragment operation is applied to all entries of the specified kind
in all feature tables of the sequence list. The bounds and minbounds arguments specify
what part of feature data are extracted.
feature : (for operations "feature" or "region") a feature table item (CDS, mRNA,...).
bounds : (for operations "fragment" or "region") see syntax above.
minbounds : same syntax as bounds. When the sequence data is too short for this quantity
to be extracted, nothing is extracted. When the sequence data is between minbounds and bounds,
extracted sequence data is extended by N's to the desired length.
zlib : (not for coordinates) turns on and off zlib-compression of server's reply. 
       Value T (True) is the default.
This command can be interrupted by client sending the escape ASCII character to server on socket; 
client should keep reading socket until "extractseqs END.\n" arrives.


==>  getattributes&[id=xx|rank=xx]{&seq=[T|F]}{&prot=[T|F]}
<==  code=0&rank=xx&name=xx&length=xx{&fr=xx&gc=xx}&acc=xx&descr="xx"&spec="xx"\n
{seq=xxxxxx{&prot=xxxxx}\n}
From a sequence name or an accession number (id= argument), or from a sequence rank (rank= argument), 
returns its rank, name, length, reading frame (0, 1, 2), genetic code (acnuc's), 
primary accession number, first DE/DEFINITION line, and species name. 
If arguments seq=T and/or prot=T is/are given, the command also returns, on a second line, the full DNA sequence,
and, if prot=T was given and the sequence is protein-coding, the protein sequence.
Reading frame and genetic code are not returned for SwissProt.