C API for remote access
Download source code. Run make and use the resulting libraa.a library. Programming example
API functions in alphabetical order : codaa, raa_acnucclose, raa_acnucopen, raa_acnucopen_alt, raa_alllistranks, raa_bcount, raa_bit0, raa_bit1, raa_btest, raa_copylist, raa_countfilles, raa_decode_address, raa_extract_1_seq, raa_extract_interrupt, raa_fcode, raa_followshrt2, raa_getattributes, raa_getemptylist, raa_getlistrank, raa_getliststate, raa_get_taxon_info, raa_gfrag, raa_ghelp, raa_iknum, raa_isenum, raa_knowndbs, raa_loadtaxonomy, raa_modifylist, raa_next_annots, raa_nexteltinlist, raa_nexteltinlist_annots, raa_nextmatchkey, print_raa_long, raa_open_socket, raa_opendb, raa_opendb_pw, raa_prep_acnuc_query, raa_prep_extract, raa_proc_query, raa_read_annots, raa_readacc, raa_readaut, raa_readbib, raa_readext, raa_readkey, raa_readlng, raa_readloc, raa_readshrt, raa_readsmj, raa_readspec, raa_readsub, raa_read_first_rec, raa_releaselist, raa_residuecount, raa_savelist, raa_seq_to_annots, raa_seqrank_attributes, raa_setlistname, raa_setliststate, raa_showannots, raa_showannots_list, raa_translate_cds, raa_translate_init_codon, raa_zerolist, scan_raa_long
API functions by theme : #include "raa_acnuc.h"
- Find, open, close database(s).
- raa_acnucopen_alt, raa_acnucopen, opens an acnuc database from its name or from environment variable racnuc
- raa_open_socket, opens connection to remote acnuc server (needed before calling raa_knowndbs).
- raa_knowndbs, gives list of names of known dbs
- raa_opendb, raa_opendb_pw, opens db from name after raa_open_socket, without or with password
- raa_acnucclose, closes the current acnuc db.
- Query database.
- raa_proc_query, processes a query expressed with the acnuc query language and creates the list of matching sequences or species or keywords;
- raa_nexteltinlist or raa_nexteltinlist_annots, loops through elements of list.
- raa_nextmatchkey, finds next keyword matching a template.
- raa_modifylist, modifies a sequence list by various criteria.
- raa_savelist, save names of list elements in local file.
- raa_getlistrank, raa_alllistranks, get list rank from list name, ranks of all currently defined lists.
- raa_releaselist, raa_zerolist, deletes or empties a list.
- Read sequence data and annotations.
- raa_getattributes, raa_seqrank_attributes, get sequence and/or attributes from a sequence name or accession number or ACNUC rank
- raa_gfrag, read sequence data
- raa_seq_to_annots, gain access to annotations from sequence rank.
- raa_read_annots and raa_next_annots, read successive annotation lines
- raa_showannots and raa_showannots_list, processes selected annotation lines
- raa_prep_extract, raa_extract_1_seq, raa_extract_interrupt, extracts all seqs from a list to a local file
- raa_prep_coordinates, raa_1_coordinate_set, get coordinates in their parent seq of various subsequences and feature items
- raa_translate_cds, raa_translate_init_codon, codaa, translate a protein coding sequence or a codon
- Use species, keywords, accession nos.
- raa_get_taxon_info, get information about a taxon specified by name, acnuc rank or ncbi ID.
- raa_iknum, get db rank of species or keyword.
- raa_fcode, get db rank of accession no., author, reference, type.
- raa_isenum, get db rank of sequence name.
- Utility functions.
- sock_fputs, send a character string to server
- sock_flush, flush output to server
- read_sock, read a character line received from server
- raa_error_mess_proc, pointer to an optional user-written function called when connection gets lost
- scan_raa_long, print_raa_long, reads/writes raa_long to decimal string
#include "raa_acnuc.h" int main(int argc, char **argv) { raa_db_access *raa; int err, list, count, num, length; char *seq, *name; err = raa_acnucopen_alt("pbil.univ-lyon1.fr", 5558, "embl", "myprog", &raa); if(err != 0) exit(1); raa_proc_query(raa, "j=nar and y=2000", NULL, "mylist", &list, &count, NULL, NULL); num = 0; while((num = raa_nexteltinlist(raa, num, list, &name, &length)) != 0) { seq = (char *)malloc((length + 1)*sizeof(char)); raa_gfrag(raa, num, 1, length, seq); printf("Name:%s Sequence:%s\n", name, seq); free(seq); } raa_acnucclose(raa); }
Link this toy program named prog.c with:
setenv RAADIR name-of-dir-containing-libraa.a
gcc -o prog prog.c -I$RAADIR -L$RAADIR -lraa -lz
Typedefs :
raa_db_access: a structure containing all information related to a connection with a remote acnuc database. typedef long long raa_long; /* scan_raa_long/print_raa_long converts from/to string decimal form */ typedef enum { raa_sub = 0, raa_loc, raa_key, raa_spec, raa_shrt, raa_lng, raa_ext, raa_smj, raa_aut, raa_bib, raa_txt, raa_acc } raa_file; typedef enum { raa_sub_of_bib = 0, raa_spec_of_loc, raa_bib_of_loc, raa_aut_of_bib, raa_bib_of_aut, raa_sub_of_acc, raa_key_of_sub, raa_acc_of_loc } raa_shortl2_kind;
Public fields of the raa_db_access structure :
typedef struct _raa_db_access {/* all information related to a connection with a remote acnuc database */ char *dbname; /* name of connected acnuc database */ FILE *raa_sockfdr, *raa_sockfdw; /* variables for read/write from/to connection socket */ int genbank, embl, swissprot, nbrf; /* one is true according to format of connected db */ int nseq; /* total number of sequences (and subseqs) in db */ int longa; int maxa; /* max widths of several db textual fields */ int L_MNEMO, WIDTH_SP, WIDTH_KW, WIDTH_SMJ, WIDTH_AUT, WIDTH_BIB, ACC_LENGTH, lrtxt; /* number of elements in a SHORTL2 record (0 when the SHORTL2 index file is not used) */ int VALINSHRT2; char *version_string; /* NULL, or a string containing version information */ int maxlists; /* max # of possible lists */ raa_node **sp_tree; /* NULL or the full taxonomy tree */ int max_tid; /* largest correct taxon ID value */ int *tid_to_rank; /* NULL or ncbi taxon ID to acnuc rank table */ int SUBINLNG; /* true number of sequence numbers in a struct rlng record */ struct rlng { int sub[SUBINLNG]; int next; } *rlng_buffer; /* supports working with selected parts of sequence annotations */ int tot_key_annots; /* number of elements of each of next three arrays */ /* uppercase names of annotation records in connected database; key_annots[0] is "ALL" */ char **key_annots; char **key_annots_min; /* same in lowercase */ /* each element is true if annotation record is wanted; want_key_annots[0]=TRUE means all records wanted */ unsigned char *want_key_annots; } raa_db_access;
- raa_acnucopen opens access to
a remote acnuc database using the racnuc, or, if undefined, acnuc, environment variable.
int raa_acnucopen(char *clientid, raa_db_access **praa); - clientid: the calling program name (freely chosen by the programmer)
- praa: points to a value that, upon return, defines the newly created acnuc connection
- return value : those of raa_open_socket and raa_opendb, or 8 if environment variables racnuc and acnuc are undefined or inadequately defined.
- raa_acnucopen_alt opens
access to a remote acnuc database using explicit address information.
int raa_acnucopen_alt(char *server_ip, int s_num , char *db_name, char *clientid, raa_db_access **praa); - server_ip : ip name of the acnuc server (e.g., "pbil.univ-lyon1.fr")
- s_num : socket number, normally 5558
- db_name : name of the database (e.g. "embl")
- clientid: the calling program name (freely chosen by the programmer)
- praa: points to a value that, upon return, defines the newly created acnuc connection
- return value : those of raa_open_socket and raa_opendb.
- raa_open_socket opens access
to the remote acnuc server
int raa_open_socket(char *serverName, int port, char *clientid, raa_db_access **praa); - serverName : ip name of the acnuc server (e.g., "pbil.univ-lyon1.fr")
- port: port number (e.g. 5558)
- clientid: the calling program name (freely chosen by the programmer)
- praa: points to a value that, upon return, defines the newly created acnuc connection
- return value: 0 if OK; 1 if problem with remote host name; 2 if cannot create connection with remote host; 7 if not enough memory.
- raa_opendb opens an acnuc database
after raa_open_socket call
int raa_opendb(raa_db_access *raa, char *dbname); - raa: value of the remote acnuc connection
- dbname : database name (e.g., "embl")
- return value: 0 iff OK; 3 if database is unknown by remote host; 4 if database is currently unavailable on remote host; 5 if a database was previously opened and was not closed; 9 if no socket was previously opened by raa_open_socket.
- raa_opendb_pw opens a password-protected
database after raa_open_socket call
int raa_opendb_pw(raa_db_access *raa, char *db_name, void *ptr, char *(*getpasswordf)(void *) ); - raa: value of the remote acnuc connection
- dbname : database name (e.g., "embl")
- ptr : NULL or pointer to data transmitted to the getpasswordf function
- getpasswordf : pointer to password-providing function that returns the password as a writable static string
- return value: as raa_opendb, or 6 to indicate failed password-based authorization.
- raa_decode_address to decode
an ACNUC-specific URL
int raa_decode_address(char *address, char **server_ip, int *s_num, char **db_name); - address : same form as "pbil.univ-lyon1.fr:5558/swissprot" or "pbil.univ-lyon1.fr:5558"
- server_ip : upon return, the ip name part of the address (pbil.univ-lyon1.fr)
- s_num : upon return, the socket number part (5558)
- db_name : upon_return, the db name (swissprot) or NULL if absent
- return value : 0 iff OK
- raa_acnucclose to close
access to the db
void raa_acnucclose(raa_db_access *raa);
raa: value of the remote acnuc connection - raa_getattributes to get
sequence and/or attributes from a sequence name or accession number.
char *raa_getattributes(raa_db_access *raa, const char *id, int *prank, int *plength, int *pframe, int *pgc, char **pacc, char **pdesc, char **pspecies, char **pseq); - raa: value of the remote acnuc connection
- id: a name or accession number
- prank: NULL or, upon return, pointer to ACNUC rank of sequence
- plength: NULL or, upon return, pointer to length of sequence
- pframe: NULL or, upon return, pointer to reading frame (0,1,2) of sequence
- pgc: NULL or, upon return, pointer to genetic code (ACNUC's) of sequence
- pacc: NULL or, upon return, pointer to primary accession no. of sequence in private memory
- pdesc: NULL or, upon return, pointer to one-line description of sequence in private memory
- pspecies: NULL or, upon return, pointer to species name of sequence in private memory
- pseq: NULL or, upon return, pointer to complete sequence in private memory
- return value: NULL if id not found, or sequence name
- raa_seqrank_attributes
to get sequence and/or attributes from a sequence rank.
char *raa_seqrank_ attributes(raa_db_access *raa, int rank, int *plength, int *pframe, int *pgc, char **pacc, char **pdesc, char **pspecies, char **pseq); - raa: value of the remote acnuc connection
- rank: the ACNUC rank of sequence
- plength: NULL or, upon return, pointer to length of sequence
- pframe: NULL or, upon return, pointer to reading frame (0,1,2) of sequence
- pgc: NULL or, upon return, pointer to genetic code (ACNUC's) of sequence
- pacc: NULL or, upon return, pointer to primary accession no. of sequence in private memory
- pdesc: NULL or, upon return, pointer to one-line description of sequence in private memory
- pspecies: NULL or, upon return, pointer to species name of sequence in private memory
- pseq: NULL or, upon return, pointer to complete sequence in private memory
- return value: NULL if id not found, or sequence name
- raa_gfrag to read a sequence fragment
int raa_gfrag(raa_db_access *raa, int nsub, int first, int lfrag, char *dseq); - raa: value of the remote acnuc connection
- nsub : rank of sequence
- first : first residue to read (counting from 1)
- lfrag : number of residues to read
- dseq : character array, allocated by caller, to be filled with residues
- return value : number of residues read (can be 0)
- raa_seq_to_annots get adress
of start of annotations for a sequence
void raa_seq_to_annots(raa_db_access *raa, int numseq, raa_long *faddr, int *div); - raa: value of the remote acnuc connection
- numseq : rank of sequence
- *faddr : returned filled with the offset within flat file of beginning of annotations
- div : returned filled with rank of division containing the sequence
- raa_read_annots read the first
line of annotations of a sequence
char *raa_read_annots(raa_db_access *raa, raa_long faddr, int div); - raa: value of the remote acnuc connection
- faddr : offset within flat file of beginning of annotations (typically from raa_seq_to_annots)
- div : rank of division containing the sequence
- return value : pointer to line read in static memory (NULL if error)
- raa_next_annots read the next
line of annotations of a sequence
char *raa_next_annots(raa_db_access *raa, NULL);
or
char *raa_next_annots(raa_db_access *raa, raa_long *faddr); - raa: value of the remote acnuc connection
- *faddr : returned filled with offset within flat file of beginning of line read (can be used later by raa_read_annots)
- return value : pointer to line read in static memory (NULL if error)
- raa_iknum get rank of a species
or a keyword
int raa_iknum(raa_db_access *raa, char *name, raa_file cas); - raa: value of the remote acnuc connection
- name : a species or a keyword (case is not significant)
- cas : raa_spec for a species or raa_key for a keyword
- return value : rank of given name (0 if absent)
- raa_isenum get rank of a sequence
from its name
int raa_isenum(raa_db_access *raa, char *name); - raa: value of the remote acnuc connection
- name : a sequence name (case is not significant)
- return value : rank of given name (0 if absent)
- raa_proc_query processes a
query and creates the list of matching seqs, species or keywords
int raa_proc_query(raa_db_access *raa, char *query, char **message, char *listname, int *numlist, int *count, int *locus, int *type); - raa: value of the remote acnuc connection
- query : a string containing a query following the acnuc query language
- message : if != NULL, returned filled with an error message in case of error in malloc'ed memory
- listname : the name to be given to the list (case is not significant)
- numlist : upon return, if no error, the rank of the created list
- count : if != NULL, returned filled with the number of elements in the created list
- locus : if != NULL, returned filled with TRUE if list contains parent sequences only
- type : if != NULL, returned filled with 'S', 'K', or 'E' for a list of seqs, keywords, or species
- return value : O if OK, or an error number
- raa_nexteltinlist returns
the next element of a list
int raa_nexteltinlist(raa_db_access *raa, int first, int lrank, char **name, int *length); - raa: value of the remote acnuc connection
- first : elements of list are searched after this position (initiate this to 0)
- lrank : rank of the list
- name : if != NULL, returned filled with the name of the element in static memory
- length : if != NULL, returned filled with the element length (for seq list only)
- return value : the rank of the next element in the list, or 0 if none
- raa_nexteltinlist_annots
returns the next element of a sequence list and related information
int raa_nexteltinlist_annots(raa_db_access *raa, int first, int lrank, char **name, int *length, raa_long *offset, int *div); - raa: value of the remote acnuc connection
- first : elements of list are searched after this position (initiate this to 0)
- lrank : rank of the sequence list
- name : if != NULL, returned filled with the name of the sequence in static memory
- length : if != NULL, returned filled with the sequence length
- offset : if != NULL, returned filled with the annotation offset of the seq
- div : if != NULL, returned filled with the division rank of the seq
- return value : the rank of the next sequence in the list, or 0 if none
- raa_nextmatchkey returns
the next keyword matching a given pattern
int raa_nextmatchkey(raa_db_access *raa, int num, char *pattern, char **matching); - raa: value of the remote acnuc connection
- num : rank beyond which next matching keyword if sought (set num=2 the first time)
- pattern : must contain at least once the wild card character @ (used only if num = 2)
- matching : if not NULL, filled with matching keyword in private memory
- return value : rank of next matching keyword, or 0 if no more matching keyword.
- raa_bcount counts the number
of elements in a list
int raa_bcount(raa_db_access *raa, int lrank); - raa: value of the remote acnuc connection
- lrank : rank of the list
- return value : the number of elements in the list
- raa_bit1 adds an element to a list
void raa_bit1(raa_db_access *raa, int lrank, int num); - raa: value of the remote acnuc connection
- lrank : rank of the list
- num: rank of the element to add
- raa_bit0 removes an element from a
list
void raa_bit0(raa_db_access *raa, int lrank, int num); - raa: value of the remote acnuc connection
- lrank : rank of the list
- num: rank of the element to remove
- raa_btest tests presence of element
in a list
int raa_btest(raa_db_access *raa, int lrank, int num); - raa: value of the remote acnuc connection
- lrank : rank of the list
- num: rank of the element to remove
- return value: TRUE iff element num is in list lrank
- raa_copylist duplicates a list
void raa_copylist(raa_db_access *raa, int rank_from, int rank_to); - raa: value of the remote acnuc connection
- rank_from : rank of the list to copy
- rank_to : rank of the destination list that must have been previously allocated by e.g., raa_getemptylist
- raa_zerolist empties a list
void raa_zerolist(raa_db_access *raa, int lrank); - raa: value of the remote acnuc connection
- lrank : rank of the list to empty, that must have been previously allocated by e.g., raa_getemptylist.
- raa_setliststate sets the
state of a list
void raa_setliststate(raa_db_access *raa, int lrank, int locus, int type); - raa: value of the remote acnuc connection
- lrank: rank of the list
- locus : TRUE iff list contains only parent sequences
- type : 'S', 'K', or 'E' for list of seqs, keywords, or species.
- raa_getliststate gets the
state of a list
char *raa_getliststate(raa_db_access *raa, int lrank, int *locus, int *type, int *count); - raa: value of the remote acnuc connection
- lrank: rank of the list
- locus : if != NULL, returned filled with TRUE iff list contains only parent sequences
- type : if != NULL, returned filled with 'S', 'K', or 'E' for list of seqs, keywords, or species.
- count : if != NULL, returned filled with the number of elements in list
- return value : the list name in static memory, or NULL if error
- raa_getemptylist finds an
empty list and sets its name
int raa_getemptylist(raa_db_access *raa, char *lname); - raa: value of the remote acnuc connection
- lname: name to give to the list;
- return value : the list rank, or 0 if none is available;
- if no list named lname existed, a new, empty one is created;
- if a list named lname already existed, its rank is returned and no change is done to that list.
- raa_setlistname sets the
name of a list
int raa_setlistname(raa_db_access *raa, int lrank, char *name); - raa: value of the remote acnuc connection
- lrank: rank of the list
- name : name to give to the list
- return value :
- 0 : OK
- 1 : a list with that name already existed and was deleted
- -1 : no list with that rank exists
- raa_getlistrank gets the
rank of a list from its name
int raa_getlistrank(raa_db_access *raa, char *name); - raa: value of the remote acnuc connection
- name: name of the list
- return value : > 0 if OK, 0 if no list with that name exists
- raa_releaselist releases a list
int raa_releaselist(raa_db_access *raa, int lrank); - raa: value of the remote acnuc connection
- lrank: rank of the list
- return value : 0 if OK, != 0 if no list with that rank exists
- raa_residuecount count
residues in a list
char *raa_residuecount(raa_db_access *raa, int lrank); - raa: value of the remote acnuc connection
- lrank: rank of the list
- return value : total number of residues (nucleotides or aminoacids) in all sequences of the list as a char string in static memory (caution: may require a 64-bit integer to be sscanf'ed).
- raa_countfilles counts the
number of subsequences present in a list
int raa_countfilles(raa_db_access *raa, int lrank); - raa: value of the remote acnuc connection
- lrank: rank of the list
- return value : the number of subsequences present
- raa_alllistranks get ranks
of all currently defined lists
int raa_alllistranks(raa_db_access *raa, int **ranks); - raa: value of the remote acnuc connection
- ranks: upon return, an array of ranks of used lists in malloc'ed memory
- return value : the number of used lists
- raa_fcode returns the rank of a
record of an index file from its key
int raa_fcode(raa_db_access *raa, raa_file case, char *name); - raa: value of the remote acnuc connection
- case: one of raa_aut raa_bib raa_acc raa_smj raa_sub
- name: the record key
- return value : the rank of the corresponding key
- raa_read_first_rec returns
the total number of records in an index file
int raa_read_first_rec(raa_db_access *raa, raa_file case); - raa: value of the remote acnuc connection
- case: an index file expressed using the raa_file enumeration
- return value : the total number of records in the index file
- int atoi_u(const char *p);
decodes a string as an unsigned decimal integer - raa_readsub reads a SUBSEQ
record
char *raa_readsub(raa_db_access *raa, int num, int *plength, int *ptype, int *pext, int *plkey, int *plocus, int *pframe, int *pgencode) - raa: value of the remote acnuc connection
- num: seq rank
- plength: upon return, filled, if != NULL, with seq length
- ptype : upon return, filled, if != NULL, with rank of seq type
- pext : upon return, filled, if != NULL, with
- > 0 indicates a subsequence and pext is a record # in EXTRACT
- ≤ 0 indicates a parent sequence and -pext is the start of long list of subsequences
- plkey : upon return, filled, if != NULL, with start of short list of keywords
- plocus : upon return, filled, if != NULL, with LOCUS rank for a parent sequence or 0 for a subsequence
- pframe : upon return, filled, if != NULL, with reading frame (0,1, or 2)
- pgencode : upon return, filled, if != NULL, with genetic code (0 is standard)
- return value : sequence name in static memory or NULL if error
- raa_readloc reads a LOCUS record
char *raa_readloc(raa_db_access *raa, int num, int *sub, int *pnuc, int *spec, int *host,
int *plref, int *molec, int *placc, int *org); - raa: value of the remote acnuc connection
- num: rank of LOCUS record
- sub, pnuc, spec, host, plref, molec, placc, org: fields of the record (any pointer can be NULL for no value returned)
- return value: the date as a private character string
- raa_readspec reads a SPECIES
record
char *raa_readspec(raa_db_access *raa, int num, char **plibel, int *plsub, int *desc, int *syno, int *plhost); - raa: value of the remote acnuc connection
- num: rank of SPECIES record
- plibel: if plibel != NULL, *plibel is returned as pointer to label in private memory, or as NULL if no label exists
- plsub, desc, syno, plhost: returned with data read from the record
- return value: pointer to name of species in private memory
- raa_readkey reads a KEYWORDS
record
char *raa_readkey(raa_db_access *raa, int num, char **plibel, int *plsub, int *desc, int *syno); - raa: value of the remote acnuc connection
- num: rank of KEYWORDS record
- plibel: if plibel != NULL, *plibel is returned as pointer to label in private memory, or as NULL if no label exists
- plsub, desc, syno: if != NULL, returned with data read from the record
- return value: pointer to name of keyword in private memory
- raa_readsmj reads an SMJYT record
char *raa_readsmj(raa_db_access *raa, int num, char **plibel, int *plong); - raa: value of the remote acnuc connection
- num: rank of SMJYT record
- plibel: if plibel != NULL, *plibel is returned as pointer to label in private memory, or as NULL if no label exists
- plong: if != NULL, upon return points to data read from record
- return value: name as a private string
- raa_readacc reads an ACCESS record
char *raa_readacc(raa_db_access *raa, int num, int *plsub); - raa: value of the remote acnuc connection
- num: rank of ACCESS record
- plsub: if != NULL, upon return points to start of short list of attached sequences
- return value: name as a private string
- raa_readaut reads an AUTHOR record
char *raa_readaut(raa_db_access *raa, int num, int *plref); - raa: value of the remote acnuc connection
- num: rank of AUTHOR record
- plref: if != NULL, upon return points to start of short list of attached references
- return value: author name as a private string
- raa_readbib reads a BIBLIO record
char *raa_readbib(raa_db_access *raa, int num, int *plsub, int *plaut, int *pj, int *py); - raa: value of the remote acnuc connection
- num: rank of BIBLIO record
- plsub: if != NULL, upon return points to start of short list of attached sequences
- plaut: if != NULL, upon return points to start of short list of attached authors
- pj: if != NULL, upon return points to rank in SMJYT of journal
- py: if != NULL, upon return points to rank in SMJYT of year of publication
- return value: reference name as a private string
- raa_readext reads an EXTRACT record
int raa_readext(raa_db_access *raa, int num, int *mere, int *deb, int *fin); - raa: value of the remote acnuc connection
- num: rank of record
- mere, deb, fin: if != NULL, upon return point to data read from record
- return value: 0 or rank of next chained record
- raa_readlng reads a LONGL record
int raa_readlng(raa_db_access *raa, int point); - raa: value of the remote acnuc connection
- point: rank of record
- return value: 0 or rank of next chained record
- raa_readshrt reads a short
list element
unsigned raa_readshrt(raa_db_access *raa, unsigned point, int *val); - raa: value of the remote acnuc connection
- point: rank of element
- val: upon return, the element value
- return value: 0 or the rank of the next short list element
- raa_followshrt2 reads an
element of a short list
unsigned raa_followshrt2(raa_db_access *raa, unsigned *p_point, int *p_rank, raa_shortl2_kind kind); - p_point: pointer to start of short list
- p_rank: pointer to rank within list (begin with 0)
- kind: indicates what kind of short list is considered The returned value is one list element. Upon return from this function, p_point and p_rank have values allowing to access the next element of the list. When the last list element has been accessed, *p_point = 0.
- raa_long scan_raa_long(char *txt);
decodes a string as a number of type raa_long (capable of storing large file offset) - char *print_raa_long(raa_long val,
char *buffer);
encodes a number of type raa_long (capable of storing large file offset) as a string in buffer argument. Returns buffer. - char *raa_ghelp(raa_db_access *raa, char *hname, char *topic);
- raa: value of the remote acnuc connection
- hname: one of "HELP" or "HELP_WIN"
- topic: name of a help topic
- raa_savelist saves in a local
file names or acc. nos of members of a list
int raa_savelist(raa_db_access *raa, int lrank, FILE *out, int use_acc, char *prefix); - raa: value of the remote acnuc connection
- lrank: rank of list to be saved in a file (can be seq, species or keyw list)
- out: opened file where to save list members
- use_acc: if TRUE, save accession numbers, if FALSE, save names of list members
- prefix: if != NULL, write prefix before each name of each member in file out
- return value: 0 iff ok
- raa_modifylist modifies list
according to length or date or by scanning the annotation of its elements
int raa_modifylist(raa_db_access *raa, int lrank, char *type, char *operation, int *pnewlist, int (*check_interrupt)(void), int *p_processed); - raa: value of the remote acnuc connection
- lrank: rank of list to be modified (must be a seq list)
- type: "length" or "date" or "scan"
- operation: (for length) ">10000" or "<500"
(for date) ">1/jan/2003" or " < 29/FEB/96"
(for scan) "string-to-be-searched-for".The prep_getannots command must be sent to the server with functions sock_fputs and read_sock before using the scan operation - pnewlist: upon return, points to rank of newly created list containing result of operation
- check_interrupt: NULL or pointer to a function that will be called iteratively by the function and that should return TRUE iff caller wants to interrupt the modification operation
- p_processed: NULL, or pointer to value that will be set, upon return, to the number of list elements processed by the function until interruption or completion (for scan operation only)
- return value: 0 if ok; 2 syntax error in operation; 3 creation of new list is impossible
- raa_knowndbs gets name and
description of all known databases
int raa_knowndbs(raa_db_access *raa, char ***pnames, char ***pdescriptions); - raa: value of the remote acnuc connection
- pnames: points to array of strings loaded with database names (in malloc'ed memory)
- pdescriptions: points to array of strings loaded with database descriptions (in malloc'ed memory)
- return value: number of elements in tables pnames and pdescriptions
- raa_prep_extract prepares for
extraction of all members of a sequence list to a local file
void *raa_prep_extract(raa_db_access *raa, char *format, FILE *out, char *operation, char *feature_name, char *bounds, char *min_bounds, char **pmessage, int lrank); - raa: value of the remote acnuc connection
- format: "fasta" or "flat" (e.g., genbank, embl) or "acnuc"
- out: FILE * variable to which extracted data should be sent. Left open after end of extraction operation.
- operation: "simple", "translation" (translates on the fly CDS sequences), "feature" (extracts fragment corresponding to given feature name), "fragment", or "region"
- feature_name: NULL or name of desired feature
- bounds, min_bounds: NULL unless operation is "fragment" or "region"
- pmessage: returned set to NULL or to error message
- lrank: rank of sequence list
- return value: NULL iff error
- raa_extract_1_seq successively extracts one sequence
from list.
int raa_extract_1_seq(void *opaque); - opaque: value returned by the previous call to raa_prep_extract
- return value: number of extracted sequences (0 is possible),
or -1 when all of list was processed. - raa_extract_interrupt cleanly interrupts an extraction
before its full completion.
int raa_extract_interrupt(raa_db_access *raa, void *opaque); - raa: value of the remote acnuc connection
- opaque: value returned by the previous call to raa_prep_extract
- return value: number of extracted sequences (0 is possible),
- sock_fputs send a character
string to server
int sock_fputs(raa_db_access *raa, char *line); - raa: value of the remote acnuc connection
- return value: 0 iff success
- sock_flush flush output to server
int sock_flush(raa_db_access *raa); - raa: value of the remote acnuc connection
- return value: 0 iff success
- read_sock read a character
line received from server
char *read_sock(raa_db_access *raa); - raa: value of the remote acnuc connection
- return value: a full line of data received from server in private memory, or NULL if communication with server is lost.
- raa_error_mess_proc A global
variable that points to a function called when connection gets lost
void (*raa_error_mess_proc)(raa_db_access *raa, char *message);
This function should call raa_acnucclose.
Usage example :
The value of the racnuc (or acnuc) environment variable should be such as pbil.univ-lyon1.fr:5558/embl
Before this call, it is possible to change the value of the raa_maxlist integer global variable from its default value of 50, to another value, to set the desired maximal number of lists the client wants to be able to create.
Returns all topic from HELP or HELP_WIN in one string in private memory
Must call this function until -1 is returned, unless raa_extract_interrupt was called.
Very rarely needed, because read_sock calls sock_flush.
void my_error_proc(raa_db_access *raa, char *message) { fprintf(stderr,"%s from database %s\n", message, raa->dbname); raa_acnucclose(raa); } raa_error_mess_proc = my_error_proc;
When no such function is assigned to raa_error_mess_proc, exit(0) is called after connection loss.
char *raa_translate_cds(raa_db_access *raa, int numseq);
- raa: value of the remote acnuc connection
- numseq: rank in db of a protein coding sequence (a CDS feature entry)
- return value: the resulting protein sequence, using the adequate genetic code and initiation codon translation, in private memory, or NULL if error.
char raa_translate_init_codon(raa_db_access *raa, int numseq);
- raa: value of the remote acnuc connection
- numseq: rank in db of a protein coding sequence (a CDS feature entry)
- return value: the resulting amino acid
char codaa(char *codon, int gc);
- codon: a trinucleotide
- gc: the genetic code to be used (typically returned by raa_readsub)
- return value: the resulting amino acid
void *raa_prep_coordinates(raa_db_access *raa, int lrank, int seqnum, char *operation, char *feature_name, char *bounds, char *min_bounds);
See extractseqs for the semantics of this function.
- raa: value of the remote acnuc connection
- lrank: the rank of a sequence list
- seqnum: the acnuc number of a (sub)sequence
only one of the first 2 arguments is non zero. - operation: "simple","fragment","feature","region"
- feature_name: the name of a feature key (e.g.: "cds", "tRNA")
- bounds: syntax by examples: "10,40" "-10, 40" "-10,e+10" "e-10,E+100" where e/E means sequence end
- min_bounds: minimum extension of required fragment, same syntax as bounds argument
- return value: NULL if error, or an opaque pointer to be transmitted to raa_1_coordinate_set
int *raa_1_coordinate_set(void *v);
- v: the opaque pointer returned by a previous call to raa_prep_coordinates
- return value: NULL or an int array containing 3*C + 1 values, where C is the 1st array element and other triples of elements are sequence-number, first-coordinate, last-coordinate.
This function must be called repetitively until it returns NULL.
first-coordinate > last-coordinate indicates the complementary strand of the parent sequence.
Usage example:
int mylist, *table, i, j, count; void *v; raa_db_access *raa; raa_proc_query(raa, "sp=bos taurus", NULL, "bos", &mylist, NULL, NULL, NULL); v = raa_prep_coordinates(raa, mylist, 0, "region", "CDS", "-10000,-1", "-2000,-1"); if(v == NULL) exit(1); while( (table = raa_1_coordinate_set(v) ) != NULL) { count = table[0] ; j = 0; for(i=0; i < count; i++) { table[j+1]; // is the acnuc number of the sequence table[j+2]; // is the start position in this sequence table[j+3]; // is the end position in this sequence j += 3; } }
char *raa_get_taxon_info(raa_db_access *raa, char *name, int rank, int tid, int *p_rank, int *p_tid, int *p_parent, struct raa_pair **p_desc_list);
- raa: value of the remote acnuc connection
- name: NULL or a taxon name (case is not significant)
- rank: used only if name==NULL, the acnuc rank of a taxon
- tid: used only if name==NULL && rank==0, an ncbi taxon ID
- p_rank: if p_rank != NULL, *p_rank returned with the taxon acnuc rank
- p_tid: if p_tid != NULL, *p_tid returned with the taxon ncbi ID
- p_parent: if p_parent != NULL, *p_parent returned with acnuc rank of taxon's parent in species tree
- p_desc_list: if p_desc_list != NULL, *p_desc_list returned with first element of chain of taxon's descendants in tree. All descendants of taxon Escherichia can be found with:
struct raa_pair *pair; raa_get_taxon_info(raa, "Escherichia", 0, 0, NULL, NULL, NULL, &pair); while(pair != NULL) { fprintf("Name: %s Rank:%d TID:%d\n", pair->value->name, pair->value->rank, pair->value->tid); pair = pair->next; }
This function may last for a few seconds at first call, but is fast for all subsequent calls.
int raa_loadtaxonomy(raa_db_access *raa, char *rootname, int (*progress_function)(int percent, void *data), void *progress_arg, int (*need_interrupt_function)(void *data), void *interrupt_arg);
- raa: value of the remote acnuc connection
- rootname: (read only) name to be given to the species tree root
- progress_function: NULL or function that gets called every time tree loading progresses by 1% and that should return TRUE when opportunity for calling program to ask for interruption is desired
- progress_arg: transmitted as 2nd argument of progress_function
- need_interrupt_function: NULL or function that gets called after progress_function returned TRUE and that should return TRUE when interruption of tree loading is desired
- interrupt_arg: argument transmitted to need_interrupt_function
- return value: 0 iff no error
This call initializes the sp_tree, tid_to_rank
and max_tid fields of the raa_db_access structure.
sp_tree[i] is the species tree node representing taxon of acnuc rank i.sp_tree[2]
is the root of this tree.
tid_to_rank[i] is the acnuc rank of ncbi taxon ID 0 ≤ i ≤ max_tid, or 0 if no such acnuc taxon exists.
typedef struct raa_node { char *name; /* taxon name */ char *libel; /* taxon libel */ char *libel_upcase; /* taxon libel converted to upper case */ int rank; /* taxon acnuc rank */ int tid; /* taxon ncbi ID */ int count; /* number of seqs attached to taxon or below in database */ struct raa_node *parent; /* taxon's parent in species tree, or NULL if node is species tree root */ struct raa_pair { raa_node *value; /* one descendant */ struct raa_pair *next; /* NULL or points to next descendant */ } *list_desc;/* to chained list of taxon's descendants in species tree */ /* taxon's next synonym, as a closed loop where a single member has parent != NULL */ struct raa_node *syno; } raa_node; /* one species tree node */
int raa_prep_acnuc_query(raa_db_access *raa);
- raa: value of the remote acnuc connection
- return value: number of available lists on server or -1 if error
This call initializes the tot_key_annots, key_annots, key_annots_min and want_key_annots fields of the raa_db_access structure.
void raa_showannots(raa_db_access *raa, int seqnum, char **featurekey_name, int featurekey_count, int *new_choice, void (*outoneline)(char *, void *), void *pdata);
- raa: value of the remote acnuc connection
- seqnum: rank of target sequence
- featurekey_name: NULL or array of names of desired feature keys (e.g. CDS, TRNA) if part only of the feature table is targeted
- featurekey_count: 0 or number of elements in array featurekey_name
- *new_choice: (input) TRUE iff choice of desired annotation lines has changed since previous raa_showannots call; (output) FALSE
- outoneline: function that gets called for each matching annotation line with 2 arguments: the matching line and a pointer to some data
- pdata: NULL or data pointer transmitted as 2nd argument to outoneline funtion calls
Targeted annotation records are specified by the want_key_annots
field of the raa_db_access structure: set want_key_annots[i]
to TRUE iff annotation record named key_annots[i] is targeted. Alternatively, set
want_key_annots[0] to TRUE to target all annotation records.
The featurekey_name array allows to further specify targeted parts of the features table.
void raa_showannots_list(raa_db_access *raa, int lrank, char **featurekey_name, int featurekey_count, void (*outoneline)(char *, void *), void *pdata);
- raa: value of the remote acnuc connection
- lrank: rank of target sequence list
- featurekey_name: NULL or array of names of desired feature keys (e.g. CDS, TRNA) if part only of the feature table is targeted
- featurekey_count: 0 or number of elements in array featurekey_name
- outoneline: function that gets called for each matching annotation line with 2 arguments: the matching line and a pointer to some data
- pdata: NULL or data pointer transmitted as 2nd argument to outoneline funtion calls
Be careful not to use this function with very large sequence lists because it cannot be interrupted.