dna_search v0.0.1 DNASearch.NCBI

Provides functions for querying the NCBI Nucleotide database.

Summary

Returns a raw FASTA string containing DNA data associated with the organism

Returns a raw FASTA string containing the sequences for the specified record IDs

Returns a list of NCBI IDs (strings) for sequences associated with the organism

Queries NCBI for sequence records and returns a map containing the following keys:

get_fasta(organism_name, options \\ [])

Returns a raw FASTA string containing DNA data associated with the organism.

organism_name: name of the organism you’re interested in. works best as a species names, e.g. "Homo sapiens" over "human".
options (optional):
- limit (optional): number of records to include in the FASTA. default: 10, max: 50.
- start_at_record_index (optional): the index of the first record to return. default: 0 to return the first set of records.
- properties (optional): string specifying special properties to filter by. default: biomol_genomic to filter to genomic sequences. see possible values for this field here.
- timeout (optional): request timeout in milliseconds. default: 10_000 (10 seconds).

get_fasta_for_sequence_ids(id_strings, options \\ [])

Returns a raw FASTA string containing the sequences for the specified record IDs.

id_strings: list of NCBI ID strings corresponding to sequence records
options (optional):
- timeout (optional): request timeout in milliseconds. default: 10_000 (10 seconds).

get_sequence_ids(organism_name, options \\ [])

Returns a list of NCBI IDs (strings) for sequences associated with the organism.

organism_name: name of the organism you’re interested in. works best as a species names, e.g. "Homo sapiens" over "human".
options (optional):
- limit (optional): number of records to include in the results. default: 10, max: 50.
- start_at_record_index (optional): the index of the first record to return. default: 0 to return the first set of records.
- properties (optional): string specifying special properties to filter by. default: biomol_genomic to filter to genomic sequences. see possible values for this field here.
- timeout (optional): request timeout in milliseconds. default: 10_000 (10 seconds).

get_sequence_records(organism_name, options \\ [])

Queries NCBI for sequence records and returns a map containing the following keys:

organism_name: name of the organism you’re interested in. works best as a species names, e.g. "Homo sapiens" over "human".
options (optional):
- limit (optional): number of records to include in the result set. default: 10, max: 50.
- start_at_record_index (optional): the index of the first record to return. default: 0 to return the first set of records.
- properties (optional): string specifying special properties to filter by. default: biomol_genomic to filter to genomic sequences. see possible values for this field here.
- timeout (optional): request timeout in milliseconds. default: 10_000 (10 seconds).