This page contains documentation and examples for our API. This API can be used to query the Pogo database directly.
A key feature of POGO is the ability for users to be able to mass query our database. Our web interface has a certain use-case, and we recognize that users may have different needs than our website provides. They might have a different workflow, need to feed our data through a pipeline, or download large sets of data.
In order to remedy this, we provide users not only with the ability to download our entire database, but also with the ability to directly query the database for information they are interested in.
POGO's database is internally represented by two tables. The data table contains comparison data between two genomes, and the taxonomy table containing taxonomic information about the genomes.
Our database's API returns JSON formatted arrays or CSV files, and uses REST's "GET" mechanism to work. POGO's database API is loosely based around SQL select statements, since we use MySQL as our database backend.
Query BasicsUsers can query the website using this url:
Queries are done by specifying certain GET variables in the URL. An example of this can be seen below, where we query the taxonomy table for all rows with the columns species, genome, and family.
To get more specific results we need to tell the database to return only rows that fit what we are interested in. As you saw above, we can tell the database what columns we are interested in, but we now need to tell it what columns we are interested in
The 'methods' section below explains how we can do just that.
MethodsThere are three main methods that our API accepts. Type, Select, and Where. There are also other arguments including and Array, and Limit and Output.
The Type argument tells the API which table you are querying, and is always required when using the API. There are only two options, "data" and "taxonomy".
The taxonomy table contains information about the different genomes that were compared.
The data table contains data from the comparisons.
The Select argument allows you to choose which columns you are interested in. To know what columns are available please refer to the Properties section of this document.
If no results are returned, then all rows in the selected table are returned
This example returns the columns genus, species, ord, and superkingdom
from the taxonomy table:
This example returns the columns id from data
from the data table:
The Where argument allows you to filter the rows based upon a statement. These operators and statements should be familiar to anyone with rudimentary knowledge of logic or programming.
At the bottom of this document are examples for different where statements
The operators we support are listed below
|!||not. this operator proceeds others, like !=|
|and||AND operator instead of &&|
|xor||Exclusive OR operator|
We also support other statements that allow users to do string comparisons.
|like(string)||wrapper for MySQL LIKE||genus like('Chlamy')|
Select all columns from rows where the genus is Bacillus
Select all taxonomy where the genus contains 'Actino'
Select all data where the Genomic_Fluidity is over 90%
Select all data where the Genomic_Fluidity is over 90% or less than 20%
http://pogo.ece.drexel.edu/query.php?type=data&where=Genomic_Fluidity>.90 OR Genomic_Fluidity<.20
Warning: Order of Operations
Consider that the following statement could have multiple meanings: Select all taxonomy where the Genomic_Fluidity is over 90% or less than 20% and Average Amino Acid Identity is over 90%.
Using parentheses we can control the order of evaluation in a statement. This is the same as with math, inside to outside. It also follows the same style as most programming languages.
Here we have a statement where we select where Genomic_Fluidity is either over 90%, or is less than 20% and has an AAAI over 90%.
http://pogo.ece.drexel.edu/query.php?type=data&where=(Genomic_Fluidity > .90) or (Genomic_Fluidity < .20 AND Average_Amino_Acid_Identity > .90 )
The Limit argument allows the user to specify how many results you want to return at maximum
The Output argument allows you to specify if you want CSV or JSON output. By default a JSON array will be returned.
The Array argument allows you to specify if you want either a JSON Associative Array, or a Indexed Array, if you are using JSON as your output type. For more information read this link.
This argument is optional, and the POGO database will return numerical arrays by default.
This is an example of returning an associative array in the data table
This section details the columns available in our data and taxonomy tables. Each column can be used in where statements, and in the select arguments.
Our taxonomy table is collected from NCBI with some small changes.
|id||This is a unique identifier for the genome. genome_id1 and genome_id2 in the data table correspond to these values.||integer|
|genome||The name of the genome, which also is also a unique identifier.||string|
|phylum||Phylum of genome.||string|
|class||Class of genome.||string|
|ord||Order of genome.||string|
|family||Family of genome.||string|
|genus||Genus of genome.||string|
|species||Species of genome.||string|
|superkingdom||Superkingdom of genome.||string|
The comparison table contains all the information you see on the regular webpage, like orthologs, 16S_rRNA, and other marker genes.
|id||This is a unique identifier for the genome comparison.||integer|
|genome_id1, genome_id2||An id of one of the two genomes in the comparison.||string|
|number_of_genes1, number_of_genes2||Number of genes from respective genome in comparison.||integer|
|orthologs_criterion1, orthologs_criterion2||See the about page for more about ortholog criterions.||integer|
|Average_Amino_Acid_Identity||The Average Amino Acid Identity. See the about page for more.||float|
|Genomic_Fluidity||See about page for more about Genomic Fluidity||float|
|ArgS, CdsA, CoaE, etc.||other (besides 16S rRNA) marker gene identities||float|
|genome1_name, genome2_name||the name of the genome.||string|
|genome1_phylum, genome2_phylum||the phylum of the genome.||string|
|genome1_class, genome2_class||the class of the genome.||string|
|genome1_genus, genome2_genus||the genus of the genome.||string|
|genome1_species, genome2_species||the species of the genome.||string|
|genome1_superkingdom, genome2_superkingdom||the superkingdom of the genome.||string|
In order to get a tarball of blast files from our database, you need to query our download url. This is done in the same method as the regular query.
The ids variable corresponds to the "id" column in the comparison table.
This example requests a tarball containing blast files from comparisons with the id's 2354, 19201, and 623719.
Taxonomy ComparisonsComparing genus's and other taxonomy is slightly more complicated because there are two different genomes in each comparison, but we aren't ever sure if which one is categorized as genome1 or genome2. Therefore you need to have slightly more complex statements to properly select based upon taxonomy. here's a pseudo-code where statement on how to correctly ask for all A vs B:
if (genome1_genus is A and genome2_genus is B) OR if (genome1_genus is B and genome2_genus is A) >>> Then show me the results
One Genus vs Another
http://pogo.ece.drexel.edu/query.php?type=data&where=(genome1_genus='Bacillus' and genome2_genus='Chlamydia') or (genome1_genus='Chlamydia' and genome2_genus='Bacillus')
One Genus vs Itself
http://pogo.ece.drexel.edu/query.php?type=data&where=genome1_genus='Bacillus' and genome2_genus='Bacillus'
One Species vs All Others
http://pogo.ece.drexel.edu/query.php?type=data&where=genome1_species='Haemophilus influenzae' or genome2_species='Haemophilus influenzae'
One Species vs Itself
http://pogo.ece.drexel.edu/query.php?type=data&where=genome1_species='Haemophilus influenzae' and genome2_species='Haemophilus influenzae'