API Documentation

This page contains documentation and examples for our API. This API can be used to query the Pogo database directly.

Introduction
Organization
Query basics
Methods

Type
Select
Where
Limit
Output
Array

Properties

Taxonomy
Comparison Data
Blast Files

Examples

Introduction

A key feature of POGO is the ability for users to be able to mass query our database. Our web interface has a certain use-case, and we recognize that users may have different needs than our website provides. They might have a different workflow, need to feed our data through a pipeline, or download large sets of data.

In order to remedy this, we provide users not only with the ability to download our entire database, but also with the ability to directly query the database for information they are interested in.

Organization

POGO's database is internally represented by two tables. The data table contains comparison data between two genomes, and the taxonomy table containing taxonomic information about the genomes.

Our database's API returns JSON formatted arrays or CSV files, and uses REST's "GET" mechanism to work. POGO's database API is loosely based around SQL select statements, since we use MySQL as our database backend.

Query Basics

Users can query the website using this url:


http://pogo.ece.drexel.edu/query.php

Queries are done by specifying certain GET variables in the URL. An example of this can be seen below, where we query the taxonomy table for all rows with the columns species, genome, and family.


http://pogo.ece.drexel.edu/query.php?type=taxonomy&select=species,genome,family&limit=10

To get more specific results we need to tell the database to return only rows that fit what we are interested in. As you saw above, we can tell the database what columns we are interested in, but we now need to tell it what columns we are interested in

The 'methods' section below explains how we can do just that.

Methods

There are three main methods that our API accepts. Type, Select, and Where. There are also other arguments including and Array, and Limit and Output.

Type

The Type argument tells the API which table you are querying, and is always required when using the API. There are only two options, "data" and "taxonomy".

The taxonomy table contains information about the different genomes that were compared.

Taxonomy Example: http://pogo.ece.drexel.edu/query.php?type=taxonomy&limit=10

The data table contains data from the comparisons.

Data Example: http://pogo.ece.drexel.edu/query.php?type=data&limit=10

Select

The Select argument allows you to choose which columns you are interested in. To know what columns are available please refer to the Properties section of this document.

If no results are returned, then all rows in the selected table are returned

This example returns the columns genus, species, ord, and superkingdom from the taxonomy table: http://pogo.ece.drexel.edu/query.php?type=taxonomy&select=genus,species,ord,superkingdom&limit=10

This example returns the columns id from data from the data table: http://pogo.ece.drexel.edu/query.php?type=data&select=id&limit=10

Where

The Where argument allows you to filter the rows based upon a statement. These operators and statements should be familiar to anyone with rudimentary knowledge of logic or programming.

At the bottom of this document are examples for different where statements

The operators we support are listed below

Equality Operator	Explanation
=	equal
<	less than
>	greater than
!	not. this operator proceeds others, like !=
and	AND operator instead of &&
or	OR operator
xor	Exclusive OR operator

We also support other statements that allow users to do string comparisons.

String Comparison	Explanation	Usage
like(string)	wrapper for MySQL LIKE	genus like('Chlamy')

Examples

Select all columns from rows where the genus is Bacillus http://pogo.ece.drexel.edu/query.php?type=taxonomy&where=genus='Bacillus'

Select all taxonomy where the genus contains 'Actino' http://pogo.ece.drexel.edu/query.php?type=taxonomy&where=genus like('Actino')

Select all data where the Genomic_Fluidity is over 90% http://pogo.ece.drexel.edu/query.php?type=data&where=Genomic_Fluidity>.90

Select all data where the Genomic_Fluidity is over 90% or less than 20% http://pogo.ece.drexel.edu/query.php?type=data&where=Genomic_Fluidity>.90 OR Genomic_Fluidity<.20

Warning: Order of Operations

Consider that the following statement could have multiple meanings: Select all taxonomy where the Genomic_Fluidity is over 90% or less than 20% and Average Amino Acid Identity is over 90%.

Using parentheses we can control the order of evaluation in a statement. This is the same as with math, inside to outside. It also follows the same style as most programming languages.

Here we have a statement where we select where Genomic_Fluidity is either over 90%, or is less than 20% and has an AAAI over 90%. http://pogo.ece.drexel.edu/query.php?type=data&where=(Genomic_Fluidity > .90) or (Genomic_Fluidity < .20 AND Average_Amino_Acid_Identity > .90 )

Limit

The Limit argument allows the user to specify how many results you want to return at maximum


    http://pogo.ece.drexel.edu/query.php?type=data&limit=1000

Output

The Output argument allows you to specify if you want CSV or JSON output. By default a JSON array will be returned.


    http://pogo.ece.drexel.edu/query.php?type=data&limit=1000&output=csv

Array

The Array argument allows you to specify if you want either a JSON Associative Array, or a Indexed Array, if you are using JSON as your output type. For more information read this link.

This argument is optional, and the POGO database will return numerical arrays by default.

Option	Explanation
ASSOC	Associative Array
NUM	Numerical Array

This is an example of returning an associative array in the data table


		http://pogo.ece.drexel.edu/query.php?type=data&output=JSON&array=ASSOC&limit=10

Properties

This section details the columns available in our data and taxonomy tables. Each column can be used in where statements, and in the select arguments.

Taxonomy Table

Our taxonomy table is collected from NCBI with some small changes.

Column Name	Description	Type
id	This is a unique identifier for the genome. genome_id1 and genome_id2 in the data table correspond to these values.	integer
genome	The name of the genome, which also is also a unique identifier.	string
phylum	Phylum of genome.	string
class	Class of genome.	string
ord	Order of genome.	string
family	Family of genome.	string
genus	Genus of genome.	string
species	Species of genome.	string
superkingdom	Superkingdom of genome.	string

Comparison Data

The comparison table contains all the information you see on the regular webpage, like orthologs, 16S_rRNA, and other marker genes.

Column Name	Description	Type
id	This is a unique identifier for the genome comparison.	integer
genome_id1, genome_id2	An id of one of the two genomes in the comparison.	string
number_of_genes1, number_of_genes2	Number of genes from respective genome in comparison.	integer
orthologs_criterion1, orthologs_criterion2	See the about page for more about ortholog criterions.	integer
Average_Amino_Acid_Identity	The Average Amino Acid Identity. See the about page for more.	float
Genomic_Fluidity	See about page for more about Genomic Fluidity	float
16S_rRNA	16S_rRNA identity	float
ArgS, CdsA, CoaE, etc.	other (besides 16S rRNA) marker gene identities	float
genome1_name, genome2_name	the name of the genome.	string
genome1_phylum, genome2_phylum	the phylum of the genome.	string
genome1_class, genome2_class	the class of the genome.	string
genome1_genus, genome2_genus	the genus of the genome.	string
genome1_species, genome2_species	the species of the genome.	string
genome1_superkingdom, genome2_superkingdom	the superkingdom of the genome.	string

Blast files

In order to get a tarball of blast files from our database, you need to query our download url. This is done in the same method as the regular query.

The ids variable corresponds to the "id" column in the comparison table.

This example requests a tarball containing blast files from comparisons with the id's 2354, 19201, and 623719.


    http://pogo.ece.drexel.edu/download.php?ids=2354,19201,623719

Examples

Taxonomy Comparisons

Comparing genus's and other taxonomy is slightly more complicated because there are two different genomes in each comparison, but we aren't ever sure if which one is categorized as genome1 or genome2. Therefore you need to have slightly more complex statements to properly select based upon taxonomy. here's a pseudo-code where statement on how to correctly ask for all A vs B:

 if (genome1_genus is A and genome2_genus is B)
 OR
 if (genome1_genus is B and genome2_genus is A)

 >>> Then show me the results

One Genus vs Another


    http://pogo.ece.drexel.edu/query.php?type=data&where=(genome1_genus='Bacillus' and genome2_genus='Chlamydia') or (genome1_genus='Chlamydia' and genome2_genus='Bacillus')

One Genus vs Itself


    http://pogo.ece.drexel.edu/query.php?type=data&where=genome1_genus='Bacillus' and genome2_genus='Bacillus'

One Species vs All Others


    http://pogo.ece.drexel.edu/query.php?type=data&where=genome1_species='Haemophilus influenzae' or genome2_species='Haemophilus influenzae'

One Species vs Itself


    http://pogo.ece.drexel.edu/query.php?type=data&where=genome1_species='Haemophilus influenzae' and genome2_species='Haemophilus influenzae'