Tag Archives: API

Dealing with Uniprot- Python programming interface.

My thesis is mostly focused on proteins, and be sure that I got really familiar with Uniprot. Uniprot comes out with a strong user interface that ease any approach. Biochemists can easily find what they need as a bioinformatician can set up his scripts to obtain information directly from the database. In fact, Uniprot has a very good programming interface, compatible with all the main programming languages, and very well explained on a detailed official tutorial. To find it, you can click this link, or you can search in google with the query “uniprot programmatically”. To me, it’s been quite complicated to find this page browsing in the website, since Uniprot documentation is huge (and I have no patience for this).

For instance, I have to retrieve a brunch of Helix Turn Helix transcriptional factors in fasta format. I’ve got a text file with one ID per line and must save them as Seq objects from the Bio.SeqIO Biopython module. Quite easy indeed, all the game is on the following function:

import urllib,urllib2

def getseq(ID, extention):
    base_url=”http://www.uniprot.org/uniprot/”
    url=base_url+ID+”.”+extention
    req = urllib2.Request(url)
    response = urllib2.urlopen(req)
    return response.read()

Please, consider that I still can’t figure out how to display ‘t’ tabs in wordpress,  change the spacing if you’ll ever use this. The response.read() returning value can be managed as a string. One can iterate this in order to print everything on a text file to be parsed with the SeqIO.parse() method. As shown, importing both urllib and urllib2 is mandatory. The amazing thing of uniprot is that the programmatic acces to the website is facilitated by the very simple organization of the database. If you know the ID, you just have to add the file type you need and build a web address with the filetype as extention.

Learn Ensembl API on EBI's website. An open workshop.

The Ensembl database comes out with a strong and well documented Application Programming interface. I must observe that Ensembl is much better than other DBs such as Uniprot or NCBI, wich look more oriented in providing good graphical interfaces than facilitate programmers’ life, becoming a very good tool for wet lab biologists who need a small bioinformatics outline, but a slippery slope for bioinformaticians. But this is actually a my own preference.

If you happen on Biostar’s homepage, you can notice a link on the top directing you to an open workshop to learn the basics of Ensembl API. This will take you to a very detailed (and quite pleonastic) description of the course. There’s no need to apply or register. After getting you to browse in several introductory pages, the website gets you to a brunch of video classes you’ll enjoy for sure.

Ensembl API is written in Perl. And this is despicable to me since I am a proud Python fanboy. Note that this course won’t get you to learn Perl’s basics, and a small Perl knowledge is necessary.

So, if you want to learn Ensembl API, the links provided here will definitely help.