Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Cool GO annotation visualizations with Gephi + Bio4j

DZone's Guide to

Cool GO annotation visualizations with Gephi + Bio4j

· Java Zone
Free Resource

Bitbucket is for the code that takes us to Mars, decodes the human genome, or drives your next car. What will your code do? Get started with Bitbucket today, it's free.

Hi everyone!

After a few months without finding the opportunity to play with Gephi, it was already time to dedicate a lab day to this.
I thought that a good feature would be having the equivalent .gexf file for the current graph representation available at the tab “GoAnnotation Graph Viz”; so that you could play around with it in Gephi adapting it to your specific needs.
Then I got down to work and this is what happened:

First of all I was really happy to see how there was a new version of Gephi (0.8) as well as a good bunch of new (at least for me… :D ) layout algorithms plugins available like Parallel Force Atlas, Circular Layout or Layered Layout. So once I have downloaded and installed everything I started to have some fun with it and get to know how filters work, (I haven’t used these ones before).
Even though I got stuck a couple of times trying to figure out how to use some of them, I easily solved these small setbacks thanks to the great support found in the Gephi forums, where they quickly answered my newbie questions, thanks Gephi team!

As a source for the graph I used the public EHEC GO annotations we did for the E. coli O104:H4 Genome Analysis Crowdsourcing we coordinated last summer and chose the Molecular Function sub-ontology for the visualization.

When I first loaded the gexf file in Gephi without applying any kind of filters this is what I got:

As you (maybe) can see, the size of GO term nodes is proportional to the number of proteins they annotate; still it pretty much looks just like a big hair-ball…

Then I applied the following set of filters:

in order to get the GO terms with at least 6 protein annotations plus the proteins which are annotating these terms (their neighborhoods); and this is what it looked like (after applying a Parallel Force Atlas layout algorithm):

I decided then to get rid of the protein labels, since they were way too many and not so useful to be seen; for that I used the option: ‘Hide nodes/edges labels if not in filtered graph’.
After doing this and applying the black background preview setting, the visualization finally looked pretty decent:

Please go here to check the version exported with Sea Dragon plugin where you can zoom and move around!

Well, if you like the result (or you don’t but you want to play with this and get a better viz!), I just uploaded a new version of Bio4j GO Tools viewer where you can download the corresponding .gexf file for your GO annotations XML file.
Just press the button highlighted in the screenshot and enter the URL for your GO annotations XML file:

(You can use the public EHEC GO annotation results URL I used as a sample for this post: https://s3-eu-west-1.amazonaws.com/pablo-tests/EHECAnnotationVersion2.xml )

So, that’s all for now, please let me know if you play around with this and get some cool visualizations!

@pablopareja

Source: http://blog.bio4j.com/2011/11/cool-go-annotation-visualizations-with-gephi-bio4j/#comment-274


Bitbucket is the Git solution for professional teams who code with a purpose, not just as a hobby. Get started today, it's free.

Topics:

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}