Cool GO annotation visualizations with Gephi + Bio4j
Join the DZone community and get the full member experience.Join For Free
After a few months without finding the opportunity to play with Gephi, it was already time to dedicate a lab day to this.
I thought that a good feature would be having the equivalent .gexf file for the current graph representation available at the tab “GoAnnotation Graph Viz”; so that you could play around with it in Gephi adapting it to your specific needs.
Then I got down to work and this is what happened:
First of all I was really happy to see how there was a new version of
Gephi (0.8) as well as a good bunch of new (at least for me…
) layout algorithms plugins available like Parallel Force Atlas,
Circular Layout or Layered Layout. So once I have downloaded and
installed everything I started to have some fun with it and get to know
how filters work, (I haven’t used these ones before).
Even though I got stuck a couple of times trying to figure out how to use some of them, I easily solved these small setbacks thanks to the great support found in the Gephi forums, where they quickly answered my newbie questions, thanks Gephi team!
As a source for the graph I used the public EHEC GO annotations we did for the E. coli O104:H4 Genome Analysis Crowdsourcing we coordinated last summer and chose the Molecular Function sub-ontology for the visualization.
When I first loaded the gexf file in Gephi without applying any kind of filters this is what I got:
As you (maybe) can see, the size of GO term nodes is proportional to the number of proteins they annotate; still it pretty much looks just like a big hair-ball…
Then I applied the following set of filters:
in order to get the GO terms with at least 6 protein annotations plus the proteins which are annotating these terms (their neighborhoods); and this is what it looked like (after applying a Parallel Force Atlas layout algorithm):
I decided then to get rid of the protein labels, since they were way
too many and not so useful to be seen; for that I used the option: ‘Hide nodes/edges labels if not in filtered graph’.
After doing this and applying the black background preview setting, the visualization finally looked pretty decent:
Please go here to check the version exported with Sea Dragon plugin where you can zoom and move around!
Well, if you like the result (or you don’t but you want to play with this and get a better viz!), I just uploaded a new version of Bio4j GO Tools viewer where you can download the corresponding .gexf file for your GO annotations XML file.
Just press the button highlighted in the screenshot and enter the URL for your GO annotations XML file:
(You can use the public EHEC GO annotation results URL I used as a sample for this post: https://s3-eu-west-1.amazonaws.com/pablo-tests/EHECAnnotationVersion2.xml )
So, that’s all for now, please let me know if you play around with this and get some cool visualizations!
Opinions expressed by DZone contributors are their own.
Future of Software Development: Generative AI Augmenting Roles and Unlocking Co-Innovation
Decoding ChatGPT: The Concerns We All Should Be Aware Of
Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL
New ORM Framework for Kotlin