For this week’s 5-Minute Interview, I chatted with Irina Balaur, Postdoctoral Researcher at the European Institute for Systems Biology and Medicine in Lyon, France. I caught up with Irina at GraphConnect Europe last April.
Here’s what we talked about:
What Project Are You Currently Working On?
Irina Balaur: I’m using Neo4j as part of the eTRIKS project, which is a large European collaboration between research, academia and pharmaceutical companies to develop a knowledge management system for translational medicine. The goal is to facilitate the development of tools that can be used easily by clinicians. I’m using Neo4j to develop the components and framework as part of a multi-scale approach in systems biology and medicine.
Why Did You Choose Neo4j?
Irina: It was the best solution we were able to find for data management and integration, and we are still very happy with it. Biomedical data is heterogeneous and highly-connected, and we needed to infer new associations among concepts. For example, if we have a protein which is a biomarker for a disease and we have a drug that targets that protein, we can infer a relationship between the drug and the disease.
Neo4j is a great tool for data integration. We use predefined algorithmic methods, which are either already provided by the Neo4j framework or which we develop through different programming languages.
What Were Some Surprising Results or Moments?
Irina: I was using the epigenetic interdependency framework as the database for my PhD project, which was a basic network I developed with a large amount of programming. After I discovered Neo4j, it only took a couple hours of work before I basically had transitioned the initial database with genetic and epigenetic interdependencies into the graph database. And I’m really happy with Neo4j because it’s much easier to infer associations. I wish I had learned about Neo4j four or five years ago.
My colleagues are really happy with the software as well. We have integrated data on human metabolic reconstruction, for example, in Neo4j. He was interested in exploring the subnetworks, neighborhoods and shortest pathways between different metabolites of interest for him, and he found it really helpful. Other colleagues are really pleased with Cypher because it’s a declarative query language.
If You Could Go Back, Is There Anything You Would Have Done Differently?
Irina: It’s not necessarily different. What I need now is to connect output from Neo4j — using tools like JSON that are well established by the systems biology community. We need more tools to share networks and do more data analytics, or some type of interface that would help clinicians communicate with biologists.