Computational Biology: Simulating Science, Accelerating Discovery

Jha explains that computational biology allows researchers to simulate biological processes in the same way the aviation industry uses models to simulate flight conditions. "You cannot test your plane in every condition; it's a very expensive and slow process," he notes. Similarly, researchers can test and refine models before applying them in real-world experiments, making the process more efficient.

By iteratively refining these models based on experimental feedback, scientists can better understand intricate biological mechanisms. "If the model is good, you can make some predictions, go back, and test it. If not, you improve your model," Jha illustrates.

The Trillion-Dollar Problem: Bridging the Gap Between Tech and Life Sciences

One of the most critical challenges between tech companies and life sciences is the sheer volume of data available. "They think they have a trillion-dollar problem, but they have hundreds of ten-million-dollar problems," states Jha. As he explains, the entire paradigm must shift to adapt to a world where many valuable problems exist, rather than focusing on a singularly valuable one.

Tech giants like Google benefit from access to billions of users generating data daily, providing them with an immense pool of information to fuel AI models. In contrast, life sciences data—while equally critical—remains far scarcer and harder to obtain. For instance, even large clinical trials for widespread conditions like breast cancer typically involve only a few thousand patients, a scale that pales in comparison to the vast data sources available to tech companies.

"There's a significant disparity," Jha points out. "How much data about my purchasing habits is available versus my health data? We don't have access to that, nor is it consolidated in one single place."

Clean Data Over Big Data: The Key to AI Success

Data cleaning and preparation are often underestimated in AI workflows, particularly in life sciences. Abhishek Jha compares this to "technical debt," where rushing to amass data without proper structuring leads to long-term challenges. For instance, inconsistent terms like "hepatitic carcinoma" and "liver cancer" across datasets create inefficiencies, forcing researchers to spend months standardizing data before analysis.

"Clean data is always more valuable than more data," Jha emphasizes. "Investing in a thoughtful data strategy is crucial for AI success." Without standardized terminology and organized datasets, life sciences risk delaying breakthroughs and limiting the potential of AI-driven innovation.

About Abhishek Jha

Abhishek Jha is the Co-Founder and CEO of Elucidata, a trailblazing biotech company recognized as one of Fast Company's Most Innovative Biotech Companies of 2024. Under his leadership, Elucidata has transformed biological discovery by leveraging data-centric approaches and cutting-edge AI and ML innovations.

Elucidata's flagship platform, Polly, accelerates research by harmonizing multi-modal biomedical data—such as Omics, Assay, Real-World, Clinical, EHR, and CRO data—into a Unified Data Model. Polly's LLM-powered curation enables 10x faster data preparation with 99.99% accuracy, drastically reducing the time to actionable insights and advancing precision medicine.

Before founding Elucidata in 2015, Abhishek spent nearly six years as a Senior Scientist at Agios Pharmaceuticals, contributing to the development of four first-in-class drugs. His experience includes designing algorithms to integrate high-throughput metabolic data with transcriptional data and deploying visualization tools for clinical applications.

Abhishek holds a Postdoctoral Fellowship from MIT, where he developed computational models in immunology, and his career has been defined by a passion for combining technology and science to revolutionize patient outcomes. Beyond his professional pursuits, Abhishek enjoys discussing innovations in biotech and mentoring the next generation of entrepreneurs.

