I started my graduate work at Harvard Medical School while bioinformatics and computational biology were still emerging fields, intrigued by the potential of genomics to answer fundamental questions of science and disease. Instead, I found myself engrossed in the even more nascent field of proteomics, which held the additional promise of a direct understanding of the complex world of protein interactions which guide the cell but are only hinted at in its DNA.
However, much work remained to be done to develop the computational methods necessary to make biological sense of the raw, numerical data. It was here that I discovered my passion for the engineering necessary to create the systems that power our new -omics driven world, and to do so with the throughput necessary to drive today’s science. At Genentech, my group develops the new algorithms necessary to analyze data from cutting-edge proteomic and related analytical methods and the data platforms necessary to analyze, organize, and visualize this data at scale.
Nat Biotechnol. 2016 Aug 9;34(8):811-3.
Bakalarski CE, Gan Y, Wertz I, Lill JR, Sandoval W.
Within the Computational Proteomics and Analytical Data Science group, our team of scientists and engineers works at the intersection of biology, technology, and computer science.
Our computational group’s work spans three major areas of impact: first, we work collaboratively with colleagues in diverse therapeutic disciplines to design, execute, and analyze the results of proteomic and other high-content data to address specific drug discovery research topics. Additionally, our group develops the novel algorithms and tools essential to analyze data generated by cutting-edge technologies developed at Genentech and beyond.
Lastly, we use modern software engineering concepts for the automation of data analysis within our own analytical data platform, where we leverage ontologies and semantic representations to build a unified knowledge graph of proteomic data, mineable for many applications including machine learning approaches.