ERF 1043


Distinguished Lecturer Series

Sorin Istrail

Brown University

Eric Davidson’s “Regulatory Genome” for Computer Science


In his book The Regulatory Genome: Gene Regulatory Networks (GRN) in Development and Evolution (Academic Press 2006), Eric Davidson, the foremost experimentalist of regulatory genomics, forcefully reminds us that in the scientific method, causality is everything; all other approaches are just distractions. In contrast, Davidson — a notoriously elegant writer — offers devastating criticism of the “posterior Biology” approaches all too impatiently employed today — the “measure first” expression of thousands of genes and then “computationally infer Biology.” The last century’s luminaries of mathematical statistics taught us in no uncertain terms that causality cannot be inferred from statistical tables. Davidson aligns with them, adding to their argument a practical dose of reality. The exquisite regulatory mechanisms, locked down by evolution, can only be revealed through systematic experimental perturbations. In the absence of the ocean deep “prior Biology” knowledge, no amount of clustering statistics, or other skinny deep dives, would be able infer “Biology.” Like his mentor Max Delbruck, and with the sea urchin genome in hand, Eric Davidson become the leading liberator of quantitative principles of cell regulation, trapped in the qualitative, descriptive world of biology without genomic sequence.


In this talk we will discuss several computer science problems, inspired by our 15-years-long collaboration with Professor Davidson, who died in 2015, and rooted in his seminal research on causality, completeness, genomic Boolean logic, and genomically encoded regulatory information. Our collaboration produced the CYRENE cisGRN-Lexicon database containing the regulatory architecture of 600+ transcription-factor-encoding genes and other regulatory genes in eight species: human, mouse, fruit fly, sea urchin, nematode, rat, chicken, and zebrafish; and the CYRENE cisGRN-Browser, a full genome browser dedicated to cis-regulatory genomics.

Professor Davidson’s legacy consisted of 400+ papers and six books; he mentored about 300 Ph.D.s, postdocs and faculty in his laboratory in the Division of Biology at California Institute of Technology. He was also a beacon of critical discourse. In this spirit, our presentation will include some critical comments about “computational systems biology considered harmful” avenues. As our beloved teacher and mentor, Davidson united us —biologists, physicists, biochemists, engineers, mathematicians and computer scientists, like in his CalTech Laboratory — in a research renaissance movement towards the quest for the functional meaning of DNA. From such research will ultimately come, by experimental demonstration, the revelation of the much-sought laws of regulatory Biology.


Sorin Istrail is the Julie Nguyen Brown Professor of Computational and Mathematical Sciences and Professor of Computer Science at Brown University. He is the former Director of the Center for Computational Molecular Biology at Brown University. Before joining Brown, he was the Senior Director and then Head of Informatics Research at Celera Genomics, where his group played a central role in the construction of the Sequence of the Human Genome; they co-authored the 2001 Science paper “The Sequence of the Human Genome,” which, with over 12,000 citations to date, is one of the most cited scientific paper. His group at Celera Genomics also built a powerful suite of genome-wide algorithms that was used for the comparison of all human genome assemblies to date. In 2002 his Celera group in collaboration with the company ClearForrest won the ACM KDD Cup – the top international data mining/machine learning competition – the challenge then was the automatic annotation of a section of the Drosophila genome. In 2003 he joined the ranks of Applied Biosystems Science Fellows, one of just six Science Fellows in a company of 800 scientists. Before Celera, Professor Istrail founded and led the Computational Biology Project at Sandia National Laboratories (1992-2000). In 2000, he obtained the negative solution (computational intractability) of a 50 years old unresolved problem in statistical mechanics, the Three-Dimensional Ising Model Problem. This work was included in the Top 100 Most Important Discoveries of the U.S. Department of Energy’s first 25 years, and as the 7th top achievement of DOE in Advanced Scientific Computing. Professor Istrail’s research focuses on computational molecular biology, human genetics and genome-wide associations studies, medical bioinformatics of autism, multiple sclerosis, HIV, preterm labor and viral immunology, algorithms and computational complexity, and statistical physics. In 2014 he co-founded the “Grigore Moisil” Institute for Computer Science and Applications at the University “Alexandru Ioan Cuza” Iasi, Romania. He is Editor-in-Chief of the Journal of Computational Biology, and together with Pavel Pevzner and Mike Waterman, he is co-founder of the Annual International Conference on Research in Computational Molecular Biology – RECOMB Conference series; he is also co-Editor of the MIT Press Computational Molecular Biology series and is co-Editor of the Springer-Verlag Lecture Notes in Bioinformatics series. He is Professor Honoris Causa of the “Alexandru Ioan Cuza” University, Iasi, Romania.

Sorin Istrail obtained his PhD in Computer Science from University of Bucharest having his beloved professors Solomon Marcus and Sergiu Rudeanu as his PhD advisors. He did his postdoctoral studies in computer science with Professor Albert Meyer of the Laboratory of Computer Science, MIT. He also did his postdoctoral studies in Molecular Biology with Professor Eric Davidson of the Division of Biology at California Institute of Technology.