Research

We want to understand how different cell states emerge in diverse biological systems (viruses, bacteria, and mammalian cells) and use this knowledge to generate desired and even new cell states in a dish. To do so, we develop new technologies to measure the expression levels of thousands of genes in single cells. These high throughput measurements create two challenges. First, the resulting data sets are high-dimensional and difficult to interpret. To understand the data, we use tools from differential geometry and machine learning. Second, high throughput measurements destroy the cells and provide only static snapshots. To obtain information about the dynamics, we use synthetic biology to engineer cells to record their histories in their own DNA. Finally, we develop organoid systems and microfluidic platforms to control cell states in vitro using the understanding obtained from the high throughput data.

 

Cancer biology

Dysregulation of transitions between mammalian cell states can lead to diseases such as cancer. We have developed a method to read out somatic mutations alongside the transcriptome of individual cells using commercial droplet-based single-cell sequencing platforms. We have applied this technology to profile cells obtained from the bone marrow of patients with a certain type of blood cancer (called myeloproliferative neoplasm or MPN), and measured how the differentiation trajectory of the mutated blood cells deviate from that of the wild-type cells. We also reconstructed the lineage tree of individual blood stem cells using the patterns of spontaneous somatic mutations accrued in their genomes over time. Strikingly, we found that the cancer mutation occurred in a single blood stem cell several decades before disease diagnosis in two newly-diagnosed MPN patients –at age 9 in a 34 year-old patient, and age 19 in a 63 year-old patient. Our findings raise the possibility of detecting and eradicating the mutated cells before cancer ever emerges.

 

Synthetic biology

Recording histories of cells in their own DNA. A fundamental challenge in modern biology is to reveal the developmental lineage tree of a multicellular organism. John Sulston accomplished this feat for the worm C. elegans by observing its development and hand-drawing its lineage tree. Is it possible to map the lineage history of trillions of cells without directly observing every division? At Caltech, we developed a synthetic platform called MEMOIR for individual cells to autonomously record their lineage history in their own DNA. Working with Fernando Camargo's lab, we recently extended this framework to record the lineage histories of single cells in vivo in engineered mice. Our platform uses Cas9 to generate heritable mutations in synthetic target arrays that are transcribed and read out in individual cells using single-cell RNA sequencing. We have used our CARLIN (CRISPR array repair lineage tracing) mice to study the behavior of individual blood stem cells.

 

Technology development

Single-cell RNA sequencing of bacteria. Single-cell RNA sequencing of mammalian cells has revolutionized our understanding of cell states in mammalian development and disease. Working with Adam Rosenthal, we have recently developed a method to sequence the transcriptome of individual bacterial cells. Our method uses DNA probes and leverages existing commercial microfluidic platforms (10X). We used this method to correctly identify known cell states and uncover previously unreported cell states across different types of bacteria and growth conditions. Our high throughput, highly resolved single cell transcriptomic platform can be broadly used for understanding heterogeneity in microbial populations with applications for understanding the microbiome and antibiotic resistance.

 

Theory and Computation

We use a diverse set of theory and computational tools to model and understand cell states. For example, we have used biophysical models from first principles to understand how mechanical forces on DNA can regulate gene expression. We have also used concepts from differential geometry to understand the geometry of single-cell gene expression data. We are actively working on developing machine learning approaches for understanding high dimensional biological data sets using tools from mathematics such as graph theory.

 

Controlling cells states in vitro

Ultimately, our goal is to use the knowledge from our measurements and models to control cell state in a dish. To do so, we develop new technology platforms for manipulating and observing cells in vitro. For example, we have developed microfluidic platforms to observe the cell cycle of individual mammalian cells across tens of generations. We also develop 3D culture systems of stem cells to accurately mimic embryonic development. We have successfully developed the first 3D organoid model of human pluripotent stem cells that recapitulates the development of somites in vitro. We are using these systems to generate cell states that can be transplanted into patients to treat disease.