Current Research Projects

NIH – R01 “Quantifying Molecular Consequences of Human Missense Variants with Large-Scale Interactome Perturbation Studies,” joint with Dr. Hiyuan Yu of the Weill Institute for Cell and Molecular Biology and Department of Biological Statistics and Computational Biology at Cornell.

The dramatic increase of DNA variants discovered through advances in sequencing technologies has been inadequately translated into therapeutic successes.  Although many of these variants are related to human disorders, the overwhelming number of non-functional variants makes the assessment of functional significance a steep challenge.  In this study, we aim to develop a high-throughput pipeline to quickly clone and directly test a large number of coding variants for their impact on the human interactome network and use the results to build a machine learning pipeline to predict functional impact of all coding variants, in anticipation that both our experimental data and computational pipeline will lead to broad clinical and therapeutic applications.

NIH – R01 “Genetic Transmission of Components of the Human Gut Microbiome,” joint with Dr. Ruth Ley, of the Max Planck Institute for Developmental Biology, Dr. Timothy Spector, of the Department of Genetic Epidemiology at King’s College, London and Dr. Ilana Brito, of the Meinig School of Biomedical Engineering at Cornell.

Despite the importance of both variation in the human genome and variation in the gut microbiome to human health, there is currently little knowledge connecting the two. But it is likely that variation in the human genome can result in differences in the composition of the gut microbiota with potential impact on disease outcomes. Results obtained from the proposed research will bridge this knowledge gap, and will ultimately be used to improve the lifestyles of individuals suffering from common diseases such as obesity and diabetes, and to develop preventative measures to mitigate the manifestation of disease.

NIH – R01 “Heterochromatin and Satellite Repeat Sequence Variationin Natural Populations, joint with Dr. Daniel Barbash

This project aims to apply novel bioinformatics approaches and experimental designs to quantitatively describe and to understand the process of turnover of heterochromatic satellite DNA sequences in the genome. Focusing on the genomes of Drosophila species, we will make use of analysis of both short repeated “words” or “kmers” and complex satellite structures in the genome sequences of inbred lines of diverse species and of mutation-accumulation lines. We will model satellite changes as a Gaussian process, and score their meiotic behavior by testing for departures from Mendelian segregation by genome sequencing of backcross progeny.

NIH – R01 subcontract with University of Arizona “Reference-Quality Drosophila Genome Assemblies for Evolutionary Analysis of Previously Inaccessible Genomic Regions,” joint with Dr. Road Wing of the University of Arizona and Manyuan Long of the University of Chicago.

Dr. Clark’s role on this project is to investigate the evolution of previously inaccessible regions of the genome of Drosophila, including piRNA clusters, heterochromatic repeats and the Y chromosome.  The Clark lab will generate inbred lines and their F1 hybrids of Drosophila melanogaster and several other species of Drosophila for purposes of analysis of polymorphism and evolution of piRNA clusters and of heterochromatic repeats.  The whole-genome library construction and sequencing will be done at the University of Arizona by the lab of Dr. Rod Wing, and piRNA and miRNA libraries will be constructed and sequenced at Cornell.   Much of the effort in the Clark lab will be in computational analysis of these sequences, producing and testing the assemblies, and developing models of evolutionary divergence of these  genome elements (piRNA clusters and heterochromatin).  They will also examine the structure and organization of Y chromosomal genes and repeats.  The results will be shared with Drs. Manyuan Long (Chicago) and Rod Wing (University of Arizona) in this collaborative effort.

NIH – R01 “Regulation of Gamete Use and Neural Pathways in Reproduction,” joint with Dr. Mariana Wolfner.

Using Drosophila melanogaster as a model for male- and female-derived proteins that interact after mating and prior to fertilization, this project aims to test the roles of candidate genes for this process by a series of knockdown experiments tested across a range of natural variation. Aim 1 considers genes that are expressed in the female, while Aim 2 is focused on genes expressed in the male.   The project has significance to understanding the molecular nature of mating interactions, and we anticipate that the results will be relevant to idiopathic infertility in humans, which appears to arise from a reproductive incompatibility between the particular pair of individuals involved.

NIH – R01 “Population Genetic Inferences from Dense Genotype Data,” joint with  and Dr. Rasmus Nielsen of UC Berkeley.  (Renewal Pending)

This project aims to understand the population-level forces at play on the human genome by analysis of genome-wide SNP data and next-generation sequences using newly developed statistical methods.  Estimation of model parameters from alignments of next-generation sequence reads will be done so as to accommodate base-calling uncertainty, and segment-wise inference of ancestry in admixed genomes will be applied to understand past admixture history. Identity-by-descent methods will be pursued to allow the most reliable inferences about demography, natural selection and other population forces acting on human genetic variation.

NIH – R01 “Population Genetic Consequences of Explosive Population Growth in Humans,” joint with Dr. Alon Keinan, Cornell; Dr. Yun Song of UC Berkeley; and Dr. John Novembre, Univ. of Chicago.

This project will develop methods of population genetic analysis to understand the role of recent rapid population expansion in shaping patterns of variation in human populations. Improved methods for genetic inference in the face of such rapid growth will be developed, correcting the misapplication of standard methods which were developed for stable populations. Rapid population expansion dramatically inflates the abundance of rare variants in the population, and the impact of this on the genetic architecture of human disease risk will be quantified.

NIH – R01 “Functional & Comparative Genomics of Drosophila Immunity,” joint with Dr. Brian Lazzaro.

This project aims to experimentally define and quantitatively model the regulation of the innate immune response, emphasizing the balance of cis-acting and trans-acting genetic variation and the role of regulatory microRNAs. A well-characterized reference set of Drosophila melanogaster genetic lines will be exploited for experimental work and for development of the quantitative model. This project is aimed at understanding how genetic variation in populations mediates individual differences in the efficacy of immune defense against microbial infection.