Linnen, Catherine
Catherine Linnen - A&S, Biology
Linnen Lab Research Overview and Computational needs (Fall 2013)
PI: Catherine Linnen
Postdoc: Dr. Matthew Niemiller
PhD Students: Robin Bagley, Kim Duong, John Terbot
Lab Manager: Adam Leonberger
Graduate: Ashleigh N Glover
Current Undergraduates (fall 2013): Alonna Ballinger, Mary Collins, Katherine Harper, Melanie Hurst, Ahmad Nawaz, and Taylor Shackleford
Current Undergraduates (Fall 2020): Jeremy Davis, Ryan Ridenbaugh - Added 08/03/2020
Research Overview
Genomics of Adaptation and Speciation Research in my lab is motivated by a desire to understand the mechanisms responsible for generating biodiversity. Biologists have long sought to understand how changes in DNA sequences (genotype) give rise to changes in the appearance and/or behavior of organisms (phenotype). To date, most of our successes in linking genotype to phenotype have been in a handful of model organisms in the lab (e.g., worms, yeast, fruit flies, and lab mice); however, recent advances in sequencing technologies and novel statistical methods are now making it possible to study this connection in a wide range of organisms in nature. This allows us to understand not only how genetic variation contributes to phenotypic variation, but also how evolutionary processes such as natural selection act on this variation to shape its distribution in space and time in the wild.
Using an integrative approach that combines lab and fieldwork and applies the tools of molecular phylogenetics, ecological genetics, and population genomics, I am linking genotype and phenotype in two systems: the deer mouse, Peromyscus maniculatus, and the pine-sawfly genus Neodiprion (order Hymenoptera; family Diprionidae). Deer mice and sawflies share several features that make them excellent study organisms, including: they are abundant and easy to find in nature; they can be kept under controlled laboratory conditions; crosses can be made between divergent populations and species; and they have many traits that are variable within and between species. To gain insight into how organisms adapt to their environment and why this process sometimes leads to the formation of new species, I specifically focus on those traits that impact the ability of individuals to survive and reproduce (e.g., cryptic coat color in deer mice) and/or contribute to reproductive barriers between populations and species (e.g., host use in sawflies). My long-term research goals in both systems are to: (1) identify the genes, mutations, and molecular mechanisms responsible for phenotypic differences between populations and species, (2) measure the strength of natural selection acting on those phenotypes and their underlying genes, (3) reconstruct the histories of ecologically important genes (gene trees) and the populations in which they occur (population/species trees), and (4) determine the extent to which the course of evolution varies across different organisms and traits.
Assembly of N. lecontei and N. pinetum draft genome
Description:
Neodiprion lecontei and Neodiprion pinetum are sister species that differ in a number of ecologically relevant traits, including host use and color. We are currently working to build genomic resources that will be used to help us identify the genetic changes responsible these differences. First, we are generating a de novo genome assembly using Illumina HiSeq data. Second, we are also assembling draft transcriptomes (project 2)
Personnel:
Kim Duong, Adam Leonberger, Matt Niemiller, Danielle Herrig, Catherine Linnen
Cole Railey, Added 03/05/2021
Kathryn M Everson, Added on LCC cluster 05/31/2023
Alaine C Hippee, PostDoc, Added on MCC cluster 08/08/2023
Jakeb N Wattrs, Added on MCC cluster 08/08/2023
Computational methods:
read processing, genome assembly, read mapping, and annotation.
Software:
fastx_toolkit, ea-utils, Jellyfish, Quake, SOAP_ec, Reptile, FLASH, SOAPdenovo2, Allpaths, BWA, SOAP.coverage, Maker, Augustus,GeneMarkES, SNAP, RepeatModeler, RepeatMasker, NCBI BLAST, BLAST2G0
Software availability: all programs are accessible on DLX EXCEPT fastx_toolkit (installation needs libgtextutils-0.6), Maker (installation needs CPAN and BioPerl), RepeatModeler (installation needs TRF), BLAST2GO. BLAST2GO is commercially available
N. lecontei and N. pinetum draft transcriptomes
Description: de novo and genome-guided transcriptome assembly using Illumina MiSeq and possibly HiSeq data
Personnel:
Kim Duong, Adam Leonberger, Matt Niemiller, Catherine Linnen
Computational methods:
read processing, transcriptome assembly, read mapping, annotation, and differential expression analysis
Software:
fastx_toolkit, ea-utils, Jellyfish, Trinity, GMAP, GSNAP, PASA, RSEM, NCBI BLAST, BLAST2GO), DESeq, edgeR, baySeq
Software availability: all programs are accessible on DLX EXCEPT fastx_toolkit (need libgtextutils-0.6), GMAP (need root access), GSNAP (need root access), PASA (installation needs MySQL, CPAN, DBD::mysql, BLAT), BLAST2GO, DESeq (installation needs Bioconductor), edgeR (installation needs Bioconductor) baySeq (installation needs Bioconductor). BLAT and BLAST2GO are commercially available
Population genomic analysis of N. lecontei
Description: Neodiprion lecontei occurs on a wide range of hosts and we hypothesize that divergence in host use may lead to species formation. To test this hypothesis, we will use genome-wide data from a range-wide sampling of Neodiprion lecontei and will use these data to assess the contributions of divergent host use and geography to diversification. This project will involve processing HiSeq data derived from multiplexed reduced-representation libraries prepared using the ddRAD protocol outlined in Peterson et al. PLOS ONE 2012. Initial analysis steps will involve filtering sequence data and calling SNPs, using both reference and reference-free approaches. Once SNPs are called, we will then apply multiple population genetic analyses to these datasets. Anticipated software needs are outlined below (although additional programs may be needed).
Personnel:
Robin Bagley, Matt Niemiller, Adam Leonberger, Claire O'Quin PostDoc, Catherine Linnen
Computational methods:
read processing and filtering, demultiplexing, read assembly, SNP calling, and population genetic analysis
SNP calling software (via https://github.com/brantp/rtd): BLAT, MCL/MCXLOAD, Muscle, Samtools, GNU Parallel, Numpy, Gdata, Editdist, Genome Analysis Toolkit (GATK) Unified Genotyper.
Population genetic software (tentative list):
Arlequin, Migrate, Structure, Spagedi, IMA2, Eigensoft
Software availability: to our knowledge, these programs are not currently available on DLX. They should all be available for (free) download online. Many can be installed on an as needed basis within individual students’ working directories.
Identification of genes underlying color variation in Neodiprion
Description: There is a tremendous amount of variation in larval color both within and between Neodiprion species. We have conducted a cross between two Neodiprion lecontei populations with very divergent color patterns and have measured phenotypes in F2 males. We will generate a genome-wide SNP dataset using same approach as for Project 3 and then we will perform a QTL analysis to identify regions of the genome that contribute to this color variation.
Personnel:
Taylor Shackleford, Adam Leonberger, Matt Niemiller, Claire O'Quin PostDoc, Catherine Linnen
Computational methods:
read processing and filtering, demultiplexing, read assembly, SNP calling, and QTL analysis
Software:
SNP calling software (via https://github.com/brantp/rtd): BLAT, MCL/MCXLOAD, Muscle, Samtools, GNU Parallel, Numpy, Gdata, Editdist, Genome Analysis Toolkit (GATK) Unified Genotyper.
QTL analysis (to be decided): rQTL?
Software availability: all programs should be freely available online. However, some QTL mapping programs must be purchased, so if we do choose one of these, we will be in contact.
Whole-genome phylogenetic analysis of eastern North American Neodiprion
Description: Eastern Neodiprion species diverged rapidly and recently, which has made it difficult to infer phylogeny from small numbers of markers. We therefore plan to generate a genome-scale dataset for phylogenetic inference. We will generate low-coverage whole genome sequences from paired-end Illumina hiseq data, map these to existing genome assemblies (in progress—Project #1), then perform phylogenetic analysis on the data.
Personnel:
Matt Niemiller, Adam Leonberger, John Terbot, Catherine Linnen
Computational methods:
read processing, genome assembly, read mapping, phylogenetic analysis
Software:
software for processing data, mapping to existing genome assembly, and calling SNPs is TBD, but probably will involve many of the same tools as for the other described projects. Additionally, we would most likely use the following phylogenetic/population genetic analysis software: BEAST, RAxML, LaGrange, RASP, IMA2, PAML.
Software availability: all programs should be available online for free.
Publications:
2013
- Linnen, C.R., Y.-P. Poh, B.K. Peterson, R.D.H. Barrett, J. G. Larson, J. Jensen, and H.E. Hoekstra. 2013. Adaptive evolution of multiple traits through multiple mutations at a single gene. Science 339: 1312-1316. (*selected by Faculty of 1000)
2012
- Linnen, C.R. and D.R. Smith. 2012. Recognition of two additional pine-feeding Neodiprion species (Hymenoptera: Diprionidae) in the eastern United States. Proceedings of the Entomological Society of Washington. 114: 492-500.
2010
- Linnen, C.R. 2010. Species-tree estimation for complex divergence histories: a case study in Neodiprion sawflies. In Estimating species trees: in practice and theory (L.L. Knowles and L.S. Kubatko, eds.). Wiley-Blackwell, Hoboken, NJ. (invited contribution).
- Linnen, C.R. and B.D. Farrell. 2010. A test of the sympatric host race formation hypothesis in Neodiprion (Hymenoptera: Diprionidae). Proceedings of the Royal Society of London B. 277: 3131-3138.
- Manceau, M. †, V.S. Domingues†, C.R. Linnen†, E.B. Rosenblum and H.E. Hoekstra. 2010. Convergence in pigmentation at multiple levels: mutation, genes, and function. Philosophical Transactions of the Royal Society B 365: 2439-2450. (†authors contributed equally)
- Weber, J.N., M.B. Peters, O.V. Tsyusko, C.R. Linnen, C. Hagen, N.A. Schable, T.D. Tuberville, A.M. McKee, S.L. Lance, K.L. Jones, H.S. Fisher, M.J. Dewey, H.E. Hoekstra and T.C. Glenn. 2010. Five hundred microsatellite loci for Peromyscus. Conservation Genetics. 11: 1243-1246.
Publications and grant funding arising out of use of DLX.
We have not yet submitted anything for publication.
Pending Funding:
preliminary results from project #1 (genome assembly) were used as preliminary data in a successful NSF proposal, which is now funding project #3. The proposal is entitled: “Comparative analysis of host-shift speciation in the redheaded pine sawfly, Neodiprion lecontei (Hymenoptera: Diprionidae)” 0,000; 2013-2016; PI: Catherine Linnen.
Grants:
Linnen, Catherine DEB-1257739 REU Supplement: Comparative analysis of host-shift speciation in the redheaded pine sawfly, Neodiprion lecontei National Science Foundation 3/10/2014 - 3/31/2016 SCOPE
Linnen, Catherine DEB-1257739 Comparative analysis of host-shift speciation in the redheaded pine sawfly, Neodiprion lecontei (Hymenoptera: Diprionidae) National Science Foundation 4/1/2013 - 3/31/2016 $418,656
Linnen, Catherine DEB-1257739 Comparative analysis of host-shift speciation in the redheaded pine sawfly, Neodiprion lecontei (Hymenoptera: Diprionidae) $203,932 National Science Foundation 4/1/2013 - 3/31/2016
Center for Computational Sciences