Jakubek Swartzlander, Yasminka A.
Alzheimer’s Disease Neuroimaging Initiative (adni_mosaicism)
In this work we will generate mosaic mutational profiles for adni WGS ~ 60X coverage for 800 individuals. These data are stored in VCF files (one per chromosome, 23 total).  We have permission for data download for this project.
Personnel:
Sasha Jakubek Swartzlander (PI)
Aaron Smith (data scientist)
Abraham Dutch (data scientist)
Software:
vcf tools, eagle (genotype phasing)
MOCHA
hapLOH
bedtools
TOPMED admixture mapping (topmed_admixture_mapping)
This work is funded by an NCI award (K22). We will be working with processed sequencing data from a few thousand individuals to identify differences in mosaic mutational patterns across individuals and DNA segments of different genetic ancestry. The amount of data loaded at any one time to the cluster will vary. These data come from ~80,000 individuals but we will work with at most 10,000 at a time using vcf data for a single chromosome.  We have permission for data download for this project.
Personnel:
Sasha Jakubek Swartzlander (PI)
Aaron Smith (data scientist)
Abraham Dutch (data scientist)
Software:
MOCHA and custom scripts
Breast tissue whole genome sequencing project (komen_pilot_normal_breast)
We will work with ~ 25 BAM files from whole genome sequencing (WGS) data ~40X coverage to investigate copy number profiles in these breast tissues. We are working on the data use agreement for these data.
Personnel:
Sasha Jakubek Swartzlander (PI)
Aaron Smith (data scientist)
Abraham Dutch (data scientist)
Software:
GATK
vcf tools
eagle (genotype phasing)
MOCHA
hapLOH
bedtools
Center for Computational Sciences