Nagarajan, Radha*
Not a current user.
Affiliation: Radha Nagarajan, Ph.D. Associate Professor, Division of Biomedical Informatics, College of Medicine, University of Kentucky
Project: Implementing Scalable Analytics Pipelines for Large Observational Healthcare Data
The aim of the proposed work is to implement healthcare data analytics pipelines using open-source Big Data ecosystems for large observational healthcare data sets such as those from electronic health records, claims data and other sources with potential to assist in clinical, operational and business decision making. These data sets can span multiple terabytes and thus demand sophisticated infrastructure for their querying and analysis. The high-performance computing infrastructure support will be used install necessary opens-source in-memory Big Data tools that can assist primarily in querying and analysis of large healthcare data sets that are infeasible using existing infrastructures. Performance of traditional as well as novel predictive and prescriptive modeling frameworks investigated within the proposed pipeline.
HPC cluster will be used to parallelize some of the algorithms I am currently developing for translational bioinformatics, personalized medicine and systems biology. The objective is to enable hypothesis testing, validation and discovery. The data sets I will be investigating include those obtained generated from high-throughput molecular assays such as microarrays and next-generation sequencing. These data sets provide genome-level resolution of disease phenotypes. They are large and high-dimensional by nature and their analysis is computationally intensive. Traditionally, constraints and filters are imposed as a part of the analytics pipeline. However, these ad-hoc constraints often impose bias on the analysis and limit knowledge discovery. I believe parallelizing the necessary algorithms across high-performance computing infrastructure would enable timely discovery of meaningful patterns from these data sets in an unbiased manner. This project falls in line with the mission of the Center for Clinical and Translational Science (CCTS) Award at the University of Kentucky.
Collabortors
Eric Durbin (KCR)
Sally Ellingson-Dockery (KCR)
Software
R x64 2.15.2 (windows)
Hadoop
Publications
Grants
Center for Computational Sciences