Work

Elucidating Pleiotropy of Genetic Variability on Human Complex Traits using Phenome-wide Analysis

Public

Downloadable Content

Download PDF

Understanding the complex genome-phenome associations behind human complex traits will be a primary focus for the practice of precision medicine in the future. Identifying the genetic variants that contribute to the inter- and intra- phenotypic variations of individuals, elucidating pleiotropic architecture of common complex traits, and demonstrating how personal biomedical data, including genetic data, can be leveraged to estimate certain clinical outcomes are key challenges to be addressed. A number of population-based studies with genomic and deep phenotype data have become massively available recently, providing novel research opportunities for the interrogation of genetic-phenotypic associations on unexplored genetic landscapes. The extensive phenotypic information encoded in large-scale biomedical data, ranging from genotype data and electronic health records (EHR) to longitudinal studies in social sciences, are powerful resources for researchers to characterize the pleiotropic architecture of human complex traits. My projects aim to comprehensively extract those underlying genetic mechanisms in combination with recent advanced computational methodologies, including natural language processing (NLP) techniques, supervised/unsupervised machine learning (ML), and predictive modeling. By leveraging various types of bioinformatic data with relevant computational approaches, the genetic pleiotropy of several human traits can be catalogued into a pragmatic dimension with descriptive phenotypes, which complements the current knowledge gaps with genetic analytics. The goal of my thesis is to implement an interdisciplinary platform of phenome-wide analysis to discover novel human genome-phenome associations utilizing multidimensional biomedical data. As a promising tool to systematically evaluate the sizeable amount of personal health data, phenome-wide analysis is implemented across both biomedical and social sciences domains, expanding its methodological utility to non-clinical traits. In this dissertation, I aim to address the following challenges with large-scale phenome-wide analysis: (1) identifying individuals with high risk for Polycystic ovary syndrome (PCOS) with a polygenic and phenotype risk score (PPRS) prediction model validated by ML techniques (Chapter 2), (2) investigating genetic epidemiology of diverticular diseases in conjunction with genome-wide association studies (GWAS) based on large multicenter EHR data with NLP technology (Chapter 3), and (3) exploring pleiotropic structure of human intelligence based on the cognitive test scores from a 60-year longitudinal social survey data, utilizing unsupervised cluster analysis (Chapter 4). Taken together, my dissertation establishes a broad new paradigm of phenome-wide analytics as an innovative tool to annotate the pleiotropy of diverse genetic components across multiple scientific domains, from medical sciences to social sciences. The recent advancement in computational methodologies using large-scale data will be additionally blended with phenome-wide analytics and will demonstrate efficient use cases for the optimal utilization of biomedical big data. In the onset of the era of precision medicine, I expect this thesis to contribute to the realization of tailored prognosis, diagnosis, and intervention by better understanding the phenotypic heterogeneity of human complex traits through effective analysis of phenome-wide and genome-wide data.

Creator
DOI
Subject
Language
Alternate Identifier
Keyword
Date created
Resource type
Rights statement

Relationships

Items