Work

Principled Analysis of High-Dimensional Data: How to Identify Signals while under the Curse of Dimensionality

Public

Modern data sets are increasingly vast, not only in the number of samples, but also in the number of measurements, or features, that they contain. This high-dimensionality poses a unique set of problems for data analysis due to a set of phenomena known as ``the curse of dimensionality.'' This thesis presents an overview of these challenges, some methods to address these challenges, known as dimensionality reduction (DR) methods, and some results on the efficacy and quality of these methods. Motivated by this overview and some heuristics for practical and principled use of DR methods, a novel method called EMBEDR is proposed and several interesting results from its application to single-cell omics data sets are discussed. In addition, an examination of another unique high-dimensional data set involving circular variables motivated the development of a novel regression scheme, which is proposed and detailed.

Creator
DOI
Subject
Language
Alternate Identifier
Keyword
Date created
Resource type
Rights statement

Relationships

Items