Oh Lab - Sehyun Oh, PhD

I am an Assistant Professor at CUNY SPH, with expertise in both experimental biology and bioinformatics. As a molecular biologist by training, I had studied DNA repair and telomere maintenance mechanisms during my doctoral and postdoctoral research. As a bench scientist, I started to notice the limitations of arguing the extent to which my findings in cell lines were actually happening in living organisms and relevant to public health, and this made me interested in the potential of large public datasets. I made a career transition from a bench scientist to a bioinformatics scientist in 2017. Since then, I have worked on large public omics data analysis, statistical method development for high-dimensional data, Cloud-based computing, and user-friendly software development. Currently, I am developing an omics data repository designed for the easy application of Artificial Intelligence and Machine Learning tools, and incorporating histopathology images into multi-modal analysis. My overarching career goal is to facilitate interdisciplinary research by developing intuitive bioinformatics infrastructure and user-friendly tools that lower barriers across different disciplines and resources. In my free time, I enjoys ballroom dancing and exploring different neighborhoods in New York.

Positions


2023 - Present	Assistant Professor	CUNY School of Public Health	New York, NY, USA
2022 - 2023	Research Assistant Professor	CUNY School of Public Health	New York, NY, USA

Education and Training


Postdoc	Bioinformatics	City University of New York	New York, NY, USA
Postdoc	Microbiology	Columbia University	New York, NY, USA
Ph.D.	Molecular Biology	University of Minnesota - Twin Cities	Minneapolis, MN, USA
B.S.	Biological Sciences	Seoul National University	Seoul, Korea

Current Funding


2024 - 2029	NIH 1U01 CA230551 (Co-I), Exploiting public metagenomic data to uncover cancer-microbiome relationships.
	The human microbiome is implicated in the development and response to treatment of some cancers, including infectious agents estimated to be responsible for ~20% of the global cancer burden. However, previously unrecognized bacterial and viral strains, as well as loss of normal structure and function of human-associated microbiomes, likely play additional roles in disease etiology and treatment. This project investigates the role of the human microbiome in cancer by applying novel and state-of-the-art methods to published metagenomic data, and provides enhanced, expanded, and more efficiently usable microbiome data resources back to the cancer research community for a broad range of investigations.
2024 - 2029	NIH 2U24 CA180996 (Co-I), Cancer genomics: integrative and salable solutions in R/Bioconductor.
	Researchers gather a wide range of complex genetic information in order to comprehend the intricate factors involved in the development and treatment of cancer. This project develops, expands, and sustains essential software and data resources that aid cancer researchers in effectively managing and analyzing this information using advanced computational and statistical approaches.
2025 - 2027	NIH 2U24 CA180996 (Co-I), Multi-Omic Integration of TCGA Histopathology Images through A Cross-platform Pipeline Bridging Python and R/Bioconductor
	This proposal aims to create a robust computational infrastructure for integrating histopathology-derived image features with genomic, transcriptomic, and clinical data from large cancer cohorts such as TCGA.

Completed Funding


2022 - 2023	NIH 2U24 CA180996 AI/ML Supplement (Co-I), Cancer genomics: integrative and salable solutions in R/Bioconductor.
	A wealth of genomic datasets have been made publicly available and reusable to research communities; however many are not readily usable by Machine Learning (ML) algorithms. This project targets hundreds of primarily cancer-focused, multi-modal genomic datasets that have previously been harmonized for analysis in the Bioconductor Project for open-source Bioinformatics, and translates them to formats broadly suited for large scale ML.
2024 - 2025	PSC-CUNY Research Award (PI), Construct informatics infrastructure for transfer learning in biomedical research
	This proposal reinforce the pre-trained model for biological signatures (RAVmodel from GenomicSuperSignature) with manual curation, improving its usability and interpretability and providing the feasibility to expand our method to different model systems and biological modalities.