Scientific Software

Computational Phenotyping Pipelines

Reusable analysis pipelines for latent structure, clustering, and subgroup discovery.

Scientific problem Human brain health data are high-dimensional, heterogeneous, and often poorly described by a single average effect.

Pipeline diagram from multidimensional clinical data collection through self-organizing maps, Gaussian mixture model phenotype discovery, and random forest explanation.

Why It Was Needed

Large cohort studies need reproducible ways to discover cognitive profiles, validate subgroup structure, compare model outputs, and interpret drivers without turning each analysis into a bespoke workflow.

What It Enables

These pipelines make it possible to identify hidden cognitive phenotypes, test sensitivity across cohorts, and connect subgroup membership to biological, psychosocial, and clinical drivers.

These pipelines are the computational layer for discovering hidden structure in heterogeneous translational data.

They are designed around reproducibility, model interpretation, and the ability to connect statistical structure to meaningful scientific and clinical questions.

Scientific Infrastructure

The pipelines combine dimensionality reduction, self-organizing maps, clustering, supervised modeling, variable importance, and visual interpretation. They are built to support both discovery and explanation: finding structure first, then asking what variables define or modify that structure.

Scientific Story

This thread began in NeuroHIV cognitive phenotyping and now supports broader translational questions in aging, Long COVID, mental health, sleep, and biomarker integration.

Connected Threads

Related Publications

2024 article highlighted

Identifying and distinguishing cognitive profiles among virally suppressed people with HIV.

Erin E. Sundermann, Raha Dastgheyb, David J. Moore, Alison S. Buchholz, Mark W. Bondi, Ronald J. Ellis, Scott L. Letendre, Robert K. Heaton, Leah H. Rubin

Neuropsychology

Identifies six cognitive profiles among virally suppressed people with HIV and the factors that distinguish them.

2023 article highlighted

Machine learning approaches to understand cognitive phenotypes in people with HIV

Shibani S. Mukerji, Kalen J. Petersen, Kilian M. Pohl, Raha M. Dastgheyb, Howard S. Fox, Robert M. Bilder, Marie-Josée Brouillette, Alden L. Gross, Lori A. J. Scott-Sheldon, Robert H. Paul, Dana Gabuzda

The Journal of infectious diseases

Frames machine learning as a tool for discovering cognitive biotypes in people with HIV.

2024 article selected

Biopsychosocial phenotypes in people with HIV in the CHARTER cohort

Bin Tang, Ronald J. Ellis, Florin Vaida, Anya Umlauf, Donald R. Franklin, Raha Dastgheyb, Leah H. Rubin, Patricia K. Riggs, Jennifer E. Iudicello, David B. Clifford, David J. Moore, Robert K. Heaton, Scott L. Letendre

Brain Communications

This paper helps define a research thread in computational phenotyping and cognitive subgroup discovery, providing context for how computational and translational evidence can be organized into reusable scientific systems.

2021 article highlighted

Patterns and predictors of cognitive function among virally suppressed women with HIV

Raha M. Dastgheyb, Alison S. Buchholz, Kathryn C. Fitzgerald, Yanxun Xu, Dionna W. Williams, Gayle Springer, Kathryn Anastos, Deborah R. Gustafson, Amanda B. Spence, Adaora A. Adimora, Drenna Waldrop, David E. Vance, Joel Milam, Hector Bolivar, Kathleen M. Weber, Norman J. Haughey, Pauline M. Maki, Leah H. Rubin

Frontiers in neurology

Uses self-organizing maps and random forests to characterize cognitive profiles in virally suppressed women with HIV.