RefinedScience AML Dataset
RefinedScience
This longitudinal dataset includes 393 newly diagnosed Acute Myeloid Leukemia (AML) patients treated with Venetoclax plus Azacitidine as a first-line therapy. The dataset tracks patients from their diagnosis date (EVENT) through treatment, follow-up, and subsequent therapies. It is derived from the RefinedScience platform, which combines high-fidelity clinical data with single-cell omics to identify novel drug targets and stratify patient risk.
Key features include access to companion analysis tooling for patients who had CITE-seq performed on bone marrow specimens at diagnosis and/or follow-up timepoints. This unique dataset supports precision medicine by linking granular clinical outcomes with deep molecular profiling.
Datasets
RefinedScience AML Dataset
Data Collection Time Frame
2015 - 2025
Participants
Current: 393 unique patients (3,186 rows of data)
Geographical Coverage
Aurora, Colorado, USA (UCHealth)
Usage examples
- Drug Discovery & Biomarker Research: Identify and validate novel drug targets or biomarkers for AML patients who develop resistance to standard therapies.
- Predictive Modeling: Develop machine learning models to predict long-term survival and risk stratification for patients treated with Venetoclax and Azacitidine.
- Outcomes Analysis: Analyze the impact of evolving toxicities, hospital events, and short-term disease responses on overall survival.
- Single-Cell Analysis: Utilize the companion analysis tooling to explore gene expression and protein markers (CITE-seq) in bone marrow specimens to understand disease heterogeneity.
- Molecular Subtype Analysis: Explore mutation-specific and molecular response patterns across treatment and relapse.
- Translational Trial Design: Inform biomarker enrichment strategies and combination treatment therapies for Venetoclax-based regimens.
Data modalities
EHR data
- Demographics
- Lab results
- Med history
- Pathology/Molecular reports (available at diagnostic and follow Events)
- Therapy
- Social history
- Procedures
- Hospitalizations and ICU visits
Derived data
- Treatment response
- Risk classifications (eg. ELN)
Omics data
- CITE-seq (via companion analysis tooling)
- Integrated molecular profiling, including cytogenic and targeted sequencing data (via companion analysis tooling)