Skip to end of metadata
Go to start of metadata

tranSMART tree Hierarchy or Ontology refers to the overall organization and representation of the study concepts in the tranSMART User Interface (UI) as well as to recommended Ontologies and terminology to be used for data curation. 

For the purpose of developing tranSMART tree Ontologies we will group studies into 2 major Categories:

I. Drug Discovery and Development Studies

I.1 Discovery Studies

Data from Wet Lab experiments to discover new drugs, new drug targets or biomarkers of response to specific classes of drugs can be classified as "Discovery Studies". Data collected in these types of studies are usually phenotypical information about cells and cell lines, readouts of cell based assays, in-vitro protein-protein, protein-DNA and other interactions. Discovery stage studies can also include experiments on disease animal models. New targets and drug candidates discovered in this phase don't always proceed to the next phase of drug development. But analyzing this data with genetic and other biomarkers could generate additional insight into disease mechanism and help plan new discovery efforts or development of personalized treatment. 

An example of a "Discovery" type study is an in-vitro experiment to test cell line response to anticancer drugs. Cancer Cell Line Encyclopedia (CCLE) Drug Sensitivity Test Study includes IC50 and EC50 results of anti-cancer compound screen using CCLE cell lines with known mutations. The result of this study can be used to correlate cancer mutations to efficacy of different compound classes. 

I.1a. Case Study: CCLE Drug Sensitivity Test Study

Cell Line Information includes CCLE annotation mapped to standard ontologies. Cell Line Ontology is used for cell line names. SNOMEDCT is used for disease terms.

Mutations are annotated using COSMIC CatalogEntrez Gene ID, and HUGO Symbol

EC50 and IC50 results are loaded for each cell line per compound tested. CHEBI Ontology is used for compounds names. If a high number of compounds is being tested, they can be loaded into higher level Ontology groups (e.g. Tanespimycin belongs to Macrocycle group)

I.2 Preclinical Efficacy and Safety Studies

I.3 Clinical Trials

Clinical trial data combined with genetic or high dimensional exploratory biomarkers can provide a treasure trove of data for exploratory research. tranSMART tree Hierarchy design can facilitate navigation and interpretation of different observations, events, measurements, safety and efficacy assessments and other patient level data collected in the course of a clinical trial.

eTRIKS project of the European Innovative Medicines Initiative (IMI) has recently released Standards Starter Pack Guidelines for IMI members with recommended Ontologies for data curation and tranSMART Master Tree Hierarchy developed in accordance with CDISC SDTM (Study Data Tabulation Model). Standards Starter Pack can be downloaded using the link on this page in the Resources section (1). SDTM defines a Standard Structure for data to be submitted to the FDA. FDA Janus Clinical Trials Repository (CTR) is built for SDTM formatted data. Maintaining consistent format would greatly facilitate the use of clinical trial data for exploratory research. At the same time, the differences between CTR relational and tranSMART dimensional database model for patient level data has to be taking into consideration for tranSMART tree Hierarchy to maintain data integrity. In tranSMART version 16.2 and earlier all subject level clinical data are loaded into "Observation Fact" table. Related observations are associated through the category pass. 

Original data structure, user group, planned data use and tranSMART database architecture are the four main factors to consider when designing a master tranSMART tree for a project. The intend of this Library is to collect examples of different tranSMART trees used by the Community.

I.3a tranSMART tree SDTM Format:

Moving data from a traditional relational data base into tranSMART is challenging, especially for “Events Class” SDTM Domains. Some of the data has to be loaded multiple times to facilitate typical analysis.

I.3aa tranSMART tree SDTM Format AE Domain:

I.3b tranSMART tree for Parallel Design Clinical Trial:

tranSMART tree Hierarchy is designed as a narrative to Clinical Trial protocol - observations, events and measurements designated as Safety Endpoints are loaded under Clinical Data - Safety Endpoints; assessments, measurements and calculated scores collected and analyzed to assess efficacy are loaded under Clinical Data - Efficacy Endpoints. At the same time, SDTM domains data structure is maintained. Corresponding domain codes are added to the relevant upper level folders (e.g. Adverse Events (AE), Vital Signs (VS),Medical History (MH), etc.). STDM codes are added to Data Labels (e.g. AE by Severity Scale (AESEV), AE by Body System Organ Class (AEBODSYS), etc.) "Narrative" design allows biologists and non-clinical analysts to easily navigate data. SDTM codes can be used to cross-reference tranSMART loaded data with the original data stored in a restricted access clinical trial database. Note that Adverse Events are loaded multiple times to associate each AE "observation fact" with other "observation facts" categorizing Adverse Events.

I.3b tranSMART tree for Parallel Design Clinical Trial curated for loading with tMDataLoader (pre-installed on tranSMART 16.2) can be downloaded using the link on this page in the Resources section (2).

II. Observational Studies/Natural History of a Disease

II.1 Cross-sectional study

II.2 Case-control study

II.3 Cohort study

II.3a tranSMART tree for Cohort NEPTUNE Study

The Nephrotic Syndrome Study Network (NEPTUNE) is a North American multi-center collaborative consortium established to develop a translational research infrastructure for Nephrotic Syndrome. The NEPTUNE study is a prospective observational study that enrolls diverse population of patients reflective of the disease burden in North America. The two primary study outcomes are change in urinary protein excretion and change in renal function. Custom controlled dictionary and tranSMART tree Ontology has been developed for this study based on the standard terminology used by disease experts and on the standard analysis results and assessment categories.

tranSMART tree for NEPTUNE Study curated for loading with tMDataLoader (pre-installed on tranSMART 16.2) can be downloaded using the link on this page in the Resources section (3). NEPTUNE tranSMART tree Ontology file is also attached (4)

II.3b tranSMART tree for Cohort ADNI Study

The Alzheimer’s Disease Neuroimaging Initiative (ADNI) researchers collect, validate and utilize data such as MRI and PET images, genetics, cognitive tests, CSF and blood biomarkers as predictors for the disease. Data from the North American ADNI’s study participants, including Alzheimer’s disease patients, mild cognitive impairment subjects and elderly controls cohorts. 

Data is organized by Domain

Vasco Verissimo ADNI Curation Tese de Mestrado was used to create Test ADNI Study.

Example Study can be found in Resources (5) or on Google Drive.

II.4 Ecological study



  1. eTRIKS-Standards-Starter-Pack-v1.0.pdf
  2. Rheumatoid Arthritis_STUDYABC_NewDrugABC Phase
  4. NEPTUNE_tranSMART_v1_4_Ontology.xlsx
  5. ADNI
  • No labels