Skip to end of metadata
Go to start of metadata

Hackathon

Dates: Wed 5th February to Fri 7th February 9:00 to 17:00

Participants

eTRIKS/Imperial: Florian Guitton, Axel Oehmichen, Peter Rice

CTMM TraIT/The Hyve: Riza Nugraha, Ruslan Forostianov, Gustavo dos Santos Lopes, Kees von Bochove

University of Michigan: Terry Weymouth

Recombinant at Deloitte: Dave John plus others

Pfizer: Ami Khandeshi

Harvard: Michael McDuffie

Goals

Merge of Sanofi RC2 with public code branches into a clean v1.2alpha branch

Branches to be included are described in detail on a separate page

Develop an approach to publicly accessible testing and QA

TranSMART v1.2 Features Specification

Feature/ProviderDescription

High-Dimensional data models

Sanofi RC2, Hyve/Cognizant, eTRIKS

  • Support of new omic data types: miRNAseq, miRNA/qPCR, RBM proteomics, mass spec proteomics, mass spec metabolomics, RNAseq with data dictionaries, annotation file management, gene list support, sub-categorization in advanced workflows
  • Improved ETL performance and scalability
  • Refactoring of code for high dimension data types (generic API developed)

Improved search and Browsing

Sanofi RC1+RC2, Recombinant, Hyve/Cognizant
  • Centralised search functionality: Browse tab to navigate at 4 levels with data dictionaries and file export; filter options for field values; tagging capabilities;
  • New browsing capabilities: hierarchical data management; link to level 2 data; Analyze tab to browse level 2 (processed) data, select cohorts/subsets, launch analysis; data export enhancements
  • Grid view enhancements

Serial Data

Sanofi RC2, Hyve/Cognizant

Accommodate serial low and high-dimensional data (time series, doses responses, etc.).

Analysis of ‘serial’ data matrix. Analysis of individual columns (time points). Use of time dimension as a variable.

Sample ID incorporation

Sanofi RC2, Hyve/Cognizant
Definition of multiple samples per patient. Association of high and low dimensional data to a single sample. Display linking sample and patient ID in grid view.

Oracle compatibility

eTRIKS, Hyve, Recombinant
Maintain Oracle support in parallel with Postgres.  Oracle and Postgres test systems and test suites.

Incremental data load

Sanofi RC2, Hyve/Cognizant

Enable incremental loading of data for a study:

add new variables to existing study, add new patients to existing study, overwrite values for certain variables in an existing study, change label names of variables previously loaded.

Variation/VCF support

eTRIKS, OncoTrack, Hyve, Sanofi RC2

Transmart NGS working group developments.

Improvements in SNP variation automation and performance.

Recombinant FDA

Recombinant
Use of i2b2 modifiers. Cross-trial support. Personalized workspace features (e.g., analysis saving) ?

Pfizer GWAS

Recombinant
Pfizer GWAS 1.0 available. New version with minor changes expected – can be updated later.

Faceted search

JnJ, eTRIKS, Hyve, Recombinant

Already merged in multiple branches. To be merged with Sanofi RC1+2 Browse/Analyze tab functionality which was preferred in the community vote.

Analytics improvements

Sanofi RC2, eTRIKS, Recombinant
  • Refactoring of code analytics (advanced workflows),
  • Improved boxplot, line graph and correlation analytics.
  • Support for multiple cohorts.
  • Adaptation of legacy analytics to new omic data types,
  • New advanced workflows from recombinant.

R Interface

Takeda, Hyve, eTRIKS, Recombinant
Presented by Takeda in Paris. Merge new R interface developments by Hyve for JnJ/eTRIKS.

Core API

Hyve
Update with latest new functionality

Security improvements,

eTRIKS, Sanofi RC2

Improved security from eTRIKS and Sanofi exercise.

Security models for APIs.

Multiple cohorts

eTRIKS
General support for more than 2 cohorts for comparison/analysis

Postgres 9.3 support

eTRIKS

Maintaining Postgres 9.3 as the main supported version
Code reviewSupport for Grails 2.3.x. Code cleanup proposed in Paris hackathon to be applied to code base after Boston.

 

Code Branches

Test Data

Sanofi RC2 test data for hackathon internal use: examples for miRNAseq, qPCR miRNA, RNAseq, metabolomics and RBM proteomics. Expected also mass-spec proteomics.

eTRIKS script to extract and load pathway data from KEGG

Public studies curated by Rancho Biosciences

  • Oncology: GSE1456 GSE4271 GSE4698 GSE4922 GSE20194 GSE27831
  • Inflammatory: GSE8650 GSE13732 GSE24060 GSE17755
  • Asthma: GSE13168

 

 

  • No labels