Skip to end of metadata
Go to start of metadata

Overview over all known ETL tools and their capabilities for loading data into the tranSMART database.

Overview of the ETL tools

The tranSMART Project Management Committee decided in release 16 to focus efforts on transmart-batch and tmDataLoader going forward.

However, tranSMART 19.0 code for transmart-data has revised and updated the Kettle procedures with an upgrade to the latest versions of Kettle and performance improvements especially for RNAseq data.

The Foundation has an extensive set of curated datasets available for loading using transmart-data with predefined parameter files and data for clinical and HDD datatypes. The data can also be downloaded, unpacked and rearranged for the other ETL tools.

We are working with the developers of tMDataLoader to provide a closer integration with tranSMART 19.0.

ETL toolOpen
Source
tranSMART version
(Development Roadmap)
DownloadManual
transmart-data(tick)

19.0 and earlier

plus 17.1 server-only

Also used for loading data dictionaries and for database management

16.3 and earlier: https://github.com/tranSMART-Foundation/transmart-data

19.0 and later:  https://github.com/tranSMART-Foundation/transmart/tree/master/transmart-data

17.1: https://github.com/tranSMART-Foundation/transmart-core/tree/master/transmart-data

Loading data with transmart-data
tMDataLoader(tick)

19.0 and earlier

(17.1 in development)

https://github.com/Clarivate-LSPS/tMDataLoaderhttps://github.com/Clarivate-LSPS/tMDataLoader/wiki
https://drive.google.com/file/d/0ByehpOFIhEbadDZiT1VvYW5ERnM/view
transmart-batch(tick)

19.0 and earlier

plus 17.1 server-only

16.3 and earlier: https://github.com/tranSMART-Foundation/transmart-batch

19.0 and later: https://github.com/tranSMART-Foundation/transmart/tree/master/transmart-batch

17.1: https://github.com/tranSMART-Foundation/transmart-core/blob/master/transmart-batch

16.2 and earlier: https://github.com/tranSMART-Foundation/transmart-batch/tree/master/docs

17.1: https://github.com/tranSMART-Foundation/transmart-core/blob/master/transmart-batch/docs/

Integrated Curation Environment (ICE)(tick)

19.0 and earlier

A binary version was included in older releases as part of  tranSMART-ETL

16.3: https://github.com/transmart/transmart-ICE

19.0 and later: https://github.com/tranSMART-Foundation/transmart/tree/master/transmart-ICE

https://drive.google.com/file/d/0B8lizkKDeaKhMWZBWnlnODVEQW8/view
Kettle(tick)19.0 and earlier

16.3 and earlier: https://github.com/transmart/tranSMART-ETL

19.0 and later: https://github.com/tranSMART-Foundation/transmart/tree/master/transmart-etl

Loading data with Kettle (Step by step tranSMART ETL Guide)

Overview of the loading capabilities (state of 05.11.2019)

(Empty boxes are waiting to be tested)

Data type / ETL tooltransmart-datatMDataLoadertransmart-batchIntegrated Curation Environment (ICE)

SupportHDDSupportHDDSupportHDDSupportHDD
Clinical

Y

- - -

Y

- - -

Y

- - -

Y

- - -

DictionaryY- - -N- - -N- - -N- - -
aCGH / CNV *

Y

Y

Y

Y

Y

Y

N

N

Metabolomics

Y

Y

Y

Y

Y

Y

Y

Y

miRNA-qPCR

Y

Y

Y

Y

Y

Y

Y

Y

miRNA-RNAseqYY



YY
mRNA expression

Y

Y

Y

Y

Y

Y

Y

Y

MS-ProteomicsYY



YY
Proteomics

Y

Y

Y

Y

Y

Y

Y

Y

RBM proteomicsYY



YY
RNASeq by position

Y

Y

Y

Y

Y

Y



RNAseq by geneYY



YY
SNP

N

N

Y

Y

N

N

Y

Y

VCF

Y

Y

Y

Y

N

N

N

N

* aCGH data is called CNV data in the ETL tool transmart-batch

 Explanation of the table


Columns:

  • Support: The column tells if an import is generally possible
  • HDD: The column tells if an import as an highdimensional data type is possible (for explanation of HDD see here: Supported Data Types)

Values:

  • Y: the ETL tool is able to load the specified data
  • N: the ETL tool is NOT able to load the specified data


Further information

  • No labels