Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
Table of Contents
Part I. INTRODUCTION
Part II. SOFTWARE PREREQUISITES
1. Compiler and tools:
2. R and Bioconductor
3. Web application development tools
4. PostgreSQL database
Part III. TRANSMART INSTALLATION
1. Download tranSMART Data
2. Install tranSMART Data
Part IV. BUILD AND START TRANSMART WEB SERVER
1. Building tranSMART from source
2. Start tranSMART application server
3. Check tranSMART is correctly started
PART V. EXTRA: PostgreSQL data directory migration to GPFS
1. Fresh install data directory modification:
2. Change existing data directory:
3. Migrate tablespaces to GPFS
PART VI. EXTRA: Enable tranSMART DB on Power8 Running RHEL 7.1
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
Translational medical research and personalized health become hot topics in life sciences in recent years, with the availability and capability of next generation sequencing tools and clinical data. TranSMART (www.transmartfoundation.org) is such an open source platform for knowledge management of translational research data, including next generation sequencing (NGS) data such DNA, RNA and protein sequences along with clinical information such as clinical trials, patient demographics and disease conditions. It serves as an environment for scientists in bioinformatics to develop and refine research hypotheses by investigating correlations between genomic sequences and phenotypic data, and assessing their analytical results in the context of published literatures.
IBM Power8 is a data-centric system to provide outstanding performance for big data handling, including data extract, transform and load (ETL) and analysis, which are bottleneck on tranSMART platform. This guide provides step-by-step instructions to install tranSMART version 1.2 on IBM Power8 system running Ubuntu 14.04.2. Tips for system and application tuning and optimization are also included.
Anchor | ||||
---|---|---|---|---|
|
Before install tranSMART package, GCC and JAVA compilers and tools are required on the system. Additionally, to tranSMART has the following prerequisites:
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
Make sure GCC, JAVA7, ANT, GIT and make are installed on the system. If not:
$ sudo apt-get install openjdk-7-jdk
$ sudo apt-get install icedtea-7-plugin
$ sudo apt-get install make
$ sudo apt-get install ant
$ sudo apt-get install git
Set ANT_HOME and JAVA_HOME in your user profile (.profile or .bash_profile) and the PATH to the locations where these packages are installed.
Anchor | ||||
---|---|---|---|---|
|
Two ways of installing R and Bioconductor:
...
$sudo R
>library(Rserve)>Rserve() or Starting Rserve on port 6311 : $ R CMD $path_to/Rserve/libs/Rserve
Anchor | ||||
---|---|---|---|---|
|
1) PHP 5
$ sudo apt-get install php5
$ sudo service apache2 status|stop|start|restart #status
2) Tomcat 7
$ sudo apt-get install tomcat7$ sudo service tomcat7 status|stop|start|restart
Tips: Stop it tomcat before copy tranSMART WAR file to /var/lib/tomcat7/webapps
3) GRAILS
$ sudo apt-get install curl #ignore if already installed $ curl http://get.sdkman.io | bash
Tips: reopen terminal window or execute "$source .profile"
$ sdk install grails 2.3.11
$ gvm install groovy Tips: Do you want grails 2.3.11 to be set as default? (Y/n): Y
Anchor | ||||
---|---|---|---|---|
|
$ sudo apt-get install postgresql-9.3
$ sudo apt-get install libpg-java
$ sudo apt-get install libpostgresql-jdbc-java
$ sudo apt-get install postgresql-server-dev-9.3
$ sudo -i -u postgres$ psql$ postgres# \q
$ sudo service postgresql status|stop|restart
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
- transmart-data $ git clone {+}https://github.com/transmart/transmart-data.git+$ cd transmart-data
...
a) Transmart Main App $ git clone {+}https://github.com/transmart/transmartApp.git+
$ git fetch --tags
$ git checkout
b) Core API $ git clone {+}https://github.com/transmart/transmart-core-api.git+
$ git fetch --tags
$ git checkout
Anchor | ||||
---|---|---|---|---|
|
- Setup
$ sudo -i -u postgres$ psqlpostgres=# alter user postgres password 'postgres';
...
$ cd ~/transmart-data$ make -C config install
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
$ cd ~/transmartApp $ grails clean
$ grails upgrade
$ grails war –plain-output
Anchor | ||||
---|---|---|---|---|
|
- Start Solr for database search
...
$ sudo R
>library ("Rserve")
>Rserve ()
>q ()
Anchor | ||||
---|---|---|---|---|
|
Open web browse: http://yourhost.com:8080/transmart
Anchor | ||||
---|---|---|---|---|
|
Anchor | ||||
---|---|---|---|---|
|
%sudo su – postgres
To check current data_directory and database version
% psql -d postgres -U postgres
%postgres=# SHOW data_directory;
data_directory
------------------------------
/var/lib/postgresql/9.3/main
...
%sudo service postgresql start
Anchor | ||||
---|---|---|---|---|
|
Under user postgres, do:
Stop database server:
...
%cd /var/lib/postgresql/9.3
%ln –fs /gpfs/fs1/postgres/DB
(Tips: make sure the dir permission is 700)
Restart the database server
Anchor | ||||
---|---|---|---|---|
|
Using the same procedures as "Change existing data directory"
Tips:
...
- ketch.sh: set java memory to increase upload performance
Anchor | ||||
---|---|---|---|---|
|
Install database:
$ yum install postgresql.ppc64le
$ yum install postgresql-server.ppc64le
Start and configure:
$ service postgresql initdb
$ chkconfig postgresql on
$ service postgresql start
Modify vars script:
PGSQL_BIN=/usr/bin/
TABLESPACES=/var/lib/pgsql/tablespaces/
KETTLE_JOBS_PSQL=$HOME/transmart-data/env/tranSMART-ETL/Postgres/GPL-1.0/Kettle/Kettle-ETL/
KITCHEN=$HOME/transmart-data/env/data-integration/kitchen.sh
Changes in pg_hba.conf
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
Create tranSMART postgres database:
% sudo -u postgres bash -c "source vars; PGSQL_BIN=/usr/bin/ PGDATABASE=template1 make -C ddl/postgres/GLOBAL tablespaces"
% make postgres
Running /copying transmart configuration file
% sudo bash -c "source vars; TSUSER_HOME=~rchen/ make -C config/ install"
% make -C env/ data-integration
% make -C env/ update_etl
After database setup and configurations, load datasets to PostgreSQL as same as Ubuntu's.
...