Skip to end of metadata
Go to start of metadata

i2b2-related and patient information

Stored procedure to delete all clinical + mRNA data

On Postgresql in pqsl

 SELECT TM_CZ.I2B2_BACKOUT_TRIAL('GSE8581', '\Public Studies\COPD GSE8581\', NULL);

On Oracle in sqlplus

 EXECUTE TM_CZ.I2B2_BACKOUT_TRIAL('GSE8581', '\Public Studies\COPD GSE8581\', NULL);

This will also delete the mRNA data in de_subject_microarray_data (but not its platform)

Alternatively, delete clinical data manually (instead of above-mentioned 'backout' stored procedure):

Manual delete clinical data

delete from I2B2METADATA.I2B2 where c_fullname = '\Public Studies\{STUDYNAME}\';
delete from I2B2METADATA.I2B2_SECURE where c_fullname = '\Public Studies\{STUDYNAME}\';
delete from I2B2DEMODATA.CONCEPT_DIMENSION where concept_path = '\Public Studies\{STUDYNAME}\';
delete from I2B2DEMODATA.PATIENT_DIMENSION where SOURCESYSTEM_CD like '{STUDYID}:%'; --(Study_ID always UPPERCASE!)
delete from I2B2DEMODATA.OBSERVATION_FACT where SOURCESYSTEM_CD like '{STUDYID}'; --(Study_ID always UPPERCASE!)

Manual delete mRNA data

delete from DEAPP.DE_SUBJECT_MICROARRAY_DATA where TRIAL_NAME like '{STUDYID}';
delete from DEAPP.DE_SUBJECT_SAMPLE_MAPPING where TRIAL_NAME like '{STUDYID}';

Please note that the SQL queries mentioned only delete the mRNA data and not the i2b2 concept and observations related to the mRNA data (e.g. the nodes in the tree)

To delete mRNA nodes:

delete from I2B2METADATA.I2B2 where c_fullname like '\\Public Studies\\{STUDYNAME}\\ {topnode of mRNA data}\\%';
delete from I2B2METADATA.I2B2_SECURE where c_fullname like '\\Public Studies\\{STUDYNAME}\\ {topnode of mRNA data}\\%';
delete from I2B2DEMODATA.CONCEPT_DIMENSION where concept_path like '\\Public Studies\\{STUDYNAME}\\ {topnode of mRNA data}\\%';

Delete gene expression platform files
You have to delete from these tables:
delete from deapp.de_gpl_info where platform=<platform>;
delete from deapp.de_mrna_annotation where gpl_id=<platform>;

The first command removed the platform from the list of loaded platforms, and is needed to allow the platform to be reloaded later.

The second command removes the platform annotation data.

Make sure that you only delete the annotation data for the particular GPL_ID and not all.
  • Be aware that deleting platform data might also break analysis of expression for any other loaded studies that use the same platform!
  • The i2b2_backout_trial procedure above keeps platform data because other studies may be loaded with references to it.


VCF data

First create dump to be sure:

sudo -u postgres LANG=en_US pg_dump transmart > transmart_<version>.sql
(Use locale -a to check which locales you have)

delete from deapp.de_variant_subject_detail where dataset_id='TCGAOV';
delete from deapp.de_variant_subject_summary where dataset_id='TCGAOV';
delete from deapp.de_variant_subject_idx where dataset_id='TCGAOV';
delete from deapp.de_variant_population_data where dataset_id='TCGAOV';
delete from deapp.de_variant_population_info where dataset_id='TCGAOV';
delete from deapp.de_variant_metadata where dataset_id='TCGAOV';
delete from deapp.de_variant_dataset where dataset_id='TCGAOV';
delete from deapp.de_subject_sample_mapping where trial_name='TCGAOV' and platform='VCF';

delete from deapp.de_gpl_info where platform = 'VCF_hg36';

Please note that the SQL queries mentioned there only delete the VCF data and not the i2b2 concept and observations related to the VCF data. These can be removed using the i2b2_backout_trial procedure (with all clinical data) or manually. These i2b2 tables to check for the vcf nodes are: i2b2metadata.i2b2, i2b2metadata.i2b2_secure, i2b2demodata.concept_dimension. For example:

delete from I2B2METADATA.I2B2 where c_fullname = '\Private Studies\TCGAOV\DNASeq\';
delete from I2B2METADATA.I2B2_SECURE where c_fullname = '\Private Studies\TCGAOV\DNASeq\';
delete from I2B2DEMODATA.CONCEPT_DIMENSION where concept_path = '\Private Studies\TCGAOV\DNASeq\';

Proteomics

/* select all protein platform data */
SELECT * FROM TM_CZ.LT_PROTEIN_ANNOTATION;
SELECT * FROM DEAPP.DE_GPL_INFO WHERE GPL_ID = '%{PROTEOMICS_PLATFORM_ID}%';
SELECT * FROM DEAPP.DE_PROTEIN_ANNOTATION;

/* delete all protein (high-dim) data */

DELETE from DEAPP.DE_SUBJECT_PROTEIN_DATA where TRIAL_NAME like '%{STUDY_ID}%';
DELETE from DEAPP.DE_SUBJECT_SAMPLE_MAPPING where TRIAL_NAME like '%{STUDY_ID}%';
commit;

Note: always run 'commit;' if you want to delete from Oracle database.

Metabolomics

/* delete all metabolomics platform data */
DELETE FROM DEAPP.DE_GPL_INFO WHERE PLATFORM = '%{METABOLOMICS_PLATFORM_ID}%';
DELETE FROM deapp.de_metabolite_annotation WHERE GPL_ID = '%{METABOLOMICS_PLATFORM_ID}%';
DELETE FROM deapp.de_metabolite_sub_pathways WHERE GPL_ID = '%{METABOLOMICS_PLATFORM_ID}%';
DELETE FROM deapp.de_metabolite_super_pathways WHERE GPL_ID = '%{METABOLOMICS_PLATFORM_ID}%';

/* delete metabolomics (high-dim) data -- Don't forget to also remove the nodes from i2b2 tables (see below)*/

DELETE from DEAPP.de_subject_metabolomics_data where TRIAL_NAME like '%{STUDY_ID}%';
DELETE from DEAPP.DE_SUBJECT_SAMPLE_MAPPING where TRIAL_NAME like '%{STUDY_ID}%';

/* delete metabolomics concept paths (actually, the nodes itself, from the dataset explorer tree) */

delete from I2B2METADATA.I2B2 where c_fullname like '\\Private Studies\\Metabolo_UNGO140411\\{topnode of mRNA data}\\%';
delete from I2B2METADATA.I2B2_SECURE where c_fullname like '\\Private Studies\\Metabolo_UNGO140411\\{topnode of mRNA data}\\%';
delete from I2B2DEMODATA.CONCEPT_DIMENSION where concept_path like '\\Private Studies\\Metabolo_UNGO140411\{topnode of mRNA data}\\%'; 

commit;

Note: always run 'commit;' if you want to delete from Oracle database.


RNAseq

Data

delete from deapp.de_subject_rnaseq_data where trial_name='CELL-LINE';

Platform

delete from deapp.de_chromosomal_region where gpl_id='Illumina';

  • Be aware that deleting platform data might also affect other studies that use this data!


aCGH

Data

Seems to be deleted with backout trial. Otherwise:

delete from deapp.de_subject_acgh_data where trial_name='CELL-LINE';

Platform

delete from deapp.de_chromosomal_region where gpl_id='244k-gene';

  • Be aware that deleting platform data might also affect other studies that use this data!


Deleting Program/Browse Data

Listing folders

To see all the folders and their numeric folder_ids:

select folder_level,folder_id,folder_type,folder_name,folder_full_name from fmapp.fm_folder order by folder_full_name; 

Subfolders (studies, assays, analysis, folders) will be listed below their parents in the output.

Deleting Folders

These are the instructions on how to delete folders from the backend.

The following components will be deleted if the SQL statement below is run.  The components will be deleted in the order listed.  

  • Business Object associated with the folder being deleted
    • In the case of a study, the relevant record in BIOMART.BIO_EXPERIMENT
    • In the case of a gene expression analysis, the relevant record in  BIOMART.BIO_ASSAY_ANALYSIS
    • Business Object's UID from BIOMART.BIO_DATA_UID
    • Folder Association record (from FMAPP.FM_FOLDER_ASSOCIATION table)
  • File Association record(s) (from FMAPP.FM_FOLDER_FILE_ASSOCIATION)
    Note: this only applies to folders that contain files
    • Folder Record (from FMAPP.FM_FOLDER)
    • Folder UID Record (from FMAPP.FM_DATA_UID)
    • Folder Template Association (from AMAPP.AM_TAG_TEMPLATE_ASSOCIATION)
    • Associated Metadata (from AMAPP.AM_TAG_ASSOCIATION)

This is the SQL to perform the deletions described above. (replace 'REPLACEME' with desired numeric folder ID):

/*Uncomment this if you are deleting a study*/
/*DELETE FROM BIOMART.BIO_EXPERIMENT WHERE bio_experiment_id IN
(Select bio_data_id FROM BIOMART.BIO_DATA_UID WHERE UNIQUE_ID IN
(Select object_uid from FMAPP.fm_folder_association WHERE FOLDER_ID = REPLACEME));
DELETE FROM BIOMART.BIO_DATA_UID WHERE BIO_DATA_TYPE = 'BIO_EXPERIMENT' AND BIO_DATA_ID NOT IN (SELECT BIO_EXPERIMENT_ID FROM BIOMART.BIO_EXPERIMENT);
DELETE from searchapp.search_auth_sec_object_access where secure_object_id in
(select search_secure_object_id from searchapp.search_secure_object where bio_data_id not in (select bio_data_id from biomart.bio_data_uid) and data_type='BIO_CLINICAL_TRIAL');
DELETE from searchapp.search_secure_object where bio_data_id not in (select bio_data_id from biomart.bio_data_uid) and data_type='BIO_CLINICAL_TRIAL';*/ 
/*Uncomment this if you are deleting an analysis*/
/*DELETE FROM BIOMART.BIO_ASSAY_ANALYSIS WHERE bio_assay_analysis_id IN
(Select bio_data_id FROM BIOMART.BIO_DATA_UID WHERE UNIQUE_ID IN
(Select object_uid from FMAPP.fm_folder_association WHERE FOLDER_ID =REPLACEME));
DELETE FROM BIOMART.BIO_DATA_UID WHERE BIO_DATA_TYPE = 'BIO_ASSAY_ANLYSIS' AND BIO_DATA_ID NOT IN (SELECT BIO_ASSAY_ANALYSIS_ID FROM BIOMART.BIO_ASSAY_ANALYSIS);*/
/*if you are deleting a folder (as opposed to study or analysis) you can start here*/
delete from fmapp.fm_folder_association where folder_id in (select folder_id from fmapp.fm_folder where folder_id = REPLACEME);
delete from fmapp.fm_folder_FILE_association where folder_id in (select folder_id from fmapp.fm_folder where folder_id = REPLACEME);
delete from fmapp.fm_folder where folder_id = REPLACEME;
delete from fmapp.fm_data_uid where not exists
(select * from fmapp.fm_folder where folder_id = fm_Data_uid.fm_data_id and fm_data_type = 'FM_FOLDER')
and not exists
(select * from fmapp.fm_file where file_id = fm_Data_uid.fm_data_id and fm_data_type = 'FM_FILE');
DELETE FROM amapp.am_tag_TEMPLATE_association WHERE OBJECT_uid NOT IN
(SELECT UNIQUE_ID FROM FMAPP.FM_DATA_UID);
DELETE FROM amapp.am_tag_association WHERE SUBJECT_uid NOT IN
(SELECT UNIQUE_ID FROM FMAPP.FM_DATA_UID);
Delete from amapp.am_data_uid where unique_id not in
(select object_uid from amapp.am_tag_association);
delete from amapp.am_tag_value where tag_value_id not in
(select am_data_id from amapp.am_data_uid);


  • No labels