Skip to end of metadata
Go to start of metadata

Version 17.1 and higher

Adding de novo transcripts of a known gene to the tranSMART dictionary works basically in the same way as for regular transcripts.

Regular transcripts are loaded with two dictionary files. Both contain Ensembl transcript IDs, one dictionary linking it to associated transcript names and the other that links it to an Entrez gene ID. Since de novo transcripts will neither have an official ID nor an associated name, these can be chosen by the uploader (they do have to be unique).

Adding de novo transcripts will not require any reloading of previously uploaded transcripts, simply providing the new ones will suffice.


For example, if you would like to add two de novo transcripts that belong to TP53 (Entrez gene id: 7157) you could do so by creating the following tsv files.

transcript_id_name_mapping.txt
Ensembl Transcript ID	Associated Transcript Name
DENOVO00000001	TP53_DENOVO1
DENOVO00000002	TP53_DENOVO2
transcript_id_gene_mapping.txt
Ensembl Transcript ID	EntrezGene ID
DENOVO00000001	7157
DENOVO00000002	7157

 

Then, from the transmart-loader directory run in this order:

java -cp target/loader-jar-with-dependencies.jar org.transmartproject.pipeline.dictionary.TranscriptDictionary  <path_to_file>/transcript_id_name_mapping.txt

java -cp target/loader-jar-with-dependencies.jar org.transmartproject.pipeline.dictionary.GeneTransDictionary <path_to_file>/transcript_id_gene_mapping.txt

  • No labels