In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "how to filter transcripts according to class_code". In daily operation, I believe many people have doubts about how to screen transcripts according to class_code. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts of "how to screen transcripts according to class_code". Next, please follow the editor to study!
Problem description
The basic steps for chain-specific library identification of long-chain non-coding RNA (lncRNA) are
Hisat2 compares the original sequencing data to the reference genomic samtools to obtain the sorted bam file stringtie, and each sample is assembled to get a transcript, which is a gtf file stringtie-p 12-- rf-o transcripts.gtf sorted.bam.
Gffcompare merges all sample gtf files ls * .gtf > list.merged.txt
~ / Biotools/gffcompare/gffcompare-r reference.gtf-I list.merged.txt-o merged
To get a merged.combined.gtf, each transcript is assigned a class_code to indicate the location of the transcript relative to the reference genome.
The above pictures are from the paper GFF Utilities: GffRead and GffCompare
Long-chain non-coding RNA usually chooses class_code as u/x/i. For example, only transcripts with TAIR10 annotation [Cufflinks class codes'u'(intergenic transcripts),'x'(Exonic overlap with reference on the opposite strand),'I'(transcripts entirely within intron) were retained is mentioned in the paper Global identification of Arabidopsis lncRNAs revealsthe regulation of MAF4 by a natural antisense RNA.
So the question is, how to use merged.combined.gtf to obtain gtf files with transcripts with class_code of u, x and I?
Found a way, there is a module in python pyGTF,github link is https://github.com/chengcz/pyGTF
Install directly using pip
Pip install pyGTF
You can parse comment files in gft format
Use this module to write a simple script
Import sys
From pyGTF import Transcript
From pyGTF import GTFReader
In_gft = sys.argv [1]
Class_code = sys.argv [2]
Out_gtf = sys.argv [3]
Fw = open (out_gtf,'w')
With GTFReader (in_gtf,flag_stream=True) as fi:
For i in fi:
If i._attri ['class_code'] = = class_code:
I.to_gtf (fw)
Fw.close ()
The method of use is
Python 01.py in.gtf i out.gtf
# another knowledge point learned today: samtools counts the sequence length of fasta files, and extracts sequences according to sequence names
Referenc
Https://www.cnblogs.com/xudongliang/p/5200655.html
Use the command
Samtools faidx input.fasta
An input.fasta.fai file is generated with a total of five columns. The first column is the sequence name, the second column is the sequence length, and the fourth column is the number of bases per row.
Extract a sequence according to its name. It seems that only a single sequence can be extracted here.
Samtools faidx input.fasta TCONS_00000018 > TCONS_00000018.fa
You can also add the specified location.
Samtools faidx input.fasta TCONS_00000018:1-10
> TCONS_00000018:1-10
TGGGCGAACG at this point, the study on "how to filter transcripts according to class_code" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.