Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How perl outputs the location information of genes

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of perl how to output gene location information, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value. I believe everyone will gain something after reading this perl article on how to output gene location information. Let's take a look at it.

The location information of the perl output gene is sorted according to the chromosome and location information of the gene.

We are sorting out the gff file of the genome. in order to output the location information of the gene, as well as the information of multiple transcripts corresponding to the gene, we need to sort the genes according to chromosomes. here we use the hash in perl to sort by value, and two values are used to sort the genotypes. The sample code is as follows. The following code can extract the location information of all genes and the corresponding multiple transcripts from the gff file:

The perl code is as follows:

#! / usr/bin/perl-wuse strict;use Cwd qw (abs_path getcwd); use Getopt::Long;use Data::Dumper;die "perl $0" unless (@ ARGV==2); my$gff=$ARGV [0]; my%gene= (); my%gene_region= (); my%mRNA2Gene= (); my%Gene2mRNA= (); open IN, "$gff" or die "$!"; open OUT, "> $ARGV [1]" or die "$!"; print OUT "# gene_ID\ tchr\ tstart\ tend\ tstrand\ ttranscript_id\ n" While () {chomp;next if (/ ^ # /); my@tmp=split (/\ t /); if ($tmp [2] = ~ / ^ gene/) {my ($id) = ($tmp [8] = ~ / ID= ([^;] +) /); $gene {$id} = 1 domestic genetic region {$id} = [$tmp [0], $tmp [3], $tmp [4], $tmp [6]; # print "gene:$id\ n"; # my$gene_chr- > {$id} = $tmp [0] } if ($tmp [2] = ~ / mRNA | transcript/i) {my ($id) = ($tmp [8] = ~ / ID= ([^;] +) /); my ($pid) = ($tmp [8] = ~ / Parent= ([^;] +) /); if (exists $gene {$pid}) {push @ {$Gene2mRNA {$pid}, $id;} # print "mRNA:$id\ n";} close (IN) # Multi-tier sequencing, first by chromosome, then for my$id by gene location (sort {$gene_region {$a}-> [0] cmp $gene_region {$b}-> [0] or $gene_region {$a}-> [1] $gene_region {$b}-> [1]} keys% gene_region) {print OUT "$id\ t" .join ("\ t", (@ {$gene_region {$id}, sort @ {$Gene2mRNA {$id}). } close (OUT); this is the end of the article on "how perl outputs the location information of genes". Thank you for reading! I believe you all have a certain understanding of the knowledge of "how perl outputs the location information of genes". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report