Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the GFF3 format?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail what the GFF3 format is. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.

GFF3 format description

GFF3 each row represents a sequence element (except for comment lines that begin with #). Each row has only nine columns (that is, each sequence element has nine attributes). Columns and columns can only be separated by the tab key, and must be used if an attribute of a sequence element is empty. Instead, the format is as follows:

2L FlyBase transcript 7529 9484. +. ID=FBtr0300690;Parent=FBgn0031208;Name=CG11023-RC;biotype=protein_coding;transcript_id=FBtr03006902L FlyBase five_prime_UTR 7529 7679. +. Parent=FBtr03006902L FlyBase exon 7529 8116. +. Parent=FBtr0300690;Name=FBtr0300690-1 is responsible for 12 L FlyBase CDS 7680 8116. + 0 ID=CDS:FBpp0289914;Parent=FBtr0300690;protein_id=FBpp02899142L FlyBase exon 8193 8589. +. Parent=FBtr0300690;Name=FBtr0300690-2 exchangeable characteristic FBtr0300690, E2qualification ranking 22L FlyBase CDS 8193 8589. + 1 ID=CDS:FBpp0289914;Parent=FBtr0300690;protein_id=FBpp02899142L FlyBase CDS 8668 9276. + 0 ID=CDS:FBpp0289914;Parent=FBtr0300690;protein_id=FBpp0289914

The nine columns from left to right are:

1. Name description of seqid-scaffold or chromosome 2. Source-the name or data source of the software that produces a sequence element (database name or project name) 3. Type-the type of the sequence element, such as mRNA, CDS, etc. 4. The starting position of the start-sequence element on the scaffold or chromosome, counting from 1. 5. The termination position of the end-sequence element on the scaffold or chromosome Starting from 1, count 6. Score-the score of the sequence element, which is generally the E-value when the sequence element is compared and the P-value7 when ab initio gene prediction features. Strand-"+" represents the positive chain of the sequence element in scaffold or chromosome, and vice versa. Phase-can be "0", "1", "2", "0" means that the first base of the sequence element is the first clip of the first codon, "1" means that the second base of the sequence element is the first base of the first codon, and so on. 9. Attributes-some other attributes of the sequence element, there can be multiple attributes each of which must be split by ";", such as "ID=some-id;Name=some-name" Parent=some-parent, please note this Parent attribute. Because sequence elements are very complex, a sequence element (e.g. exon) may belong to another sequence element (e.g. gene). This Parent attribute means which sequence element the sequence element is on. If a sequence element does not have a Parent attribute, it means that its parent element is scaffold or chromosome. This is the end of the article on "what is the GFF3 format?" Hope that the above content can be helpful to you, so that you can learn more knowledge, if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report