Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to download TCGA data using TCGAbiolinks

2025-03-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "how to use TCGAbiolinks to download TCGA data". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

TCGAbiolinks is an R packet that analyzes and processes TCGA data. It queries and downloads TCGA data through GDC API. It also provides common analysis functions such as difference analysis, survival analysis, enrichment analysis and so on.

Http://bioconductor.org/packages/release/bioc/html/TCGAbiolinks.html

The basic usage of this R packet is as follows

1. Query

Similar to the online query, except that the various optional properties on the web page are changed into corresponding parameters, the basic usage is as follows

Query with project as the core, and other parameters are used to filter the data. The following parameters are commonly used.

Datga.category

Data.type

Workflow.type

Experimental.strategy

Platform

Access

The above parameters correspond to the selection menus on the web page, as shown below

In addition, there are several important parameters, the default value of the legacy parameter is FALSE, which means to query from harmonized database, TRUE means to query from GDC legacy archive, and the barcode parameter is used to select data from some of the samples.

The basic unit of the query result is Files, which can be viewed by the following code

Datatable (getResults (query))

The result is an html table, which is viewed through a web page, with each row representing a table, as shown below

2. Download

In addition to viewing the search results, you can also download the search results, using the following

Here is divided into two steps, the first step is to download the original data from GDC, which can be downloaded using API or gdc-clinet, and the speed of API is relatively faster; the second step is to sort out the results of the original data, and the raw data downloaded from GDC is separate from each file, so you need to sort out the results before they can be used for subsequent analysis. Taking the expression spectrum data as an example, we need to merge the samples, convert the sample ID and so on, all of which can be done through GDCprepare.

The sorted results are stored in the data object, and the information of the sample can be viewed as follows

The results are as follows

The information of the expression matrix can be viewed as follows

The results are as follows

After the data has been downloaded and sorted out, it is ready for analysis. Different types of data correspond to different analysis methods. For specific analysis methods, please refer to the official documentation.

This is the end of "how to use TCGAbiolinks to download TCGA data". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report