In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to use gdc-client batch download TCGA data, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.
The online download function of GDC is only suitable for downloading small data sets. When you need to download TCGA data with a large amount of data, you must rely on the client tool gdc-client provided by GDC. The web address is as follows
Https://gdc.cancer.gov/access-data/gdc-data-transfer-tool
The software is a command line tool that supports windows, linux and mac OS operating systems. You can download files in the following two ways
1. Manifest
First filter the dataset you are interested in through the GDC online database, and then add the dataset to the shopping cart through the shopping cart icon, as shown below
Click the Cart button on the navigation bar, and click to download the Manifest file
The contents of the file are as follows
The first column is the uuid of the file. In the GDC database, all the information is uniquely identified by a uuid. The use of bulk downloads using manifest files is as follows
Gdc-client download-m gdc_manifest_20190610_105445.txt
The result is downloaded to the current directory, and each file is saved in the folder corresponding to uuid, as shown below
What I download here is the gene expression of FPKM. The contents of the file are as follows.
We can see that there is no header information, and each sample is separate. in practice, we usually need to integrate it into a table to get a table of behavioral genes listed as the gene expression of the sample. For the data downloaded in this way, there is no sample information corresponding to the file, which can be obtained by downloading SampleSheet. The contents of the file are as follows
The corresponding sample information of each sample is saved, and the table of gene expression can be obtained by combining this data.
2. UUID
The second way is to download directly using the uuid corresponding to the file. Click on the file name to see the UUID information, as shown below.
The use of downloading files through uuid is as follows
Gdc-client download cadfedcc-2742-42ad-9fd3-733d01086392
These two ways are essentially the same, both through the uuid of the file to correspond to the only file, and download. It should be noted that this method can only download the original file, if you need downstream analysis, you need to adjust the file format.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.