How to use python to crawl thousands of fund data 07/09 Update SLTechnology News&Howtos

How to use python to crawl thousands of fund data

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article is about how to use python to crawl tens of millions of fund data. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Fund code

Crawling fund data is a necessary condition to know the fund code, how to get it, open the official website to find it.

After clicking around, I found the home page of the fund code, thought, turned the page and crawled.

Http://fund.eastmoney.com/allfund.html

As a result, F12 unexpectedly opened the fundcode_search.js in the following figure.

Right-click the new tab to open →

Found that all the fund codes are there, then it is even easier.

Import requests

Import re

Import json

Import pandas as pd

Url = 'http://fund.eastmoney.com/js/fundcode_search.js'

R = requests.get (url)

A = re.findall ('var r = (. *])', r.text) [0]

B = json.loads (a)

Fundcode = pd.DataFrame (b, columns= ['fundcode',' fundsx', 'name',' category', 'fundpy'])

Fundcode = fundcode.loc [:, ['fundcode',' name', 'category']]

Fundcode.to_csv ('fundcode_search.csv', index=False, encoding='utf-8-sig')

The operation obtains a total of 10736 pieces of data of all fund codes.

Climb the history of funds

With tens of thousands of fund codes, and then crawling their net worth data for the past three years, that rounding is not ten million pieces of data.

The method has been given before, also open the fund website, use the browser with its own flow analysis tools can easily find the data interface.

Where callback returns the js callback function, which can be deleted, funCode is the fund code, pageIndex is the page number, pageSize returns the number of data items per page is, startDate and endDate are the start time and end time, respectively. [1]

FundCode = '001618' # Fund Code

PageIndex = 1

StartDate = '2018-02-22' # start time

EndDate = '2020-07-10' # deadline

Header = {

'User-Agent':' Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'

'Referer': 'http://fundf10.eastmoney.com/jjjz_{0}.html'.format(fundCode)

}

Url = 'http://api.fund.eastmoney.com/f10/lsjz?fundCode={0}&pageIndex={1}&pageSize=5000&startDate={2}&endDate={3}&_=1555586870418?'\

.format (fundCode, pageIndex, startDate, endDate)

Response = requests.get (url, headers=header) Thank you for your reading! On "how to use python to climb tens of thousands of fund data" this article is shared here, I hope the above content can be of some help to you, so that you can learn more knowledge, if you think the article is good, you can share it out for more people to see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.