Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The GitHub program helps you learn data science from scratch.

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "the GitHub project helps you learn data science from scratch". In the daily operation, I believe that many people have doubts in the GitHub project to help you learn data science from scratch. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for everyone to answer the doubts that "GitHub project helps you learn data science from scratch." Next, please follow the editor to study!

Free resources to learn data science from scratch.

How to get started with data science?

The GitHub project provides a free learning resource that includes not only a super-detailed learning roadmap, but also a number of free online courses, a large number of data science projects and more than 100 free machine learning books.

The project collects different resources scattered on the Internet and combines them in a certain order to help data science beginners solve the problem of how to search for free and structured learning resources. The project's authors say the project will be continuously updated based on new free resources.

Learning roadmap for data scientists

"sharpening the knife does not miss the firewood worker. "the project begins with a detailed introduction of a data science roadmap that lists what data science learners need to master:

Basic knowledge (basic matrix and algebra, etc.)

Statistics (probability theory, Bayesian theorem, etc.)

Programming

Machine learning

Text mining / natural language processing

Data visualization

Big data

Data acquisition

Data reprocessing (Data Munging)

Toolbox.

Basic knowledge to be mastered by a data scientist

Before you become a data scientist, you need to have a theoretical knowledge of matrices, understand how they operate, and be familiar with the transformations of matrices. The project author also introduces us to a variety of data structures, including hash function, binary tree and so on.

Taking a binary tree as an example, the project author explains what a binary tree is: "in computer science, a binary tree is a tree data structure in which each node has at most two child nodes, called left and right child nodes. "

Binary tree

In addition to matrix knowledge, data science beginners also need to master more than ten knowledge points (some parts are still being updated), such as relational algebra, database basic knowledge, CAP principle, ETL and so on.

Statistics

The project introduces a lot of knowledge about statistics, including data set selection, descriptive statistics, exploratory data analysis, histogram, probability theory, Bayesian theorem and so on.

Taking exploratory data analysis as an example, the project author introduces the development environment, dependent library, installation mode and analysis method needed to complete the whole data analysis task from two aspects of data visualization and analysis.

Click the Seaborn link to go to the Seaborn home page. The picture shows the contents of the home page linked to.

In terms of data analysis, the project author introduces the PCA dimensionality reduction method to help learners understand what principal component analysis is and how to implement it in Python.

Programming

To become a data scientist is inseparable from programming. This project introduces the programming languages Python, R setup/R studio and so on. Taking R setup / R studio as an example, the project author introduces two installation methods: Linux and Windows. But there is still a lot of knowledge to be added in this part.

Content to be added

Machine learning

The project also lists the machine learning knowledge needed to master data science, including numerical variables, classification variables, supervised learning, unsupervised learning, training and test sets, classifiers, over-fitting, deviations and variances, support vector machines and other 30 items.

Taking support vector machine as an example, the project author first introduces the function of support vector machine, which can be used in classification and regression tasks, and then explains the principle of support vector machine in simple and clear language. In addition, the project author lists other knowledge about support vector machines, which readers can learn by themselves through links.

Support vector machine

In addition to the above introduction, the project author has also sorted out text mining, data visualization and other contents, which will not be repeated here.

Free online courses

Based on the project of GitHub user Developer-Y, this project collates a large number of resources for free online courses, including artificial intelligence, machine learning and robotics. The machine learning part is subdivided into machine learning introduction, data mining, data science, probability graph model, deep learning, reinforcement learning, advanced machine learning courses, natural language processing based on machine learning and computer vision, time series analysis, probability and statistics, linear algebra and so on.

Part of the screenshot of the list of free online courses provided by the project.

From the list of programs, we can see the familiar Wu Enda machine learning courses, as well as rich curriculum resources from Carnegie Mellon University, Stanford University, Federal Institute of Technology Zurich, University of California, Berkeley, Microsoft and other institutions.

Artificial intelligence open source project

In addition, the project also lists a large number of artificial intelligence open source projects, covering machine learning, deep learning, natural language processing, computer vision and other areas.

This resource comes from the GitHub library created by Ashish Patel, an AI researcher and data scientist, and currently contains 71 entries. Click on the link to get the corresponding project and code resources.

Partial screenshot of the project list.

From the current list, we can see that it includes related project resources in target detection, chatbot, GUI, unsupervised learning, regression analysis, emotional analysis, recommendation system, data science, NLP, computer vision and so on. The cover list will be updated continuously.

100 + free machine learning books

The author of the project compiled a list of machine learning books from Insane. The list was updated in January 2021, including the familiar "Flower Book" and "Deep Learning", as well as books on graphic algorithms, natural language processing, data mining, GAN, Python, and so on.

At this point, the study of "the GitHub project helps you learn data science from scratch" is over. I hope it can solve everyone's doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report