In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, the editor will bring you some suggestions for beginners of R language. the article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.
Recently, a lot of people have been asking me about the introduction to R language learning.
To tell you the truth, this topic will be more convincing if it is answered by an old driver who has been dominant in the field of data science for many years, has a lot of experience in the project, and writes a lot of code.
It is not appropriate for me to answer this question for the following reasons:
First of all, my learning cycle is very short. It officially began in September 2016, which is only about 10 months, which is a bit of a quick achievement.
Secondly, I didn't have any programming foundation before I learned the R language (not counting the SQL I studied in college and the VBA that I only know a little).
Third, I am a liberal arts student with no strong mathematical foundation and statistical background.
However, if I think about it from a different perspective, I think I am qualified to answer this question for the following reasons:
As a liberal arts programming rookie and math blind, I better understand the inner confusion and struggle of beginners who have zero basic programming and are not good at mathematics when they first come into contact with R language.
According to my learning cycle and its effect, my learning effect has been tested in practice and recognized by many readers.
My journey of learning R language was completed on the university campus, not forced under the pressure of the workplace, so without a lot of utilitarianism and quick tempo, I also learned it on demand. What I learned was the part with the highest rate of return on input and real style, so the experience of grasping the foundation and learning rhythm is more worthy of reference.
Now I will begin my answer (biaoyanqiao!)
About the original intention of learning:
First of all, I would like to say that before you plan to learn R, you must ask yourself, what is your purpose of learning R language?
Is it the need for college professional courses? Or reserve your own data analysis skills in advance? Or in order to cope with workplace pressure, passive charging? Or just on a whim, see today's big data development in full swing, can not help but join in a wave of excitement? Or just for the sake of interest and realization of some of your own ideas.
Because the goal orientation is different, it means that you can spend different length of time, different painstaking efforts, different learning paths, different learning modules, and achieve different results.
Be sure to set your goals and learn according to your needs, otherwise you will be confused and trapped before you get started, because in addition to the built-in basic packages, there are no less than tens of thousands of expansion packages available on CRAN, and if you include the small crowdsourcing of personal development hosted on GitHub, there may be tens of thousands of them, counting with fingers and enough to learn for a lifetime.
About the understanding of R language:
Here I would like to talk about my ideas about the R language, and I don't want to repeat the concept explanations, development history, and functional introductions that have been broadcast.
R language was developed by statisticians, and its mission was decided at the beginning of its birth is statistical calculation and data visualization, which are the two major directions of the core functions of R language.
For these two directions, the learning of statistical computing is based on classroom theory and professional background. to be honest, R language only provides a platform for implementation, and it should not change or create new theories and models.
Most of the formulas and model algorithms used in these statistical calculations are packaged into expansion packages. After importing the package, you only need to call the corresponding functions and set the corresponding parameters. These functions are no different from the functions in Excel, and there is no need to fear.
As for parameter tuning, model testing and optimization, the knowledge background that these things rely on is basically from classroom learning and professional background, and has little to do with R software. For cases where you need to write your own algorithms, you only tune and calculate according to mature theoretical algorithms on the basis of functions, which has nothing to do with the software (except basic grammar). It is related to the professional background and industry experience outside the software.
In the final analysis, for statistical learning, what is important is the theoretical background, business experience, and what really needs R to achieve is only the built-in extension package functions and basic syntax.
Compared to the study of SPSS, it is difficult for a person who does not know statistics to learn SPSS well, although he knows various functional modules and menus (such as me), similarly, a person who does not understand statistics and mathematics is also difficult to learn R language (statistical calculation module), although he is familiar with the basic syntax of R language and the functions of many expansion packages (such as me).
As for the data visualization direction of R language, it is slightly different. It is true that data visualization does not rely very much on mathematics (except for graphics specifically used for algorithm rendering, few require a lot of operations), but it is highly dependent on graphic syntax and on the concept of visual presentation.
There are four sets of grammars in R language (basic graphic grammar, advanced graphic grammar, lattice grammar, ggplot2 grammar). But unfortunately, I only know one of them-ggplot2.
My original intention to learn something is very pure. To do something well is not just good, but to make the effect pleasing to the eye, to achieve the goal of amazing people, and most importantly, to make the boss full of praise (don't you want a promotion and a raise?).
This means that I need to learn a set of grammar that is elegant, efficient, compatible and closer to the concept of visualization. Because my energy and time do not allow me to spread my painstaking efforts equally among the four task lines, after all, my multitasking ability is very poor.
If there is too much greed, the consequence may be that each set of grammar can be understood, but each set is mediocre, which I cannot tolerate. And ggplot2 is the perfect choice for me.
Even so, is it enough to be proficient in grammar or to keep it in mind? Of course not, even if you can keep it in mind, there is no guarantee that you can easily realize your ideas, because data visualization not only depends on the implementation of tools and platform syntax, it is more about the understanding of the data source, the understanding of visualization, and the mastery of design ideas (how to match colors, how to typesetting, how to match fonts, etc.).
If the learning of software also follows the law of twenty-eight, I think the same is true of the learning of R language.
80% of the energy needs to be spent on statistical theoretical background and business knowledge (which can be self-taught) outside the software, while the part that needs to be realized by using R software should not be dryly learned (of course, the basic grammar of R language should be solid). If the theory is clear, a lot of things will come naturally and be easily solved. This is especially reflected in the study of statistics and data analysis.
And data visualization requires you to firmly master the basis (basic grammar application, data cleaning skills), be able to use a set of graphic grammar (recommended ggplot2), and then do not focus too much on the tools and code itself, but to accumulate visualization literacy and improve the aesthetic level of design. Here I will slightly modify the 28 law of data visualization, five or five law is more appropriate, because ggplot2 is not very easy to master.
As for flexible things such as design, aesthetics and creativity, it is difficult to solve them through one or two books or one or two sets of courses. These are internalized in life and accumulated in daily dribs and drabs. Of course, if you consciously cultivate them slowly through some courses and books, it will be effective over time.
About the R language learning skill path:
General skills learning:
Fundamentals: data structure, variable type, data import / export, data merge append, length-width conversion, data index, slicing, aggregation.
Advanced: regular expression, merge and division, matching and replacement, missing value interpolation, de-duplication and sorting, control flow: loop and judgment.
Specialized skills learning:
Statistics and Analysis: go and study textbooks
Data visualization: ggplot2 grammar + design + aesthetic + creativity
Basically, as long as your general skills are about the same, there is no need to keep turning around in this small circle, you can find your own data to make a case, which is the best way to learn. most of the progress comes from the ability to solve unknown problems in the case.
I haven't read many books in R language, so I still don't recommend books here. If you really want to learn, do you still use others to recommend books? just take a look at Douban's book list.
Usually use search engines to solve temporary problems, basically you encounter problems, predecessors have given a very detailed answer on the network.
Answer some beginner's questions:
1, R language does not need a very deep programming foundation, my programming foundation is basically 0, is not suitable to learn this?
I have a programming foundation of 0 before I learn R, and that's called a programmer. Programmers don't have to blink when they learn R language.
2. Do you need a strong mathematical background to learn R language? I am a liberal arts student, and I am super poor in mathematics. Can I not learn it?
Shake hands. I am a liberal arts student just like you. I am extremely poor in maths. If you plan to make a transition to data mining, you may need to make up for high numbers, linear generations, probability statistics and algorithms. If it is only used as a business analysis tool and visualization, your math level may have exceeded the threshold.
3. I have been in R linguistics for a long time, it seems to be a year. I have read a lot of books, and I can understand all the basic grammars and ggplot2, but I just can't write the code myself, and I'm in a hurry when I draw.
Have you been reading the textbook all the time, even the exercise code is from the copy textbook, how many practical cases have you done, how much real business data have been analyzed, and how much new knowledge has been solved in the actual combat process outside the textbook? it is better to see more than to practice more.
4. Ask for the map template!
Sorry, I don't provide templates, I only provide code and case data! (r language is very difficult to make a template)
Hello, are you there? could you draw a picture for me?
…… (I would like to say a word that is not here.)
6. Could you recommend an introductory book?
In fact, I don't think an introduction to R language needs an introduction book, because I didn't learn it according to the book at the entry stage, but since you have raised this question, I'd like to give you some advice. if you are a student in school and have plenty of time, recommend "R language practice", but you must read it selectively, not the whole article. The first few on the data structure, variable types, data cleaning to take a good look (skip the conceptual and purely explanatory content), the middle statistical learning part to look at as needed, the final document report output part to look carefully (LaTeX and HTML you may not be able to use).
Data visualization recommended two, "R language Visualization Manual", "ggplot2: data Analysis and graphic Art" (the first choice, more approachable, the second although the author's masterpiece, but the intention is more unique, lofty, not very friendly to beginners).
If you are a professional, textbook is not highly recommended, because it takes up too much time to practice, and it is recommended to use fragmented time to listen to some online courses in your spare time. You can listen to the entry for free. Tianshan Intelligent Community is a good free course learning platform. I also have classes in Tianshan Intelligence, and there are many free courses on the theme of big data. In addition, NetEase Yun classroom can also find a lot of good lessons. Free courses for introduction, and then take advantage of access to front-line business data, more R language to work practice, you will make faster progress.
7. Rubik's Cube, how did you learn R language? can you impart some experience?
I'm too embarrassed to answer this question, but I'd better lick my face and say it. I belong to the practical school. I usually practice using crawlers to capture data on the Internet directly. During my internship, I can use R to firmly avoid using Excel, forcing myself to find R language usage scenes, and then continuously output content through Wechat official accounts, Zhihu columns and personal blogs (forcing myself to continue to practice).
Of course, the salvage foundation is very important, otherwise you can only keep your notebook with you every time you write the code, and you won't look there (it's a waste of time).
Make good use of help documentation, there is a powerful help system in R language, you can go directly to the documentation home page of the expansion pack, or you can use it? Info to search for the detailed usage and parameter setting rules of a function.
To ensure regular practice, we should set aside a fixed time to practice every day, depending on the specific situation of the individual.
Finally, a piece of advice, a programming language for data analysis can only be used for data analysis in actual combat, just as tigers can have the wild nature of the king of beasts only in the forest, so once they feel that they have mastered the basics, the final advanced way is to be used in actual combat.
These are the suggestions that the editor shared to the beginners of R language. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.