Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to carry on big data's entry-level study with zero foundation?

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Big data is a very fashionable technical term at the moment, at the same time, it has naturally given birth to some occupations related to big data's processing, which influence the business decisions of enterprises through data mining and analysis.

These people are called data scientists (Data) abroad.

Scientist), the title was originally created by D.J.Pati and Jeff

Hammerbacher proposed it in 2008, and they later became heads of the data science team at LinkedIn and Facebook, respectively. The position of data scientist has also begun to create value in the traditional American telecommunications, retail, financial, manufacturing, logistics, health care, education and other industries.

However, in China, big data's application is just in its infancy, and the talent market is not so mature. "you can hardly expect a generalist to complete all the links in the whole chain. More companies will recruit talents who can complement their existing teams according to their existing resources and shortcomings." Wang Yuyao, director of China Business Analysis and Strategy at LinkedIn, told China Business Weekly.

As a result, each company has different requirements for big data's work: some emphasize database programming, some highlight the knowledge of applied mathematics and statistics, some require relevant experience in consulting firms or investment banks, and some hope to find applied talents who understand products and markets. Because of this, many companies will give some new titles and definitions to this group of people dealing with big data according to their own business types and team division of labor: data mining engineer, big data expert, data researcher, user analyst and so on are all Title that often appear in domestic companies, which we collectively call "big data engineer".

We live in an era of "technology explosion" and "sharing, open source". The iteration rate of advanced technologies is faster than at any other time in history, and these technologies are no longer blocked, everyone can access and learn. Lifelong learning has become a problem that each of us has to face, which is particularly evident in the field of big data / artificial intelligence: an endless stream of new technologies have brought convenience to us on the one hand, but it also makes it difficult for us to learn and choose efficiently. Therefore, it is necessary to have appropriate logic and methods to learn big data's knowledge under such a background.

This article attempts to help readers to make good use of all kinds of "shared and open source" learning tools and learning channels to avoid the "pit" that all kinds of novices are easy to fall into, and to achieve high-quality learning and mastery of the target technology with minimum time cost and economic cost.

This paper first analyzes the background of the times, then divides the echelon of talents in the field of big data at present, and finally gives the advanced guide of big data / artificial intelligence talents from rookies to masters.

I believe there are many students who want to learn from big data. You can add big data's study skirt: 532 plus the last 147 of 218, you can get a whole set of big data study courses for free.

1. Background matting

"Technology explosion" and "shared open source" are the most distinctive labels in this era, and the author believes that they are causal and closely related to each other. First of all, in the era of "technology explosion", for the research team at the forefront of technological development, the best means of "technology realization" is "shared open source". On the other hand, before the development of the Internet and mobile Internet, information is very close. once a technological innovation appears, it needs to register a patent as soon as it appears, and the technology needs to be protected by the government. the only way to realize technology is to sell patents or organize production to form products.

Nowadays, the Internet and mobile Internet have become very mature, and new information will spread to every corner of the world at a very low cost in a very short time, so research teams at the forefront of technology only need to upload their work to neutral shared and open source websites such as "arxiv" or "github" at the first time, and they will be protected by global public opinion immediately. This intensity is far stronger than the patent protection of a country.

Subsequently, as long as the new technology does have application value or academic value, then all kinds of capital giants, technology predators and related organizations will line up to send rich offer. For the cutting-edge team, the time point of technology realization is much earlier than the time point of technology production.

Second, because of the "technology explosion", there are always new technologies waiting for the frontier team to study and discover, so the best way for the frontier team to stay ahead is not to hold on to the existing results, but to realize the realization of "sharing open source" as soon as possible. and put it into new research.

Finally, "shared open source" has also greatly promoted the "technology explosion". The rapid development of any technology and technology needs a huge talent system to support it. In contrast, in various periods of history, the main channel for sharing knowledge and training talents is "schools". This channel is not only a single form, but also often has a considerable threshold, will block a considerable number of "aspiring young people" out of the door.

In this day and age, the fastest channel for the dissemination of knowledge is the Internet. due to "shared open source", the world's best educational resources and the most advanced academic and technological ideas suddenly do not have any threshold. Open to all individuals without difference, the result is that as long as a certain technology, science and technology field has a great breakthrough and broad application prospects (such as big data, artificial intelligence) Then the corresponding talent echelon will automatically catch up in a short time.

The research team standing at the academic forefront of big data only needs to unswervingly expand its territory, and then the talent echelon will automatically carry out "guarantee" work such as "new technology demonstration" and "technology production". To ensure the healthy development of this technology field and related industries, so as to further promote the convergence of resources to the cutting-edge team at the top of the pyramid and support its development work.

Big data (huge data collection) is a very fashionable term in modern society. Is a high-order state of data science. Data science does not have an independent discipline system, statistics, machine learning, data mining, database, distributed computing, cloud computing, information visualization and other technologies or methods to deal with data. It has given birth to some professions related to big data, which influence the business decisions of enterprises through the analysis and mining of data.

In China, the application of big data is in its infancy, and the talent market is not very mature. Each company has different requirements for big data's work: some emphasize database programming, some highlight the knowledge of applied mathematics and statistics, some require relevant experience in consulting companies, and some hope to find applied talents who understand products and markets. Because of this, many companies will give some new titles and definitions to this group of people dealing with big data according to their own business types and team division of labor: data mining engineer, big data expert, data researcher, user analyst and so on are all Title that often appear in domestic companies, which we collectively call "big data engineer".

For some large companies, people with a master's degree are a better choice, but Xue Guirong, a researcher at Alibaba Group, stressed that education is not the most important factor. Having experience in large-scale data processing and curiosity to hunt for treasure in the data ocean will be more suitable for this job. If you want to know more about programming sharing, please follow × × Gongzong account: programmer Daniel, there are also articles and practical information about sharing this aspect.

Find the right × ×, and climb forward with the roller.

It is no longer an era when you can sweep the world by falling off a cliff and finding a secret book for a few years, whether it is a great fighter like Hinton (the father of the BP algorithm who overturned the BP algorithm) or a rookie like he Kaiming (a magic high achiever who sends best paper as easily as most people send paper), are in their own very reliable team to explore with their friends. Good × × does not need more, one or two really reliable is enough, as for the importance of teammates will be explained later.

The final advice to be given in the rookie foundation-building part is that do not stay at this stage for too long, and do not wait for "ready" before starting to practice, because the "ready" here often includes the rookie's lack of self-confidence. you will never be "ready" if you don't improve yourself further. In general, students who want to be partial to AI such as "computer vision" or "natural language processing" can choose corresponding practical projects to move on to the next stage after completing Wu Enda's "deep learning" course and data mining "machine learning" course.

So what practical means should we choose? In the best case, there is a big god to lead the team to do the real project, but such an opportunity is often not available, so we will not discuss it here. The popular way is to participate in a big data competition. At present, domestic "Ali Tianchi" and foreign "Kaggle" are open big data competition platforms, on which there will be all kinds of real projects released by various organizations for everyone to practice and compete. After reading this, you may still have a big question in your mind: "even if you have learned the basic lessons, can you start to practice without anyone to take them?" The following article will answer how to practice "rolling and crawling" one after another.

For the first time in the world

Find the highest baseline

The "baseline" here can be understood as a reference when your predecessors have made achievements when they happen to need to do the same work. For the above-mentioned situation, if there is a big god to lead the team to practice, then the previous practical experience of leading the big god will become the "baseline" of all the team members. Is there a more general solution for readers who do not have "God" resources? The answer is yes. If readers do not know how to start with a kind of problem at present, for example, they have just finished the course of "in-depth learning", but do not know how to do projects such as "natural language processing", the best way is to make good use of domestic "Wanfang" and "knowledge net", such as the thesis query platform, to query the dissertations of domestic colleges and universities in related fields. Most of these papers are in Chinese and will introduce a lot of basic background knowledge in the paper, just to meet our needs.

There is a good phrase that I successfully learned is "it is not the alarm clock but the dream that wakes me up every day." this may sound inspirational, but for 90% of people, it is nonsense. We look back and find that what wakes us up every day is often "wages deducted after being late for work" or "the killer of the boss after arriving late at the lab". This is the reality, it sounds cruel, but we can make good use of it. When it comes to our upgrade and project progress, the biggest driving force that keeps us moving forward is often "the disdain of our friends who can't finish the task before DDL" and "the sense of achievement brought by the completion of quick win".

To do this well, in addition to the reasonable division of tasks mentioned in the previous section, the most important thing is to have a reliable teamleader to continue to push (push), each to a given node after the thunderous advance. Finally, according to Maslow's hierarchy of needs theory, dreams should belong to the "self-fulfilling needs" at the top of the model. If a person can be awakened by a "dream", then that person's other needs should have been well met. So I sincerely wish that one day you can be awakened by your own "dream" in the morning.

How to become big data engineer

Due to the current shortage of big data talent, it is difficult for the company to recruit the right talent-both highly educated and preferably with large-scale data processing experience. Therefore, many enterprises will use internal mining.

In August this year, Alibaba held a big data competition to take out the data from Tmall's platform, remove sensitive issues, and hand it over to more than 7000 teams on the cloud computing platform for competition, which is divided into internal and external competitions. "through this way to motivate internal staff, but also found external talent, so that big data engineers from various industries emerge."

Yan Liping suggests that people who have been engaged in database management, mining, and programming for a long time, including traditional quantitative analysts, Hadoop engineers, and any manager who needs to make judgments and decisions based on data, such as operations managers in certain fields, can try this position, and talented people in various fields can also become big data engineers as long as they learn to use data.

Salary and treatment

As a "giant panda" in IT occupations, the income and treatment of big data engineers can be said to have reached the top of the class. According to Yan Liping's observation, 10% of domestic IT, communications and industry recruitment are related to big data, and the proportion is still on the rise. Yan Liping said, "the arrival of the big data era is very sudden, the momentum of development in China is radical, but the talent is very limited, and now it is completely in short supply." In the United States, big data engineers earn an average of $175000 a year, while it is understood that in China's top Internet companies, the salary of big data engineers at the same level may be 20% to 30% higher than other positions, and is highly valued by enterprises.

Career development path

Due to the small number of big data talents, the data departments of most companies are generally a flat hierarchical model, roughly divided into three levels: data analyst, senior researcher and department director. Large companies may divide different teams according to the dimensions of the application domain, while in small companies they need to have multiple roles. Some Internet companies that place special emphasis on big data's strategy will set up another top position-such as Alibaba's chief data officer. "most of the people in this position will develop in the direction of research and become important data strategy talents." Yan Liping said. On the other hand, big data engineers understand business and products as well as business staff, so they can also turn to product or marketing departments, or even rise to senior management of the company.

In addition, if there is anything that big data beginners do not understand, you can follow the official account of Wechat: programmer Daniel and retweet-- I have just sorted out the latest basic and advanced tutorials of big data 2018, selfless sharing

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report