Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What does big data learn? Big data's learning roadmap

2025-02-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Recently, many netizens have asked me how to learn big data's skills. How does big data get started? How to do big data analysis? What technologies do you need to learn in data science? Big data's application prospect and so on. Because the content of big data's technology is too numerous and complicated, big data has a wide range of applications, and the key technologies used in different fields and directions will be quite different, so it is difficult to explain clearly in a few words. In this paper, from the perspective of data science and big data's key technology system, let's talk about what big data's core technology is, how to learn it, and how to avoid the misunderstanding of big data's study for reference.

1. The goal of big data's application is pervasive intelligence

To learn big data well, we must first make clear the goal of big data's application. I once said that big data is like a panacea, like the box calculation of Baidu's premise for several years, this box can be filled with anything. Why is this so? because big data's frame is too large, its ultimate goal is to use a series of information technology to achieve human deep insight and intelligent decision-making under the condition of massive data, and finally move towards universal man-machine intelligence fusion! This is not only the extension of traditional information management, but also the core technological driving force for the development of intelligent management in human society. Through the application of big data, facing the past, discovering the data rules, summing up the known; facing the future, mining the data trend and predicting the unknown. So as to improve people's ability to understand things and make decisions, and finally realize the universal intelligence of the society. Whether it is business intelligence, machine intelligence, artificial intelligence, or intelligent customer service, intelligent question and answer, intelligent recommendation, intelligent medical care, intelligent transportation and other related technologies and systems, their essence is evolving towards this goal. With the rapid development of cloud computing platform and big data technology, it is becoming easier and easier to obtain the technology and support related to big data infrastructure construction. At the same time, the comprehensive data collection capability of mobile Internet and Internet of things technology has objectively promoted the accumulation and outbreak of big data. In short, big data is a big box, everything can be installed inside, and the collection of large data sources can not be separated from the Internet of things and smart phones for the collection of large data sources. Big data massive data storage can not be expanded without cloud computing. Big data computing analysis using traditional machine learning, data mining technology will be relatively slow, need to do parallel computing and distributed computing expansion Big data can't do automatic feature engineering without deep learning, big data can't show it interactively without visualization, and big data's analysis techniques for specific fields and multimodal data are very extensive. Financial big data, traffic big data, medical big data, security big data, Telecom big data, e-commerce big data, social big data, text big data, image big data, video big data … The scope of such things is too wide, so first of all, we have to find out the core goal of big data's application. Only after this is clear, can we grasp the common key technologies according to the characteristics of different industries, so as to study pertinently.

Fig. 1 Foreign big data enterprise relationship chart, traditional information technology enterprises are also developing towards intelligence, competing and supporting each other with emerging big data enterprises.

2. Data Science and its key Technology system viewed from the layout of big data

After defining big data's application goal, let's take a look at data science (Data Science). Data science can be understood as a collection of scientific methods, technologies and systems that intersect multiple disciplines to acquire knowledge from data. Its goal is to extract valuable information from data. It combines theories and technologies in many fields, including applied mathematics, statistics, pattern recognition, machine learning, artificial intelligence, and deep learning. Data visualization, data mining, data warehouse, and high performance computing. Jim Gray, a Turing Award winner, describes data science as the "fourth paradigm" of science (empirical, theoretical, computational and data-driven), and asserts that because of the influence of information technology and the proliferation of data, scientific problems in any field in the future will be driven by data.

Here I still want to recommend the big data learning exchange skirt I built myself: 532 + 218 plus 147. it was all developed by big data. If you are studying big data, the editor welcomes you to join us. We are all software development parties. Irregularly share practical information (only related to big data development), including a 2018 latest big data advanced materials and advanced development tutorials Welcome to Zhonghejin, who wants to go deep into big data.

Fig. 2 typical data science processes: including original data acquisition, data preprocessing and cleaning, data exploratory analysis, data computing modeling, data visualization and reporting, data products and decision support, etc.

The traditional information technology is mostly calculated and processed on structured and small-scale data. In the big data era, the data became larger, the data sources were heterogeneous, and intelligent prediction and analysis were needed. Therefore, the core technology is inseparable from machine learning, data mining, artificial intelligence and so on. In addition, it is necessary to consider the distributed storage management of massive data and the parallel processing of machine learning algorithms. Therefore, the large-scale growth of data has objectively promoted the prosperity and development of DT (Data Technology) technology ecology. Including big data collection, data preprocessing, distributed storage, NOSQL database, multimode computing (batch processing, online processing, real-time streaming processing, memory processing), multimodal computing (image, text, video, audio), data warehouse, data mining, machine learning, artificial intelligence, deep learning, parallel computing, visualization and other technical areas and different levels. It can be seen that the territory of big data under the new technology generic ecology of DT is very complex, of course, there is also the element of bubble, and this territory will be changing all the time, just like the applications in the PC era, the websites on the Internet, the APP of the mobile Internet, and the technologies and products of the big data era are also in the process of survival of the fittest. Let's take a look at the 2017 edition of big data's territory:

Fig. 3 big data industrial map of Zhongguancun at home and abroad (including data, technology, applications, enterprises, etc.)

The above big data layout basically covers foreign big data related technology and industrial chain (domestic Zhongguancun version of big data technology and enterprises are still too few, mostly traditional information technology enterprises are collecting numbers), from large data sources, open source technology framework, big data infrastructure construction, big data core computing mining analysis, big data industry application and other aspects of related technologies, products and enterprises are displayed. Big data industry chain from data source > open source technology > infrastructure > analysis and calculation > industry application to product landing, each chain link and the subdivision of its jurisdiction involve a large number of data analysis technologies. Whether it is learning technology or developing products, it is very necessary to analyze and understand the industrial territory of big data. We will not elaborate on the details of the layout. From the perspective of learning, we will focus on the core technologies that are included in DT (Data technology) technology generics, and what is the logical relationship among the various technology fields. This is the first question to be clarified in learning big data:

(1) Machine learning (machine learning): first of all, let's talk about machine learning, why do we talk about it first, because machine learning is the key technology of big data's processing of the connecting link between the preceding and the next, machine learning is deep learning and artificial intelligence, and machine learning is data mining and statistical learning. Machine learning belongs to the interdisciplinary discipline of computer and statistics. The core goal is to make the computer have the function of automatic classification and prediction of data through a series of algorithms, such as function mapping, data training, optimization solution, model evaluation and so on. The field of machine learning includes many kinds of intelligent processing algorithms, such as classification, clustering, regression, correlation analysis and so on. There are many algorithms under each category, such as SVM, neural network. Logistic regression, decision tree, EM, HMM, Bayesian network, random forest, LDA, etc., whether they are ten or twenty network ranking algorithms, are only the tip of the iceberg. With the breakthrough development of deep learning core technology, machine learning algorithms can be expanded at a high speed. In a word, big data's processing should be intelligent, and machine learning is the core. Machine learning is the core technology of deep learning, data mining, business intelligence, artificial intelligence, big data and other concepts. Machine learning for image processing and recognition is machine vision, machine learning for simulating human language is natural language processing, machine vision and natural language processing are also the core technologies that support artificial intelligence. Machine learning for general data analysis is data mining. Deep learning (deep learning) is a popular sub-field of machine learning, which belongs to a series of variants of the original artificial neural network algorithm. Due to the remarkable learning effect in the field of image and speech recognition under the condition of big data, it is expected to become the key technology for artificial intelligence breakthrough, so the major research institutions and IT giants have paid great attention to it.

(2) data mining (data mining), data mining can be said to be a superset of machine learning, is a relatively broad concept, similar to mining, it is necessary to dig out gems from a large number of ores and mining valuable and regular information from massive data. The core technology of data mining comes from the field of machine learning, such as deep learning is a kind of popular algorithm in machine learning, of course, it can also be used in data mining. There are traditional business intelligence (BI) areas also include data mining, OLAP multi-dimensional data analysis can do mining analysis, and even the basic statistical analysis of Excel can also do mining. The key is whether your technology can really mine useful information, and then that information can guide decision-making. The formulation of data mining is earlier than machine learning and has a wide range of application. Data mining and machine learning are the core technologies of big data analysis, which support each other and provide relevant models and algorithms for big data processing. Models and algorithms are the key to big data processing, and learning models are rarely used in exploratory interactive analysis, visual analysis, data collection, storage and management.

(3) artificial intelligence (artifical intelligence), AI and big data promote each other. On the one hand, the development of basic theory and technology of AI provides richer models and algorithms for big data machine learning and data mining, such as a series of deep learning techniques (reinforcement learning, adversarial learning, etc.) and methods in recent years. On the other hand, big data provides new power and fuel for the development of AI. After the data scale is large, the traditional machine learning algorithm is faced with challenges, such as parallelization, acceleration and improvement. The ultimate goal of AI is to make machines intelligent and anthropomorphic, machines can do the same work as human beings, and the human brain can deal with all kinds of complex problems with tens of watts of power. Although the computing power of machines is much stronger than that of human beings, it is difficult for machines to match the functions of human understanding, perceptual inference, memory and fantasy, and psychology. Therefore, it is difficult for machines to personify artificial intelligence from a technical point of view. A considerable part of the techniques and algorithms of artificial intelligence and machine learning overlap. Deep learning has achieved great success in the fields of computer vision and walking, such as Google automatically identifying a cat, Google's AlpaGo also beat the top human professional go players, and so on. However, at this stage, deep learning can not achieve brain-like computing, up to the bionic level, emotion, memory, cognition, experience and other human unique ability machines are difficult to achieve in the short term.

(4) other big data processing basic technologies, such as figure 4, big data basic technologies include computer science-related aspects such as programming, cloud computing, distributed computing, system architecture design, etc., and the theoretical basis of machine learning, such as algorithm, data structure, probability theory, algebra, matrix analysis, statistical learning, feature engineering and so on. Business analysis and understanding, such as domain knowledge management, product design, visualization, and data management, such as data collection, data preprocessing, database, data warehouse, information retrieval, multi-dimensional analysis, distributed storage and so on. These theories and technologies serve big data's basic management, machine learning and application decision-making.

Fig. 4 Technical dimension of data science

The figure above is the five technical dimensions of data science, which basically covers the key supporting technology system of data science. The related technologies of data science are sorted out from the aspects of data management, basic theory and technology of computer science, data analysis, business understanding, decision-making and design, among which the learning contents of basic theory and method of computer science and data analysis are the most important. At this stage, most of big data's products and services are in the data management section, and the docking of the analysis plate and the business decision plate is the key breakthrough for the follow-up development of data science and big data industry.

In addition, the Art&Design section in the picture only lists traffic communication and visualization, in fact, it is not enough, this art (Art) also illustrates the essential difference between data science and traditional information technology, the core competence of data science is to put forward ideas based on problems, and then turn the ideas into learning models, this ability is to talk about art, without such design art, it is not so easy for computers to be intelligent. Why did it rise to art? Because experience tells us that there is no standard answer to turn a real problem into a model, there is more than one model to choose from, there are various technical routes, there are many dimensions of evaluation indicators, and even there are many optimization methods. the essence of machine learning is to deal with this art, given raw data, constraints and problem description, there is no standard answer, and the choice of each scheme is a hypothetical hypothesis. It is necessary to have the ability to use accurate testing and experimental methods to verify and falsify these hypotheses. From this level, all future scientific problems as well as business and government management decision-making problems will be data science problems, and machine learning is the core of data science.

3. Big data touch the elephant blindly: how to build a complete knowledge structure and analytical ability

From digitalization, informationization, networking to the intelligent era in the future, the cutting-edge information technology fields such as mobile Internet, Internet of things, cloud computing, big data and artificial intelligence have become popular one by one. It also represents the general trend of the development of information technology. What is the technical category of big data and big data and their logical relationship? it is estimated that many people are touching the elephant blindly according to the field they are familiar with (figure 5). In fact, I am talking about the blind touch the elephant is not derogatory, after all, a field of learning to master is from the blind touch the elephant. Big data, data science is a very virtual concept, the analysis goal and the adoption of technology are all-inclusive, just like writing programs, front-end and back-end, BCMX S and Cmax S, embedded, enterprise applications and APP, etc., there are dozens of development languages, and the technologies needed in different directions are also very different.

Fig. 5 big data blind man touching elephant

Therefore, how to build a complete knowledge structure and analytical ability in big data's field from point to face is very important, and some aspects of technology and language are just tools. Big data's knowledge structure is not only a profound knowledge of big data's basic theory, but also a broad knowledge and application of the overall view, with big data's industrial development needs of the most reasonable, optimal, the most critical core technology and knowledge system. Through reasonable knowledge structure and scientific big data's way of thinking, big data can improve his analytical skills in actual combat. This goal is very big, but it is still achievable. First of all, we need to find out the situation of big data's industrial chain, and then we need to clarify big data's technology stack, that is, the relevant technology system, and finally set the learning goal and application direction. what industry is the data, whether it is concerned about storage or machine learning, what is the scale of the data, and whether the data type is text, image, web page or commercial database? The technology used in each direction is quite different, so it is necessary to find out the point of interest and breakthrough point of learning.

Fig. 6 reference map of big data's technology stack and learning route

The above big data technology stack and learning roadmap can be said to be a general outline of big data's learning, which is highly professional and worthy of in-depth study and understanding by beginners, and is a richer supplement to the data science and technology system I mentioned earlier. For example, the basic learning part includes linear algebra, relational algebra, database foundation, CAP theory, OLAP, multidimensional data model, data preprocessing ETL and so on. In short, big data's study is not like cooking, waiting until all the materials are ready (because the technical system in this field has numerous and complicated application goals, it is difficult to master most of its core theories and technologies for ten or two years). But combined with their own interests or work needs, find a point to plunge in, master the relevant technology of this point, and deeply understand the process, application and evaluation of its analysis. After a thorough point, and then point to the surface, and then draw examples, gradually covering all fields of big data, so as to build a complete knowledge structure and technical ability system, which is the best way for big data to learn.

4. How should big data learn: the characteristics of data Science and big data's Learning misunderstanding

(1) big data's learning should be business-driven, not technology-driven: the core competence of data science is problem solving. Big data's core goal is data-driven intelligence, to solve specific problems, whether it is scientific research, business decision-making, or government management. Therefore, it is necessary to clarify the problem before learning, understand the problem, the so-called problem-oriented, goal-oriented, and then study and select the appropriate technology to apply after this clear, so that it is targeted, and the big data analysis of hadoop,spark is not rigorous. Different business areas need the support of theories, technologies and tools in different directions. For example, text and web pages need to be modeled in natural language, data streams change over time, and images, audio and video are mostly spatio-temporal mixed modeling; big data processing such as acquisition requires crawler, inverted export and preprocessing support. Storage needs support such as distributed cloud storage and cloud computing resource management, computing needs model support such as classification, prediction and description, and applications need visualization, knowledge base, decision evaluation and other support. Therefore, it is the business that determines the technology, rather than considering the business according to the technology, which is the first misunderstanding that big data should avoid in learning.

(2) big data should make good use of open source and do not repeat the wheel: the technical gene of data science lies in open source. The open source of the frontier field of IT has become an irreversible trend. Android open source makes smartphones civilian, which makes us enter the era of mobile Internet. Intelligent hardware open source will lead us into the era of the Internet of things. Big data's open source ecology, represented by Hadoop and Spark, accelerates the process of de-IOE (IBM, ORACLE, EMC), forcing traditional IT giants to embrace open source. The deep learning open source of Google and OpenAI alliance (represented by Tensorflow,Torch,Caffe and others) is accelerating the development of artificial intelligence technology. R and Python, the standard languages of data science, were born and flourished because of open source. Nokia declined because it did not grasp the general trend of open source. Why to open source, this is due to the industrialization and componentization of the development of IT, the basic technology stacks and tool libraries in various fields have been very mature, the next stage is how to quickly combine, quickly build building blocks, fast output, whether it is linux,anroid or tensorflow, its basic component library is basically the use of existing open source libraries, combined with new technical methods to achieve, combined to build wheels, rarely repeated. In addition, open source, this crowdsourced development model, is the embodiment of collective wisdom programming, a company can not accumulate the development intelligence of global engineers, and a star open source project on GitHub can, so we should make good use of open source and collective wisdom programming, instead of repeatedly building wheels, this is the second misunderstanding that big data should avoid.

(3) big data's study should be based on points and should not be greedy for perfection: data science should grasp fragmentation and systematization. According to the previous analysis of big data's technology system, we can see that the depth and breadth of big data's technology is incomparable to that of traditional information technology. Our energy is very limited, it is difficult to master big data's theory and technology in many fields in a short time, and data science should grasp the relationship between fragmentation and systematization. What is fragmentation? this fragmentation includes business and technical aspects. Big data is not only Google, Amazon, BAT and other Internet companies, but there are traces of it paying attention to data in every industry and enterprise: real-time sensor data on a production line, sensor data on vehicles, running status data of high-speed rail equipment, monitoring data of transportation departments, case data of medical institutions. Due to the huge amount of data of government departments, big data's business scenarios and analysis objectives are fragmented, and they are very different from each other. In addition, on the technical level, big data's technology is a panacea, all the technologies that serve data analysis and decision-making belong to this category, and its technical system is also fragmented. How to grasp systematicness? big data applications in different fields have their common key technologies, and their system technical architecture also has something in common, such as high scalability of the system, large-scale expansion of horizontal data and large-scale expansion of vertical business. high fault tolerance and multi-source heterogeneous environment support, compatibility and integration of the original system, and so on, every big data system should consider the above problems. How to grasp big data's fragmented learning and systematic design is inseparable from the two misunderstandings mentioned above. It is suggested that from the perspective of application, starting from the needs of an actual application field, we should first set out a technical point. After having a certain foundation, we will gradually understand its systematic technology by reference to its horizontal expansion.

(4) big data should have the courage to practice and not talk on paper: data science or data engineering? Big data can produce value only by combining with applications in specific fields. Data science or data engineering is the key issue for big data to make clear in his study. To engage in academic paper data science OK, but to apply big data to the ground, if the achievements of data science are transformed into data engineering for landing application, it is very difficult, which is also the reason why many enterprises question the value of data science. Not to mention that this kind of transformation needs a process, the practitioners themselves also need to examine and think. How does industry, including government regulatory agencies, introduce research intelligence, and how does data analysis transform and realize value? Data science researchers and enterprise big data system development engineers have to think about these key issues. At present, the key problems to be solved in data engineering are data (Data) > knowledge (Knowledge) > service (Service), data collection and management, mining and analysis to acquire knowledge, knowledge rules for decision support and application into continuous service. Only by solving these three problems can we calculate the landing of big data's application. From a learning point of view, DWS is the general goal of big data's study to solve problems, especially the practical application ability of data science, and practice is more important than theory. From model, feature, error, experiment, test to application, every step should consider whether the model can solve practical problems, whether the model can be explained, have the courage to try and iterate, the model and software package itself are not omnipotent, big data application should pay attention to robustness and effectiveness, greenhouse model is useless, training set and test set OK? Big data how to get out of the laboratory and engineering landing, first, can not work behind closed doors, the model converged, of course, everything will be fine; second, to go out of the laboratory to fully dock with the actual decision-making problems of the industry; third, relevance and causality can not be less, the model that can not describe causality is not helpful to solve practical problems. Fourth, pay attention to the iteration and production of the model, continuous upgrade and optimization to solve the problem of incremental learning of new data and dynamic adjustment of the model. Therefore, big data's study must be clear about whether I am doing data science or data engineering, what technical capabilities I need, and at what stage I am now, otherwise it is difficult for me to learn and make good use of big data for the sake of technology.

(5) the three stages of big data's study: the technical route of different stages has its own emphasis and grasp the principal contradiction. In the process of big data application implementation, due to technical and cost considerations, it is impossible to solve all problems in a short time. Big data application itself has its own rules and characteristics, such as the analysis goal must match the data scale. The adoption of analysis technology depends on the data structure and data source conditions, data integration must cover a more comprehensive business background, key links of data can not be missing, and so on. Big data's learning can be divided into three stages according to the application goal:

1) big data infrastructure construction stage: the focus of this stage is to save, manage and use big data, and at the same time consider the interworking and combination of big data platform and the original business system. In a word, do a good job of global data integration to solve the problem of data isolated island! In order to complete the construction and development of big data's infrastructure system, it is necessary to clearly define the selection and use of core components of each layer of data collection, storage and analysis, build a stable big data cluster, or choose a private cloud solution service cluster, run in parallel with the production system, so that the historical data and real-time data to be analyzed can be collected and continuously flowed into big data system. The key technology learning at this stage includes collection crawler, data interface, distributed storage, data preprocessing ETL, data integration, database and data warehouse management, cloud computing and resource scheduling management and so on.

2) big data descriptive analysis stage: this stage is mainly focused on basic description statistics and exploratory visual analysis of data offline or online, and big data can carry out interactive query, summary, statistics and visualization under the condition of mass storage. If the BI system is built, it is necessary to integrate traditional BI technology for OLAP, KPI, Report, Chart, Dashboard analysis and preliminary descriptive data mining analysis. This basic analysis stage is not only the test of the quality of data integration, but also the test of the stability of distributed storage management technology under the condition of massive data, and it should be able to replace or integrate all kinds of reports of traditional BI. The key technology learning at this stage includes visualization, exploratory interactive analysis, multidimensional analysis, query design of all kinds of basic reports and charts, and so on.

3) big data advanced prediction analysis and production deployment stage: under the condition that the preliminary description analysis results are reasonable and meet the expected goals, and the data distributed management and descriptive analysis are stable and mature, it can be combined with the needs of further intelligent analysis. Advanced predictive mining and analysis is carried out by using machine learning models such as deep learning, which are suitable for massive data processing. By iterating step by step to optimize the mining model and data quality, a stable, reliable and scalable intelligent prediction model is formed, and the decision support of the analysis results is carried out in the enterprise related business services for verification, deployment, evaluation and feedback. The key technologies in this stage include machine learning modeling, decision support, visualization, model deployment, operation and maintenance and so on.

In the process of technical learning in the above stages, we need to pay attention to several key issues: first, attach importance to visualization and business decision-making, big data's analysis results serve for decision-making, and big data's decision-making forms, the quality of visualization technology plays a decisive role; second, ask yourself, is Hadoop, Spark and so on necessary? It is necessary to consider the technology selection and the determination of the technical route from the whole big data technology stack; third, the modeling problem is in the core position, and the selection and evaluation of the model is very important. in the classroom and laboratory, the evaluation of most models is static. seldom consider its running speed, real-time and incremental processing, so more complex bloated models are used, and their characteristic variables are often very complex. However, various Boost methods in Kaggle competitions, such as XGBDT and random forest models, are rarely mentioned in data mining and machine learning textbooks, so you can't trust all books in order to fully refer to the actual experience of the industry. Fourth, the choice of development language, the basic framework system Java must be mastered, the application-level machine learning and data analysis library Python must be mastered, and C++ must be mastered in order to go deep into the bottom of various frameworks and learning libraries. Fifth, the production of the model, the actual data need to be transformed into input features through the pipeline design to transfer to the model, how to minimize the performance gap between online and offline models, these are the key problems to be solved.

(6) other additions: Kaggle, crowdsourcing and training. Crowdsourcing is a form of innovative production organization based on the Internet. Companies use the network to distribute work and find ideas and solve problems by involving more appropriate people, such as Wikipedia and the IT resource community GitHub, are typical crowdsourcing platforms. Crowdsourcing + open source has greatly promoted the rapid development of the IT industry. Of course, as the top crowdsourcing platform in the field of data science, Kaggle has much more influence than that (so it has just been acquired by Google). Companies and researchers can publish data on Kaggle, where data analysts can compete to produce the best models. The essence of this crowdsourcing model is the embodiment of collective intelligence programming, that is, there are many strategies that can be used to solve almost all predictive modeling problems, and analysts can not find the best solution from the very beginning. The goal of Kaggle is to solve this problem through crowdsourcing, thus making data science a collective intelligence movement. So to learn from big data, it is highly recommended to go to Kaggle surfing, a good platform for experience. As for big data's training, you can go to training and study if you don't know much about the basic theory and technology. after you have a foundation, you have to rely on yourself to practice more and solve practical problems.

5. Conclusion and prospect

To make a summary, big data is not a Silver Bullet. The rise of big data only illustrates a phenomenon. With the rapid development of science and technology, data accounts for an increasing proportion in human life and decision-making. In the face of such breadth and depth of big data's technical stack and tool set, how to learn and master big data's analytical skills is like a blind man touching an elephant. However, the study and application of technology are also interlinked. All roads lead to Rome, and the key is to find the right entry point, combine theory with practice, have an overall view and engineering thinking, and have a grasp of the principal contradiction between complex system design and development and key technical systems. Familiar with big data's basic theory and algorithm, application cut-in, point-to-point, inference, horizontal expansion, so as to build a complete big data knowledge structure and core technical competence, such a learning effect will be much better.

In addition, technological development also follows the law of quantitative change to qualitative change, and artificial intelligence + Internet of things + big data + cloud computing is a four-in-one development (time sequence, but substantial technological breakthroughs are all in recent years). The infrastructure and core architecture of the future intelligent era will be based on these four levels, and this trend of social evolution is also obvious: agricultural era > industrial era > Internet era > intelligent era. In this four-in-one intelligent technology chain, the Internet of things focuses on data collection, cloud computing focuses on infrastructure, big data technology is in the core position, and artificial intelligence is the development goal. Therefore, learning big data technology also requires comprehensive research and understanding of these four aspects.

Finally, pour some cold water on the prospects of big data. In the future, big data's job demand will not be as much as that advertised by the media, and big data's specific work will not be as cool as that in American blockbusters. Don't always stare at BAT. The development of big data in China is still in its infancy. In short, technology belongs to technology, practice can produce real knowledge, landing to solve the problem is the key, and it takes ten years for Palantir to sharpen a sword. However, in big data's era, everyone has to know something about data analysis, which is the most real, do not know programming? Then learn from Python. If both aunts and primary school students in the era of artificial intelligence can program, it must be Python:) for more sharing of programming, please follow the official account of Wechat: programmer Daniel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report