Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The Prospect of big data's New Development of Science: four Trends that have to be known

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Since 2012, almost everyone (at least the Internet) must be called big data, and it seems that they are embarrassed to chat with others if they don't have anything to do with big data. Starting from 2016, big data system gradually began to enter the deployment stage in the enterprise, big data's hype gradually dissipated, followed by a period of vigorous development of applications, and some iconic IPO representing mature technologies continue to appear in domestic and foreign capital markets. In the twinkling of an eye, the bubble that big data experienced a few years ago is indisputably shifting to artificial intelligence. It can be said that in the past year, the "Big Bang" of common consciousness experienced by AI was even worse than that of big data at that time. Recently, the tuyere has been transferred to the block chain, which has become a cause of anxiety in the industry to some extent.

However, no matter how the technical hotspots change, what we can see is that as the industry settles down to make a substantial landing, big data's ecology is becoming more and more subdivided. Today, I will talk to you about some new changes and trends in big data's field.

Here I still want to recommend the big data Learning Exchange Group I built myself: 529867072, all of them are developed by big data. If you are studying big data, the editor welcomes you to join us. Everyone is a software development party. Irregularly share practical information (only related to big data software development), including the latest big data advanced materials and advanced development tutorials sorted out by myself. Welcome to join us if you want to go deep into big data.

I. data Governance and Security Data Governance& Security

As far as the development trend is concerned, this can be discussed in the first place.

Over the years, data has been rapidly accumulated in enterprises. The Internet of things (IoT) continues to accelerate the generation of data.

For many enterprises, big data's solution is to use technologies such as open source Apache Hadoop as basic support to create a data lake (Data Lake), that is, to create an enterprise-wide data management platform for storing all enterprise data in native format. The data lake will eliminate the isolated island of information by providing a single data repository, which can be used by the whole organization for business analysis, data mining and other applications. When there is a data lake, people tend to think that this thing will become an omni-directional and omnipotent collection of big data, such as clickstream data, Internet of things data, log data, etc., will be required to enter the lake, but the problems that these data are difficult to deal with will be ignored.

But unless you know exactly what's in the data lake and have access to the right data for analysis, it doesn't make sense no matter how big it is. As a result, in the end, everyone will realize that many data lakes are underperforming resources, and people do not know what is stored in it, how to access it, or how to gain insight from the data.

However, it is not easy to easily find what you want and manage permissions at the same time. In addition to the data lake, another theme of governance is to provide anyone with easy access to reliable data in a secure and auditable manner.

Therefore, from the point of view of managing and making good use of corporate data assets, data governance, like the company's top-level system and declaration, needs to be valued and implemented with corresponding strategies and processes. The ultimate goal is to improve data management, ensure data quality, and form a new situation of open sharing through the realization of data governance. In addition, data governance is also an organic combination of decision-making, functions and operational processes, and people are responsible for these data assets.

Second, the development of data workbench dedicated to collaboration

In most large enterprises, big data's adoption starts with a small number of independent projects, and so does a tweet: for example, do a little Hadoop cluster here, use an analysis tool there, run a simple business model, and realize the need to set up some new positions (data scientist, Chief data Officer), and so on.

Now, business scenarios are becoming more and more abundant, heterogeneity is becoming more and more prominent, and a variety of tools are used throughout the enterprise. Within the organizational scope of the company, the centralized "data science department" is gradually giving way to a more decentralized organization, because the centralized department is becoming more and more bottleneck and more likely to cause the loss of resources.

This group of data scientists, data engineers and data analysts is increasingly embedded in different business units. Therefore, the need for the platform is clear, that is to make everything work together, because big data's success is based on the establishment of an assembly line made up of technology, people and processes.

As a result, some new types of collaboration platforms (such as Jupyter) are emerging faster, leading the development of the so-called DataOps (corresponding to DevOps) field.

III. Automation of data science

Data scientists (Data Scientist) are still hot contenders in the market. But we rarely see such people around us, and even Fortune 1000 companies are troubled by their inability to recruit more "data scientists." In some organizations, the data science department is evolving from an enabler to a bottleneck.

At the same time, the popularity of AI and the spread of self-service tools have made it easier for data engineers and even data analysts with limited skills in data science to perform basic operations that until recently were the domain of data scientists. With the help of automation tools, a large number of big data's work, especially those simple and boring jobs, will be handled by data engineers and data analysts without having to bother data scientists with deep technical skills. Of course, even so, data scientists don't need to be too scared at the moment.

In the foreseeable future, self-service tools and automation models will "enhance" rather than destroy data scientists and liberate them. Let them focus on tasks that require judgment, creativity, social skills, or vertical industry knowledge, so that they can better reflect the name of scientists.

Fourth, the rise of big data administrator

Big data administrator (BDA) is also marked by the database administrator (DBA), although the two English letters have only changed the order, but their connotations are very different. A very obvious trend is that enterprises will have a demand for a new job role, that is, big data administrator. DBA is already very familiar, but it is very different from the data administrators in the era of big data. Big data Learning Exchange Group: 251956502

The data administrator is between the data consumer and the data engineer. In order to be successful, the data administrator must understand the meaning of the data and master some technologies applied to the data in addition to the maintenance of big data system.

The data administrator needs to be aware of the types of data analysis to be performed throughout the organization, which datasets are well suited for this work, and how to transform the data from its original state to the form and form that the data user needs to perform the work. Data administrators should use systems such as self-service data platforms to speed up the end-to-end process for data consumers to access basic data sets without having to make countless copies of data.

Conclusion

The above four aspects are the new requirements put forward by the practical development of data science. Whoever can get good results in these aspects will take the leading position in this big data era.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report