Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Look at the new trend of data application | the 8th Tencent Cloud Techo TVP developer Summit ended successfully.

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

In the data-driven era, how to make effective use of big data has become an important topic in various industries. With the vigorous development of cloud computing, artificial intelligence and other emerging technologies, data technology is also growing and showing new trends and characteristics, how should enterprises grasp the new context of data technology, so as to gain insight into the value behind the data?

On August 19, 2023, the eighth Techo TVP developer Summit, "data-driven Intelligence, Smart enabling the Future", hosted by Tencent Cloud TVP, came to a successful conclusion. The summit gathered six leaders and experts from the data technology industry to share and exchange ideas and practices around the latest progress, direction and trends of data technology, innovative applications, and provide inspiration for developers.

The host opened the scene.

Lu Dongming, founder of ▲ and TVP of Tencent Cloud, made an opening speech.

The summit is hosted by Lu Dongming, a teacher from Tencent Cloud TVP. Lu Dongming is also the founder and host of the interview program "talking about threesome", which focuses on big data and AI, and is known as "Uncle Ming". At the beginning of the summit, Uncle Ming opened with a classic quotation from the famous British novelist Dickens in A Tale of two cities: this is the best and worst era of data technology in China. This is the most prosperous moment in the development cycle of database and big data technology in Chinese history, but the dazzling technology system and constantly changing products also bring unprecedented challenges to developers and enterprises. In the face of many database technologies, how to choose and how to combine to deal with different goals is an important issue that enterprises and developers need to think about and clarify.

Four Trends of data platform under the Nationalization of AI

Shi Kai, founder of ▲ Lean data Methodology and TVP of Tencent Cloud, gave a speech.

The author and founder of Lean data Methodology, Mr. Shi Kai, TVP of Tencent Cloud, shared the theme of "four Trends of data platform under AI Nationalization".

Mr. Shi pointed out that we are rapidly moving from "data nationalization" to "AI nationalization" era. In the era of data nationalization, everyone can be empowered by data and get real-time feedback and insight through the use and analysis of data. With the emergence of ChatGPT, the era of AI nationalization is coming rapidly. In the future, artificial intelligence will benefit everyone, and it will also bring great challenges to the enterprise data platform, that is, the contradiction between the unlimited growth of data application demand and the limited fragmented data productivity. However, the emergence of the large model gives data personnel a new imagination, and everyone hopes that AI technology can help data production, data analysis, and accelerate the generation of data source to value.

To this end, Mr. Shi put forward four major trends for the future development of the data platform:

The value of the data platform is obvious, with more and more enterprises investing more and more in the data, more and more enterprises want the data to generate value directly for the business, which also brings new challenges to the data platform, that is, how to directly relate the value of the data platform to the business value.

With the modernization of the data platform architecture, the data platform will develop in the direction of fusion analysis, ease of use, credibility and decentralization. The new data architecture practice represented by Data Fabric / Data Mesh is gradually rising.

AIGC empowers the data value chain, and the data platform will integrate AIGC's new technology to eliminate waste in the enterprise data production value chain.

AIGC capability platform, service-oriented, AIGC will become the ability of enterprises to use and adjust, general enterprises do not need to build their own large model, but should pay attention to how to integrate the ability of large model to deepen the value mining of data.

Finally, Mr. Shi Kai summed up a sentence for everyone: "the digital transformation starts with the problem, starts with the business, becomes the data, falls on the scene, measures the value, and finally organizes." No matter how the data platform evolves, how to grow from the business, irrigate with data, land in the scene, and finally present the business value, promoting the overall digitization of the enterprise is the core proposition that the enterprise pays close attention to.

Cost and ease of use-Tencent Cloud ES Cloud Native Serverless Evolution

The Director of Research and Development of ▲ Tencent Cloud ES gave a speech.

From the technology imagination, returning to the reality of enterprise data governance, in today's transformation from extensive growth to intensive growth, how to reduce cost and increase efficiency and improve data efficiency is the focus of enterprises and developers. Teacher Gao Pan, R & D Director of Tencent Cloud ES, shared the "cost and ease of use-the evolution of Tencent Cloud ES Cloud native Serverless".

Teacher Gao Pan said that Tencent Cloud ES is a fully hosted ELKB service native to Tencent Cloud. It is based on open source ES and carries out self-research kernel transformation around cost, performance, stability, scalability and other aspects. The cost is reduced by 50% to 80%, query performance is improved by 3%, write performance is improved by 2 times, SLA to 99.99%, and scalability is improved by more than 10 times.

The service scenarios of Tencent Cloud big data ES are very rich, and log is the most common and largest of them. Because the log value density is relatively low, but the scale is usually large, enterprises focus on cost control under the log scenario. Therefore, Tencent Cloud big data ES has made a lot of optimization improvements around cost. Through link integration, index autonomy, deposit separation and other technologies to greatly reduce access costs, operation and maintenance costs, resource costs.

The cost problem can be easily solved. Gao Pan also hopes to continue to improve the ease of use and provide users with an one-stop big data analysis service. Although various manufacturers provide PaaS-based ES services based on the lowest ES kernel, users still need to spend energy on operation and maintenance work, such as cluster creation, data link configuration, index life cycle management and so on. Therefore, he and his team improved the ES service based on Tencent Cloud's PaaS version and introduced a Serverless ES service that does not need to care about clusters and nodes and is free of operation and maintenance. In terms of cost, it has also been further optimized. Serverless is different from the original node-based billing form of PaaS services, which will be charged according to the number of writes and queries, and really charge on demand. In terms of stability, the cluster index backend is optimized by unified operation and maintenance to avoid failures caused by improper use, and is also 100% compatible with open source ES API,100% and ELK ecology.

Building Enterprise Real-time data Warehouse: building stable and reliable data Warehouse TCHouse-D based on Apache Doris

Li de, head of R & D technology at ▲ Tencent Cloud Doris, gave a speech.

Apache Doris is a well-known open source warehouse project of ASF, and it has gained the favor of many developers because of its easy to use and flexible advantages. Li de, the technology leader of Tencent Cloud Doris R & D and PMC of Apache Doris community, brought you a share entitled "Building an enterprise-level real-time data warehouse: building a stable and reliable data warehouse based on Apache Doris".

At the beginning of the sharing, Mr. Li de briefly introduced that Tencent Cloud Big data TCHouse-D,TCHouse-D is a real-time data warehouse service built by Tencent Cloud based on Apache Doris. It is 100% compatible with Apache Doris, compatible with MySQL protocol, and supports concurrency, multidimensional analysis, interactive analysis, real-time data warehouse, lake warehouse federation analysis and other business scenarios. It is easy to use, flexible, safe and reliable, ecologically compatible and fully functional. Then, Mr. Li shared his understanding of enterprise-level real-time updatable data warehouses:

Real-time write, add, delete, change and query, data can be written in real time and batch, and can be seen in real time, and can be connected to real-time systems such as Flink, Kafka and so on.

Real-time synchronization of data changes, support for whole library synchronization and incremental synchronization, automatic speed regulation of back pressure for streaming writing, real-time non-blocking automatic synchronization for table structure changes

Enterprise-level stable and reliable, complete authentication, authority and audit functions, perfect monitoring, alarm and inspection, fully managed service, high availability of reading and writing.

TCHouse-D is strictly designed based on the above standards. In ensuring real-time writing, adding, deleting, modifying and querying, it draws lessons from Google Mesa's pre-aggregation model, and the storage engine provides fast data import support through a data structure similar to LSM. In real-time synchronization, MySQL Binlog can synchronize in real time, the whole database increment, segment changes can be synchronized automatically, and there are two phases of commit, which can realize Exactly Once semantics. As a cloud product, there is no doubt about TCHouse-D 's investment in stability. It supports two-level alarm system for operation and users, regular patrol inspection, real-time writing back pressure automatic current limit, and Tablet and Compaction health check. In addition, the mechanism design of role-based authority system, whitelist and metadata double backup also protects the security and reliability of the service.

With everyone's expectation, Mr. Li de shared the future planning and prospect of TCHouse-D: hot and cold layering, computing nodes, cross-cluster synchronous replication, deposit separation and other functions are under development and are expected to meet with you in Q4 this year or early next year.

DataOps Exploration: analysis on the selection of Top Ten DataOps projects in Apache

Speech by ▲ Apache Software Foundation Member and Tencent Cloud TVP Guo Wei

In the field of big data, enterprises often pay attention to the results of data extraction and efficient mining, but try to explore the closed-loop process of data generation, storage, integration, circulation and reproduction. Apache Software Foundation Member and Mr. Guo Wei, TVP of Tencent Cloud, shared the theme of "DataOps Exploration: selection Analysis of the Top Ten DataOps projects of Apache".

In order to help you understand DataOps more intuitively, Mr. Guo succinctly summed it up as follows: the whole closed-loop process of storing data in the database, building dashboards, integrating it into the data lake to build data models, then mining, and finally to predict the results and regenerate new data. In 2019, Gartner divided IT technology into three ages: IT craftsmen, IT industrialization and IT digitization. However, Mr. Guo pointed out that with the rapid development of AI technology and the emergence of large models, we are facing the fourth era-IT intelligent era, and DataOps will also show the development trend from BI to AI. Subsequently, Mr. Guo made a detailed introduction and selection analysis of ten popular DataOps open source projects such as Apache SeaTunnel, Apache Airflow, Apache DolphinScheduler, Apache Nifi and so on, in order to further help enterprises and developers to find suitable projects so as to successfully create the company's own DataOps platform.

Speaking of the collision between the big model and DataOps that everyone is interested in and the future trend, teacher Guo said that it is a general trend for enterprises to retrain their models through open source models, and use a case video of "training their own privatized ChatGPT with a cup of Starbucks money" to vividly show the feasibility of training big models. The ultimate goal of DataOps is to make data generation faster, and the combination of big model and DataOps is something that every company and individual should try boldly.

Finally, Mr. Guo leads you to look forward to the fact that the essence of Ops is to improve the efficiency of people and people, improve the efficiency of business and technology, improve the efficiency of design and research, and improve the efficiency of people at different levels. I believe that in the field of DataOps, there will also be "ChatGPT-like" applications, allowing people to understand data through natural language.

Architecture and implementation of Tencent Cloud Intelligent Storage in AIGC scenario

Wang Miao, head of intelligent storage research and development at ▲ Tencent Cloud, gave a speech.

At present, as an important application scenario of large model, AIGC is sought after by many industries. Some institutions predict that AIGC scenario will become a trillion market in 5-10 years. Mr. Wang Miao, head of intelligent storage research and development from Tencent Cloud, also shared "the architecture and implementation practice of Tencent Cloud intelligent storage in AIGC scenario", and introduced in detail the technical architecture and main capabilities of Tencent Cloud intelligent storage, as well as the specific problems that can help enterprises solve in AIGC scenarios.

Teacher Wang Miao first introduced in detail the technical architecture of the intelligent storage system in the access layer, logic processing layer, data processing layer, storage layer, and underlying basic services. Then Mr. Wang Miao summarized the core elements of the AIGC scenario, namely, content generation, content security and content intelligence. Around these three core elements, combined with all the processes involved in the AIGC scenario, from data collection, data preprocessing, feature engineering, model training, to reasoning application, content audit, content intelligence, Tencent Cloud provides an end-to-end intelligent storage solution.

In Tencent Cloud intelligent storage solution, COS, as the unified storage base of the data lake, provides data accelerators GooseFS and GooseFSx during the data training phase with strong demand for bandwidth. Through distributed acceleration services and rich protocol support, it can greatly improve the efficiency of data reading and writing and the convenience of access. In terms of content security, Tencent Cloud will provide an integrated storage content security solution from input to output through customized models based on Vientiane's rich content auditing capabilities and combined with the special scenario of AIGC. In addition, in the face of copyright protection, teacher Wang Miao also introduced in detail the technical principle of data Vientiane digital watermarking: through discrete Fourier transform algorithm The picture and video frames are converted in frequency domain / time domain, and the digital watermark information is embedded in the conversion process, so as to hide the watermark and protect the copyright of digital products. In addition, there must be distribution scenarios for AIGC products. Tencent Cloud Intelligent Storage also provides extreme intelligence compression service, which can provide more than 50% volume compression for JPG and PNG images without changing the image format, thus greatly saving distribution traffic.

Finally, teacher Wang Miao shared a customer case focused on the field of text diagrams. Tencent Cloud intelligent storage team built the throughput capacity of GooseFS / s by helping customers deploy TB at training nodes, which greatly improved the training efficiency and model iteration efficiency of customers. With the launch of the business, in the face of a large number of requests and AIGC products, customers review text and pictures tens of millions of times a day through the AIGC automatic audit function of data Vientiane, which perfectly solves the content security problem. When distributing pictures, through the combination of AVIF adaptive and smart compression, it intelligently distributes the smallest images for different platforms, reduces the picture download bandwidth by 50%, saves operating costs and improves access speed.

Round table conversation

▲ round table conversation

After the end of the sharing session full of practical information, the round table session specially planned for this summit followed. What is different from the past is that this round-table discussion was held in the form of a debate under the auspices of Uncle Ming. Shi Kai, Gao Pan, Li de, Guo Wei and Wang Miao, five guests, expressed their own views on the topic and output their different views and original views. there was a collision of pros and cons on almost every issue, which was brilliant for a moment, and the audience shouted and enjoyed themselves, but also learned the speculative spirit.

With the arrival of the era of nationalization of AI, will big data be more prosperous?

Shi Kai, Gao Pan, and Wang Miao, three teachers, take a positive stand. They all believe that AI will make various industries more prosperous in the future, the amount of data will increase sharply, and in the future, the market will have higher requirements for big data's numeracy and efficiency, which will further promote technological upgrading and promote big data's development to a higher level.

Teacher Li de, on the other hand, held the opposite view. After asking whether the operating system was more popular 20 years ago or now, he expressed his view. He believed that when AI really iterated to extreme maturity, the database and big data would be hidden behind the application, and people's demand for database or big data may be reduced. Teacher Guo Wei also agrees with Mr. Li de's point of view, he believes that in the future, big data will become the infrastructure, really all the business logic will be done by the AI model.

Host Uncle Ming also shared his point of view, in his view, our understanding and exploration of data is not deep enough, with the development of AI, data requirements are also changing, new data types or data characteristics are likely to emerge in the future, and data engineers may have to solve new challenges at that time. From testing (Test) to large text (Text), to pictures (Image), and then to video (Video) is an evolution, what is behind video (Video), there is a lot of room for imagination.

Is the successful path for the future development of China's data technology "big and comprehensive" or "small and beautiful"?

Wang Miao, a teacher who tends to be small and beautiful, believes that some vertical scene companies have enough domain knowledge to respond quickly to the needs of some vertical fields when combined with big data's technology. At the same time, he also suggested that small and beautiful companies can stand on the shoulders of giants, and the underlying technology can consider using open source technology or cloud services to focus on energy and resources to quickly push their own products out. Gao Pan teacher believes that this issue belongs to the division of labor, small and beautiful focus on their own areas of in-depth exploration, do a good job of their own products, and then cooperate with large companies; large and comprehensive cloud manufacturers should do a good job of integration to provide customers with a complete set of solutions.

Teachers Guo Wei, Shi Kai and Li de think that it is better to be big and comprehensive. Mr. Guo Wei pointed out that the needs of Party A's enterprises are diversified, with 20% of enterprises choosing to assemble themselves with a small and beautiful single tool, while 80% of companies may rely more on one-stop solutions. Shi Kai said that in today's fierce market environment, companies that are not big and comprehensive may face survival problems, and there is a gap in understanding of technology and business goals between Party An and Party B. as a database product company, it is necessary to declare that they are large and comprehensive and emphasize the advantages of their own products in order to enhance industry recognition. Teacher Li de holds a similar point of view, in his view, small and beautiful is the ideal vision, big and all is the realistic path. If from the perspective of business success, product positioning and marketing is very important, many small and American companies in positioning and publicity is not as large and comprehensive companies can achieve household name.

On the other hand, Uncle Ming said that small and beautiful companies are the source of innovation, and he expects to see small and beautiful companies succeed, but large and comprehensive companies have more advantages in integrating resources and cost control. Large and comprehensive companies are more likely to succeed.

In the era of multiple weapons, what are the "weapons" that help developers improve their combat effectiveness?

Teacher Gao Pan shared his suggestions from the perspective of weapons. Although the technical products are numerous and complicated, developers only need to choose a well-recognized product for further study in each field according to the needs of their own scenarios, such as Spark in offline scenarios, MySQL in TP scenarios, ES in PG,AP scenarios, and Doris in PG,AP scenarios. The rest of the products can be used as examples.

Shi Kai believes that the more the technology is flying, the more it is necessary to maintain the core competence, so Mr. Shi puts forward three important abilities that a developer needs to have: learning ability, logic ability and communication ability. Learning ability ensures faster growth, logical ability helps solve problems better, and communication skills can create a very good atmosphere and environment to go farther, more stable and faster.

Teacher Li de also shared three abilities: first, the ability to use tools, such as ChatGPT, mature "wheels" and other tools or components to complete business requirements; second, to participate in open source, the use of open source code to learn and research can make faster progress; and finally, the summary ability, summary is a process of forcing yourself to think, and being good at summarizing can improve your thinking dimension.

Participating in open source is also one of Mr. Guo Wei's suggestions to developers. in addition, Mr. Guo Wei reminded developers to pay attention to large models, especially privatized models that will perform better than expected in auxiliary programming. Second, an in-depth understanding of business processes and requirements is often the criterion for distinguishing good programmers from ordinary programmers. Advanced good developers must not just write code, but to understand the business, participate in business processes, so as to better control business requirements.

Teacher Wang Miao stressed that developers need to have a sense of management, and use the consciousness of management to weigh the input-output ratio and decide whether things should be done and how many resources should be invested in architecture design and technology selection. this is a quality that developers need to further become comprehensive talents.

Finally, the host, Uncle Ming, summed up three words of advice for the participants: difference, reason and speech. " "difference" is not only the difference of difference, but also the difference of variation. In the current era of serious homogenization, developers must demand differences, observe market changes and seize opportunities in order to seize the opportunity in the next reincarnation; "understanding" is understanding, understanding a system, understanding a business will become more important; while "saying" represents persuasion, truly successful developers often end up leading the team, and persuasion is essential in this path.

Conclusion

▲ Summit site

At this point, the summit has officially come to an end. At the summit, the six experts opened their thoughts and in-depth exchanges on the latest progress and future trends of data technology, which not only brought the trend prospect of data technology, but also shared the practical experience that can be landed.

In the future, Tencent Cloud TVP will always keep pace with the times, adhere to the original idea of "using technology to influence the world", and continue to create "the most informative, interesting and useful" developer summit for developers. Let's look forward to the arrival of the next Techo TVP developer Summit.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report