In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
I'm a lucky man. Although luck cannot be copied, vision and effort can.
Guan Tao / Boss Guan, Ali P10 of the post-80s generation, person in charge of Alibaba general computing platform, Alibaba computing platform researcher. 12 years of working life, the choice of Microsoft and Ali.
Guan Tao's flower name comes from the homonym: Guan Tao. There is a kind of leisure to watch Haiguantao, but in the MaxCompute technical team, because the team not only has to do the core technology, but also has to "get results on the ground" and bear the customer scale and revenue of Ali Yun, like a small start-up company, so people prefer to call him Guan Boss, and suddenly become secular close.
Boss Guan is a northerner, and the tall man is also a little bookish. Because of his work, he leads a multinational team that travels across the Pacific (Beijing, Hangzhou, Seattle, California) and occasionally likes to match in both Chinese and English.
"I am an interest-driven person. On the whole of my career, I am quite lucky to do what I am interested in and get into the IT business."
If there are people who know Guan Tao, they can't help saying, "this guy, his luck is so good."
Walk all the way, did not take any examination, the university chooses the computer reason is also very capricious, likes to play the game. After graduation, he joined Microsoft and was the youngest technical manager of Microsoft. Later, he went to Aliyun. In less than 3 years, he was already the head of the MaxCompute team of P10, Alibaba's general computing platform.
"A long time ago, probably when I was in junior high school, I had my first computer, the famous 486, with a mathematical coprocessor, the main frequency 266MHz, and only 4m of memory."
Those who like to play games know that they often encounter some hurdles that are more difficult and impossible to break through. At that time, Guan Tao thought: how can we bypass these settings of the system? So he checked a lot of magazines and read a lot of books, trying to change the game archives. at that stage, he knew for the first time what hexadecimal was, and it was also his first contact with programming.
In the end, he was tossing about by himself, manipulating the game characters to kill in all directions, which was simply invincible. The feeling of letting the program run as you wish, "Hey, it was fun."
As a result, I began to think that this major (computer) is good. By the time he graduated from high school, he was sent to Nankai University because of the math competition. At that time, one choice was to enter the mathematics department, Nankai's trump card major, but in the end, Guan Tao chose the computer because of his interest.
There are many bifurcations in life, sometimes the first choice, the road behind will begin to communicate, seems to conform to the trend, in fact, are the result of choice.
From Beijing 200km to Seattle over 8000 km
Work requires regular make a little change
In 2006, Guan Tao graduated. It means he's about to start his career, and he's a little eager to try.
Three years as a graduate student, because the tutor has additional requirements: can not go to internship, which makes Guan Tao do not know so much about the recruitment market, Microsoft is also "not much knowledge." But there was a MSRA in Beijing, Microsoft Research Asia, which was said to be the best R&DCenter at that time.
With the mentality of trying, after a whole day of interview, Guan Tao got the offer smoothly. "it doesn't seem to be that difficult," he recalls.
Guan Tao spent six years in Beijing, 200km from Chengde, Hebei province, and was one of the first dozens of people on Microsoft's Bing search team in Beijing. From partial reservoir to computing layer, constantly enrich yourself in the project. He is an interest-driven person, but he is willing to become a perfectionist in his work.
The first project at Microsoft was to build a distributed KV+ObjectStore system for image and video storage that supports Bing search. In 2006, there was no open source system like Hbase, when a small team of six people completely handwritten a distributed KV, which was eventually deployed on 3000 machines and supported normal online traffic. In actual combat, they came into contact with various challenges in distributed systems and learned a lot. "this project is a good opportunity and a start."
The second project is to do search background IndexGen Pipeline: a customized storage and computing system to support general search 100B level of very large-scale data storage and processing, and later this search background has also become the second generation architecture of Microsoft Bing search background, and has served to this day.
Later, he took the lead in doing interactive query (JetScopeOn Cosmos) on big data, and in the end, more than half of Microsoft's team were using this system.
In Guan Tao's view, whether it is life or career development, going to make a little change regularly is a good choice. You can see and learn more while keeping the freshness. From being led to write code by others, to being in charge of some sections independently, to leading your own project team and a larger technical team, all of these need to have a self-definition of time and grasp the pace of your own development.
Six years after Microsoft, he also prepared make a bigger change: applied to Microsoft headquarters in Seattle, USA.
In a city more than 8000 kilometers away, it is not too cold in winter, not too hot in summer, and his favorite snowboarding, so that he insists on driving to different ski resorts on the last day of each year.
During his stay in the United States, Guan Tao continued to do in-depth interactive query, StructuredData optimization promotion, etc., and also accumulated a lot of experience in multinational technical team management. "the United States has a history of nearly 40 years, and the team members are more senior than the team in Beijing. In the United States, they can see different people and see different projects."
During the 10 years at Microsoft, Guan Tao also paid attention to the domestic local enterprises represented by BAT, which are developing very well and have a higher acceleration.
Employee 22 of the Seattle branch.
The return 10 years later will face more challenges.
"at that time, the overseas office had just been set up, and I was employee 22 of Ali's Seattle branch."
"what is the situation in China after 10 years at Microsoft?" Curiosity grew, so by chance, Guan Tao jumped to Ali and became a member of Alibaba's general computing platform MaxCompute team. This is January 2016.
The predecessor of MaxCompute is ODPS, the unified big data platform within Ali. At present, 99% of the data storage and 95% of the computing power are generated on this platform. If Alibaba Group's data system is compared to an aircraft carrier battle group, then MaxCompute is the aircraft carrier in the middle.
In the face of such a relatively mature and huge platform that has been developed for nearly 6 years, there are many challenges. When he joined Ali in January 2016, he took over the helm position of MaxCompute at the 2016 annual meeting. It has been done from 0 to 1. How can it be done from 1 to 10? Guan Tao does not have much time left.
He believes that the gradual development of large-scale systems is a process of continuous self-evolution, and big data system is no exception.
Microsoft's experience has given him some help: including the same big data engine (which varies greatly in scale), and previous technical and engineering experience can be reused. The rich management experience of multinational technical team also makes Guan Tao more adapted to Ali's work.
From MaxCompute1.0 to MaxCompute2.0
"We're changing engines on a flying plane."
Guan Tao recalled: "when we came in at that time, MaxCompute1.0 actually inherited the core business of Alibaba and Aliyun in a period of technological maturity, and the engine upgrade had technical risks and problems (we call it Regression, including functionality and performance). In order to ensure transparency to the upper layer, we first made a framework upgrade to support the simultaneous deployment of different versions of the engine online, cut the traffic bit by bit, and observe the effect at the same time. " And then major surgery at the engine level.
It's kind of like "changing the engine on a flying plane".
Today's MaxCompute2.0 has a scale of nearly 100000 units compared with version 1.0, and its performance has more than doubled, saving Alibaba more than 2 billion of its budget every year. At the same time, it also allows Ali's big data engine to have a relatively good layout in the next 3-5 years.
For the specific content of MaxCompute2.0, you can search MaxCompute on the forums of Yunqi Community to learn about it.
How to manage the technical team
Guan Tao's view is that in the final analysis, a technical manager is still a manager.
1. The first thing to consider is not what you want to do but what to help the team do, and there is more of an "altruistic" sense of responsibility.
2, the technology is forward-looking, the technical team manager should lead the team to move forward purposefully and correctly, and it is very important to grasp the future direction.
3. At the recruitment level, think about how to recruit the right people and how to distribute the talents. Now is the stage where the talent is located and the office is located.
Viewing Ali Singles Day holiday from big data's point of view
To support the Singles Day holiday, we should start with the two unifications (data unification and resource unification).
The data has the characteristic that 1 / 1 is greater than 2, and different data fusion calculations can produce greater value. And the key is how to get all the data through.
A few years ago, Alibaba built Zhongtai and put all the internal data together (physically distributed on nearly 100000 servers in many places, but logically unified, and the distribution and scheduling of data are transparent to users). Let the rich data help the product and business move forward.
The unification of resources: putting all the machines in a large resource pool (internally known as the mixed project), and getting through the resource scheduling system is of great help to the efficiency optimization of the machine and the disaster recovery of the whole system.
Everyone who does big data knows that the data can be increased fivefold in three years, but the machine is not good, otherwise the cost is too high and unrealistic. And the use of existing servers for hybrid deployment, "this is also nearly a year, we focus on a project, that is, different BU, different types of machines deployed in the same resource pool."
With the premise of these two unified structures, when the flood peak comes, Singles Day can choose to stop the less important work (scheduling based on priority and dependency in millions of jobs), so that these machines can be used to support the flood peak. After the flood peak, the main force of the machine is transferred to the calculation, and the necessary calculation is output as soon as possible.
On Singles Day this year, big data cluster supported more than 1x4 trading traffic through flexibility in the hours at the peak of the flood.
Do not increase the number of pieces, only by moving the pieces on the chessboard, complete the layout to defend the general. Of course, prior to this, the team has cut MaxCompute from version 1.0 to version 2.0, and the performance improvement is also the key to supporting the data volume of Singles Day.
On the basis of less than 1/3 increase in hardware, data processing has doubled compared with last year, reaching the scale of one-day processing of 600PB. It can be said that MaxCompute played well in this campaign, even better than last year.
Future: cloud, new hardware, unstructured computing, non-relational computing, AI is the trend
Will DBA be eliminated?
Last year, Hu Xiaoming, former president of Aliyun, said: "the cloud computing competition on the Internet is the overall competition of the world oligarchic economy. In my opinion, it is the competition between Hangzhou and Seattle. Whoever embraces technology will embrace the future." The owner is very proud of it.
Guan Tao believes that cloud computing has spread from Internet companies to traditional enterprises, such as Hangzhou's urban brain and "up to one run" project, which is a 2G (To Government) project. There are also industrial 4.0 projects based on industrial brains.
Judging from the current market attitude, enterprises may be more open, welcome and embrace this technological change, and complete their own digital transformation. "Cloud computing will not be an oligarch but will be inclusive," Guan said.
Forward-looking topic: big data deals with the field, what should programmers focus on in the future?
1. Development of new hardware
The computing level is more and more closely integrated with the innovation of new hardware, and the hardware will bring about the platform revolution. For example, with the development of new hardware such as chip CPU (AVX, SIMD), ARM multi-core architecture, GPU,FPGA,ASIC, memory NVM, SSD, SRM, network intelligent network card and RDMA, the cooperation of new hardware and software is worthy of attention.
2. there are many opportunities in the field of non-relational computing (graph computing).
Big data is still at the relational processing level, including flows and batches based on relational data. In fact, non-relational computing is becoming more and more popular, including knowledge graphs and portraits. These data organizations are not relational expressions, but are expressed in the form of points and edges in the form of graphs, which are more in line with physical abstractions, such as the relationship between people and goods, at the risk control level, and at the knowledge graph level. It is more appropriate to describe the relationship between physical entities.
Early next year, MaxCompute's graph computing system MaxGraph will be launched, which supports machine learning operations such as graph storage, query, pattern matching and GraphEmbedding.
3. Unstructured data will become the mainstream of big data.
With the development of IoT, more and more short video, picture and voice data may account for 80% of the data. Because this kind of data is characterized by different structures, and the data is very large but the unit value is not high (compared with traditional structured data), how to analyze and process unstructured data quickly and efficiently is a key challenge for computing platforms.
Last year, MaxCompute released an unstructured data processing module that can handle data, including video and audio, in a user-defined way.
4. Al for Everything (also for BigData)
Will DBA be eliminated?
Big data is characterized by large, including not only the scale of data processing, but also the management and optimization of the entire massive data. The traditional database field that relies on DBA manpower to manage will no longer be applicable.
Use Al to optimize data distribution, data management, computing optimization and cost optimization (such as automatic SubQuery merging, intelligent index building, etc.). "Let big data driverless" is also the trend of the future.
Boss Guan sent a message.
Wake up every day with a feeling of passion for the difference technology will make in people'slife .
When you wake up every morning, you will be excited about technological progress and the development and improvement it has brought to human life. )
-- quoted from the complete Biography of Bill Gates ("Biography of BillGates")
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.