In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article shows you how Jupyter Notebook adapts to the development direction of data science, the content is concise and easy to understand, and can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
0 1
The future of data science and the power to drive the development of our tools will be discussed below.
The following is a review of the tool I often use in data science-Jupyter Notebook
I want to see how the practice of data science has changed in the past few years. Then, I'll highlight three main forces that have changed the way I use Notebook today.
02 current data science
The field of data science is changing rapidly. We have now entered an era in which statements such as "the sexiest job of the 21st century" and "data is new oil" are outdated and replaced by more real business problems and technology-based challenges. I think this change is twofold: we now need to support (1) the need for production analysis and experimentation, and (2) the rapid adoption of cloud technology.
First, the need for production. In the life cycle of software engineering, the creation of data products or the deployment of experimental artifacts has been growing over the years. With the rise of machine learning engineers and data science software developers, more and more engineering jobs are adopted. In addition, analysis is no longer limited to publications or charts, as there is a growing need to replicate experiments and deploy artifacts.
Next, the exponential growth of data requires the use of cloud technology. We can't just use our own laptops to load Pandas's 1TB dataset! the popularity of tools such as Docker and Kubernetes enables us to scale data processing workloads at an unprecedented level. Adopting the cloud means that we have to consider scalability, resource provisioning, and infrastructure when managing workloads. However, the previous Jupyter Notebook ecosystem, although it was an important part of the data scientist's toolbox, did not mean a corresponding change:
As I said, what we know as Jupyter Notebook does not mean these changes. They are used for exploration, not for production. They should run on a machine, not in a cluster. However, over the past five years, the Jupyter Notebook ecosystem has grown: we now have JupyterLab, some plug-ins, new kernels for other languages, and third-party tools available to us. Of course, we can still run the laptop by typing jupyter Notebook into the terminal, but now it's much more than that!
This raises the question: what drives these changes, and how can we use this larger laptop ecosystem to cope with today's changes in data science?
03 changes in three directions
Jupyter Notebook's ecosystem is growing, and I think it's driven by three forces:
Experimenting on the cloud: big data needs a lot of computing and storage, and the average consumer machine doesn't always do this.
Support developer workflow: more and more data science teams are adopting software engineering best practices-version control, gitfow, pull requests, and so on.
Rapid shift from analysis to production: testing assumptions in a controlled environment is not enough. Software written for analysis should be easily reused for production.
Moving towards a cloud-first environment means we can perform notebook-based tasks on machines that are more powerful than we are. For example, a managed notebook instance enables us to run Jupyter notebook from a remote server without action and setup. On the other hand, the development towards a more productive workflow provides us with a set of tools to entrust notepad-based tasks to software engineering practices. We will see more of these tools in the next part of this article.
Finally, note that the development of tools does not depend on a single entity or organization. As we will see later, filling these gaps may come from individuals who provide third-party plug-ins or organizations that provide management services.
We looked at two drivers of growth in data science: (1) adoption of cloud computing, and (2) growth in production demand. We found that Jupyter notebook is only a small part of the ecosystem, that is, it is usually used for exploration (rather than production) and runs only on our local machines (not in the cloud).
Then, using the same framework, we identified three forces of change that enabled the Jupyter laptop ecosystem to evolve: more experiments on the cloud, support for developer workflows, and a faster shift from analytics to production. These forces may lead to the development of new tools, plug-ins, and notepad-like products to meet these gaps.
The above content is how Jupyter Notebook adapts to the development direction of data science. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.