Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed introduction of Faust Library in Python

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article introduces the relevant knowledge of "detailed introduction of Faust library in Python". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Catalogue

Faust is a stream processing library

Porting ideas from kafka streams to Python

Agent is a function of async def, so it can also perform other operations asynchronously

Use Kafka topic as the "pre-write log"

Faust supports any type of stream data

Faust is statically typed

Introduction to Faust

High availability

Distributed

fast

Flexibility

Installation

Binding

Download and install from the source file

Use the development version

common problem

Faust is a stream processing library that transplants ideas from kafka streams to Python.

It is used in Robinhood to build high-performance distributed systems and real-time data channels that process billions of data every day.

Faust also provides the same type of tools for streaming and event handling, such as Kafka Streams, Apache Spark/Storm/Samza/Flink.

It doesn't need to use a DSL, just Python! This means that you can use all your favorite Python libraries when doing streaming:

NumPy, PyTorch, Pandas, NLTK, Django, Flask, SQLAlchemy, etc.

Due to the need to use the new async/await syntax and variable type annotation methods, Faust needs to use a version above Python3.6.

Here is an example of handling the input command stream:

The agent decorator defines a "stream processor", which is essentially a Kafka topic and can do some processing for each event received.

Agent is a function of async def, so it can also perform other operations asynchronously

Such as web request.

The system can persist the state in a manner similar to that of a database. The table is named distributed key/value storage, and you can use a regular Python dictionary to do this.

Tables are stored in an ultra-fast embedded database (known as RocksDB) written locally in C++ on each machine.

The table can also store optional window aggregate counts to track clicks from the previous day or clicks from the previous hour. Like Kafka streams, we support scrolling, jumping, and sliding time windows, and old windows can expire to prevent data filling.

In order to improve reliability

Use Kafka topic as the "pre-write log"

When a key is changed, we publish it to the updated log. The backup node uses this update log to keep an exact copy of the data and to support immediate recovery in the event of any node failure.

To the user, the table is just a dictionary, but the data exists between restart and cross-node replication, so other nodes can automatically take over in the event of a failure.

You can count the number of page views through URL:

The data sent to Kafka topic is partitioned, which means that the number of clicks will be fragmented in this way as URL. Therefore, each count of the same URL is immediately passed to the same Faust worker instance.

Faust supports any type of stream data

Bytes, Unicode, and serialization structures, as well as "models" that use modern Python syntax to describe how keys and value in the stream are serialized.

Faust is statically typed

Use the mypy type checker, so you can take full advantage of static typing when writing applications.

The Faust source code is small and well organized, and it is a good resource for learning Kafka stream implementation.

Learn more about Faust on the introduction page. JPG

Read more about Faust, system requests, installation instructions, forum resources, etc., or visit quick start tutorials directly. Check out the Faust application in an application that writes stream processing, and then delve into it through the user's manual. Deep information is explained in this manual according to different topics.

Introduction to Faust

Faust is very easy to use. When learning other flow processing methods, you always need to start with a complex hello-world project and the corresponding basic requirements. Faust only needs Kafka, and all that's left is Python. If you know Python, you can directly use Faust to do streaming work, and it can integrate everything related to it.

Here's a simple application you can do: the source code is Python.

You may be spooked by the keywords async and await, but you don't need to know how asyncio works when using Faust: just imitate these examples to get the results you want.

The sample application starts two tasks: one is to process the flow, and the other is the background thread that sends events to the stream. In a real application, your system will publish events to Kafka topic, your processor can get event information from Kafka topic, and only need background threads to enter data into our example.

High availability

Faust is highly available and can survive network problems and server crashes. In the event of a node failure, it can recover automatically, and the table will take over the standby node.

Distributed

Start more instances according to the needs of your application.

fast

A single-kernel Faust worker instance can already handle tens of thousands of events per second, and we have reason to believe that once we can support a more optimized Kafka client, the throughput will increase.

Flexibility

Faust is Python, and stream is an infinite asynchronous iterator. If you know how to use Python, you already know how to use Faust, which can be used with your favorite Python libraries, such as Django, Flask, SQLAlchemy, NTLK, NumPy, Scikit, TensorFlow, and so on. Finally, if your time is not very tight, and want to improve quickly, the most important thing is not afraid of hardship, it is suggested that you can contact dimension: 762459510, that is really good, many people are making rapid progress, you are not afraid of hardship! You can add it and have a look at it.

Installation

You can install Faust through the Python package or from the source file

Install it using pip:

Binding

Faust also defines a set of setuptools extensions that can be used to install Faust and has a dependency on a given feature.

You can use square brackets to specify them in your requirements or on the pip command line. Separate multiple packages with commas:

The following bindings are valid:

Shop

Optimization.

Sensor

Event cycle

Debug

Download and install from the source file

You can install it like this:

If you are not currently using virtualenv, you must execute the last command as a privileged user.

Use the development version

You can install the latest version of Faust using the following pip command:

common problem

Can Faust be used on Django/Flask/etc?

Use gevent

This approach applies to any blocking Python library that can work with gevent.

Using gevent requires you to install the aiogevent module, which you can install as a package for Faust:

Then to actually use gevent as the event loop, you can either use-L in the faust program

Command:

Or add import mode.loop.gevent in front of your script

Remember: it is very important that it is at the top of the module and executed before importing the library.

Use eventlet

This approach applies to any blocking Python library that can use eventlet.

Using eventlet requires you to install the aioeventlet module, which you can install as a bundle with Faust.

Then to actually use eventlet as the event loop, you can either use-L in the faust program

Command:

Or add import mode.loop.gevent in front of your script

Warning

It is very important that it is at the top of the module and is executed before importing the library.

Can Faust be used on Tornado?

Sure! Use tornado.platform.asyncio

Link: http://www.tornadoweb.org/en/stable/asyncio.html

Can Faust be used on Twisted?

Sure! Use an asyncio reactor to achieve:

Link: https://twistedmatrix.com/documents/17.1.0/api/twisted.internet.asyncioreactor.html

Does it support Python3.5 or earlier?

There are currently no plans to support Python 3.5, but you are welcome to contribute to this project.

Here are some steps needed to achieve this goal

Source code conversion to rewrite variable comments to comments

Example, code:

Rewrite the source code transformation of asynchronous functions

Example, code:

Must be rewritten:

This is the end of the detailed introduction of the Faust library in Python. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report