How to make an in-depth Analysis of Pulsar Functions 07/19 Update SLTechnology News&Howtos

How to make an in-depth Analysis of Pulsar Functions

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to conduct an in-depth analysis of Pulsar Functions. Many people may not know much about it. In order to make you understand better, the editor summed up the following contents for you. I hope you can get something according to this article.

Brief introduction of Pulsar Functions

Pulsar Functions is a lightweight computing framework, its main purpose is to provide a simple deployment, simple operation and maintenance platform, mainly highlighting the characteristics of "simple" use.

Probably the running process is the process of receiving messages in input topic, entering functions for operation and message processing, and finally outputting them to output topic.

Pulsar Functions can cover more than 90% of the stream processing scenarios. For example: message filtering, message routing, message enhancement and so on. For more usage scenarios, please refer to the previously published technical blog: Pulsar Functions-based event handling design pattern

Pulsar Functions is mainly divided into three modules: Instance, Runtime and Function-worker.

There are three main forms supported at the Runtime level: thread, process and externally supported Kubernetes.

For more information about Pulsar Functions introduction and basic layer source code explanation, you can refer to the playback video: 01:00-13:20 time period.

Learn more about Functions

From the parsing point of view of this sharing, we start from several important nodes within functions. Mainly: how to submit Pulsar Functions, how to schedule Functions Worker, and how to run Pulsar Functions.

| | Submission Workflow |

When executing / creating a function, it needs to be exposed to the user in the form of `FunctionConfig`, and the user can perform internal functions operations by specifying `FunctionConfig`.

Functions can be submitted to any worker, through the corresponding Json file to provide the corresponding tenant/namespace/name and other input / output configuration.

After the configuration build is complete, there will be an AuthN/AuthZ checks process to detect whether encryption-related settings have been added during the configuration of function. After that, the format in the `FunctionConfig` file and other aspects will be verified again.

Eventually, these jar packages will be stored on the BookKeeper side to facilitate subsequent calls.

After completing the above operations, submission workflow will submit all functions to MetaData Topic and record them in map from format.

FQFN is Fully Qualified Function Name, and the format is a combination of three fields: tenant, namespace and functions name.

FQFN, as the key for storing metadata, populates the `FunctionConfig` field provided by the user into the Function MetaData. The main purpose of the MetaData Topic Tailer in the figure is to monitor the MetaData Topic in real time and to carry out follow-up operations according to the dynamics such as real-time update, change, writing and so on.

Status updates are required before operations such as "create / update / delete" are actually started within Functions. The general process is as follows:

Copy current statu

Perform status merge updates

Increase the number of current versions

Write data to MetaData Topic

Tailer for data reading and verification

If there is no conflict, the entire update

The whole process above is in the case of a single function worker, but conflicts can occur in multiple function worker.

In multiple function worker running scenarios, conflicts occur when concurrent updates are made to a function. When a conflict occurs, the solution is to use the policy of First Writer Win, that is, when the first request is successfully received, other requests will be rejected.

After dealing with the above process, we can see that functions has both advantages and disadvantages in the submission workflow process. For example, it can be submitted to any worker with a fixed state machine. Of course, unlimited data growth does not configure some operations related to compressed data, which is really a pity to be lost.

For more information on how to submit a Functions, please refer to the playback period from 13:30 to 37:00.

| | Scheduling workflow |

When function worker has the above metadata information, how do you schedule the entire process next?

The entire scheduling process of Function worker is performed in the `IScheduler Interface` API.

At the same time, Function worker will turn on "scheduling mode" in the following states.

CRUD actions: create / update / delete

Worker changes: such as creation of new worker, changes in leadership, etc.

Although the function can be submitted to any worker, the scheduling process can only be done in the worker with the "leader" attribute.

So how to confirm "who is the boss"?

In previous live broadcasts, we have also talked about the subscription model of Pulsar messages, one of which is Failover mode. Pulsar Functions also borrows this pattern here.

When all worker is entered, it subscribes to "Coordination Topic" in failover mode. According to the rules of failover, only one active "child" will become a "boss" at a time. So by analogy, the worker2 above becomes the leader for that period of time.

Once you have Leader Worker, you need it to write data and write it to Assignment Topic to record scheduling information.

For more information on the scheduling process, please refer to the playback video from 37:00 to 45:00.

| | Execution Workflow |

So how does the entire Functions run after the submission and scheduling is complete?

In the figure above, Assignment Tailer listens for a change in Topic and passes the action change to Function RunTime Manager. At the same time, some subsequent operations are done through Spawner.

Spawner is an abstract execution environment when using Functions. It also has the function of Functions lifecycle management and data interaction with Functions through GRPC.

Future Work's future follow-up updates and product directions on Pulsar Functions, one is to further improve and improve the existing functions, such as the previously mentioned unlimited growth of Function MetaData Topic data, then whether to develop a compressible function for it; and the expansion mode of dynamic RunTime selection, etc.

The other is to continue to extend other new features of Pulsar Functions, such as Function Mesh under development. You can look forward to it.

Can Q&AQ:Pulsar functions read data directly from a third-party key-value database and write the results to key-value database after processing? Or can only be written to the Pulsar topic,Pulsar Functions through Pulsar IO and then read from the topic for processing? A:Pulsar Functions is not supported and can be implemented with the help of Pulsar IO. It is done in the way described in the latter part of the problem. After reading the above, do you have any further understanding of how to conduct an in-depth analysis of Pulsar Functions? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.