Keep your AI model as close to the data source as possible 07/06 Update SLTechnology News&Howtos

Keep your AI model as close to the data source as possible

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Source: Redislabs

Authors: Pieter Cailliau, LucaAntiga

Translation: Kevin

Brief introduction

Today we released a preview version of RedisAI, pre-integrated into the [tensor] werk component. RedisAI is a Redis module that can serve tensors tasks and perform deep learning tasks. In this blog post, we will introduce the features of this new module and explain why we think it can subvert machine learning (ML) and deep learning (DL) solutions.

There are two main reasons for RedisAI: first, the cost of migrating data to the host that executes the AI model is very high, and it has a great impact on the real-time experience; secondly, the Serving model has always been a challenge to DevOps in the field of AI. The purpose of building RedisAI is to enable users to serve, update, and integrate their models without moving Redis multi-node data.

Data location is important.

In order to prove the importance of data location in running machine learning and deep learning model, let's give an example of a chatbot. Chatbots usually use recurrent neural network model (RNN) to solve one-on-one (seq2seq) user question and answer scenarios. The more advanced model uses two input vectors, two output vectors, and saves the context of the conversation in the form of digital intermediate state vectors. The model uses the user's last message as input, the intermediate state represents the history of the conversation, and its output is a response to the user message and the new intermediate state.

In order to support user-defined interaction, this intermediate state must be saved in the database, so Redis + RedisAI is a very good choice. Here we compare the traditional scheme with the RedisAI scheme.

1. Traditional scheme

Use Flask applications or other solutions to integrate Spark to build a chat robot. When receiving a user conversation message, the server needs to get the intermediate status from the Redis. Because there are no native data types available for tensor in Redis, you need to deserialize first, and after running the recurrent neural network model (RNN), ensure that the real-time intermediate state can be serialized and saved to Redis.

Considering the time complexity of RNN, the cost of CPU on data serialization / deserialization and the huge network overhead, we need a better solution to ensure the user experience.

2. RedisAI scheme

In RedisAI, we provide a data type called Tensor, which can manipulate Tensor vectors in mainstream clients using a series of simple commands. At the same time, we provide two other data types for the runtime characteristics of the model: Models and Scripts.

The Models command is related to the running device (CPU or GPU) and the parameters customized by the back end. RedisAI has built-in mainstream machine learning frameworks, such as TensorFlow, Pytorch, etc., and will soon be able to support the ONNX Runtime framework, while adding support for traditional machine learning models. However, the great thing is that the command that executes Model is not aware of its back end:

AI.MODELRUN model_key INPUTS input_key1... OUTPUTS output_key1..

This allows users to decouple the back-end choice (usually determined by the data expert) from the application service, and the replacement model only needs to set a new key value, which is very simple. RedisAI manages all requests in the model processing queue and executes them in separate threads, ensuring that Redis can still respond to other normal requests. The Scripts command can be executed on CPU or GPU and allows users to manipulate Tensors vectors using TorchScript, a Python-like custom language that can manipulate Tensors vectors. This can help users preprocess data before executing the model, and it can also be used in scenarios where results are post-processed, such as integrating different models to improve performance.

Overview of data types and backend of RedisAI

We plan to support batch execution of commands through the DAG command in the future, which will allow users to batch execute multiple RedisAI commands in an atomic operation. For example, run different instances of a model on different devices and make average predictions of the execution results through scripts. Using the DAG command, calculations can be performed in parallel, and then aggregation operations can be performed. If you need a full and deeper list of features, you can access redisai.io. The new architecture can be simplified to:

Model services can be simpler

In a production environment, using Jupyter notebooks to write code and deploy it in Flask applications is not the best solution. How can users be sure that their resources are the best? What happens to the intermediate state of the above chat robot if the user host goes down? Users may repeatedly build wheels to implement existing Redis functions to solve the problem. In addition, because the complexity of the combined solution is often higher than expected, stubbornly sticking to the original solution can also be very challenging. Through the Redis enterprise data storage scheme, RedisAI supports the data types such as Tensors, Models and Scripts needed for deep learning, and realizes the deep integration of Redis and AI models. If you need to expand the computing power of the model, you only need to simply expand the capacity of the Redis cluster, so users can add as many models as possible in the production environment to reduce infrastructure costs and overall costs. Finally, RedisAI adapts well to the existing Redis ecology, allowing users to execute scripts to preprocess and post-process user data, use RedisGear to correctly transform the data structure, and use RedisGraph to keep the data up-to-date.

Conclusion and follow-up plan

1. In the short term, we hope to use RedisAI to stabilize and reach a stable state as soon as possible while supporting three mainstream backends (Tensorflow, Pytorch and ONNX Runtime).

2. We want to be able to load these backends dynamically, and users can customize the specified backends. For example, this will allow the user to use Tensorflow Lite to handle edge use cases. 3. Plan to realize automatic scheduling function, which can realize the automatic merging of different queues in the same model. 4. RedisAI will count the running data of the model, which is used to measure the implementation of the model.

5. Complete the DAG features explained above.

For more high-quality middleware technical information / original / translated articles / materials / practical information, please follow the official account of "Brother Middleware"!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.