In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >
Share
Shulou(Shulou.com)05/31 Report--
This article focuses on "how to use ES to do Redis monitoring", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to use ES to do Redis monitoring"!
Preface
Figure: Redis heat ranking
Redis is very popular and easy to use, no matter in the business application system, or in the field of big data has an important position; but Redis is also very fragile, not good use, a lot of problems. Before 2012, it was mainly based on memcached, and then moved to the Redis camp. I have experienced single instance mode, master-slave mode, sentry mode, agent mode and cluster mode. It is rarely used well at the company level, and it is very one-sided for Redis control, which leads to a lot of problems in actual projects.
For Redis to work well, you need to master three levels as a whole:
Development level
Architecture level
Operation and maintenance level
Among them, architecture and operation and maintenance are very important, most small and medium-sized enterprises only meet common functions at the development level, slightly larger data scale, higher business complexity, it is easy to have a variety of architecture and operation and maintenance problems. The purpose of this article is to explore the Redis monitoring system. At present, of course, there are many mature products in the industry, but I think they are all very conventional, only do some coarse-grained monitoring, and do not refine according to the characteristics of business requirements according to local conditions, so as to provide architecture development optimization solution in reverse.
The content of this article will focus on the following issues:
What are the aspects of Redis monitoring system?
What have we done to build a Redis monitoring system?
To what extent should the Redis monitoring system be refined?
Why use ELK to build a monitoring system?
Demand background
Project description
The company's business scope belongs to the car networking industry, with millions of real car owners, and the business project focuses on the car owner's life service. in order to improve the system performance, Redis is introduced as the cache middleware, as described as follows:
Deployment architecture adopts Redis-Cluster mode
There are dozens of background applications, with more than 200 application instances.
All application systems share a cache cluster
With dozens of cluster nodes and disaster recovery backup environment, the number of nodes has doubled.
The cluster node has a higher memory configuration.
Figure: schematic diagram of Redis cluster architecture and application architecture
Problem description
At the beginning of the system, everything about Redis is normal. With more and more application system access and more application system sub-module access, some problems begin to appear. The application system is aware and the cluster server is aware, as described below:
Cluster node crash
Fake death of cluster nodes
Some back-end applications respond very slowly to the cluster.
In fact, the root cause of the problem is the lack of architecture, operation and maintenance. It is easy to monitor the operation of the Redis cluster server, and it also provides a lot of direct command methods. However, you can only see some common metrics of the server, which cannot be analyzed in depth, and you have no knowledge of the internal operation of Redis, especially how business applications use Redis cluster:
What is the heat problem used by Redis clusters?
Which applications consume more Redis memory resources?
Which applications take up the highest number of Redis visits?
Which applications do not use Redis types reasonably?
How about the distribution of Redis resources for application modules?
What are the hot issues of using Redis clusters for applications?
Monitoring system
The purpose of monitoring is not only to monitor Redis itself, but also to make better use of Redis. Traditional monitoring is generally simple and not systematic, but for Redis, I think it at least includes: one is the server side, the second is the application side, and the third is the joint analysis of the server side and the application side.
Server:
First of all, the server side is at the operating system level, such as CPU, memory, network IO, disk IO, process information running on the server, etc.
Redis running process information, including server running information, number of client connections, memory consumption, persistence information, number of keys, master-slave synchronization, command statistics, cluster information, etc.
Redis runs the log, which records some important operation processes, such as running persistence, which can effectively help analyze programs that crash and fake death.
Application side:
The application side, get some behaviors of the application side using Redis, specific which applications which modules occupy the most Redis resources, which applications which modules consume the most Redis resources, which applications and which modules are misused, and so on.
Joint analysis:
Joint analysis combines the operation of the server side with the behavior used by the application side, such as: some of the reasons for the sudden blocking of the server side may be that the application side sets a large cache key value, or the list of key values used, the blocking is caused by a large amount of data.
Solution
Why choose the Elastic-Stack technology stack?
Most third parties only monitor some metrics, and ELK (Elasticsearch, Logstash, Kibana) is still used for detail logs, that is, after using third-party monitoring metrics, it is necessary to set up an ELK cluster to view detail logs.
In addition, the advantages of Elastic-Stack technology stack integration, indicators can also be, log files can also, from the beginning of collection to storage, to the final report panel integration is very good, the threshold is very low.
Let's talk in detail about how we did it and what work we did.
Server system
The Elastic-Stack family has Metricbeat products that support system-level information collection. The Elastic cluster address and system metrics module can be launched with simple configuration, and the existing system monitoring panel will be created in Kibana, which is very simple and fast, and can be done by general operation and maintenance.
Figure: metrcibeat schematic diagram
The sample configuration of system metrics information collection is as follows:
Server cluster
To collect Redis cluster operation information, the industry usually uses the info command provided by Redis to collect it on a regular basis.
The information obtained by info includes the following:
General information about server:Redis server
Clients: the connection part of the client
Memory: memory consumption related information
Information about persistence:RDB and AOF
Stats: general statistics
Replication: master / slave replication information
Cpu: statistics of CPU consumption command
Stats:Redis command
Statistics of cluster:Redis cluster information
Keyspace: related statistics of database
The Metricbeat products of the Elastic-Stack family also support the Redis module, which is also obtained by the info command, but has some implementation limitations, as described below:
The master-slave relationship information of Redis cluster cannot be expressed by Metricbeats.
Some statistical information of Redis cluster is always cumulative, such as the number of commands. If you want to get the peak value of the number of commands, you cannot get it.
Redis cluster status information changes, Metricbeats is not dynamic, such as new nodes in the cluster, offline nodes and so on.
So here we refer to the CacheCloud product (open source by Sohu team). We customize the design and development of Agent, regularly collect information from the Redis cluster, and do some simple calculation of statistical values internally, convert it to Json, write it to a local file, and collect and send it to Elasticsearch through Logstash.
Figure: schematic diagram of Redis server running information collection architecture
Server log
It is very easy for the Redis server to run log collection, directly through the Filebeat products of the Elastic-Stack family, in which there is a Redis module, configure the Elastic server, and the log file address can be.
Figure: server log collection process
Redis running log collection configuration:
Application end
Application-side information collection is not only the most important part of the whole Redis monitoring system, but also the most troublesome to achieve and the longest link. The first is to modify the jedis (technology stack Java) source code, add the buried point code, recompile and reference to the application project, any command operation of the application side for the Redis cluster will be captured, and the key information will be recorded, and then written to the local file.
Figure: Redis application behavior collection architecture diagram
The format of the data collected by the application is as follows:
Figure: data collected by the application side
Jedis modification:
The information recorded by the jedis transformation is as follows:
R_host: access the server address and port of the Redis cluster, one of which is ip:port
R_cmd: execute command types, such as get, set, hget, hset, etc.
R_start: start time of command execution
R_cost: time consumption
R_size: get the key size or set the key size
R_key: get the key name
R_keys: a secondary split of key values, with no limit to the length of the array. It is necessary to emphasize that all application systems share a cluster, so the key values of the application system are standardized and are divided according to special symbols, such as "application name _ system module _ dynamic variable _ xxx", which is mainly easy for us to distinguish.
There are several areas in the jedis transformation, as follows:
Class Connection.java file, statistics start, record command execution start time; statistics end, record command end time, time consumption, etc., and write to log stream
Class JedisClusterCommand file, the place to get the key key, convenient to analyze the behavior of the application key later.
There are two places in the class Connection.java file:
Figure: where the code is buried in the class Connection.java file
Figure: where the code is buried in the class Connection.java file
The class JedisClusterCommand file embeds the code .java file in one place:
Figure: buried point code of class JedisClusterCommand file
Logback modification:
All applications use logback to write log files. In order to be more accurate, the application side also needs to obtain some information of the application side when writing to the log, as shown below:
App_ip: the IP address where the application side is deployed on the server
App_host: the name of the server on which the application side is deployed.
Customize a Layout to automatically obtain the IP address and server name of the application side:
Figure: custom Layout of Logback
App configuration:
App configuration belongs to the final work, which mainly outputs the log data of the buried point. You can configure the log logback.xml file:
Figure: configure the application log file logback.xml
Log collection:
Logstash is used for application log collection, and the log directory is configured to point to the Elastic cluster, so that the overall monitoring log collection part is over.
Log analysis
Redis server log analysis is relatively simple, just some conventional indicators, create a key chart, it is easy to see the problem. Focus on the log analysis of the application side.
Figure: some behavior diagrams using Redis on the application side
After the ELK monitoring system was launched, we observed and analyzed continuously for two weeks and obtained some monitoring results, such as:
Some of the key values on the application side are too large, which actually exceeds the 1MB. This kind of key value access takes a lot of time and will cause serious blocking.
Some applications actually use Redis as a database
Some use the List type as a message queue, accessing hundreds of thousands of data at a time
Some applications operate on the cluster with a high frequency, accounting for more than half of the total.
There are many more, so we won't describe them one by one.
Follow-up plan
The monitoring system is equivalent to the eye of the architect, with this, the optimization and transformation plan for Redis is easy to work out:
The application side and misuse should all be changed.
On the server side, some splits are carried out according to the application data, and some dedicated clusters are split, specifically for some application use or scenarios.
Developers, if there are any new business modules that need to be connected to Redis, you need to inform the architects for review.
At this point, I believe you have a deeper understanding of "how to use ES to do Redis monitoring". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
#! / bin/bash#for security of osfor ip in `cat ip.txt`doecho-e "\ 033 [31m#$ip #
© 2024 shulou.com SLNews company. All rights reserved.