Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Deep understanding of ceph mgr

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Background

Monitoring is the first step in management, so the main function of ceph-mgr is to expose some indicators of the cluster to the outside world. What is surveillance? For example, if a user visits a website 5xx, then monitoring is such a system: collect the number of 5xx of a website, save it, then send an alarm message to the developer when there is too much 5xx, and then provide other information for the developer to solve the problem (such as logs, indicator charts). So the monitoring system is a data system, including acquisition, storage, analysis (including alarm), visualization, these parts.

With regard to monitoring, there have been many attempts by ceph and the community before that.

Calamari

Calamari . Calamari is a monitoring and management program developed by Inktank, the company behind ceph, for ceph Enterprise Edition. It is open source after Red Hat's acquisition of the company and is now basically at a standstill. The basic principle is to remotely execute the python script using salt, which collects data or executes commands through the admin socket exposed locally by each daemon in the ceph. It mainly consists of several parts:

Collection: ceph data. Salt + python script host data. Diamond storage: comes with a graphite analysis: no visualization: a front end is customized

Evaluation:

There are many technologies involved, mixed together, and the burden of cognition and maintenance is heavy. The technologies involved include, but are not limited to: vagrant, salt, django, graphite, node, diamond. As far as the installation process is concerned, it took me about two days, and then I successfully found that all the master versions involved in the warehouse couldn't run together. First of all, Github Release only has Ubuntu packages, which were uploaded in October 15, and our system is CentOS 7.2, so I can only follow this tutorial to compile and install from the source code. The package file conflicts when salt installs the package during installation. The file could not be found during rpm build. After downloading its front-end romana from Release to the corresponding directory, I found that some front-end files could not be found, so I had to change its Django view,Django. I didn't know how to change it. Finally, after the front-end page was displayed, I found that an API backend requested by the front-end was not implemented. The project has stalled and development resources have shifted to ceph-mgr. From the Github Insights of this project, we can see that there are fewer commit after 2017, and there are two people with more commit. The first [jcsp] is currently mainly developing ceph-mgr. Cephmetrics

Cephmetrics . The basic principle is based on the collectd plug-in, collecting data from admin socket to send to graphite, using grafana to do graphics.

Evaluation:

The division of the project is clearer than that of calamari, and each component uses the mainstream solution of the industry. Collectd (acquisition) + graphite (storage and computing) + grafana (visualization). I am more optimistic about this solution. The collectd plug-in is deployed on each machine, which solves the load balancing problem of collection, but the deployment, upgrade and management of the plug-in are relatively troublesome, and may affect the target host, so the problem is not too big and can be adopted. Dashboard is not good, and there is a lot of redundant code. The data selected in the Dashboard and the placement of the data are not well considered in the Dashboard, for example, they do not put the relevant data together, do not chart according to a purpose, and have the feeling of stacking data. Redundant code refers to the code that contains the deployment code of ansible, the configuration of collectd about the collection of system data such as cpu and so on, which has nothing to do with Ceph itself, which increases the cognitive burden. Ceph_exporter

Ceph_exporter . The basic principle is to use librados to take data from ceph monitor and expose the index in the format specified by prometheus through http protocol.

Evaluation:

Is a pure collection component, only need to deploy one place, and ceph monitor communication, the mode is easy to understand, very optimistic. One drawback is that the prometheus system itself has. Its plug-ins are distributed to various repositories in the form of exporter, deployed separately, so many exporter, each is an independent process, how to manage them is a big problem. Management includes deployment, monitoring, upgrading, configuration management, starting and stopping, each of which is a problem. In contrast, as a collection framework, collectd provides common basic functions for the implementation of all plug-ins, which makes the implementation of plug-ins very simple: it provides a running environment for plug-ins. Plug-ins only need to provide read (input plug-in), write (output plug-in), no need to start the process, no need to deal with signals. A configuration system is provided for the plug-in. The plug-in does not have to worry about how to configure itself, as long as the user passes it in a unified format in the collectd configuration file, the plug-in can get it in a unified way. Log mechanism is provided for plug-ins. Plug-ins can use collectd's logging mechanism so that you don't have to worry about how to support level, output to different places, and so on. Provides a data channel for the plug-in. The data between the plug-ins is connected, and the plug-in does not care where to output, whether it is graphite,influxdb or opentsdb. Just implement read callback to collect data, and then configure different output plug-ins to achieve output to different places. Ceph-mgr

Under the above background, ceph officially developed ceph-mgr, whose main goal is to achieve the management of ceph clusters and provide a unified entrance for the outside world. To learn more about ceph-mgr, you need to understand how ceph-mgr runs.

As can be seen from the official documents, ceph-mgr runs through the executable file ceph-mgr, and you can find add_executable (ceph-mgr ${mgr_srcs}) by searching ceph-mgr in the source code src/CMakeLists.txt, from which you can see that ceph-mgr is mainly compiled by files in src/mgr (guess and guess), and the main function is in src/ceph_mgr.cc. The above are the relevant documents, people who need in-depth can read them, here is an introduction to the working principle of ceph-mgr after finishing.

The mode in which ceph-mgr works is event-driven, which means waiting for the event, processing the event to return the result when the event comes, and continuing to wait. Its main running threads include:

Messenger thread. This is an event-driven main thread that listens to a port and is sent to the input event by the outside world. Messenger receives the event and dispatches it to each processor. By subscribing monitor to a message from a topic, such as mgrmap, osdmap,monitor notifies the event to the port on which messenger listens when the data changes. Event handlers include: MgrStandby. Mgr is highly available through standby, and each running ceph-mgr contains a thread that MgrStandby,MgrStandby is not running, which exists in callbacks when the messenger receives a message, as well as scheduled tasks run through timer threads, and manages other entities. The only message it handles is mgrmap, which is to come up when the Lord is dead and return when he is not the Lord. When the host is managed by monitor, the main logic in MgrStandby is relatively simple. There is a Mgr instance. When a mgrmap is received, the instance is generated and stored in the MgrStandby attribute. Because when receiving a message, if MgrStandby sees an instance of Mgr, it will send the message to it for processing, and in the timing function, it will also call the timing function of mgr, so that, in fact, MgrStandby will take on the main task. Mgr . As mentioned in the previous paragraph, Mgr is attached to the MgrStandby presence, and there are no separate threads. It maintains cluster member information in memory by handling mon_map, fs_map, osd_map and other events. It manages the ceph-mgr plug-in, provides the plug-in with the source of all data, and notifies the plug-in of ceph-mgr when a specific event occurs, such as the plug-in's notify function, which is called back by Mgr. DaemonServer . Independent thread, listening on the same port as the main messenger (to be confirmed). Is the primary maintainer of cluster metric data, and the load performs operations on the cluster, such as telling OSD to pg scrub, and so on. Plugin thread. Plugin is written by Python, each plugin runs in a separate thread, and the function called by the thread is the serve of the python class. Plugin can run a http server in serve to provide external services. Ceph-mgr provides get, get_server and other functions for plugin, which return data about cluster metrics. For example, the prometheus plug-in exposes ceph internal indicators in prometheus format through the http protocol, which makes it easier to monitor ceph clusters. Ceph is written by C++. Ceph calls methods defined by python plugin (such as serve), and python plugin can call functions defined by C++ (such as get). The mechanism of python/c++ intermodulation is provided by python. Its basic principle is: C++ calls python. Python entities in C++ are all of type PyObject, modules, functions, classes, data are all. Cpython provides PyImport_Import to get the PyObject corresponding to the m module object by name, and the class can be obtained by PyObject_GetAttrString from the properties of the module, and so on, cpython also provides a method to generate the PyObject of the value of the corresponding python type from the value of type c, such as PyObject* PyString_FromString (char *). If you have function objects and parameter objects, you can call the function through PyObject* PyObject_CallObject () and transfer the resulting PyObject* back to C++ type OK. Python calls C++. Define PyObject* ceph_state_get (PyObject* self, PyObject* args) in C++, and resolve the parameters to C++ type through PyArg_ParseTuple (args, "ss:ceph_state_get", & handle, & what) in the function, and then implement a Python function. Add the Python function to a registry through PyMethodDef CephStateMethods [] = {{"get", ceph_state_get, METH_VARARGS, "Get a cluster object"}}. Through Py_InitModule ("ceph_state", CephStateMethods), the function in the registry is defined as a property of the ceph_state module, and the module is injected into python sys.path, and python can call the function through ceph_state.ceph_state_get.

Author: Li Yichao [Senior Software Development engineer]

In order to improve the efficiency of research and development, Didi Yun Technology Salon, which is all technical practical information, is in the process of signing up!

Follow Didi Yun official account immediately:

Reply to "class" to get free registration.

Reply to "Server" to get an one-month experience of getting started with CVM for free.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report