Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to import data from RRD database into MYSQL

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article shows you how to import the data from the RRD database into MYSQL, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

Import data from RRD database into MYSQL

A brief introduction to RRD database and RRDTOOL

It means Round Robin Database. The design concept is to store in the round-robin way, after a cycle (which can be defined by yourself), the new

The data will overwrite the original data. So RRD database is suitable for storing dynamic data and does not need long-term storage. Because it periodically overwrites the old data.

So the size of the database is basically fixed and does not increase over time.

RRDTOOL is free software developed by Tobias Oetiker and uses RRD as the storage format. RRDTOOL provides a number of tools to www.2cto.com the RRD database

Perform operations, including creating, updating, querying, and generating display diagrams, etc. RRDTOOL also provides API in many languages for ease of operation.

Ganglia is a distributed monitoring system, which uses RRD database for data storage and visualization. There is one related to ganglia in the Hadoop package.

The hadoop cluster can be monitored by modifying some parameters and making some settings on ganglia. The data for each different attribute exists a

In the RRD database.

Second, import data into MYSQL

There will also be situations where you may want to store the data collected by rrdtool for a long time for some analysis. And the data in RRD database is continuous.

Updated, although it can also retain long-term data, but the accuracy is not enough. For example, the step size of an RRD database is 15 seconds, that is, every 15 seconds.

A new value is stored (such as memory usage) and an old value is overwritten. A RRD database stores 5761 such data (+ 15 www.2cto.com a day)

Seconds). And always store the most recent day's data over time. And then continue to calculate higher-step values through these values, for example, we can pass through the

These 15-second data calculate 360s of data (average), and then store these values in 360s steps, but the time interval that can be stored at this time is

Longer, the same number of rows can store 24 days of data. By analogy, you can also store data for a year on a daily basis, but at this time the accuracy is only

For a day, those old 15s data have been overwritten. If you want to store all this data, you need to import the data regularly through a script.

It's convenient to do this on LINUX. Perl,python,lua,ruby is a good choice, and so can shell. Then use crond to set it at a certain time

It can be executed regularly. The following is a sample code for python:

(note: python learns in general, basically writing code while reading books, there are a lot of questions, please correct them. )

The first is to initialize and create the database and the corresponding tables:

Import os

Import MySQLdb

Import string

Root= "/ var/lib/ganglia/rrds/hap-clu"

Dirs=os.listdir (root)

Map1=string.maketrans ('.','_')

Map2=string.maketrans ('-','_')

Conn=MySQLdb.connect (host='localhost', user='root',passwd='123456')

Cursor=conn.cursor () www.2cto.com

For onedir in dirs:

Dbname=onedir.translate (map1) .translate (map2)

Cursor.execute ("create database if not exists" + dbname)

Conn.commit ()

Conn.select_db (dbname)

# print onedirname

Print "DB:" + dbname+ "."

Files=os.listdir (root+ "/" + onedir)

For onefile in files:

Tablename=onefile [:-4] .translate (map1)

If (dbname== "_ _ SummaryInfo__"):

Cursor.execute ("create table if not exists" + tablename+ "(time_id int not null primary key,value varchar (30), num varchar (30)")

Else:

Cursor.execute ("create table if not exists" + tablename+ "(time_id int not null primary key,value varchar (30))")

Conn.commit ()

# print "CREATE TABLE" + tablename

Print "CREATE DATABASE" + dbname+ ""

Cursor.close ()

There are a lot of explanations here:

1. Stored directory: this is the default directory in ganglia, but it can be modified. Other different applications should also be different. The last hap-clu is the cluster www.2cto.com

Donovan's name. In this directory, each node occupies a directory, the directory name is usually the IP address, and finally there is a summary directory. Corresponding to, for each

The catalog (node) creates a database with one table for each attribute.

"." is not allowed in naming conventions for 2.MYSQL databases and tables. And "-", so the corresponding database name and table name should be converted accordingly. What is used here is

Translate function.

3. It was originally thought that the script would only need to be executed once, but in the actual application, it is found that the number of tables and databases may increase. For example, there are newly added ones.

Node, you need to create a database for it in time. For some existing nodes, it is possible that the data of some attributes were detected later. Like the situation I encountered.

It is only after running for a period of time that the statistics about swap come out and the RRD database is created. I don't know if this is a configuration problem or a normal one. But in order to

It runs smoothly, and the script runs at the same time every day as the script that inserts the data, and before the latter.

A script that inserts data:

Import os

Import commands

Import MySQLdb

Import string

Import rrdtool

# from xml.etree.ElementTree import ElementTree

Www.2cto.com

# working directory

Root= "/ var/lib/ganglia/rrds/hap-clu"

Dirs=os.listdir (root)

# mysql table name limit

Map1=string.maketrans ('.','_')

Map2=string.maketrans ('-','_')

Conn=MySQLdb.connect (host='localhost', user='root',passwd='123456')

Cursor=conn.cursor ()

For onedir in dirs:

Dbname=onedir.translate (map1) .translate (map2)

Conn.select_db (dbname)

Print "DB:" + dbname+ "."

Files=os.listdir (root+ "/" + onedir)

Os.chdir (root+ "/" + onedir)

For onefile in files:

# it seems that all is AVERAGE

Tablename=onefile [:-4] .translate (map1)

Data=rrdtool.fetch (onefile, "AVERAGE")

Firsttime=data [0] [0] www.2cto.com

Count=0

While count < 5761:

Time=firsttime+15*count

Value=data [2] [count] [0]

If value==None:

Count+=1

Continue

If dbname== "_ _ SummaryInfo__":

Num=data [2] [count] [1]

Fvalue= [time,str (value), str (num)]

Try:

Cursor.execute ("insert into" + tablename+ "values (% sdepartment% s)", fvalue)

Except MySQLdb.IntegrityError:

Pass

Else: www.2cto.com

Fvalue= [time,str (value)]

Try:

Cursor.execute ("insert into" + tablename+ "values (% s)", fvalue)

# print "OK" + str (count)

Except MySQLdb.IntegrityError:

Pass www.2cto.com

Count+=1

Conn.commit ()

Print "UPDATING TABLE" + tablename

Cursor.close ()

Description:

1.python has a RRDTOOL module, and the corresponding commands can be called directly through the functions in the module, and the result is a list or tuple of

It's easy to traverse. Another way is to export rrd to XML by calling external commands (RRDTOOL has this feature built into it). The advantage is that XML

The data inside is very similar, but the disadvantage is that it is too cumbersome and inefficient, and it is necessary to parse XML.

2.count is the number of rows of data stored in RRD, which is directly set to the default value for convenience. Strictly speaking, it should be obtained through RRDTOOL INFO first.

If you want to turn off the structure information, get this value, and then call. Rrdtool.fetch can get all the stored values.

3. About commit. Just opened when not familiar with API, did not add this sentence, the results of the data are not imported. After each insert for the first time, the result is inserted into www.2cto.com

The speed is so slow that it takes about a day to update at a time, which is useless at all. It will be soon after it is put in the back.

4. Because the frequency of insertion is different from the frequency of RRD updates, in order to ensure the continuity of data (not lost), the frequency of insertion is higher than the frequency of updates. There will be

A lot of duplicate data, where the primary key (timestamp, UNIX seconds) and IntegrityError are used to skip the data that has been inserted. When I did this,

One question has been taken into account, that is, when there are a large number of original rows in the table, how slow is the insertion to the back? (a single table updates about 5700 rows of data every day.

If the month is about 170000 lines, it will exceed 2 million lines in a year. Now the result of my run is that there are 50,000 rows of data in the table, and the insertion speed has not slowed down significantly, so I want to continue.

Run it for a while and observe. If it's too slow, you have to do it another way.

The above content is how to import data from RRD database into MYSQL. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report