In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article shows you how to import the data from the RRD database into MYSQL, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
Import data from RRD database into MYSQL
A brief introduction to RRD database and RRDTOOL
It means Round Robin Database. The design concept is to store in the round-robin way, after a cycle (which can be defined by yourself), the new
The data will overwrite the original data. So RRD database is suitable for storing dynamic data and does not need long-term storage. Because it periodically overwrites the old data.
So the size of the database is basically fixed and does not increase over time.
RRDTOOL is free software developed by Tobias Oetiker and uses RRD as the storage format. RRDTOOL provides a number of tools to www.2cto.com the RRD database
Perform operations, including creating, updating, querying, and generating display diagrams, etc. RRDTOOL also provides API in many languages for ease of operation.
Ganglia is a distributed monitoring system, which uses RRD database for data storage and visualization. There is one related to ganglia in the Hadoop package.
The hadoop cluster can be monitored by modifying some parameters and making some settings on ganglia. The data for each different attribute exists a
In the RRD database.
Second, import data into MYSQL
There will also be situations where you may want to store the data collected by rrdtool for a long time for some analysis. And the data in RRD database is continuous.
Updated, although it can also retain long-term data, but the accuracy is not enough. For example, the step size of an RRD database is 15 seconds, that is, every 15 seconds.
A new value is stored (such as memory usage) and an old value is overwritten. A RRD database stores 5761 such data (+ 15 www.2cto.com a day)
Seconds). And always store the most recent day's data over time. And then continue to calculate higher-step values through these values, for example, we can pass through the
These 15-second data calculate 360s of data (average), and then store these values in 360s steps, but the time interval that can be stored at this time is
Longer, the same number of rows can store 24 days of data. By analogy, you can also store data for a year on a daily basis, but at this time the accuracy is only
For a day, those old 15s data have been overwritten. If you want to store all this data, you need to import the data regularly through a script.
It's convenient to do this on LINUX. Perl,python,lua,ruby is a good choice, and so can shell. Then use crond to set it at a certain time
It can be executed regularly. The following is a sample code for python:
(note: python learns in general, basically writing code while reading books, there are a lot of questions, please correct them. )
The first is to initialize and create the database and the corresponding tables:
Import os
Import MySQLdb
Import string
Root= "/ var/lib/ganglia/rrds/hap-clu"
Dirs=os.listdir (root)
Map1=string.maketrans ('.','_')
Map2=string.maketrans ('-','_')
Conn=MySQLdb.connect (host='localhost', user='root',passwd='123456')
Cursor=conn.cursor () www.2cto.com
For onedir in dirs:
Dbname=onedir.translate (map1) .translate (map2)
Cursor.execute ("create database if not exists" + dbname)
Conn.commit ()
Conn.select_db (dbname)
# print onedirname
Print "DB:" + dbname+ "."
Files=os.listdir (root+ "/" + onedir)
For onefile in files:
Tablename=onefile [:-4] .translate (map1)
If (dbname== "_ _ SummaryInfo__"):
Cursor.execute ("create table if not exists" + tablename+ "(time_id int not null primary key,value varchar (30), num varchar (30)")
Else:
Cursor.execute ("create table if not exists" + tablename+ "(time_id int not null primary key,value varchar (30))")
Conn.commit ()
# print "CREATE TABLE" + tablename
Print "CREATE DATABASE" + dbname+ ""
Cursor.close ()
There are a lot of explanations here:
1. Stored directory: this is the default directory in ganglia, but it can be modified. Other different applications should also be different. The last hap-clu is the cluster www.2cto.com
Donovan's name. In this directory, each node occupies a directory, the directory name is usually the IP address, and finally there is a summary directory. Corresponding to, for each
The catalog (node) creates a database with one table for each attribute.
"." is not allowed in naming conventions for 2.MYSQL databases and tables. And "-", so the corresponding database name and table name should be converted accordingly. What is used here is
Translate function.
3. It was originally thought that the script would only need to be executed once, but in the actual application, it is found that the number of tables and databases may increase. For example, there are newly added ones.
Node, you need to create a database for it in time. For some existing nodes, it is possible that the data of some attributes were detected later. Like the situation I encountered.
It is only after running for a period of time that the statistics about swap come out and the RRD database is created. I don't know if this is a configuration problem or a normal one. But in order to
It runs smoothly, and the script runs at the same time every day as the script that inserts the data, and before the latter.
A script that inserts data:
Import os
Import commands
Import MySQLdb
Import string
Import rrdtool
# from xml.etree.ElementTree import ElementTree
Www.2cto.com
# working directory
Root= "/ var/lib/ganglia/rrds/hap-clu"
Dirs=os.listdir (root)
# mysql table name limit
Map1=string.maketrans ('.','_')
Map2=string.maketrans ('-','_')
Conn=MySQLdb.connect (host='localhost', user='root',passwd='123456')
Cursor=conn.cursor ()
For onedir in dirs:
Dbname=onedir.translate (map1) .translate (map2)
Conn.select_db (dbname)
Print "DB:" + dbname+ "."
Files=os.listdir (root+ "/" + onedir)
Os.chdir (root+ "/" + onedir)
For onefile in files:
# it seems that all is AVERAGE
Tablename=onefile [:-4] .translate (map1)
Data=rrdtool.fetch (onefile, "AVERAGE")
Firsttime=data [0] [0] www.2cto.com
Count=0
While count < 5761:
Time=firsttime+15*count
Value=data [2] [count] [0]
If value==None:
Count+=1
Continue
If dbname== "_ _ SummaryInfo__":
Num=data [2] [count] [1]
Fvalue= [time,str (value), str (num)]
Try:
Cursor.execute ("insert into" + tablename+ "values (% sdepartment% s)", fvalue)
Except MySQLdb.IntegrityError:
Pass
Else: www.2cto.com
Fvalue= [time,str (value)]
Try:
Cursor.execute ("insert into" + tablename+ "values (% s)", fvalue)
# print "OK" + str (count)
Except MySQLdb.IntegrityError:
Pass www.2cto.com
Count+=1
Conn.commit ()
Print "UPDATING TABLE" + tablename
Cursor.close ()
Description:
1.python has a RRDTOOL module, and the corresponding commands can be called directly through the functions in the module, and the result is a list or tuple of
It's easy to traverse. Another way is to export rrd to XML by calling external commands (RRDTOOL has this feature built into it). The advantage is that XML
The data inside is very similar, but the disadvantage is that it is too cumbersome and inefficient, and it is necessary to parse XML.
2.count is the number of rows of data stored in RRD, which is directly set to the default value for convenience. Strictly speaking, it should be obtained through RRDTOOL INFO first.
If you want to turn off the structure information, get this value, and then call. Rrdtool.fetch can get all the stored values.
3. About commit. Just opened when not familiar with API, did not add this sentence, the results of the data are not imported. After each insert for the first time, the result is inserted into www.2cto.com
The speed is so slow that it takes about a day to update at a time, which is useless at all. It will be soon after it is put in the back.
4. Because the frequency of insertion is different from the frequency of RRD updates, in order to ensure the continuity of data (not lost), the frequency of insertion is higher than the frequency of updates. There will be
A lot of duplicate data, where the primary key (timestamp, UNIX seconds) and IntegrityError are used to skip the data that has been inserted. When I did this,
One question has been taken into account, that is, when there are a large number of original rows in the table, how slow is the insertion to the back? (a single table updates about 5700 rows of data every day.
If the month is about 170000 lines, it will exceed 2 million lines in a year. Now the result of my run is that there are 50,000 rows of data in the table, and the insertion speed has not slowed down significantly, so I want to continue.
Run it for a while and observe. If it's too slow, you have to do it another way.
The above content is how to import data from RRD database into MYSQL. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.