In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, the editor will bring you an example analysis of PG_repack and table expansion, which is the umbrella of PostgreSQL disk space. The article is rich in content and analyzes and describes it from a professional point of view. I hope you can get something after reading this article.
In the recent use of PG, I found that this database is really a bottomless pit, there are too many things, but we learn everything through the trunk and branches, and the follow-up learning sometimes depends on self-consciousness and luck.
Today, the operation of pg_repack, this plug-in, PG is not as simple as ORACLE or SQL SERVER, because many of the functions of PG are done through plug-ins, which is different from MYSQL plug-ins.
To get to the point, there are usually some problems that need to be managed in PG as follows:
Free space reclaimed from the table to the disk after a large number of records are deleted
Rebuild a table to reorder records and compress / package them into fewer pages. This may cause the query to get only one page from disk (or
< n个页面),而不是n个页面。换句话说,IO越少,性能越好。 从由于不正确的auto vaccum设置而导致大量膨胀的表中不能回收空闲空间。 安装 pg_repack 是并不是一件难事,正常的编译,create extensiton pg_repack ,然后在配置文件中 shared_preload_libraries = 'pg_repack' 重新启动PG 即可 下面我们就是要模拟一个表膨胀的案例,然后再用 pg_repack 来解决一些问题 1 我们在postgres 数据库中创建一张表 CREATE TABLE large_test (id serial primary key,num1 bigint, num2 double precision, num3 double precision); 2 插入测试数据 INSERT INTO large_test (num1, num2, num3) SELECT round(random()*10), random(), random()*142 FROM generate_series(1, 2000000) s(i); 3 我们查看这个表到底在机器物理那个文件上体现 select oid,datname from pg_database ;OK We confirm that the physical file of our table should be in the 13287 folder, where we use the oid2name command to see where your table is in the file.
In the following figure of oid2name-t large_test, we can lock the physical table in the file 16455.
This one has a record of 2 million, and the size is 115MB.
Let's start with a table expansion operation, and we open two things.
1 the first thing inserts 2 million data into the table
2 the second thing updates the value of a field in the table
We can take a look at the size of the watch, which soared from 115MB to 345 MB in an instant.
If logically speaking, the size of the table should not be about 230MB, how so exaggerated to reach 345MB.
In fact, from the point of view of the structural design of PG's table, (I wrote a text four months ago), the main reason is that PG's undo log is actually blended into the physical design of the table, each time UPDATE actually does not make real data modification, but re-inserts a new row (which reminds me of cassandra), so how many rows have been updated, the space of the data is * 2, so it causes the table to expand. And vaccum and auto vaccum. (vaccum was also written, about 2 months ago), so sometimes we sacrifice our artifact, (note: please maintain the operation during non-working hours) PG_REPACK tool to shrink our overinflated table, of course, auto vaccum can also be solved, but if your watch expands too much, and during non-working hours, it is also a good way to solve this problem at one time.
Let's start repack next.
Pg_repack-d postgres-- table public.large_test
After more than 10 seconds of work, we look at where the physical file of the large_test table is, and I can see that the name of the physical file has changed after repack.
We are looking at the size of this physical file 230 MB compared to the disk occupancy rate just now, we can clearly know that the space of those abandoned lines has been released to the system.
So the question is, what exactly does repack do and how does it work? in fact, the principle of hot repack is very simple, which means the same meaning as MYSQL's alter table table engine=innodb (if you are the DBA of MYSQL, you will probably soon understand). Of course, if you are the DBA of SQL SERVER, you know the function of shinrk database.
This is equivalent to rewriting a new file, kicking out the original physical file, and reorganizing the table.
The advantage is not only that the table takes up less space, but also that the speed of accessing the table will be faster. Finally, this command can be run in parallel, followed by the parameter J and your number of parallelism.
Finally, if you report some strange errors in installing pg_repack, you can try to install it.
Sudo yum-y install postgresql-static.x86_64
Finally, if you want to operate this command remotely, it is OK, but your remote machine must also install this plug-in, not local installation, remote operation of a PG without plug-ins is not allowed.
The above is the example analysis of PG_repack and table expansion of PostgreSQL disk space shared by Xiaobian. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.