Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

[GP] Greenplum installation

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Configure system information to prepare for installation of Greenplum 1.Greenplum cluster introduction

1 master,3 segment cluster is used here, and the ip is

192.168.6.119 master

192.168.6.120 segment

192.168.6.121 segment standbymaster

two。 Modify the / etc/hosts file (all machines need to be modified) [gpadmin@dw-greeplum-1 gpseg-1] $more / etc/hosts

# Do not remove the following line, or various programs

# that require network functionality will fail.

127.0.0.1 localhost.localdomain localhost

:: 1 localhost6.localdomain6 localhost6

192.168.6.119 dw-greeplum-1 mdw

192.168.6.120 dw-greeplum-2 sdw1

192.168.6.121 dw-greeplum-3 sdw2

After configuring this file, be sure to modify the / etc/sysconfig/network file as follows (all machines need to be modified):

[root@mdw ~] # cat / etc/sysconfig/networkNETWORKING=yesHOSTNAME=dw-greeplum-1

3. Create users and user groups (all machines are to be created) [root@mdw] # groupadd-g 530 gpadmin [root@mdw ~] # useradd-g 530-u530-m-d / home/gpadmin-s / bin/bash gpadmin [root@mdw ~] # passwd gpadminChanging password for user gpadmin.New password:BAD PASSWORD: it is too simplistic/systematicBAD PASSWORD: is too simpleRetype new password:passwd: all authentication tokens updated successfully.

4. Modify the system kernel (for all machines, refer to page 216 in the book)

Besides

[root@mdw selinux] # cat / etc/selinux/config # This file controls the state of SELinux on the system.# SELINUX= can take one of these three values:# enforcing-SELinux security policy is enforced.# permissive-SELinux prints warnings instead of enforcing.# disabled-No SELinux policy is loaded. SELINUX=disabled # SELINUXTYPE= can take one of these two values:# targeted-Targeted processes are protected,# mls-Multi Level Security protection.SELINUXTYPE=targeted

All right, now that the preparatory work is done, we can start to install Greenplum.

two。 Install Greenplum1. Create installation file directory (per machine) [root@mdw ~] # mkdir / opt/greenplum [root@mdw ~] # chown-R gpadmin:gpadmin / opt/greenplum

After that, the installation files are placed in this directory.

two。 Download the installation package http://pan.baidu.com/s/1b60kqi

Install Greenplum on 3.master (just master)

Just put the downloaded file in a location that you can find on the CenOS system, give the file execution permission, and then execute the file to start the installation.

[root@mdw ~] # chmod + x greenplum-db-4.3.3.1-build-1-RHEL5-x86_ 64.bin [root @ mdw ~] #. / greenplum-db-4.3.3.1-build-1-RHEL5-x86_64.bin

During this period, you need to modify the default installation directory, enter / opt/greenplum/greenplum-db-4.8.3.1

After that, the installation is successful, and the Greenplum on master is installed successfully. But before, we all installed as root, so we need to change the owner of the files in the installation directory to gpadmin

[root@mdw] # chown-R gpadmin:gpadmin / opt/greenplum

Because Greenplum is only installed on master, the installation package needs to be sent to each segment in bulk before Greenplum is fully installed on the entire Greenplum cluster. The following actions are to connect all the nodes and send the installation package to each node.

4. Create a configuration file [root@mdw ~] # su gpadmin [gpadmin@mdw root] $cd [gpadmin@dw-greeplum-1 gpseg-1] $more / opt/greenplum/greenplum-db/conf/hostlist dw-greeplum-1dw-greeplum-2dw-greeplum-3 [gpadmin@mdw [gpadmin@dw-greeplum-1 gpseg-1] $more / opt/greenplum/greenplum-db/conf/seg_hosts dw-greeplum-2dw-greeplum-3

Note that you need to convert to gpadmin identity at this time, and create hostlist and seg_hosts files according to the above file contents.

5. Get through all nodes

Some environment variable settings for running Greenplum are saved in greenplum_path.sh, including GPHOOME, PYTHONHOME, and so on.

[gpadmin@mdw ~] $source / opt/greenplum/greenplum-db/greenplum_path.sh [gpadmin@mdw ~] $gpssh-exkeys-f / opt/greenplum/greenplum-db/conf/hostlist [STEP 1 of 5] create local ID and authorize on local host... / home/gpadmin/.ssh/id_rsa file exists. Key generation skipped [STEP 2 of 5] keyscan all hosts and update known_hosts file [STEP 3 of 5] authorize current user on remote hosts... Send to sdw1... Send to sdw2... Send to sdw3 [STEP 4 of 5] determine common authentication file content [STEP 5 of 5] copy authentication files to all remote hosts... Finished key exchange with sdw1... Finished key exchange with sdw2... Finished key exchange with sdw3 [INFO] completed successfully

This means that you can successfully get through, and then you can use the following command to enable batch operation, as follows:

Note that when using the gpssh-exkeys command, be sure to use the gpadmin identity, because this command generates the password-free login key for ssh, at / home/gpadmin/.ssh. If you use the gpssh-exkeys command with the root identity, the generated .ssh key is under the home of root or under / home/gpadmin but is the owner of the root, and does not have permission to perform the appropriate operation later using the gpadmin identity.

[gpadmin@mdw ~] $gpssh-f / opt/greenplum/greenplum-db/conf/hostlist Note: command history unsupported on this machine. = > pwd [sdw1] / home/gpadmin [sdw3] / home/gpadmin [sdw2] / home/gpadmin [mdw] / home/gpadmin= > exit

Here the pwd command is the view path command in linux, and here is the location of the batch operation, from which you can see that it is connected to four nodes at the same time. Here, if only two parameters are set in the / etc/hosts file and no hostname is set, only two nodes can be connected at the same time, and it is random. It took a long time for this mistake to happen.

Here we are just testing it, and do some other operations after exit.

6. Distribute the installation package to each child node

After getting through, you need to bulk copy the greenplum installation package in master to each segment node.

[gpadmin@mdw conf] $cd / opt/greenplum/ package: [gpadmin@mdw greenplum] $tar-cf gp.4.3.tar greenplum-db-4.3.8.1/ and then use the gpscp command to copy this file to each machine: [gpadmin@mdw greenplum] $gpscp-f / home/gpadmin/conf/hostlist gp.4.3.tar =: / opt/greenplum/

Ok, if there is no accident, the batch copy is successful. You can go to the corresponding folder of the child node to view it, and then decompress the tar package. We use batch operation.

[gpadmin@mdw conf] $gpssh-f hostlist = > cd / opt/greenplum [sdw3] [sdw1] [sdw2] [mdw] = > tar-xf gp.4.3.tar [sdw3] [sdw1] [sdw2] [mdw] establish soft link = > ln-s. / greenplum-db-4.3.8.1 greenplum-db [sdw3] [sdw1] [sdw2] [mdw] = > ll (you can use ll to check whether it has been installed successfully)

This completes the installation of all nodes.

4. Initialize the database

Several steps before initialization are to do some preparatory work.

1. Batch create Greenplum data storage directory [gpadmin@mdw conf] $gpssh-f hostlist= > mkdir gpdata [sdw3] [mdw] [sdw2] [sdw1] = > cd gpdata [sdw3] [mdw] [sdw2] [sdw1] = > mkdir gpmaster gpdatap1 gpdatap2 gpdatam1 gpdatam2 [sdw3] [mdw] [sdw2] [sdw1] = > ll [sdw3] Total usage 20 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam1 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam2 [sdw3] drwxrwxr -x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap1 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap2 [sdw3] drwxrwxr-x 2 gpadmin gpadmin 4096 19:46 gpmaster [mdw] total dosage 20 [mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam1 [mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam2 [mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap1 [mdw] drwxrwxr-x 2 gpadmin Gpadmin 4096 July 18 19:46 gpdatap2 [mdw] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpmaster [sdw2] total dosage 20 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam1 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam2 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap1 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap2 [sdw2] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 Total amount of gpmaster [sdw1] 20 [sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam1 [sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatam2 [sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap1 [sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpdatap2 [sdw1] drwxrwxr-x 2 gpadmin gpadmin 4096 July 18 19:46 gpmaster= > exit2. Configure the .bash _ profile environment variable (per machine) [gpadmin@mdw ~] $cd [gpadmin@mdw ~] $cat .bash _ profile# .bash _ profile# Get the aliases and functionsif [- f ~ / .bashrc]; then. ~ / .bashrcfi # User specific environment and startup programsPATH=$PATH:$HOME/binexport PATHsource / opt/greenplum/greenplum-db/greenplum_path.shexport MASTER_DATA_DIRECTORY=/home/gpadmin/gpdata/gpmaster/gpseg-1export PGPORT=2345export PGDATABASE=testDB

[gpadmin@mdw ~] $source .bash _ profile (let the environment variable take effect)

I feel that this environment variable sometimes does not work when using source, so I need to restart the machine.

There is a problem here, Greenplum will install a Python package, which is 2.6.6, forget what it is for, and then add it later. And CentOS6.5 itself comes with a Python package, which is 2.6.2. After setting the above environment variables, there will be problems when using yum to install some software (because yum is based on Python), because the system finds that it has two Python packages, and he does not know which one to use. I haven't tried to update the system's Python package. When I need to install the software, I comment out all the environment variables and let it take effect after installation.

3. Initialization configuration file [gpadmin@dw-greeplum-1 gpseg-1] $more / opt/greenplum/greenplum-db/conf/initgp_config ARRAY_NAME= "Greenplum" SEG_PREFIX=gpsegPORT_BASE=33000declare-a DATA_DIRECTORY= (/ home/gpadmin/gpdata/gpdatap1 / home/gpadmin/gpdata/gpdatap2) MASTER_HOSTNAME=dw-greeplum-1MASTER_DIRECTORY=/home/gpadmin/gpdata/gpmaster MASTER_PORT=2345TRUSTED_SHELL=/usr/bin/sshMIRROR_PORT_BASE=43000REPLICATION_PORT_BASE=34000MIRROR_REPLICATION_PORT_BASE=44000declare-a MIRROR_DATA_DIRECTORY= ( / home/gpadmin/gpdata/gpdatam1 / home/gpadmin/gpdata/gpdatam2) MACHINE_LIST_FILE=/opt/greenplum/greenplum-db/conf/seg_hosts4. Initialize database [gpadmin@mdw ~] $gpinitsystem-c / opt/greenplum/greenplum-db/conf/initgp_config-s dw-greeplum-3 # initialize and make the third machine standby database or [gpadmin@mdw ~] $gpinitsystem-c initgp_config-h seg_hosts # initialize everything first, and then specify the third machine as standby

[gpadmin@mdw] $gpinitstandby-s dw-greeplum-3

Sdw3 refers to the node where the standby of master is located. I read that some materials on the book and on the Internet put standby on the last node, which may be established by convention.

If there is a problem with some of the configuration above, gpinitsystem will not succeed, and the log is in the gpinitsystem_2016XXXX.log file of the primary node / home/gpadmin/gpAdminLogs/.

It should be noted that if initialization fails, be sure to check the log file carefully. There is no point in repeating the installation. It is important to find the main reason. I made some mistakes when I pasted in the / etc/sysctl.conf file, so I couldn't find the reason all the time. Finally, I contacted the official community of Greenplum. Thank you very much for your help!

Question:

1. When I was using the virtual machine (using bridge mode) configuration, there was a "sdw1 file cannot be copied to sdw2". In stackoverflow, someone said that I wanted to set DNS. After I set up DNS, I could gpinitsystem successfully! But I still don't quite understand why I want to set up DNS, because I really don't need DNS to parse anything.

two。 When I use the virtual machine, in order to save effort, I first install a CentOS. The CentOS will have an installation package on the hard disk. After copying the installation package, I found that the name of the network card in the CentOS that copied the installation package was different from that of the source installation package. I did not use eth0, but used eth2 or Auto-eth2.

Experience:

1. There are a lot of problems in the process of installation, a lot of things about linux network, but fortunately, it will be easy to solve after understanding.

There are many reasons for the failure of 2.gpinitsystem, be sure to read the log carefully, the log will tell us the reason for the error, an attempt will not solve the problem.

Reference:

Http://www.cnblogs.com/renlipeng/p/5685432.html-the recommendation is the same as the steps in the book, very careful

Http://www.cnblogs.com/liuyungao/p/5689588.html-- some other operations

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report