Как установить greenplum на windows

Installing Greenplum Community edition (CE). Here are the step by step  instructions on how to install the Greenplum Database CE software and get your single-node Greenplum Database system up and running. 

Before installing we have to change following OS configuration parameters:

—————-

LINUX

—————

In /etc/sysctl.conf: 

sysctl.kernel.shmmax = 500000000

sysctl.kernel.shmmni = 4096

sysctl.kernel.shmall = 4000000000

sysctl.kernel.sem = 250 512000 100 2048

sysctl.kernel.sysrq = 1

sysctl.kernel.core_uses_pid = 1

sysctl.kernel.msgmnb = 65536

sysctl.kernel.msgmax = 65536

sysctl.kernel.msgmni = 2048

sysctl.net.ipv4.tcp_syncookies = 1

sysctl.net.ipv4.ip_forward = 0

sysctl.net.ipv4.conf.default.accept_source_route = 0

sysctl.net.ipv4.tcp_tw_recycle = 1

sysctl.net.ipv4.tcp_max_syn_backlog = 4096

sysctl.net.ipv4.conf.all.arp_filter = 1

sysctl.net.ipv4.ip_local_port_range = 1025 65535

sysctl.net.core.netdev_max_backlog = 10000

sysctl.vm.overcommit_memory = 2

For RHEL version 6.x platforms, do not include the sysctl. prefix in the above parameters, as follows:

xfs_mount_options = rw,noatime,inode64,allocsize=16m

kernel.shmmax = 500000000

kernel.shmmni = 4096

kernel.shmall = 4000000000

kernel.sem = 250 512000 100 2048

kernel.sysrq = 1

kernel.core_uses_pid = 1

kernel.msgmnb = 65536

kernel.msgmax = 65536

kernel.msgmni = 2048

net.ipv4.tcp_syncookies = 1

net.ipv4.ip_forward = 0

net.ipv4.conf.default.accept_source_route = 0

net.ipv4.tcp_tw_recycle = 1

net.ipv4.tcp_max_syn_backlog = 4096

net.ipv4.conf.all.arp_filter = 1

net.ipv4.ip_local_port_range = 1025 65535

net.core.netdev_max_backlog = 10000

vm.overcommit_memory = 2

In /etc/security/limits.conf: 

* soft nofile 65536  

* hard nofile 65536  

* soft nproc 131072 

* hard nproc 131072 

OPTIONAL

==================================================================================================================

In /etc/hosts:

Include the host names and all interface address names for every machine participating in your Greenplum Database system.

File system recommendations: XFS is the preferred file system on Linux platforms for data storage. Greenplum recommends the following xfs mount options:

rw,noatime,inode64,allocsize=16m

Also, each disk device file should have a read-ahead value of 16384.  For example, to verify the read-ahead value of a disk device:

# /sbin/blockdev —getra /dev/sdb

=================================================================================================================

Add the Greenplum database Admin account:

# useradd gpadmin

# passwd gpadmin

# New password: password

# Retype new password: password

You cannot run the Greenplum Database SNE server as root. While dealing with Greenplum use this newly created user account

Installing the Greenplum Database Community edition (CE):

1. Download or copy the Greenplum Database CE from www.greenplum.com/

2. Unzip the installer file:

# unzip greenplum-db-4.2.2.4-build-1-CE-RHEL5-x86_64.bin

3. Launch the installer using bash:

# /bin/bash greenplum-db-4.0.0.0-build-#-SingleNodeEdition-PLATFORM.bin

4. The installer prompts you to provide an installation path. Press ENTER to accept the default install path (/usr/local/greenplum-db-4.0.0.0), or enter new path

5. The installer installs the Greenplum Database CE software and creates a greenplum-db symbolic link one directory level above your version-specific Greenplum Database 

6. Change the ownership of your Greenplum Database SNE installation so that it is owned by the gpadmin

# chown -R gpadmin /usr/local/greenplum-db-4.0.0.0

# chgrp -R gpadmin /usr/local/greenplum-db-4.0.0.0

7. Preparing the Data Directory Locations

Every Greenplum Database SNE instance has a designated storage area on disk that is called the data directory location.

8. Create or choose a directory that will serve as your master data storage area

On this location user data is not stored, instead metadata (data about the data) is stored. Here global system catalog resides

# mkdir /gpmaster

# chown gpadmin /gpmaster

# chgrp gpadmin /gpmaster

9. Create or choose the directories that will serve as your segment storage areas:

This is the file system location where the database data is stored.

# mkdir /gpdata1

# chown gpadmin /gpdata1

# chgrp gpadmin /gpdata1

# mkdir /gpdata2

# chown gpadmin /gpdata2

# chgrp gpadmin /gpdata2

10. Configuring Greenplum Database SNE / CE Environment Variables:

$ vi .bashrc

Then add following entry

source /usr/local/greenplum-db/greenplum_path.sh

now source it

$ source ~/.bashrc

11. Now let’s initialize Greenplum database:

Greenplum provides a utility called gpinitsystem which initializes a Greenplum Database system. After the Greenplum Database SNE system is initialized and started, you can then create and manage databases by connecting to the Greenplum master database process.

12. Log in to the system as the gpadmin user:

# su — gpadmin

13. Copy the single_hostlist example file from your Greenplum Database installation to the current directory:

$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/hostlist_singlenode .

14. Copy the gp_init_singlenode example file from your Greenplum Database SNE installation to the current directory:

$ cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_singlenode .

15. Edit the gp_init_singlenode file and enter your configuration settings, you can remain them default. Some default parameters in this file are:

ARRAY_NAME=»GPDB SNE»

MACHINE_LIST_FILE=./hostlist_singlenode

SEG_PREFIX=gpsne

PORT_BASE=50000

declare -a DATA_DIRECTORY=(/disk1/gpdata1 /disk2//gpdata2)

MASTER_HOSTNAME=sachi

MASTER_DIRECTORY=/home/gpmaster

MASTER_PORT=5432

16. Run the gpssh-exkeys utility to exchange ssh keys for the local host:

$ gpssh-exkeys -h sachi

[gpadmin@dbaref ~]$  gpssh-exkeys -h dbaref

[STEP 1 of 5] create local ID and authorize on local host

[ERROR dbaref] authentication check failed:

     ssh: connect to host dbaref port 22: Connection refused

[ERROR] cannot establish ssh access into the local host

to overcome this issue

1. Disable firewall

2. start sshd service if not started

[root@dbaref Downloads]# /sbin/service sshd status

Redirecting to /bin/systemctl  status sshd.service

sshd.service — OpenSSH server daemon.

      Loaded: loaded (/lib/systemd/system/sshd.service; disabled)

      Active: inactive (dead)

      CGroup: name=systemd:/system/sshd.service

[root@dbaref Downloads]# /sbin/service sshd start

Redirecting to /bin/systemctl  start sshd.service

[root@dbaref Downloads]# /sbin/service sshd status

Redirecting to /bin/systemctl  status sshd.service

sshd.service — OpenSSH server daemon.

      Loaded: loaded (/lib/systemd/system/sshd.service; disabled)

      Active: active (running) since Mon, 20 May 2013 08:45:07 -0400; 5s ago

    Main PID: 2764 (sshd)

      CGroup: name=systemd:/system/sshd.service

          └ 2764 /usr/sbin/sshd -D

[root@dbaref Downloads]# 

Now su to gpadmin and run it again

[gpadmin@dbaref ~]$  gpssh-exkeys -h dbaref

[STEP 1 of 5] create local ID and authorize on local host

  … /home/gpadmin/.ssh/id_rsa file exists … key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] authorize current user on remote hosts

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts

[INFO] completed successfully

[gpadmin@dbaref ~]$ 

======================================================================

Note: I have struggled with this many times. Here is one more option.

[root@sachi ~]# mkdir /etc/ssh/gpadmin

[root@sachi ~]# cp /home/gpadmin/.ssh/authorized_keys /etc/ssh/gpadmin/

[root@sachi ~]# chown -R gpadmin:gpadmin /etc/ssh/gpadmin

[root@sachi ~]# chmod 755 /etc/ssh/gpadmin

[root@sachi ~]# chmod 644 /etc/ssh/gpadmin/authorized_keys

[root@sachi ~]# vi /etc/ssh/sshd_config 

#RSAAuthentication yes

#PubkeyAuthentication yes

# changed .ssh/authorized_keys to /etc/ssh/gpadmin/authorized_keys <<<<<<<

AuthorizedKeysFile      /etc/ssh/gpadmin/authorized_keys

#AuthorizedKeysCommand none

#AuthorizedKeysCommandRunAs nobody

Reboot the server

======================================================================

17. initialize Greenplum Database SNE:

$ gpinitsystem -c gpinitsystem_singlenode

[gpadmin@sachi ~]$ gpinitsystem -c gpinitsystem_singlenode 

20130423:19:53:54:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking configuration parameters, please wait…

20130423:19:53:54:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Reading Greenplum configuration file gpinitsystem_singlenode

20130423:19:53:54:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Locale has not been set in gpinitsystem_singlenode, will set to default value

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Locale set to en_US.utf8

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Detected a single host GPDB array build, reducing value of BATCH_DEFAULT from 60 to 4

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking configuration parameters, Completed

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Commencing multi-home checks, please wait…

.

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Configuring build for standard array

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Commencing multi-home checks, Completed

20130423:19:53:55:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Building primary segment instance array, please wait…

..

20130423:19:53:56:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking Master host

20130423:19:53:56:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking new segment hosts, please wait…

..

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checking new segment hosts, Completed

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum Database Creation Parameters

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:—————————————

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master Configuration

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:—————————————

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master instance name       = GPDB SINGLENODE

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master hostname            = sachi

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master port                = 5432

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master instance dir        = /home/gpmaster/gpsne-1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master LOCALE              = en_US.utf8

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum segment prefix   = gpsne

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master Database            = 

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master connections         = 250

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master buffers             = 128000kB

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Segment connections        = 750

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Segment buffers            = 128000kB

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Checkpoint segments        = 8

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Encoding                   = UNICODE

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Postgres param file        = Off

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db/./bin/initdb

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db/./lib

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Ulimit check               = Passed

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Array host connect type    = Single hostname per node

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [1]      = ::1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [2]      = 172.16.72.1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [3]      = 192.168.122.1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [4]      = 192.168.133.1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [5]      = 192.168.1.6

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [6]      = fe80::250:56ff:fec0:1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [7]      = fe80::250:56ff:fec0:8

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Master IP address [8]      = fe80::8e89:a5ff:fe80:f8e6

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Standby Master             = Not Configured

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Primary segment #          = 2

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total Database segments    = 2

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Trusted shell              = ssh

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Number segment hosts       = 1

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Mirroring config           = OFF

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:—————————————-

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum Primary Segment Configuration

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:—————————————-

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-sachi     /disk1/gpdata1/gpsne0     40000     2     0

20130423:19:53:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-sachi     /disk2/gpdata2/gpsne1     40001     3     1

Continue with Greenplum creation Yy/Nn>

Y

20130423:19:54:13:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Building the Master instance database, please wait…

20130423:19:55:19:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Starting the Master in admin mode

20130423:19:55:39:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Commencing parallel build of primary segment instances

20130423:19:55:39:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait…

..

20130423:19:55:39:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait…

………………………………………………………………………………………………………………………….

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:————————————————

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Parallel process exit status

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:————————————————

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total processes marked as completed           = 2

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total processes marked as killed              = 0

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Total processes marked as failed              = 0

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:————————————————

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Deleting distributed backout files

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Removing back out file

20130423:19:57:59:004006 gpinitsystem:sachi:gpadmin-[INFO]:-No errors generated from parallel processes

20130423:19:58:00:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Starting gpstop with args: -a -i -m -d /home/gpmaster/gpsne-1

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Gathering information and validating the environment…

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Obtaining Segment details from master…

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Greenplum Version: ‘postgres (Greenplum Database) 4.2.2.4 build 1 Community Edition’

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-There are 0 connections to the database

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=’immediate’

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Master host=sachi

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=immediate

20130423:19:58:00:016987 gpstop:sachi:gpadmin-[INFO]:-Master segment instance directory=/home/gpmaster/gpsne-1

20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Starting gpstart with args: -a -d /home/gpmaster/gpsne-1

20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Gathering information and validating the environment…

20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Greenplum Binary Version: ‘postgres (Greenplum Database) 4.2.2.4 build 1 Community Edition’

20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Greenplum Catalog Version: ‘201109210’

20130423:19:58:01:017070 gpstart:sachi:gpadmin-[INFO]:-Starting Master instance in admin mode

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining Segment details from master…

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Setting new master era

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Master Started…

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Checking for filespace consistency

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining current filespace entries used by TRANSACTION_FILES

20130423:19:58:02:017070 gpstart:sachi:gpadmin-[INFO]:-TRANSACTION_FILES OIDs are consistent for pg_system filespace

20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-TRANSACTION_FILES entries are consistent for pg_system filespace

20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-Checking for filespace consistency

20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES

20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-TEMPORARY_FILES OIDs are consistent for pg_system filespace

20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-TEMPORARY_FILES entries are consistent for pg_system filespace

20130423:19:58:03:017070 gpstart:sachi:gpadmin-[INFO]:-Shutting down master

20130423:19:58:04:017070 gpstart:sachi:gpadmin-[INFO]:-No standby master configured.  skipping…

20130423:19:58:04:017070 gpstart:sachi:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait…

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-Process results…

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:——————————————————

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-   Successful segment starts                                            = 2

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-   Failed segment starts                                                = 0

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:——————————————————

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances 

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:——————————————————

20130423:19:58:06:017070 gpstart:sachi:gpadmin-[INFO]:-Starting Master instance sachi directory /home/gpmaster/gpsne-1 

20130423:19:58:08:017070 gpstart:sachi:gpadmin-[INFO]:-Command pg_ctl reports Master sachi instance active

20130423:19:58:08:017070 gpstart:sachi:gpadmin-[INFO]:-Database successfully started

20130423:19:58:08:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode

20130423:19:58:08:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Loading gp_toolkit…

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Scanning utility log file for any warning messages

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-*******************************************************

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-were generated during the array creation

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Please review contents of log file

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20130423.log

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To determine level of criticality

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[WARN]:-*******************************************************

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Greenplum Database instance successfully created

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:——————————————————-

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To complete the environment configuration, please 

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-update gpadmin .bashrc file with the following

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-2. Add «export MASTER_DATA_DIRECTORY=/home/gpmaster/gpsne-1»

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-   or, use -d /home/gpmaster/gpsne-1 option for the Greenplum scripts

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-   Example gpstate -d /home/gpmaster/gpsne-1

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20130423.log

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Review options for gpinitstandby

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:——————————————————-

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-The Master /home/gpmaster/gpsne-1/pg_hba.conf post gpinitsystem

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-has been configured to allow all hosts within this new

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-new array must be explicitly added to this file

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory

20130423:19:58:10:004006 gpinitsystem:sachi:gpadmin-[INFO]:——————————————————-

[gpadmin@sachi ~]$ pwd

/home/gpadmin

[gpadmin@sachi ~]$ 

18. After the Greenplum Database SNE system is initialized and started, you can connect to the Greenplum master database process using the psql client program:

$ createdb mydb

$ psql mydb

19. Now export master data directory:

$ vi .bashrc

Then add following entry

export MASTER_DATA_DIRECTORY=/gpmaster/gpsne-1

now source it

$ source ~/.bashrc

20. Now you can perform any database operations using psql program (DDL, DML)

Uninstall Greenplum:

To uninstall run the following commands:

$ gpdeletesystem -d /gpmaster/gpsne-1

$ rm -rf /usr/local/greenplum-db-4.0.0.0

$ rm /usr/local/greenplum-db

You can remove the environment variable and restore the default setting of OS parameters(Optional)

In today’s data-driven world, the choice of database can significantly impact the performance and efficiency of your AI/ML workloads. VMWare’s Greenplum, is a powerful, massively parallel processing (MPP) data platform for analytics, machine learning, and AI. In this blog post, we will touch upon what makes Greenplum an excellent choice for developers and data scientists, but the post is focussed on providing detailed steps to install Greenplum on Windows using WSL2 (Windows Subsystem for Linux 2). I hope this blog post will enable you to easily install Greenplum and try out its powerful features. I will focus extensively on installing it on RockyLinux 8 in WSL2 but you can also use the same instructions to install it directly on RockyLinux 8.

Introduction to Greenplum

Greenplum Database is an advanced, fully-featured, open-source data warehouse and analytics platform. It provides a robust feature set with capabilities designed for big data and large-scale data analytics. Here are some of the key features that make Greenplum stand out:

  1. Massively Parallel Processing (MPP): Greenplum leverages a shared-nothing architecture, enabling it to distribute data and query workloads across multiple nodes. This results in exceptional performance and scalability for large datasets.
  2. High Availability: Greenplum offers robust high-availability features, including data replication and failover mechanisms, ensuring continuous availability of your data.
  3. Advanced Analytics: With built-in support for advanced analytics, Greenplum allows users to perform complex data processing tasks directly within the database. This includes in-database machine learning, geospatial analytics, and more.

Greenplum’s architecture and features make it particularly well-suited for AI/ML workloads. One of the exciting additions to Greenplum is the support for pgvector, an extension for vector similarity search, which is essential for AI applications. It can be used to build recommendation engines and nearest-neighbor searches. It provides amazing features that are key to build AI/ML applications

  1. Integration with AI/ML Libraries: Greenplum integrates well with popular AI/ML libraries and frameworks such as TensorFlow, PyTorch, and scikit-learn. This makes it easier to bring models into production and leverage Greenplum’s processing power.
  2. pgvector for Similarity Search: The pgvector extension allows you to perform fast and efficient vector similarity searches. This is crucial for AI applications that require comparing high-dimensional data, such as image and text embeddings.
  3. Scalability: Greenplum’s MPP architecture ensures that AI/ML models can be trained on large datasets efficiently. The ability to scale horizontally across multiple nodes means you can handle ever-growing data volumes.

Trying Out Greenplum

As part of my AI/ML workflow, Greenplum plays a crucial role in handling and processing large datasets efficiently. Its ability to scale and perform advanced analytics and machine learning processes directly within the database accelerates the AI/ML workflow. Setting up Greenplum on a single node is perfect for development and testing, allowing you to explore and experiment with Greenplum’s features. Now, let’s dive into the steps to install a single node Greenplum instance on Windows using WSL2.

Step-by-Step Installation Guide

Prerequisites

  • Windows 10 or later with WSL2 enabled.
  • Rocky Linux version 8 container rootfs
  • Basic knowledge of Linux command line.

Step 1: Download and Import Rocky Linux to WSL2

  1. Download the Rocky Linux 8 tar file from the official website here
    .

  2. Open PowerShell as Administrator and import the Rocky Linux tar file downloaded in the above step into WSL

wsl --import RockyLinux8 <InstallLocation> Rocky-8-Container-Base.latest.x86_64.tar.xz --version 2
  1. Launch the Rocky Linux WSL instance wither using the command line or by selecting it from the dropdown in the terminal window. it generally take a few minutes for the distribution to be made available by WSL in Windows terminal.

Step 2: Enable systemd in WSL2

systemd is a system and service manager for Linux operating systems. It is responsible for initializing the system and managing system processes after booting. systemd provides various features such as parallelized startup, on-demand starting of daemons, and monitoring and maintaining services.

systemd provides key features such as:

  • Service Management: systemd can start, stop, and manage services efficiently.
  • Parallel Initialization: Speeds up the boot process by initializing services in parallel.
  • Socket-based Activation: Services are started on-demand using sockets.
  • Logging: Integrates with journalctl for system and service logging

Windows Subsystem for Linux (WSL) provides a compatibility layer for running Linux binaries natively on Windows. By default, WSL uses a simplified init system to manage the Linux environment, which does not include systemd. This is because WSL aims to be lightweight and systemd introduces additional complexity and resource usage. However, with the introduction of WSL2, which uses a full Linux kernel, it’s now possible to enable systemd and take advantage of its features.

Here’s how you can enable systemd in WSL2..

  1. To enable systemd in WSL2, create or modify the /etc/wsl.conf file
  1. Add the following content to wsl.conf

Add the following lines to enable systemd at boot

  1. Restart your WSL2 instance

After modifying the configuration file, you need to restart your WSL2 instance for the changes to take effect. Open PowerShell and run:

  1. Verify that systemd is running

Once WSL2 restarts, verify that systemd is running by checking the process with the following command:

You should see systemd as the output, indicating that it is successfully running.

Step 3: Install Required Packages

Before we proceed with installing Greenplum, it’s crucial to update your system and install several essential packages. These packages include openssh-server, libcap, and vim.

  • openssh-server: This package installs the OpenSSH server, which allows you to securely connect to your WSL instance using SSH. SSH is essential for running remote commands, managing files, and maintaining your Greenplum installation.
  • libcap: The libcap library provides the POSIX 1003.1e capabilities interface. It allows programs to use fine-grained privileges rather than the traditional all-or-nothing root access. This is important for enabling non-root users to perform specific administrative tasks without granting full root permissions.
  • vim: I prefer to use vim as a robust and efficient way to manage text files within my WSL environment. You can use any text editor of your choice.
sudo dnf update
sudo dnf makecache
sudo dnf install openssh-server libcap vim

This sets up the necessary packages to move onto the next step.

Step 4: Set Up User for Greenplum

To run Greenplum effectively, it is best practice to create a dedicated user and group for managing the database. This ensures that the database processes run under a specific user account, enhancing security and manageability.

  1. Create a user and group for Greenplum

First, create a group named gpadmin. This group will be associated with the Greenplum user and help manage permissions more effectively. Next, create a user named gpadmin and add it to the gpadmin group. This user will own the Greenplum processes and files. Assign a password to the gpadmin user. You will be prompted to enter and confirm the password.

sudo groupadd gpadmin
sudo useradd -g gpadmin gpadmin
sudo passwd gpadmin
  1. Edit the sudoers file to allow gpadmin to execute commands without a password

To manage Greenplum effectively, the gpadmin user needs to execute administrative commands without constantly entering the password. This can be achieved by editing the sudoers file. Open the sudoers file using the visudo command. The visudo command safely edits the sudoers file and ensures that syntax errors are not saved, which could prevent gaining superuser access.

  1. Add the following line:

In the visudo editor, scroll down to find the section where user privileges are defined. Add the following line to grant the gpadmin user the ability to execute any command without a password. This line means that the gpadmin user can execute all commands (ALL) as any user ((ALL)) without providing a password (NOPASSWD).

gpadmin ALL=(ALL) NOPASSWD: ALL

We created a dedicated gpadmin user and group, and configured sudo privileges, ensuring that your Greenplum installation runs under a specific user account with the required permissions. This enhances both security and manageability of your Greenplum environment.

Step 5: Set Up SSH

The Secure Shell (SSH) protocol provides secure command-line access to remote machines. For Greenplum, SSH is used to enable communication between nodes in the cluster. Setting up SSH involves generating SSH keys and configuring the SSH service.

SSH host keys are used to identify the server. These keys need to be generated if they are not already present.

  1. The below commands generate RSA, ECDSA and ED25519 SSH keys and configure SSH for the root user
# As root
ssh-keygen -t rsa -b 2048 -f /etc/ssh/ssh_host_rsa_key -N ''
ssh-keygen -t ecdsa -b 256 -f /etc/ssh/ssh_host_ecdsa_key -N ''
ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key -N ''
  1. The below commands generate RSA, ECDSA and ED25519 SSH keys and configure SSH for the gpadmin user

Switch to the gpadmin user. This ensures that the keys are generated in the correct user context

# As gpadmin
sudo -i -u gpadmin
ssh-keygen -t rsa -b 2048
  1. Copy the Public Key to Authorized Keys
ssh-copy-id gpadmin@127.0.0.1
  1. Test SSH to 127.0.0.1 with no password to ensure it is ready.

If everything is configured correctly, you will be logged into the same system without being asked for a password.

By generating SSH host and user keys and configuring SSH for the gpadmin user, you enable secure, passwordless authentication. This setup is essential for the seamless operation of Greenplum, as it relies on SSH for internal communication between cluster nodes.

Step 6: Configure Ping for non-root users

The ping command is commonly used to test network connectivity by sending ICMP echo requests to a target host. By default, the ping command requires root privileges to send these requests. However, you can configure ping to work for non-root users by using the setcap command, which sets file capabilities on the executable.

  1. The setcap command allows you to set specific capabilities on executables, granting them privileges without requiring full root access. In this case, we use it to grant the cap_net_raw capability to the ping command, allowing non-root users to execute it.
sudo setcap cap_net_raw+p /usr/bin/ping
  1. Test the ping command as a non-root user to ensure that it works correctly:

You should see the usual ping output, indicating that the command is functioning properly for non-root users.

Step 7: Install Greenplum Software

Now we are ready to begin the process of downloading the RPM package and installing it in your Rocky Linux environment. First, you need to download the Greenplum RPM package. You can find the latest version of the Greenplum Database RPM on the official Greenplum website or on the Pivotal/Tanzu network.

  1. Download the .rpm file to your local machine. Save it in a directory that you can easily access from your WSL2 environment.
  2. Use the dnf package manager to install the Greenplum RPM package. This command installs the Greenplum software and its dependencies.
  3. After installation, change the ownership of the Greenplum installation directory to the gpadmin user. This ensures that the gpadmin user has the necessary permissions to manage and operate Greenplum.
cp /mnt/c/Users/pradeepl/Downloads/greenplum-db-7.1.0-el8-x86_64.rpm .
sudo dnf install greenplum-db-7.1.0-el8-x86_64.rpm
sudo chown -R gpadmin:gpadmin /usr/local/greenplum-db/
  1. Verifying the Installation — After installing Greenplum, you should verify that the software is installed correctly. Switch to the gpadmin user to check the installation by checking the Greenplum version.

This is my output.

Step 8: Configure Environment for gpadmin

To ensure that the Greenplum database functions correctly, we need to configure the environment for the gpadmin user. This involves setting environment variables and sourcing the Greenplum environment script. Add the following lines to the .bashrc file. These lines set the GPHOME variable to the Greenplum installation directory, source the Greenplum environment script, and set the COORDINATOR_DATA_DIRECTORY to the appropriate data directory.

The COORDINATOR_DATA_DIRECTORY is an environment variable used in Greenplum to specify the location of the data directory for the Greenplum Coordinator, which is the master node of the Greenplum Database cluster. This directory stores metadata and data for the coordinator node, which includes information about the entire Greenplum Database system, such as the system catalog, user data, and other critical information.

Add the following content to the .bashrc file for gpadmin:

export GPHOME=/usr/local/greenplum-db
. $GPHOME/greenplum_path.sh
export COORDINATOR_DATA_DIRECTORY=/home/gpadmin/gp/gpsne-1

Reload the .bashrc file:

Step 9: Initialize Greenplum

After setting up the environment and ensuring that SSH is configured properly, the next step is to initialize the Greenplum database. This involves creating necessary directories, configuring the initialization files, and running the initialization command to set up the Greenplum cluster.

  1. Create necessary directories and configuration files:

First, create a directory to hold the Greenplum configuration and data files. Copy the sample configuration file provided by Greenplum to this directory.

mkdir ~/gp
cd ~/gp
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_singlenode .
  1. Edit the gpinitsystem_singlenode file and update the necessary lines to configure the data directories and coordinator hostname.
declare -a DATA_DIRECTORY=(/home/gpadmin/gp /home/gpadmin/gp)
COORDINATOR_HOSTNAME=127.0.0.1
COORDINATOR_DIRECTORY=/home/gpadmin/gp
  1. Create hostlist_singlenode file in /home/gpadmin/gp and add 127.0.0.1 to the file. This file lists the hosts that will participate in the Greenplum cluster.
echo "127.0.0.1" > ~/gp/hostlist_singlenode
  1. Initialize the cluster (if SSH error comes up the first time, enter “yes”):

Run the gpinitsystem command to initialize the Greenplum cluster using the gpinitsystem_singlenode configuration file. If prompted with an SSH warning, enter “yes” to continue.

gpinitsystem -c gpinitsystem_singlenode
  1. Verifying the Initialization: After the initialization process completes, verify that the Greenplum cluster is up and running by using the gpstate command.

This command provides information about the status of the coordinator and segment instances. My output from this command is below

“verify installation”

Initializing the Greenplum database involves creating the necessary directories, configuring initialization files, and running the initialization command. By following these steps, you ensure that your Greenplum cluster is set up correctly and ready for use. The initialization process configures the coordinator and segment instances, enabling you to manage and use Greenplum for your data processing needs.

Step 10: Validation

The final step in setting up your Greenplum database is to validate that the system can be started and stopped correctly. This ensures that all configurations are correct and that Greenplum is functioning as expected.

Validate that you can restart Greenplum. This command starts the Greenplum database cluster, initializing all processes and ensuring that the database is ready for use. Validating the start and stop functionality of Greenplum ensures that your setup is complete and that the database is functioning correctly. This step confirms that all configurations are correct and that Greenplum can be managed effectively.

By following these steps, you should have a working Greenplum installation on Windows using WSL2 with Rocky Linux. Greenplum’s powerful features and scalability make it an excellent choice for handling AI/ML workloads, providing a robust platform for your data processing and analytics needs. If you encounter any issues or have questions, feel free to leave a comment below.

Step 11: Connect to your newly minted Greenplum system

Now that we have setup Greenplum we can login using the below command

psql -d postgres -U gpadmin

we can then create a table, add and query data to check as below

CREATE TABLE sample_table (
    id SERIAL PRIMARY KEY,
    name VARCHAR(50),
    age INT
);
INSERT INTO sample_table (name, age) VALUES
('Alice', 30),
('Bob', 25),
('Charlie', 35);
SELECT * FROM sample_table;

I get the below output on running the select

alt text

Conclusion

In this guide, we’ve walked through the process of setting up Greenplum on a single-node system using Rocky Linux on WSL2. This setup is ideal for development and testing purposes, allowing you to explore the powerful features of Greenplum in a controlled environment.

We started by enabling systemd in WSL2 to ensure that essential services, such as sshd, can be managed efficiently. Next, we installed the required packages and set up a dedicated gpadmin user with the necessary permissions to manage the Greenplum database. We also configured SSH for secure, passwordless communication, and set up ping to work for non-root users.

After downloading and installing the Greenplum software, we configured the environment for the gpadmin user, ensuring that all necessary environment variables were set. We then initialized the Greenplum database, carefully configuring the system to run on a single node.

Finally, we validated our setup by ensuring that the Greenplum database could be started and stopped without issues, confirming that our configuration was correct and functional.

Next Steps

With your Greenplum setup complete, you can now start exploring its capabilities. Use this single-node setup to develop and test your applications, experiment with data processing, and familiarize yourself with Greenplum’s powerful features. When you’re ready, you can scale your setup to a multi-node cluster for production use, leveraging the full power of Greenplum’s massively parallel processing capabilities. We will start doing this in the next set of blog posts.

This section describes the client and loader tool package installation procedure for Windows systems.

Prerequisites

You must have operating system superuser privileges to install the tools package.

Procedure

Perform the following procedure to install the client and loader tools package on a Windows system.

  1. The Greenplum Database client and loader tools for Windows require a recent Microsoft Visual C++ Redistributable for Visual Studio 2017. You must download and install an update as described in the Microsoft support article titled The latest supported Visual C++ downloads.

  2. If you plan to use the gpload.bat Greenplum Database loader program for Windows:

    1. Ensure that a 64-bit version of Python 2.7 is installed on your system. Refer to Python 2.7.16 or the source of your choice for Python download and install instructions.
    2. You must also add the Python directory to your PATH.
  3. Locate the installer .msi file that you downloaded from Broadcom Support Portal in a previous step. The naming format of the Windows installer file is greenplum-db-clients-<version>-x86_64.msi.

    For more information about download prerequisites, troubleshooting, and instructions, see Download Broadcom products and software.

  4. Double-click on the greenplum-db-clients-<version>-x86_64.msi file to launch the installer.

  5. Click Next on the Greenplum Clients Setup Wizard Welcome screen.

  6. Read through the End-User License Agreement, and click I Agree to accept the terms of the license.

  7. By default, the Greenplum Database client and load tools are installed into the following directory:

    C:\Program Files\Greenplum\greenplum-clients\
    

    Click Browse on the Custom Setup screen to choose another location.

  8. Click Next when you have chosen the desired install path.

  9. Click Install to begin the installation.

  10. Click Finish to exit the Windows client and load tools installer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sign up

Overview

  • The World’s First Open-Source & Massively Parallel Data Platform.
  • Greenplum is based on PostgreSQL Database

Minimum recommended specifications for servers intended to support Greenplum Database in a production environment.

  • Operating System
  • SUSE Linux SLES 10.2 or higher
  • CentOS 5.0 or higher
  • RedHat Enterprise Linux 5.0 or higher
  • Oracle Unbreakable Linux 5.5
  • Solaris x86 v10 update 7

User
gpadmin on all nodes

File Systems

  • xfs required for data storage on SUSE Linux and Red Hat (ext3 supported for root file system)
  • zfs required for data storage on Solaris (ufs supported for root file system)

Minimum CPU

  • Pentium Pro compatible (P3/Athlon and above)

Minimum Memory

  • 16 GB RAM per server

Disk Requirements

  • 150MB per host for Greenplum installation
  • Approximately 300MB per segment instance for meta data
  • Appropriate free space for data with disks at no more than 70% capacity
  • High-speed, local storage

Network Requirements

  • Gigabit Ethernet within the array
  • Dedicated, non-blocking switch

Software and Utilities

  • bash shell
  • GNU tar
  • GNU zip
  • GNU readline (Solaris only)

OS : Redhat Enterprise Linux 5 or 6

Host Name IP-Address Host Type
2ndquadrant.in-1 192.168.2.2 Master
2ndquadrant.in -2 192.168.2.3 Standby-master , segment host
2ndquadrant.in -3 192.168.2.4 Segment host
2ndquadrant.in -4 192.168.2.5 Segment host

Here’s a list of steps to keep on your desk

  1. Configure your Operating System for Greenplum.
  2. Install Greenplum on master host.
  3. Run gpseginstall to installing and configuring Greenplum Database on other hosts .
  4. Create master directory on the master.
  5. Create the master data directory location on your standby master ( gpssh can help here).
  6. Create data directories on all segments (gpssh can help here).
  7. Network Time Protocol to synchronize the system clocks on all hosts.
  8. Validating Your Systems.
  9. Initialize Greenplum Database System.
  10. Configuring Environment variable and Client Authentication

Steps 1. Configure your Operating System for Greenplum database

Greenplum requires the certain operating system (OS) parameters be set on all hosts in your Greenplum Database system (masters and segments).

  • Change hostname for each node and Edit the /etc/hosts file and make sure that it includes the host names and all interface address names for every machine participating in your Greenplum Database system.

At Master:

#cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.3 Beta (Maipo)

— Network configuration

# cd /etc/sysconfig/network-scripts/
# vi ifcfg-ens33
BOOTPROTO=static
IPADDR=192.168.2.2
NETMASK=255.255.255.0
DEVICE="ens33"
ONBOOT=yes
#vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=2ndquadrant.in-1
GATEWAY=192.168.2.1
## Configure DNS Server
# vi /etc/resolv.conf
nameserver 8.8.8.8     # Replace with your nameserverip
nameserver 192.168.2.1  # Replace with your nameserverip

– Hosts Entry

# vi /etc/hosts
192.168.2.2  2ndquadrant.in-1  2ndquadrant.in
192.168.2.3  2ndquadrant.in-2  2ndquadrant.in
192.168.2.4  2ndquadrant.in-3  2ndquadrant.in
192.168.2.5  2ndquadrant.in-4  2ndquadrant.in

–Restart the network services

# systemctl restart network
        (OR)
# service network restart

 Note: Change hostname for each segment and standby  node by using above methods for more reference

https://www.tutorialdba.com/2018/03/how-to-make-partition-and-configure.html

  •  Add the following kernel parameter to the end of your /etc/sysctl.conf file all node and run sysctl -p . Note: XFS is the preferred file system on Linux platforms for data storage. Greenplum recommends the following xfs mount options
xfs_mount_options = rw,noatime,inode64,allocsize=16m
kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 250 512000 100 2048
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
vm.overcommit_memory = 2

--Then Run
# sysctl -p.
  • Set the following parameters in the /etc/security/limits.conf file:
* softnofile 65536  
* hardnofile 65536  
* softnproc 131072 
* hardnproc 131072

Note:

# these nproc values are overridden in limits.d/90-nproc.conf

In /etc/security/limits.d/90-nproc.conf:
* softnproc 131072
* hardnproc 131072
  • The Linux disk I/O scheduler for disk access supports different policies, such as CFQ, AS, and deadline. Greenplum recommends the following scheduler option: deadline

To specify a scheduler, run the following:

echo 'deadline' > /sys/block/devname/queue/scheduler
  • Each disk device file should have a read-ahead (blockdev) value of 16384.
To verify the read-ahead value of a disk device:
# /sbin/blockdev --getra /dev/sdb
you will need to set for sr0, sda, sdb, sdc
To set blockdev (read-ahead) on a device:
# /sbin/blockdev --setra 16385 /dev/sd?
Add to /etc/rc.local:
blockdev --setra 16384 /dev/sd?
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
  • Turn off firewalls and security Linux
These commands disable firewalld when run as root:
# systemctl stop firewalld.service
# systemctl disable firewalld.service
# chkconfig iptables off
# service iptables stop

These commands disable SELINUX when run as root
# vi /etc/selinux/config
  SELINUX=disabled
  • Create a user for GreenPlum by the name gpadmin group gpadmin  with bash shell and home directory /home/gpadmin
Add the Greenplum database Admin account :
# useradd gpadmin
# groupadd gpadmin
# useradd gpadmin -g gpadmin
# passwd gpadmin
# New password: gpadmin
# Retype new password: gpadmin
  • Create a directory for all host file and configuration files that will be user in the installation
    # cd /home/gpadmin
    # mkdir gpconfigs

Step 2. Install Greenplum on master host.

Greenplum database installation types are:

  1. Installing the RPM Distribution
  2. Installing the Binary Distribution

If you do not have root access on the master host machine, run the binary installer as the gpadmin user and install the software into a directory in which you have write permission.

1.Installing the RPM Distribution

  1. Log in as root on the machine that will become the Greenplum Database master host.
  2. Download or copy the RPM distribution file to the master host machine. The RPM distribution filename has the format greenplum-db-<version>-<platform>.rpm where <platform> is similar to RHEL7-x86_64 (Red Hat 64-bit) or SuSE12-x86_64 (SuSe Linux 64 bit).
  3. Install the local RPM file:
# rpm -Uvh ./greenplum-db-<version>-<platform>.rpm
Preparing...                ########################################### [100%]
1:greenplum-db           ########################################### [100%]

The RPM installation copies the Greenplum Database software into a version-specific directory, /usr/local/greenplum-db-<version>.

  1. Change the ownership and group of the installed files to gpadmin:
# chown -R gpadmin /usr/local/greenplum*
# chgrp -R gpadmin /usr/local/greenplum*

Installing the Binary Distribution

  1. Log in as root on the machine that will become the Greenplum Database master host.If you do not have root access on the master host machine, run the binary installer as the gpadmin user and install the software into a directory in which you have write permission.
  2. Download or copy the Binary Installation distribution file to the master host machine. The Binary Installer distribution filename has the format greenplum-db-<version>-<platform>.zip where <platform> is similar to RHEL7-x86_64 (Red Hat 64-bit) or SuSE12-x86_64 (SuSe Linux 64 bit).
  3. Unzip the installer file:
# unzip greenplum-db-<version>-<platform>.zip
  1. Launch the installer using bash:
# /bin/bash greenplum-db-<version>-<platform>.bin
  1. The installer prompts you to accept the Greenplum Database license agreement. Type yes to accept the license agreement.
  2. The installer prompts you to provide an installation path. Press ENTER to accept the default install path (/usr/local/greenplum-db-<version>), or enter an absolute path to a custom install location. You must have write permission to the location you specify.
  3. The installer installs the Greenplum Database software and creates a greenplum-db symbolic link one directory level above the version-specific installation directory. The symbolic link is used to facilitate patch maintenance and upgrades between versions. The installed location is referred to as $GPHOME.
  4. If you installed as root, change the ownership and group of the installed files to gpadmin:
# chown -R gpadmin /usr/local/greenplum*
# chgrp -R gpadmin /usr/local/greenplum*

Ref: important Greenplum database files and directories

Step 3. Run gpseginstall to installing and configuring Greenplum Database on other hosts

  1. Log in to the master host as root:
$ su -
  1. Source the path file from your master host’s Greenplum Database installation directory:
# source /usr/local/greenplum-db/greenplum_path.sh
  1. Create a file called hostfile_exkeys that has the machine configured host names and host addresses (interface names) for each host in your Greenplum system (master, standby master and segments).
# vi hostfile_exkeys
      2ndquadrant.in-1
      2ndquadrant.in-2 
      2ndquadrant.in-3
      2ndquadrant.in-4
  1. Run the following command to redo the ssh key exchange:
# gpssh-exkeys -f hostfile

5. Run the gpseginstall utility referencing the hostfile  file you just created. This example runs the utility as root. The utility creates the Greenplum system user gpadmin on all hosts and sets the password as  “gpadmin”  for that user on all hosts.

# gpseginstall -f hostfile –u gpadmin –p gpadmin

Note:

  • Use the -u and -p options to specify a different system user and password.
  • Do not use the default password option for production environments.
  • Change the password immediately after installation.

Validating the Installation:

To make sure the Greenplum software was installed and configured correctly, run the following confirmation steps from your Greenplum master host  as  gpadmin user.

  • Log in to the master host as gpadmin:
$ su - gpadmin
  • Source the path file from Greenplum Database installation directory:
$ source /usr/local/greenplum-db/greenplum_path.sh
  • Use the gpssh utility to see if you can login to all hosts without a password prompt, and to confirm that the Greenplum software was installed on all hosts. Use the hostfile file you used for installation. For example:
$ gpssh -f hostfile -e ls -l $GPHOME

If the installation was successful, you should be able to log in to all hosts without a password prompt. All hosts should show that they have the same contents in their installation directories, and that the directories are owned by the gpadmin user.

If you are prompted for a password, run the following command to redo the ssh key exchange:

$ gpssh-exkeys -f hostfile

Optional installation :

1.Installing Oracle Compatibility Functions

Many Oracle Compatibility SQL functions are available in Greenplum Database

you need to run the installation script $GPHOME/share/postgresql/contrib/orafunc.sql once for each database

To install:

$ psql -d testdb -f $GPHOME/share/postgresql/contrib/orafunc.sql

To uninstall Oracle Compatibility Functions, use the script:

$GPHOME/share/postgresql/contrib/uninstall_orafunc.sql

Note: The following functions are available by default and can be accessed without running the Oracle Compatibility installer: sinh, tanh, cosh and decode.

2. Greenplum Database Extensions

gppkg (Greenplum package manager) to install Greenplum Database extensions such as pgcrypto, PL/R, PL/Java, PL/Perl, and PostGIS, along with their dependencies, across an entire cluster.

For More info: https://gpdb.docs.pivotal.io/43230/install_guide/refs/gppkg.html

Step 4. Create master directory on the master

The master does not store any user data, only the system catalog tables and system metadata are stored on the master instance, therefore you do not need to designate as much storage space as on the segments.

Note: How to make data partition on linux RHEL7: https://www.tutorialdba.com/2018/03/how-to-make-partition-and-configure.html here we are created /data partition with postgres user permission you change the permission as gpadmin

Master : as root user

# su -
# mkdir /data/master

Change ownership of this directory to the gpadmin user:

# chown gpadmin /data/master

Step 5. Create the master data directory location on your standby master

As root User:

Using gpssh, create the master data directory location on your standby master as well. For example:

# source /usr/local/greenplum-db-4.3.x.x/greenplum_path.sh 
# gpssh -h 2ndquadrant.in-2  -e 'mkdir /data/master'
# gpssh -h 2ndquadrant.in-2  -e 'chown gpadmin /data/master'

Step 6. Create data directories on all segments:

  1. On the master host, log in as root:
su -
  1. Create a file called hostfile_gpssh_segonly. This file should have only one machine configured host name for each segment host. For example, if you have three segment hosts:
2ndquadrant.in-2 
2ndquadrant.in-3
2ndquadrant.in-4
  1. Using gpssh, create the primary and mirror data directory locations on all segment hosts at once using the hostfile_gpssh_segonlyfile you just created. For example:
# source /usr/local/greenplum-db-4.3.x.x/greenplum_path.sh 
# gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/mirror'
# gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/primary'
# gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/mirror'

Step 7. Network Time Protocol to synchronize the system clocks on all hosts.

  • All Hosts requires the same time zone environment variable setting on all host for time consistancy..
  • All Hosts to synchronize time over networks for an accurate time. While time is passing by, hosts internal clocks tend to drift which can lead to inconsistent time issues, especially on servers and clients logs files or if you want to replicate servers resources or databases.
  • Greenplum recommends using NTP (Network Time Protocol) to synchronize the system clocks on all hosts that comprise your Greenplum Database system.
  • NTP on the segment hosts should be configured to use the master host as the primary time source, and the standby master as the secondary time source.
  • On the master and standby master hosts, configure NTP to point to your preferred time server.

1.NTP Installation and configuring on master:

The NTP service is installed from a Yum repository using the following command.

# yum install ntp

Turn on the NTP service and make sure it starts automatically on reboot.

# service ntpdate start
# service ntpd start
# chkconfig ntpd on
# chkconfig ntpdate on

The NTP service is configured using the “/etc/ntp.conf” file. Configuration changes have to be followed by a restart of the NTP service.

# service ntpd restart

Checking the synchronization status and your system time

# ps -ef | grep ntp
# ntpq -p
# date -R

2. On the master host, log in as root and edit the /etc/ntp.conf file.

server 2ndquadrant.in-1 prefer
server 2ndquadrant.in-2

3.On the master host, use the NTP daemon synchronize the system clocks on all Greenplum hosts. For example using gpssh:

# gpssh -f /home/gpadmin/gpconfigs/hostfile_gpssh_segonly -v -e 'yum install ntp; service ntpdate start; service ntpd start; chkconfig ntpd on ; chkconfig ntpdate on'

To Enabling Firewall

You should re-enable the firewall after installation is complete

as root:

# systemctl start firewalld.service
# systemctl enable firewalld.service
# chkconfig iptables on
# service iptables start

— checking iptables rules

# cat /etc/sysconfig/iptables

step 8. Validating Our Systems

gpcheck and gpcheckperf utilities to validate the configuration and performance of your systems.

The following tests should be run prior to initializing your Greenplum Database system.

  1. Validating OS Settings (gpcheck can help here)
  2. Validating Network Performance (gpcheckperf can help here)
  3. Validating Disk I/O and Memory Bandwidth (gpcheckperf can help here)
  • gpcheck that can be used to verify that all hosts in your array have the recommended OS settings for running a production Greenplum Database system.
  • gpcheckperfcalls to gpssh and gpscp, so these Greenplum utilities must be in your $PATH.
  1. Validating OS Settings
  2. Log in on the master host as the gpadmin
  3. Source the shpath file from your Greenplum installation. For example:
$ source /usr/local/greenplum-db/greenplum_path.sh
  1. Create a file called hostfile_gpcheckthat has the machine-configured host names of each Greenplum host (master, standby master and segments), one host name per line. Make sure there are no blank lines or extra spaces. This file should just have a single host name per host. For example:
2ndquadrant.in-1
2ndquadrant.in-2
2ndquadrant.in-3
2ndquadrant.in-4
  1. Run the gpcheckutility using the host file you just created. For example:
$ gpcheck -f hostfile_gpcheck -m 2ndquadrant.in-1 -s 2ndquadrant.in-2
  1. After gpcheckfinishes verifying OS parameters on all hosts (masters and segments), you might be prompted to modify certain OS parameters before initializing your Greenplum Database system.

2.Validating Network Performance (optional)

To test network performance, run gpcheckperf with one of the network test run options: parallel pair test (-r N), serial pair test (-r n), or full matrix test (-r M). The utility runs a network benchmark program that transfers a 5 second stream of data from the current host to each remote host included in the test. By default, the data is transferred in parallel to each remote host and the minimum, maximum, average and median network transfer rates are reported in megabytes (MB) per second. If the summary transfer rate is slower than expected (less than 100 MB/s), you can run the network test serially using the -r n option to obtain per-host results. To run a full-matrix bandwidth test, you can specify -r M which will cause every host to send and receive data from every other host specified. This test is best used to validate if the switch fabric can tolerate a full-matrix workload.

Most systems in a Greenplum Database array are configured with multiple network interface cards (NICs), each NIC on its own subnet. When testing network performance, it is important to test each subnet individually. For example, considering the following network configuration of two NICs per host:

Network Interface Configuration
Greenplum Host Subnet1 NICs Subnet2 NICs
2ndquadrant.in -2 2nd2-1 2nd2-2
2ndquadrant.in -3 2nd3-1 2nd3-2
2ndquadrant.in -4 2nd4-1 2nd4-2

You would create four distinct host files for use with the gpcheckperf network test:

Network Test Host File Contents
hostfile_gpchecknet_ic1 hostfile_gpchecknet_ic2
2nd2-1 2nd2-2
2nd3-1 2nd3-2
2nd4-1 2nd4-2

You would then run gpcheckperf once per subnet. For example (run in parallel pairs test mode):

$ gpcheckperf -f hostfile_gpchecknet_ic1 -r N -d /tmp > subnet1.out
$ gpcheckperf -f hostfile_gpchecknet_ic2 -r N -d /tmp > subnet2.out

Note:

  • If you have an oddnumber of hosts to test, you can run in serial test mode (-r n),
  • if testing an evennumber of hosts, run in parallel pairs test mode
  • If hosts have multiple unbonded NICs, the convention is to append a dash (-) and number to the host name. For example, sdw1-1 and sdw1-2 are the two interface names for host sdw1. However, NIC bonding is recommended to create a load-balanced, fault-tolerant network. This is fully network HA more About https://www.interserver.net/tips/kb/network-bonding-types-network-bonding/

3.Validating Disk I/O  and Memory Bandwidth (optional)

Disk I/O Performance (dd test) – The disk test uses the dd command (a standard UNIX utility) to test the sequential throughput performance of a logical disk or file system.

Memory Bandwidth (stream test) – The memory test uses the STREAM benchmark program to measure sustainable memory bandwidth. Results are reported in MB per second (MB/s), run gpcheckperf with the disk and stream test run options (-r ds).

To run the disk and stream tests

  1. Log in on the master host as the gpadmin
  2. Source the greenplum_path.sh path file from your Greenplum installation. For example:
$ source /usr/local/greenplum-db/greenplum_path.sh
  1. Create a host file named hostfile_gpcheckperfthat has one host name per segment host. Do not include the master host. For example:
2ndquadrant.in-2
2ndquadrant.in-3
2ndquadrant.in-4
  1. Run the gpcheckperf utility using the hostfile_gpcheckperf file you just created. Use the-d option to specify the file systems you want to test on each host (you must have write access to these directories). You will want to test all primary and mirror segment data directory locations. For example:
$ gpcheckperf -f hostfile_gpcheckperf -r ds -D \
-d /data/primary \
-d /data/mirror
  1. The utility may take a while to perform the tests as it is copying very large files between the hosts. When it is finished you will see the summary results for the Disk Write, Disk Read, and Stream tests.

Step 9. Initialize Greenplum Database System

Step to initializing greenplum database:

  1. Create a host file that contains the host addresses of your segments.
  2. Creating the Greenplum Database Configuration File (gpinitsystem_config like oracle pfile ).By default, Greenplum Database will be initialized using the locale of the master host system. Make sure this is the correct locale you want to use, as some locale options cannot be changed after initialization.
  3. Run the Greenplum Database initialization utility on the master host.
  1. Create a host file that contains the host addresses of your segments.
  • Log in as gpadmin.
$ su - gpadmin
  • Create a file named hostfile_gpinitsystem. In this file add the host address name(s) of your segment host interfaces, one name per line, no extra lines or spaces. For example, if you have three segment hosts with two network interfaces each:
$ vi hostfile_gpinitsystem
2nd2-1
2nd2-2
2nd3-1
2nd3-2
2nd4-1
2nd4-2

Save and close the file.

2.Creating the Greenplum Database Configuration File (gpinitsystem_config like oracle pfile )

Your Greenplum Database configuration file tells the gpinitsystem utility how you want to configure your Greenplum Database system. An example configuration file can be found in $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config.

  • Log in as gpadmin.
$ su - gpadmin
  • Make a copy of the gpinitsystem_config file to gpadmin user configure location
$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/
  • Open the file you just copied in a text editor.Set all of the required parameters according to your environment. See gpinitsystemfor more information. A Greenplum Database system must contain a master instance and at least twosegment instances (even if setting up a single node system).The DATA_DIRECTORY parameter is what determines how many segments per host will be created. If your segment hosts have multiple network interfaces, and you used their interface address names in your host file, the number of segments will be evenly spread over the number of available interfaces.

Here is an example of the required parameters in the gpinitsystem_config file:

$ cat /home/gpadmin/gpconfigs/gpinitsystem_config 
	ARRAY_NAME="2ndquadrant.in"
	SEG_PREFIX=2ndseg
	PORT_BASE=40000 
	declare -a DATA_DIRECTORY=(/data/primary /data/primary	/data/primary)
	MASTER_HOSTNAME=2ndquadrant.in -1
	MASTER_DIRECTORY=/data/master 
	MASTER_PORT=5432 
	TRUSTED SHELL=ssh
	CHECK_POINT_SEGMENTS=8
	ENCODING=UNICODE
  • Optional) If you want to deploy mirror segments, uncomment and set the mirroring parameters according to your environment. Here is an example of the optional mirror parameters in the gpinitsystem_config file:
MIRROR_PORT_BASE=50000
REPLICATION_PORT_BASE=41000
MIRROR_REPLICATION_PORT_BASE=51000
declare -a MIRROR_DATA_DIRECTORY=(/data/mirror /data/mirror /data/mirror)
  • Save and close the file.

Note: You can initialize your Greenplum system with primary segments only and deploy mirrors later using the gpaddmirrors utility.

  1. Run the Greenplum Database initialization utility on the master host

The gpinitsystem utility will create a Greenplum Database system using the values defined in the configuration file.

Syntax:

gpinitsystem -c cluster_configuration_file 
            [-h hostfile_gpinitsystem]
            [-B parallel_processes] 
            [-p postgresql_conf_param_file]
            [-s standby_master_host [-P standby_master_port]
              [-F standby_master_filespaces]]         
            [-m number | --max_connections=number]
            [-b size | --shared_buffers=size]
            [-n locale | --locale=locale] [--lc-collate=locale] 
            [--lc-ctype=locale] [--lc-messages=locale] 
            [--lc-monetary=locale] [--lc-numeric=locale] 
            [--lc-time=locale]
            [-e password | --su_password=password] 
            [-S] [-i] [-a] [-q] [-l logfile_directory] [-D]
            [-I input_configuration_file]
            [-O output_configuration_file]

gpinitsystem -v

gpinitsystem -h

To run the initialization utility

  1. Run the following command referencing the path and file name of your initialization configuration file (gpinitsystem_config) and host file (hostfile_gpinitsystem). For a fully redundant system (with a standby master and a spreadmirror configuration) include the -s and -S
$ gpinitsystem -c gpconfigs/gpinitsystem_config -h gpconfigs/hostfile_gpinitsystem \
 -s 2ndquadrant.in-2 -S -b 500MB
  1. The utility will verify your setup information and make sure it can connect to each host and access the data directories specified in your configuration. If all of the pre-checks are successful, the utility will prompt you to confirm your configuration. For example:
=> Continue with Greenplum creation? Yy/Nn
  1. Press to start the initialization.
  2. The utility will then begin setup and initialization of the master instance and each segment instance in the system. Each segment instance is set up in parallel. Depending on the number of segments, this process can take a while.
  3. At the end of a successful setup, the utility will start your Greenplum Database system. You should see:
=> Greenplum Database instance successfully created.

Step 10 . Configuring Environment variable and Client Authentication

1. set up your user environment for Greenplum (.bash_profile)

Set these in your user’s startup shell profile (such as ~/.bashrc or ~/.bash_profile), or in /etc/profile if you want to set them for all users.

# su - gpadmin

$ vi .bash_profile
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/2ndseg-1
export PGDATABASE=postgres
export PGPORT=5432
export PGUSER=gpadmin 
export PGPASSWORD=gpadmin

Note: Here You can access Greenplum database with no password prompt by using PGPASSWORD variable/.pgpass file.

2. pg_hba.conf  – host based authentication

  • Client access and authentication is controlled by the standard PostgreSQL host-based authentication file, conf.
  • The authentication method for local database connections is initially trust, which means that Greenplum Database trusts any user logged in to the master server to access the database system using whatever database role they specify. Trust authentication is insufficiently secure for a production database system. You should replace all instances of trust authentication in the pg_hba.conf file with a stronger authentication method, such as ident for local connections or md5 passwords for remote connections.
  • The pg_hba.conf file just controls who can initiate a database session and how those connections are authenticated. By setting object privileges you can control database access.
  • Every greenplum user autheicated/authorized by using 5 field there are TYPE,DATABASE,USER,ADDRESS and METHOD

Each remote client access record should be the following format:

#TYPE     DATABASE-NAME  USER-NAME            ADDRESS              AUTHENTICATION METHOD
host       database  	   user     	   address/CIDR      	auth-method  [auth-options]
hostssl    database  	   user     	   address/CIDR      	auth-method  [auth-options]
hostnossl  database        user     	   address/CIDR      	auth-method  [auth-options]
host       database        user  	IP-address  IP-mask  	auth-method  [auth-options]
hostssl    database        user  	IP-address  IP-mask  	auth-method  [auth-options]
hostnossl  database        user  	IP-address  IP-mask  	auth-method  [auth-options]

Each UNIX-domain socket (any user on the local system) access record should be the following format:

#TYPE     DATABASE-NAME   USER-NAME         AUTHENTICATION METHOD
local       database         user         auth-method  [auth-options]

Note: 255.0.0.0 represents an IPv4 CIDR mask length of IP/8, and 255.255.255.255 represents a CIDR mask length of IP/32.

Editing pg_hba.conf

  1. Open the file $MASTER_DATA_DIRECTORY/pg_hba.conf in a text editor.
  2. Add a line to the file for each type of connection you want to allow. Records are read sequentially, so the order of the records is significant. Typically, earlier records will have tight connection match parameters and weaker authentication methods, while later records will have looser match parameters and stronger authentication methods. For example:
# allow the gpadmin user local access to all databases
# using ident authentication
local   all   gpadmin   ident         sameuser
host    all   gpadmin   127.0.0.1/32  ident
host    all   gpadmin   ::1/128       ident
# allow the 'dba' role access to any database from any 
# host with IP address 192.168.x.x and use md5 encrypted 
# passwords to authenticate the user
# Note that to use SHA-256 encryption, replace md5 with 
# password in the line below
host    all   dba   192.168.0.0/32  md5
# allow all roles access to any database from any 
# host and use ldap to authenticate the user. Greenplum role 
# names must match the LDAP common name.
host    all   all   192.168.0.0/32  ldap ldapserver=usldap1 
ldapport=1389 ldapprefix="cn=" 
ldapsuffix=",ou=People,dc=company,dc=com"

3 . Save and close the file.

4. Reload the pg_hba.conf configuration file for your changes to take effect:

$ gpstop -u

3.  postgresql.conf configuration file

  1. To limit the number of active concurrent sessions to your Greenplum Database system, you can configure the max_connections server configuration parameter. This is a local parameter, meaning that you must set it in the conf file of the master, the standby master, and each segment instance (primary and mirror). The recommended value of max_connections on segments is 5-10 times the value on the master.
  2. When you set max_connections, you must also set the dependent parameter max_prepared_transactions. This value must be at least as large as the value of max_connections on the master, and segment instances should be set to the same value as the master.
  • In $MASTER_DATA_DIRECTORY/postgresql.conf(including standby master):
max_connections=100
max_prepared_transactions=100
  • In SEGMENT_DATA_DIRECTORY/postgresql.conffor all segment instances:
max_connections=500
max_prepared_transactions=100

Using gpconfig utility:

$ gpconfig -c max_connections -v 1000 -m 200
$ gpconfig -c max_prepared_transactions -v 200

4. Stop and restart your Greenplum Database:

$ gpstop -r

check the max connection:

$ gpconfig -s max_connections

Key points of max connection:

  • The value on the segments must be greater than the value on the master. The recommended value of max_connections on segments is 5-10 times the value on the master.
  • The value of max_prepared_transactions must be greater than or equal to max_connections on the master.
  • while raising connection you have to note down your system RAM memory (  (work_mem  x master connection) + effective_cache_size + maintennce_work_mem + Shared_buffer + Wall_buffer + commit log buffer)
  1. Try to connect the postgres database using psql

Понравилась статья? Поделить с друзьями:
0 0 голоса
Рейтинг статьи
Подписаться
Уведомить о
guest

0 комментариев
Старые
Новые Популярные
Межтекстовые Отзывы
Посмотреть все комментарии
  • Код ошибки 0x80072efe при обновлении windows 7
  • Почему экран темнеет windows 10
  • Wake on lan linux from windows
  • Ошибка пакета установщика windows soundpad
  • Виндовс 10 срок действия вашей лицензии windows истекает что делать