Use the following instructions to install PostgreSQL and set up a database on the appropriate hosts. It's useful to set a password for the root user of PostgreSQL. Note the host name and port number where you install PostgreSQL because you will need to specify them when you install the JDBC connector to PostgreSQL in a later step. Note that PostgreSQL does not have an accepted default port. You must determine the port used in your environment.
You will also need to create a PostgreSQL database and user account for Cloudera Manager to use to store data. See your PostgreSQL documentation for more information about installation and configuration.
To install PostgreSQL on a Red Hat system:
To install PostgreSQL on a SLES system:
To install PostgreSQL on an Ubuntu system:
Configuring Your Systems to Support PostgreSQL
You must configure the PostgreSQL database to run as expected. This involves:
- Configuring PostgreSQL to accept network connections.
- Initializing the database to work with Cloudera Manager.
- Configuring the operating system to start PostgreSQL.
Configuring PostgreSQL to accept network connections
By default, PostgreSQL only accepts connections on the loopback interface. Remember to reconfigure PostgreSQL to accept connections from the Fully Qualified Domain Name (FQDN) of the machines hosting the management roles. If you do not make these changes, the management processes will not be able to connect to and use the database on which they depend.
By default, PostgreSQL only accepts connections on the loopback interface. Remember to reconfigure PostgreSQL to accept connections from the Fully Qualified Domain Name (FQDN) of the machines hosting the management roles. If you do not make these changes, the management processes will not be able to connect to and use the database on which they depend.
Initializing and configuring the external PostgreSQL database
- Prepare the external PostgreSQL database for use with the Cloudera Manager Server.
- On Red Hat and SLES systems:
- On Debian/Ubuntu systems:
- Enable MD5 authentication. Edit pg_hba.conf, which is usually found in /var/lib/pgsql/data or/etc/postgresql/8.4/main. Add the following line:
Add this line before another line in the configuration file that references ident authentication.
You can modify the contents of line to support different configurations. For example, if you want to access PostgreSQL from a different host, replace 127.0.0.1 with your IP address and update postgresql.conf, which is typically found in the same place as pg_hba.conf to include: - Start the PostgreSQL database.
- On Red Hat and SLES systems:
- On Debian/Ubuntu systems:
- Configure the PostgreSQL server to start at boot.
- On Red Hat systems:
- On SLES systems:
- On Debian/Ubuntu systems:
Creating the PostgreSQL Databases for Cloudera Manager
The next step involves creating databases and user accounts for all database-backed services in Cloudera Manager.
You must create databases for each of the following features that are part of the Management Services:
- Activity Monitor
- Service Monitor
- Report Manager
- Host Monitor
You can create these databases on the host where the Cloudera Manager Server will run, or on any other nodes in the cluster. For performance reasons, you should typically install each database on the host on which the service runs, as determined by the roles you will assign during installation or upgrade. In larger deployments or in cases where database administrators (DBAs) are managing the databases the services will use, databases may be separated from services, but do not undertake such an implementation lightly.
The examples that follow allow access only to a specific user ('amon_user' on 'myhost1'), ('smon_user' on'myhost2'), ('repman_user' on 'myhost3'), or ('hmon_user' on 'myhost4') respectively) where'myhost1', 'myhost2', 'myhost3', and 'myhost4' refer to the name of the host on which you are creating the database. To restrict access in this way, you must use the hostname if this host will also have the corresponding role (Activity Monitor, Service Monitor, Report Manager, or Host Monitor respectively) as Cloudera recommends. But if instead another host will have the corresponding role, and you want to allow access to the database only from that host, you must specify the fully-qualified domain name of that host.
If you later decide to move the role (for example, Activity Monitor) to another machine, you must grant access to the corresponding user (for example 'amon_user') on the new host as well.
Note the values you enter for database names, user names, and passwords. The Cloudera Manager installation wizard requires this information to correctly connect to these databases.
The database must be configured to support UTF-8 character set encoding. The sample commands below include the required options to enable UTF-8 support.
To create the PostgreSQL Databases for Cloudera Manager:
- Connect to PostgreSQL.
- Create a database for the Activity Monitor feature and assign permissions to a database user. The database name, user name, and password can be anything you want.
- Create a database for the Service Monitor feature and assign permissions to a database user. The database name, user name, and password can be anything you want.
- Create a database for the Report Manager feature and assign permissions to a database user. The database name, user name, and password can be anything you want.
- Create a database for the Host Monitor feature and assign permissions to a database user. The database name, user name, and password can be anything you want.
Configuring PostgreSQL Settings
There are several settings you should update to ensure your system performs as expected. Update these settings in the /etc/postgresql.conf file. Settings vary based on cluster size and resources.
Large Clusters
Large clusters may contain up to 1000 hosts. For large clusters consider the following suggestions as a starting point for settings.
- max_connection: For large clusters, each database is typically hosted on a different machine. The general rule is to allow each database on a host 100 maximum connections and then add 50 extra connections. As a result, in the normal case for large clusters, configure each of the five machines that hosts a single database for 150 connections. You may have to increase the system resources available to PostgreSQL, as described athttp://www.postgresql.org/docs/9.1/static/kernel-resources.html.
- shared_buffers: 1024MB. Note that this requires that the operating system can allocate sufficient shared memory. See Postgres information on Managing Kernel Resources for more information on setting kernel resources.
- wal_buffers: 16MB. This value is derived from the shared_buffers value. Setting wal_buffers to be approximately 3% of shared_buffers up to a maximum of approximately 16MB works well in most case.
- checkpoint_segments: 128. The PostgreSQL Tuning Guide recommends values between 32 and 256 for write-intensive systems, such as this one.
- checkpoint_completion_target:0.9. This setting is only available in PostgreSQL 8.3 and later. These versions are highly recommended.
Small to Mid-sized Clusters
For small to mid-sized clusters, consider the following suggestions as a starting point for settings. If resources are especially limited, consider reducing the buffer sizes and checkpoint segments further. Ongoing tuning may be required based on each machine's resource utilization. For example, if Cloudera Manager is running on the same machine as other roles, the following values may be acceptable:
- shared_buffers: 256MB
- wal_buffers: 8MB
- checkpoint_segments: 16
- checkpoint_completion_target: 0.9
Configuration Settings for Postgres 8.1
Cloudera recommends using PostgreSQL 8.4 or later. While more recent versions provide better results, earlier versions may be used. For example, Cloudera supports PostgreSQL 8.1, which is bundled with some older Linux distributions. If you use PostgreSQL 8.1, settings such as checkpoint_completion_target are not available. Consequently, consider using the following recommended settings:
- shared_buffers: 131072
- wal_buffers: 4096
- checkpoint_segments: 256
Note that because PostgreSQL 8.1 does not support entering parameters in MB, the preceding values are provided in buffers or segments. For example, each buffer is 8KB, so 131072 is equivalent to 1024 MB.
After updating database settings, you must restart PostgreSQL for the new settings to take effect.
Restarting PostgreSQL
After making database configuration changes, you must restart the database for the changes to be applied.
To restart PostgreSQL:
Backing up the Databases
Cloudera recommends that you periodically back up the databases that Cloudera Manager uses to store configuration, monitoring, and reporting data. Be sure to back all of the databases you are using with Cloudera Manager:
- Cloudera Manager database: Contains all the information about what services you have configured, their role assignments, all configuration history, commands, users, and running processes. This is a relatively small database (<100MB), and is the most important to back up.
- Activity Monitor database: Contains information about past activities. In large clusters, this database can grow large.
- Service Monitor database: Contains monitoring information about daemons. In large clusters, this database can grow large.
- Report Manager database: Keeps track of disk utilization over time. Medium-sized.
- Host Manager database: Contains information about host status. Relatively small.
Backing Up the PostgreSQL Database
It's important that you periodically back up the external PostgreSQL database that Cloudera Manager uses to store configuration information.
To back up the PostgreSQL database, you can simply backup the /var/lib/cloudera-scm-server-dbdirectory.
You can also use the pg_dump utility to back up the external PostgreSQL database.
To use the pg_dump utility:
- Log in to the host where the Cloudera Manager Server is installed.
- Run the following command as root:
- Run the following command as root:
- Enter the password specified for the com.cloudera.cmf.db.password property on the last line of thedb.properties file. Cloudera Manager generated the password for you during installation.
For more information about using the pg_dump utility, see this page.
No comments:
Post a Comment