This article is completely off-line deployment CDH6 cluster in CentOS7.4. CDH6 deploy clusters and cluster CDH5 difference is relatively large.
Description: This article All operations are carried out under the root user.
document dowload
First, some installations CDH6 cluster file must first be downloaded outside the network environment is good.
Cloudera Manager 6.3.0
CM6 RPM: https: //archive.cloudera.com/cm6/6.3.0/redhat7/yum/RPMS/x86_64/
need to download all the RPM files in the link, save it to cloudera-repos
the directory.
ASC files: https: //archive.cloudera.com/cm6/6.3.0/allkeys.asc
also asc need to download a file, save to the same cloudera-repos
directory:
[root@node01 upload]# tree cloudera-repos/
cloudera-repos/
├── allkeys.asc
├── cloudera-manager-agent-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-daemons-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-db-2-6.3.0-1281944.el7.x86_64.rpm
├── enterprise-debuginfo-6.3.0-1281944.el7.x86_64.rpm
└── oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
MySQL JDBC driver
Requires 5.1.26 or later jdbc drive, click here Direct downloadmysql-connector-java-5.1.47.tar.gz
Cloudera Manager yum repository configuration
Note: Do not try to use FTP to build a library of CM YUM!
First install httpd
and createrepo
:
yum -y install httpd createrepo
Start the httpd
service and set the boot from the start:
systemctl start httpd
systemctl enable httpd
And then proceeds to the above-prepared storage directory Cloudera Manager RPM package cloudera-repos
under:
cd /data6/upload/cloudera-repos/
Generating metadata RPM:
createrepo .
chmod 777 -R cloudera-repos
Then the cloudera-repos
directory to httpd under the html directory:
mv cloudera-repos /var/www/html/
To ensure that these can be viewed through a browser RPM package:
Then create cm6 in the repo files (each node needs to be configured):
cd /etc/yum.repos.d
vim cloudera-manager.repo
Add the following:
[cloudera-manager]
name=Cloudera Manager 6.3.0
baseurl=http://node01/cloudera-repos/
gpgcheck=0
enabled=1
Save, quit, and then execute yum clean all && yum makecache
the command:
[root@master02 ~]# yum clean all && yum makecache
Loaded plugins: fastestmirror, langpacks
Cleaning repos: ChinaUnicom-Packages cloudera-manager
Cleaning up everything
Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos
Loaded plugins: fastestmirror, langpacks
ChinaUnicom-Packages | 3.6 kB 00:00:00
cloudera-manager | 2.9 kB 00:00:00
(1/7): ChinaUnicom-Packages/group_gz | 156 kB 00:00:00
(2/7): ChinaUnicom-Packages/filelists_db | 3.1 MB 00:00:00
(3/7): ChinaUnicom-Packages/primary_db | 3.1 MB 00:00:00
(4/7): ChinaUnicom-Packages/other_db | 1.2 MB 00:00:00
(5/7): cloudera-manager/filelists_db | 118 kB 00:00:00
(6/7): cloudera-manager/other_db | 1.0 kB 00:00:00
(7/7): cloudera-manager/primary_db | 8.6 kB 00:00:00
Determining fastest mirrors
Metadata Cache Created
Cloudera Manager Server installation
This step only need to operate on the CM Server node.
Execute the following command:
# 安装openjdk8
yum install oracle-j2sdk1.8
# 安装 cm manager(只需在server节点安装)
yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
We will need a lot of dependencies, so that still need to take a LAN yum source or manually install the rpm package
Local Parcel configuration repository
Cloudera Manager Server After the installation is complete, go to your local Parcel repository directory:
cd /opt/cloudera/parcel-repo
The first part of the download of CDH parcels upload the files to the directory, and then perform the modification sha file:
mv /data6/upload/parcels/* /opt/cloudera/parcel-repo/
mv CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha1 CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
Then execute the following command to change the file owner:
chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
The final /opt/cloudera/parcel-repo
catalog as follows:
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
└── manifest.json
Install Database
MySQL installation instructions have been prepared in the environment section, where you skipped MySQL installed.
Database Configuration
CDH to have an official recommendation of the MySQL configuration elements:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
Configuring mysql jdbc driver
Download from the previous good mysql-connector-java-5.1.47.tar.gz
package unzip the mysql-connector-java-5.1.47-bin.jar
file, the mysql-connector-java-5.1.47-bin.jar
file is uploaded to the CM Server node /usr/share/java/
and rename the directory mysql-connector-java.jar
(If the /usr/share/java/
directory does not exist, you need to manually create):
tar zxvf mysql-connector-java-5.1.47.tar.gz
mkdir -p /usr/share/java/
cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar
CDH create the database needed
The need to install the service reference table and a database created corresponding to a user, user name easy to be recorded and the corresponding password database utf8 encoding must be used to create a database:
Service Name | data storage name | username |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | are not | are not |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
Create a database and corresponding user:
# scm
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
# amon
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
# rman
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
# hue
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
# hive
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
# sentry
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
# nav
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
# navms
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
# oozie
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
# flush
FLUSH PRIVILEGES;
Set Cloudera Manager database
Cloudera Manager Server includes a script to configure the database.
mysql数据库与CM Server是同一台主机
执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
mysql数据库与CM Server不在同一台主机上
执行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h <mysql-host-ip> --scm-host <cm-server-ip> scm scm
[root@master02 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 10.172.54.51 --scm-host 10.172.54.52 scm scm
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
安装CDH节点
启动Cloudera Manager Server服务
systemctl start cloudera-scm-server
然后等待Cloudera Manager Server启动,可能需要稍等一会儿,可以通过命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
去监控服务启动状态。
当看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
日志打印出来后,说明服务启动成功,可以通过浏览器访问Cloudera Manager WEB界面了。
访问Cloudera Manager WEB界面
打开浏览器,访问地址:http://<server_host>:7180
,默认账号和密码都为admin:
欢迎页面
首先是Cloudera Manager的欢迎页面,点击页面右下角的【继续】按钮进行下一步:
### 接受条款
勾选接受条款,点击【继续】进行下一步:
### 版本选择
这里我就选择免费版了:
第二个欢迎界面
选择版本以后会出现第二个欢迎界面,不过这个是安装集群的欢迎页:
选择主机
这一步是要搜索并选择用于安装CDH集群的主机,在主机名称后面的输入框中输入各个节点的hostname,中间使用英文逗号分隔开,然后点击搜索,在结果列表中勾选要安装CDH的节点即可:
指定存储库
Cloudera Manager Agent
这里选择自定义,填写上面使用httpd搭建好的Cloudera Manager YUM 库URL:
CDH and other software
如果我们之前的【配置本地Parcel存储库】步骤操作无误的话,这里会自动选择【使用Parcel】,并加载出CDH版本,确认无误后点击【继续】:
因此,不需要自己手动安装 Cloudera Manager Agent了
JDK安装选项
这一步骤我就不再勾选安装JDK了,因为我在环境准备部分已经安装过了。取消勾选,然后继续:
SSH登录配置
用于配置集群主机之间的SSH登录,填写root用户的密码,根据集群配置填写合适的【同时安装数量】值即可:
安装Agent
到这一步会自动进行节点Agent的安装,稍等一会儿,即可安装完成:
安装Parcels
这一步同样是自动安装,分配步骤的速度主要取决于网络环境,耐心等待即可:
主机检查
等待检查完成即可:
Cloudera 建议将 /proc/sys/vm/swappiness
设置为最大值 10。当前设置为 30。使用 sysctl
命令在运行时更改该设置并编辑 /etc/sysctl.conf
,以在重启后保存该设置。您可以继续进行安装,但 Cloudera Manager 可能会报告您的主机由于交换而运行状况不良。
解决方法:
临时修改:
sysctl vm.swappiness=10
cat /proc/sys/vm/swappiness
这里我们的修改已经生效,但是如果我们重启了系统,又会变成30.
永久修改:
在/etc/sysctl.conf
文件里添加如下参数:
vm.swappiness=10
或者:
echo 'vm.swappiness=10'>> /etc/sysctl.conf
已启用透明大页面压缩,可能会导致重大性能问题。请运行echo never > /sys/kernel/mm/transparent_hugepage/defrag
和echo never > /sys/kernel/mm/transparent_hugepage/enabled
以禁用此设置,然后将同一命令添加到 /etc/rc.local
等初始化脚本中,以便在系统重启时予以设置。
安装上面的提示执行即可;
安装CDH集群
选择服务类型
这里我选择自定义服务,Zookeeper, HDFS,Yarn:
可以先安装基础组件,然后用到啥在安装啥
如果所有服务都安装,可能安装过程中会出现很多问题
角色分配
CDH会自动给出一个角色分配,如果觉得不合理,我们可以手动调整一下,注意角色分配均衡:
数据库设置
错误问题
no leveldbjni-1.8 in java.library.path
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Exception in thread "main" java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-1.8.so, Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni.so, no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-64-1-4792575431304239050.8: libstdc++.so.6: : , /tmp/libleveldbjni-64-1-6079277982211108711.8: libstdc++.so.6: : ]
出错原因:当前节点的glibc升级有关。既然不存在leveldbjni的库,那便给他安装一个。
安装leveldbjni库的方式非常有趣:
1) 首先下载leveldbjni-all-1.8.jar
2)解压该jar包,在\META-INF\native\linux64目录下找到libleveldbjni.so文件
3) 将libleveldbjni.so文件上传到1中java.library.path中
卸载Cloudera Manager
如果因为其他原因,需要卸载Cloudera Manager,在各节点执行如下步骤即可。
systemctl stop cloudera-scm-server
systemctl stop cloudera-scm-agent
yum -y remove 'cloudera-manager-*'
yum clean all
umount cm_processes
umount /var/run/cloudera-scm-agent/process
rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /var/log/cloudera* /var/run/cloudera*
rm -rf /tmp/.scmpreparenode.lock
rm -Rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper
rm -rf /var/lib/hadoop-* /var/lib/impala /var/lib/solr /var/lib/zookeeper /var/lib/hue /var/lib/oozie /var/lib/pgsql /var/lib/sqoop2 /data/dfs/ /data/impala/ /data/yarn/ /dfs/ /impala/ /yarn/ /var/run/hadoop-*/ /var/run/hdfs-*/ /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog
systemctl stop mariadb
yum -y remove mariadb-*
rm -rf /var/lib/mysql
rm -rf /var/log/mysqld.log
rm -rf /usr/lib64/mysql
rm -rf /usr/share/mysql
rm -rf /opt/cloudera
问题
ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
2019-08-27 20:35:50,469 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
2019-08-27 20:35:50,600 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
at com.ning.http.client.providers.netty.future.NettyResponseFuture.abort(NettyResponseFuture.java:231)
at com.ning.http.client.providers.netty.request.NettyRequestSender.abort(NettyRequestSender.java:422)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:290)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:117)
at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:87)
at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506)
at com.ning.http.client.AsyncHttpClient$BoundRequestBuilder.execute(AsyncHttpClient.java:229)
at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfoFuture(ParcelDownloaderImpl.java:592)
at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfo(ParcelDownloaderImpl.java:544)
at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:357)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:464)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:459)
at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at com.ning.http.client.NameResolver$JdkNameResolver.resolve(NameResolver.java:28)
at com.ning.http.client.providers.netty.request.NettyRequestSender.remoteAddress(NettyRequestSender.java:358)
at com.ning.http.client.providers.netty.request.NettyRequestSender.connect(NettyRequestSender.java:369)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:283)
... 15 more
不影响使用
安装hive报错:org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported
[root@master01 ~]# rpm -qa|grep mysql-connector-java
mysql-connector-java-5.1.25-3.el7.noarch
jdbc版本不对,要求使用5.1.26以上版本的jdbc驱动