A cluster plan
Here to build a three-node cluster Storm: are deployed on three hosts Supervisor
and LogViewer
services. Meanwhile, in order to ensure the availability, in addition to the main hadoop001 deployed Nimbus
services, still on the deployment hadoop002 alternate Nimbus
service. Nimbus
Zookeeper coordinated service managed by a cluster, if the primary Nimbus
is unavailable, the backup Nimbus
will become the new primary Nimbus
.
Second, the pre-conditions
Storm run depends on the Java 7+ and Python 2.6.6 +, so both need to pre-install software. Meanwhile, in order to ensure high availability, here we do not use the built-in Zookeeper Storm, while the use of external Zookeeper cluster. Since these three software are dependent on the plurality of frames, which is mounted separately to the finishing steps:
- Linux environment JDK installation
- Python install Linux environment
- Zookeeper stand-alone environment and to build a clustered environment
Third, the Cluster Setup
1. Download and unzip
Download the installation package, followed by decompression. Official Download: http: //storm.apache.org/downloads.html
# 解压
tar -zxvf apache-storm-1.2.2.tar.gz
2. Configure Environment Variables
# vim /etc/profile
Add environment variables:
export STORM_HOME=/usr/app/apache-storm-1.2.2
export PATH=$STORM_HOME/bin:$PATH
Making the environment variable configuration to take effect:
# source /etc/profile
3. cluster configuration
Modify ${STORM_HOME}/conf/storm.yaml
files, configuration is as follows:
# Zookeeper集群的主机列表
storm.zookeeper.servers:
- "hadoop001"
- "hadoop002"
- "hadoop003"
# Nimbus的节点列表
nimbus.seeds: ["hadoop001","hadoop002"]
# Nimbus和Supervisor需要使用本地磁盘上来存储少量状态(如jar包,配置文件等)
storm.local.dir: "/home/storm"
# workers进程的端口,每个worker进程会使用一个端口来接收消息
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
supervisor.slots.ports
Parameters used to configure port workers process to receive messages, the default will start four worker supervisor on each node, you can of course also be set according to their needs and server performance, assuming that just started two worker, then here Configuration 2 ports can be.
4. Distribution installation package
The Storm installation package distributed to other servers, the distribution is recommended on both servers also configure the look Storm environment variables.
scp -r /usr/app/apache-storm-1.2.2/ root@hadoop002:/usr/app/
scp -r /usr/app/apache-storm-1.2.2/ root@hadoop003:/usr/app/
IV. Start the cluster
4.1 start ZooKeeper cluster
Respectively, to start the ZooKeeper service on three servers:
zkServer.sh start
4.2 Start Cluster Storm
Because you boot multiple processes, so the uniform application of the background process starts way. Into ${STORM_HOME}/bin
the directory, execute the following command:
hadoop001 & hadoop002 :
# 启动主节点 nimbus
nohup sh storm nimbus &
# 启动从节点 supervisor
nohup sh storm supervisor &
# 启动UI界面 ui
nohup sh storm ui &
# 启动日志查看服务 logviewer
nohup sh storm logviewer &
hadoop003 :
On hadoop003 only need to start the supervisor
service and the logviewer
service:
# 启动从节点 supervisor
nohup sh storm supervisor &
# 启动日志查看服务 logviewer
nohup sh storm logviewer &
View Cluster 4.3
Use jps
view the process, the process should be three servers are as follows:
Access hadoop001 or hadoop002 the 8080
port, the interface is as follows. There can be seen a main one two Nimbus
and three Supervisor
, and each of Supervisor
four slots
, i.e. four available worker
processes, then represents the cluster has been built successfully.
Fifth, high availability verification
Here manually simulate the main Nimbus
abnormality, used on hadoop001 kill
orders to kill Nimbus
the thread, then you can see on the hadoop001 Nimbus
already in the offline
state, and on hadoop002 Nimbus
become new Leader
.
More big data series can be found GitHub open source project : Big Data Getting Started