mysql、flume、zookeeper、kafka快速搭建

版权声明:本文为博主大壮原创文章,未经博主允许不得转载。 https://blog.csdn.net/qq_33792843/article/details/84395439

准备做实时数据计算。

数据源为mysql的20张表吧。通过flume解析binlog日志,然后sink到kafka,由sparkstreaming消费,实时处理业务数据生成目标数据写到我们的mysql中。

一.mysql搭建

0. 检查是否已安装并删除已安装的包

yum list installed mysql*
 
yum remove mysql-community-client.x86_64 mysql-community-common.x86_64 mysql-community-devel.x86_64 mysql-community-libs.x86_64 mysql-community-libs-compat.x86_64 mysql-community-server.x86_64 mysql80-community-release.noarch
 
 
移除mysql文件夹
 
[root@centos1 ~]# whereis mysql
mysql: /usr/local/mysql
[root@centos1 ~]# rm -rf  /usr/local/mysql


 
1.更新mysql源
rpm -Uvh https://repo.mysql.com/mysql80-community-release-el6.rpm
 
2.安装mysql
yum install -y mysql-community-server mysql-community
 
 
 
速度不一,请等待。。。。。。
 
 
 
3.添加mysql执行运行等级
chkconfig --level 2345 mysqld on
 
拓展:
2、3、4、5 对应不同的运行等级:

0 - halt (系统直接关机)
1 - single user mode (单人维护模式,用在系统出问题时的维护)
2 - Multi-user, without NFS (类似底下的 runlevel 3,但无 NFS 服务)
3 - Full multi-user mode (完整含有网络功能的纯文字模式)
4 - unused (系统保留功能)
5 - X11 (与 runlevel 3 类似,但加载使用 X Window)
6 - reboot (重新启动)

如果2、3、4、5 不为on, 则表示 nginx服务在运行级别为2、3、4、5时候没有启动 (即开机的时候不会启动ngnix服务)
例如
chkconfig --level 2345 nginx on 表示设置nginx 服务在运行级别为2、3、4、5时启动 (即设置开机启动nginx服务)
 
4.启动服务
service mysqld start
 
5. 查看初始化密码
cat /var/log/mysqld.log | grep password

  6.我是懒得要密码了,毕竟测试

mysql> USE mysql ; 
mysql> UPDATE user SET Password = password ( '' ) WHERE User = 'root' ; 
mysql> flush privileges ; 
mysql> quit

二、flume安装

wget http://mirrors.tuna.tsinghua.edu.cn/apache/flume/1.8.0/apache-flume-1.8.0-bin.tar.gz

tar –zxvf apache-flume-1.8.0-bin.tar.gz

mv apache-flume-1.8.0-bin /usr/local/flume

设置环境变量

Vim /etc/profile.d/flume.sh

Source /etc/profile

   配置flume配置文件的java_home

cp flume-env.sh.template flume-env.sh

vim flume-env.sh
 

三、zookeeper安装

1.1 下载

cd /usr/local/kafka
wget https://archive.apache.org/dist/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.tar.gz
tar -zxvf zookeeper-3.3.6.tar.gz
vim /etc/profile
 

1.2 安装

使用tar解压要安装的目录即可,以3.4.5版本为例

这里以解压到/usr/myapp,实际安装根据自己的想安装的目录修改(注意如果修改,那后边的命令和配置文件中的路径都要相应修改)

tar -zxf zookeeper-3.4.5.tar.gz -C /usr/myapp

1.3 配置

在主目录下创建data和logs两个目录用于存储数据和日志:

cd /usr/myapp/zookeeper-3.4.5
mkdir data
mkdir logs

在conf目录下新建zoo.cfg文件,写入以下内容保存:

tickTime=2000
dataDir=/usr/myapp/zookeeper-3.4.5/data
dataLogDir=/usr/myapp/zookeeper-3.4.5/logs
clientPort=2181

1.4 启动和停止

进入bin目录,启动、停止、重启分和查看当前节点状态(包括集群中是何角色)别执行:

./zkServer.sh start
./zkServer.sh stop
./zkServer.sh restart
./zkServer.sh status


# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/usr/software/zookeeper/data
dataLogDir=/usr/software/zookeeper/logs
# the port at which the clients will connect
clientPort=2181

四、安装kafka

https://blog.csdn.net/qq_33792843/article/details/75727921

kafka的配置说明
http://www.cnblogs.com/yinchengzhe/p/5111635.html


kafka.common.KafkaException: Socket server failed to bind to centos1:9092: Unresolved address.
解决方案:https://www.cnblogs.com/yy3b2007com/p/8684974.html

kafka.admin.AdministrationException: replication factor: 1 larger than available brokers: 0
解决方案:https://blog.csdn.net/g1219371445/article/details/78828915
就是启动kafka然后创建


java.net.UnknownHostException: 主机名: 主机名: 未知的名称或服务
解决方案:https://blog.csdn.net/huanbia/article/details/69055523

测试也应该参照文档的,

0.修改/etc/hosts文件,将127.0.0.1 增加主机名称

1.bin/kafka-server-start.sh config/server.properties

2../kafka-create-topic.sh -partition 1 -replica 1 -zookeeper localhost:2181 -topic test

3.另一个端口来./kafka-console-consumer.sh -zookeeper localhost:2181 -topic test

猜你喜欢

转载自blog.csdn.net/qq_33792843/article/details/84395439