【ClickHouse系列】ClickHouse-docker安装和使用

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/langhailove_2008/article/details/88249926

实践过程中参考了别人的博客
【博客1】
【博客2】
但是写的都不是很详细,有很多坑都是需要自己去踩的。

1、docker 安装方式介绍(其他方式安装自行百度)----docker没玩过的同学,请自行百度!(如果有需要可以单独再写个关于docker的docs)

首先找台有docker的机器来玩:比如–10.x.xx.xxx

# 先建个目录
[root@XXXXXXX clickhouse]# pwd
/home/wangjie/docker/clickhouse


其次写dockerfile文件----dockerfile文件格式不知道怎么写的同学,请自行百度!

# 写dockerfile,文件名就是dockerfile或Dockerfile
[root@XXXXXXX clickhouse]# vim dockerfile
FROM ubuntu:18.04

ARG repository="deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
ARG version=19.1.10
ARG gosu_ver=1.10

RUN apt-get update \
    && apt-get install --yes --no-install-recommends \
        apt-transport-https \
        dirmngr \
        gnupg \
    && mkdir -p /etc/apt/sources.list.d \
    && apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4 \
    && echo $repository > /etc/apt/sources.list.d/clickhouse.list \
    && apt-get update \
    && env DEBIAN_FRONTEND=noninteractive \
        apt-get install --allow-unauthenticated --yes --no-install-recommends \
            clickhouse-common-static=$version \
            clickhouse-client=$version \
            clickhouse-server=$version \
            libgcc-7-dev \
            locales \
            tzdata \
            wget \
    && rm -rf \
        /var/lib/apt/lists/* \
        /var/cache/debconf \
        /tmp/* \
    && apt-get clean

ADD https://github.com/tianon/gosu/releases/download/1.10/gosu-amd64 /bin/gosu

RUN locale-gen en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

RUN mkdir /docker-entrypoint-initdb.d

COPY docker_related_config.xml /etc/clickhouse-server/config.d/
COPY entrypoint.sh /entrypoint.sh

RUN chmod +x \
    /entrypoint.sh \
    /bin/gosu

EXPOSE 9000 8123 9009
VOLUME /var/lib/clickhouse

ENV CLICKHOUSE_CONFIG /etc/clickhouse-server/config.xml

ENTRYPOINT ["/entrypoint.sh"]

构建clickhouse镜像(时间可能会久一点,如果出错,可能会更久–因为需要反复构建多次–不过这一切都是值得的,当你有了自己的镜像,你就可以any zuo,no die!)

[root@XXXXXXX clickhouse]# docker build -t clickhouse-server-demo:1.0 .
Sending build context to Docker daemon  3.072kB
Step 1/18 : FROM ubuntu:18.04
18.04: Pulling from library/ubuntu
6cf436f81810: Pull complete
987088a85b96: Pull complete
b4624b3efe06: Pull complete
d42beb8ded59: Pull complete
....(通常能连外网就行)....
The following additional packages will be installed:
  gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server gpgconf
  gpgsm libasn1-8-heimdal libassuan0 libgssapi3-heimdal libhcrypto4-heimdal
  libheimbase1-heimdal libheimntlm0-heimdal libhx509-5-heimdal
  libkrb5-26-heimdal libksba8 libldap-2.4-2 libldap-common libnpth0
  libreadline7 libroken18-heimdal libsasl2-2 libsasl2-modules-db libsqlite3-0
  libwind0-heimdal pinentry-curses readline-common
Suggested packages:
  dbus-user-session libpam-systemd pinentry-gnome3 tor parcimonie xloadimage
  scdaemon pinentry-doc readline-doc
Recommended packages:
  libsasl2-modules
The following NEW packages will be installed:
  apt-transport-https dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent
  gpg-wks-client gpg-wks-server gpgconf gpgsm libasn1-8-heimdal libassuan0
  libgssapi3-heimdal libhcrypto4-heimdal libheimbase1-heimdal
  libheimntlm0-heimdal libhx509-5-heimdal libkrb5-26-heimdal libksba8
  libldap-2.4-2 libldap-common libnpth0 libreadline7 libroken18-heimdal
  libsasl2-2 libsasl2-modules-db libsqlite3-0 libwind0-heimdal pinentry-curses
  readline-common
0 upgraded, 31 newly installed, 0 to remove and 3 not upgraded.
....(通常能连外网就行)....
The following additional packages will be installed:
  gcc-7-base libasan4 libatomic1 libcilkrts5 libgomp1 libitm1 liblsan0 libmpx2
  libpsl5 libquadmath0 libssl1.1 libtsan0 libubsan0
Recommended packages:
  libcap2-bin libc6-dev publicsuffix ca-certificates
The following NEW packages will be installed:
  clickhouse-client clickhouse-common-static clickhouse-server gcc-7-base
  libasan4 libatomic1 libcilkrts5 libgcc-7-dev libgomp1 libitm1 liblsan0
  libmpx2 libpsl5 libquadmath0 libssl1.1 libtsan0 libubsan0 locales tzdata
  wget
0 upgraded, 20 newly installed, 0 to remove and 3 not upgraded.
....(通常能连外网就行)....
urrent default time zone: 'Etc/UTC'
Local time is now:      Wed Mar  6 08:51:38 UTC 2019.
Universal Time is now:  Wed Mar  6 08:51:38 UTC 2019.
....(如果有报错百度之)....
Warning: apt-key output should not be parsed (stdout is not a terminal)
Executing: /tmp/apt-key-gpghome.QG893kLkNP/gpg.1.sh --keyserver keyserver.ubuntu.com --recv E0C56BD4
gpg: keyserver receive failed: Server indicated a failure
The command '/bin/sh -c apt-get update     && apt-get install --yes --no-install-recommends         apt-transport-https         dirmngr         gnupg     && mkdir -p /etc/apt/sources.list.d     && apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4     && echo $repository > /etc/apt/sources.list.d/clickhouse.list     && apt-get update     && env DEBIAN_FRONTEND=noninteractive         apt-get install --allow-unauthenticated --yes --no-install-recommends             clickhouse-common-static=$version             clickhouse-client=$version             clickhouse-server=$version             libgcc-7-dev             locales             tzdata             wget     && rm -rf         /var/lib/apt/lists/*         /var/cache/debconf         /tmp/*     && apt-get clean' returned a non-zero code: 2
您在 /var/spool/mail/root 中有邮件


# 比如1:如上报错---怎么解决?
[root@XXXXXXX clickhouse]# docker images
REPOSITORY                         TAG                 IMAGE ID            CREATED             SIZE
<none>                             <none>              5b07a3f2033d        17 minutes ago      88.1MB
ubuntu                             18.04               47b19964fb50        4 weeks ago         88.1MB
# 可以看到镜像都是办成品,不过可以用这个半成品镜像5b07a3f2033d来调试上面的报错
[root@XXXXXXX clickhouse]# docker run -it 5b07a3f2033d /bin/bash
root@186ew89a3lk8:/# 这里可以从出错的command(详见上面dockerfile文件命令菜单)开始执行check报错的原因并解决之


# 解决上述问题后,删除上面的镜像和容器,再次执行build命令,,等待直到成功或报错!
[root@XXXXXXX clickhouse]# docker ps -a  查出刚刚的容器id
[root@XXXXXXX clickhouse]# docker rm 容器id
[root@XXXXXXX clickhouse]# docker rmi 5b07a3f2033d  删除半成品镜像
....(如果有报错百度之)....
Step 12/18 : COPY docker_related_config.xml /etc/clickhouse-server/config.d/
COPY failed: stat /var/lib/docker/tmp/docker-builder800178589/docker_related_config.xml: no such file or directory
# 比如2:如上报错---怎么解决?
[root@XXXXXXX clickhouse]# docker images
REPOSITORY                         TAG                 IMAGE ID            CREATED             SIZE
<none>                             <none>              300535bbbf56        4 minutes ago       474MB
ubuntu                             18.04               47b19964fb50        4 weeks ago         88.1MB
# 可以看到这次的半成品,474MB距离成功只差8步了。同样可以用这个半成品镜像300535bbbf56来调试上面的报错
[root@XXXXXXX clickhouse]# docker run -it 300535bbbf56 /bin/bash
root@271df44d0fa9:/
# 这里可以从出错的command(详见上面dockerfile文件命令菜单)开始执行check报错的原因并解决之
即COPY docker_related_config.xml /etc/clickhouse-server/config.d/
#发现其实根源是没有docker_related_config.xml文件,所以要在本地新增这个文件,但是内容是啥呢?(用docker pull 镜像--后面有提到pull镜像的方式,然后run出容器cat相应文件即可)
比如:
# 任意new一个容器并进入
[root@XXXXXXX clickhouse]# docker run -it 76f15457b167 /bin/bash   
# 进入容器后查看文件内容后,复制出来即可
root@b1d85472110e:/# cat /etc/clickhouse-server/config.d/docker_related_config.xml   
# 分隔线
[root@XXXXXXX clickhouse]#  vim  docker_related_config.xml
<yandex>
     <!-- Listen wildcard address to allow accepting connections from other containers and host network. -->
    <listen_host>::</listen_host>
    <listen_host>0.0.0.0</listen_host>
    <listen_try>1</listen_try>

    <!--
    <logger>
        <console>1</console>
    </logger>
    -->
</yandex>
[root@XXXXXXX clickhouse]# ll
总用量 8
-rw-r--r-- 1 root root 1432 3月   6 15:10 dockerfile
-rw-r--r-- 1 root root  315 3月   6 16:37 docker_related_config.xml
# 可以看到上面的dockerfile文件命令菜单中有两个COPY,所以entrypoint.sh文件也没有的,再次build会有如下报错
Step 13/18 : COPY entrypoint.sh /entrypoint.sh
COPY failed: stat /var/lib/docker/tmp/docker-builder219246264/entrypoint.sh: no such file or directory
# 所以同上新增之即可,但是entrypoint.sh内容是啥呢?
比如:
[root@XXXXXXX clickhouse]# docker run -it 76f15457b167 /bin/bash   任意new一个容器并进入
root@b1d85472110e:/# cat /entrypoint.sh   进入容器后查看文件内容后,复制出来即可
COPY entrypoint.sh /entrypoint.sh
[root@XXXXXXX clickhouse]#  vim entrypoint.sh
#!/bin/bash

# set some vars
CLICKHOUSE_CONFIG="${CLICKHOUSE_CONFIG:-/etc/clickhouse-server/config.xml}"
USER="$(id -u clickhouse)"
GROUP="$(id -g clickhouse)"

# port is needed to check if clickhouse-server is ready for connections
HTTP_PORT="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=http_port)"

# get CH directories locations
DATA_DIR="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=path || true)"
TMP_DIR="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=tmp_path || true)"
USER_PATH="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=user_files_path || true)"
LOG_PATH="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=logger.log || true)"
LOG_DIR="$(dirname $LOG_PATH || true)"
ERROR_LOG_PATH="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=logger.errorlog || true)"
ERROR_LOG_DIR="$(dirname $ERROR_LOG_PATH || true)"
FORMAT_SCHEMA_PATH="$(clickhouse extract-from-config --config-file $CLICKHOUSE_CONFIG --key=format_schema_path || true)"

# ensure directories exist
mkdir -p \
    "$DATA_DIR" \
    "$ERROR_LOG_DIR" \
    "$LOG_DIR" \
    "$TMP_DIR" \
    "$USER_PATH" \
    "$FORMAT_SCHEMA_PATH"

if [ "$CLICKHOUSE_DO_NOT_CHOWN" != "1" ]; then
    # ensure proper directories permissions
    chown -R $USER:$GROUP \
        "$DATA_DIR" \
        "$ERROR_LOG_DIR" \
        "$LOG_DIR" \
        "$TMP_DIR" \
        "$USER_PATH" \
        "$FORMAT_SCHEMA_PATH"
fi

if [ -n "$(ls /docker-entrypoint-initdb.d/)" ]; then
    gosu clickhouse /usr/bin/clickhouse-server --config-file=$CLICKHOUSE_CONFIG &
    pid="$!"

    # check if clickhouse is ready to accept connections
    # will try to send ping clickhouse via http_port (max 12 retries, with 1 sec delay)
    if ! wget --spider --quiet --tries=12 --waitretry=1 --retry-connrefused "http://localhost:$HTTP_PORT/ping" ; then
        echo >&2 'ClickHouse init process failed.'
        exit 1
    fi
    clickhouseclient=( clickhouse-client --multiquery )
    echo
    for f in /docker-entrypoint-initdb.d/*; do
        case "$f" in
            *.sh)
                if [ -x "$f" ]; then
                    echo "$0: running $f"
                    "$f"
                else
                    echo "$0: sourcing $f"
                    . "$f"
                fi
                ;;
            *.sql)    echo "$0: running $f"; cat "$f" | "${clickhouseclient[@]}" ; echo ;;
            *.sql.gz) echo "$0: running $f"; gunzip -c "$f" | "${clickhouseclient[@]}"; echo ;;
            *)        echo "$0: ignoring $f" ;;
        esac
        echo
    done

    if ! kill -s TERM "$pid" || ! wait "$pid"; then
        echo >&2 'Finishing of ClickHouse init process failed.'
        exit 1
    fi
fi

# if no args passed to `docker run` or first argument start with `--`, then the user is passing clickhouse-server arguments
if [[ $# -lt 1 ]] || [[ "$1" == "--"* ]]; then
    exec gosu clickhouse /usr/bin/clickhouse-server --config-file=$CLICKHOUSE_CONFIG "$@"
fi

# Otherwise, we assume the user want to run his own process, for example a `bash` shell to explore this image
exec "$@"
[root@XXXXXXX clickhouse]# ll
总用量 12
-rw-r--r-- 1 root root 1432 3月   6 15:10 dockerfile
-rw-r--r-- 1 root root  315 3月   6 16:37 docker_related_config.xml
-rw-r--r-- 1 root root 3189 3月   6 17:03 entrypoint.sh
# 然后删除半成品镜像
[root@XXXXXXX clickhouse]# docker rmi 半成品镜像id
# 再次build即可
[root@XXXXXXX clickhouse]# docker build -t clickhouse-server-demo:1.0 .
...(循环上面的操作, 直到不报错为止)...
Successfully built 7e0487ee8225
Successfully tagged clickhouse-server-demo:1.0
您在 /var/spool/mail/root 中有邮件
# 看到这里说明你成功了
其实,简单的构建镜像的方法是docker pull 镜像即可。如何获取自己需要的镜像?看这里

# 先来pull clickhouse-server
[root@XXXXXXX clickhouse]# docker pull yandex/clickhouse-server
...(等待,pull镜像的过程中也可能报错的,看网络了)...
# 运气好的话,成功之后,可以看到镜像了。
[root@XXXXXXX clickhouse]# docker images |grep click
yandex/clickhouse-server           latest              76f15457b167        2 days ago          475MB
# 再来pull clickhouse-client
[root@XXXXXXX clickhouse]# docker pull yandex/clickhouse-client
# 运气好的话,成功之后,可以看到镜像了。
[root@XXXXXXX clickhouse]# docker images |grep click
[root@XXXXXXX clickhouse]# docker images | grep click
yandex/clickhouse-server           latest              76f15457b167        2 days ago          475MB
yandex/clickhouse-client           latest              52a6b316725a        13 days ago         450MB

启动一个实例即clickhouse容器

# 查看已有的clickhouse镜像
[root@XXXXXXX clickhouse]# docker images | grep click
clickhouse-server-demo             1.0                 7e0487ee8225        27 minutes ago      475MB    # 这个就是用dockerfile自己构建的,可以自由修改dockerfile来改变镜像内容
yandex/clickhouse-server           latest              76f15457b167        2 days ago          475MB    # 这种镜像最省事,但是镜像无法修改
yandex/clickhouse-client           latest              52a6b316725a        13 days ago         450MB    # 这种镜像最省事,但是镜像无法修改
# 选择一个你想用的镜像启动即可,比如:咱们用pull下来的镜像
# 首先启动clickhouse服务端
[root@XXXXXXX clickhouse]# docker run -d --name demo-clickhouse-server -p 34424:34424 yandex/clickhouse-server
38f0f69b59413f213cbc6c325b70b017a2f22c01458b5db390cb23e825c9f6b5
您在 /var/spool/mail/root 中有邮件
[root@XXXXXXX clickhouse]# docker ps -a
CONTAINER ID        IMAGE                                            COMMAND                  CREATED             STATUS                       PORTS                                                    NAMES
38f0f69b5941        yandex/clickhouse-server                         "/entrypoint.sh"         5 seconds ago       Up 4 seconds                 8123/tcp, 9000/tcp, 9009/tcp, 0.0.0.0:34424->34424/tcp   demo-clickhouse-server


连接clickhouse容器实例

# 其次启动clickhouse客户端来连接服务端
[root@XXXXXXX clickhouse]# docker run -it --rm --link demo-clickhouse-server:clickhouse-server_wj yandex/clickhouse-client --host clickhouse-server_wj (运行完就直接进入到clickhouse的交互式了)
上面的命令简单说明一下,--link 即将被连接的服务端容器名软链为clickhouse-server_wj(自定义即可),后面--host 用的就是这个软链名字
ClickHouse client version 19.1.9.
Connecting to clickhouse-server:9000.
Connected to ClickHouse server version 19.1.10 revision 54413.

38f0f69b5941 :)
38f0f69b5941 :) exit
Bye.


# 如果退出了容器,如何继续玩耍?
# 重新执行docker run -it --rm --link demo-clickhouse-server:clickhouse-server_wj yandex/clickhouse-client --host clickhouse-server_wj 这个命令即可。


操作之

# 接上面38f0f69b5941 :) show databases;

SHOW DATABASES

┌─name────┐
│ default │
│ system  │
└─────────┘

2 rows in set. Elapsed: 0.013 sec.

38f0f69b5941 :) use system

USE system

Ok.

0 rows in set. Elapsed: 0.006 sec.

38f0f69b5941 :) show tables;

SHOW TABLES

┌─name───────────────────────────┐
│ aggregate_function_combinators │
│ asynchronous_metrics           │
│ build_options                  │
│ clusters                       │
│ collations                     │
│ columns                        │
│ contributors                   │
│ data_type_families             │
│ databases                      │
│ dictionaries                   │
│ events                         │
│ formats                        │
│ functions                      │
│ graphite_retentions            │
│ macros                         │
│ merge_tree_settings            │
│ merges                         │
│ metrics                        │
│ models                         │
│ mutations                      │
│ numbers                        │
│ numbers_mt                     │
│ one                            │
│ parts                          │
│ parts_columns                  │
│ processes                      │
│ replicas                       │
│ replication_queue              │
│ settings                       │
│ table_engines                  │
│ table_functions                │
│ tables                         │
└────────────────────────────────┘

32 rows in set. Elapsed: 0.009 sec.

38f0f69b5941 :)

综上,虽然可以玩了,但是配置都是默认的,可以看到客户端连接时没有指定用户名和密码的。

业务中是不可能让clickhouse裸奔的,那么配置文件怎么配置呢?配置文件怎么从本地映射到容器中呢?(在【2、docker-compose安装方式介绍】中会有类似说明)

2、docker-compose 安装方式介绍

通过1中的操作,你基本可以开始学习clickhouse的使用了,而且你也简单了解了clickhouse-server端和clickhouse-client端(类似于mysql)
下面介绍稍微高级点的docker-compose 安装方式,同时把1中没说的clickhouse配置文件说明一下。
我们重新建一个目录来玩compose

# 新建一个目录和相关文件
[root@XXXXXXX clickhouse_compose]# cd clickhouse_server/
[root@XXXXXXX clickhouse_server]# touch config.xml
[root@XXXXXXX clickhouse_server]# mkdir data
[root@XXXXXXX clickhouse_server]# mkdir log
[root@XXXXXXX clickhouse_server]# touch users.xml
[root@XXXXXXX clickhouse_server]# mv user.xml users.xml
[root@XXXXXXX clickhouse_server]# touch log/clickhouse-server.log
[root@XXXXXXX clickhouse_server]# touch log/clickhouse-server.err.log
[root@XXXXXXX clickhouse_server]# touch docker-compose.yml
# 最终的结构类似这种
[root@XXXXXXX clickhouse_server]# tree .
.
├── config.xml
├── data
├── docker-compose.yml
├── log
│   ├── clickhouse-server.err.log
│   └── clickhouse-server.log
└── users.xml
# 然后填充相关文件的内容
# 先写config.xml文件
[root@XXXXXXX clickhouse_server]# vim config.xml
<?xml version="1.0"?>
<yandex>
    <logger>
        <level>trace</level>
        <log>/var/log/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
        <size>1000M</size>
        <count>10</count>
    </logger>

    <http_port>8123</http_port>   <!-- 通过url访问clickhouse的端口号 -->
    <tcp_port>9000</tcp_port>     <!-- 通过tcp访问clickhouse的端口号 -->

    <!-- For HTTPS and SSL over native protocol. -->
    <!--
    <https_port>8443</https_port>
    <tcp_ssl_port>9440</tcp_ssl_port>
    -->

    <!-- Used with https_port and tcp_ssl_port. Full ssl options list: https://github.com/yandex/ClickHouse/blob/master/contrib/libpoco/NetSSL_OpenSSL/include/Poco/Net/SSLManager.h#L71 -->
    <openSSL>
        <server> <!-- Used for https server AND secure tcp port -->
            <!-- openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt -->
            <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
            <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
            <!-- openssl dhparam -out /etc/clickhouse-server/dhparam.pem 4096 -->
            <dhParamsFile>/etc/clickhouse-server/dhparam.pem</dhParamsFile>
            <verificationMode>none</verificationMode>
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
        </server>

        <client> <!-- Used for connecting to https dictionary source -->
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
            <!-- Use for self-signed: <verificationMode>none</verificationMode> -->
            <invalidCertificateHandler>
                <!-- Use for self-signed: <name>AcceptCertificateHandler</name> -->
                <name>RejectCertificateHandler</name>
            </invalidCertificateHandler>
        </client>
    </openSSL>

    <!-- Default root page on http[s] server. For example load UI from https://tabix.io/ when opening http://localhost:8123 -->
    <!--
    <http_server_default_response><![CDATA[<html ng-app="SMI2"><head><base href="http://ui.tabix.io/"></head><body><div ui-view="" class="content-ui"></div><script src="http://loader.tabix.io/master.js"></script></body></html>]]></http_server_default_response>
    -->

    <!-- Port for communication between replicas. Used for data exchange. -->
    <interserver_http_port>9009</interserver_http_port>

    <!-- Hostname that is used by other replicas to request this server.
         If not specified, than it is determined analoguous to 'hostname -f' command.
         This setting could be used to switch replication to another network interface.
      -->
    <!--
    <interserver_http_host>example.yandex.ru</interserver_http_host>
    -->

    <!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->
    <!-- <listen_host>::</listen_host> -->
    <!-- Same for hosts with disabled ipv6: -->
    <!-- <listen_host>0.0.0.0</listen_host> -->

    <!-- Default values - try listen localhost on ipv4 and ipv6: -->
    <!--
    <listen_host>::1</listen_host>
    <listen_host>127.0.0.1</listen_host>
    -->

    <max_connections>4096</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>

    <!-- Maximum number of concurrent queries. -->
    <max_concurrent_queries>100</max_concurrent_queries>

    <!-- Set limit on number of open files (default: maximum). This setting makes sense on Mac OS X because getrlimit() fails to retrieve
         correct maximum value. -->
    <!-- <max_open_files>262144</max_open_files> -->

    <!-- Size of cache of uncompressed blocks of data, used in tables of MergeTree family.
         In bytes. Cache is single for server. Memory is allocated only on demand.
         Cache is used when 'use_uncompressed_cache' user setting turned on (off by default).
         Uncompressed cache is advantageous only for very short queries and in rare cases.
      -->
    <uncompressed_cache_size>8589934592</uncompressed_cache_size>

    <!-- Approximate size of mark cache, used in tables of MergeTree family.
         In bytes. Cache is single for server. Memory is allocated only on demand.
         You should not lower this value.
      -->
    <mark_cache_size>5368709120</mark_cache_size>


    <!-- Path to data directory, with trailing slash. -->
    <path>/var/lib/clickhouse/</path>

    <!-- Path to temporary data for processing hard queries. -->
    <tmp_path>/var/lib/clickhouse/tmp/</tmp_path>

    <!-- Path to configuration file with users, access rights, profiles of settings, quotas. -->
    <users_config>users.xml</users_config> 
    <!-- <users>
    <default>
            <password_sha256_hex>8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92</password_sha256_hex>
            <networks incl="networks" replace="replace">
                <ip>127.0.0.1/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>
        <ck>
            <password_sha256_hex>8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92</password_sha256_hex>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </ck>
    </users> -->
    <!-- Default profile of settings.. -->
    <default_profile>default</default_profile>

    <!-- Default database. -->
    <default_database>default</default_database>

    <!-- Server time zone could be set here.

         Time zone is used when converting between String and DateTime types,
          when printing DateTime in text formats and parsing DateTime from text,
          it is used in date and time related functions, if specific time zone was not passed as an argument.

         Time zone is specified as identifier from IANA time zone database, like UTC or Africa/Abidjan.
         If not specified, system time zone at server startup is used.

         Please note, that server could display time zone alias instead of specified name.
         Example: W-SU is an alias for Europe/Moscow and Zulu is an alias for UTC.
    -->
    <!-- <timezone>Europe/Moscow</timezone> -->

    <!-- You can specify umask here (see "man umask"). Server will apply it on startup.
         Number is always parsed as octal. Default umask is 027 (other users cannot read logs, data files, etc; group can only read).
    -->
    <!-- <umask>022</umask> -->

    <!-- Configuration of clusters that could be used in Distributed tables.
         https://clickhouse.yandex/reference_en.html#Distributed
      -->
    <remote_servers incl="clickhouse_remote_servers" >
        <!-- Test only shard config for testing distributed storage -->
        <test_shard_localhost>
            <shard>
                <replica>
                    <host>localhost</host>
                    <port>9000</port>
                </replica>
            </shard>
        </test_shard_localhost>
    </remote_servers>


    <!-- If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.
         By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.
         Values for substitutions are specified in /yandex/name_of_substitution elements in that file.
      -->

    <!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
         Optional. If you don't use replicated tables, you could omit that.

         See https://clickhouse.yandex/reference_en.html#Data%20replication
      -->
    <zookeeper incl="zookeeper-servers" optional="true" />

    <!-- Substitutions for parameters of replicated tables.
          Optional. If you don't use replicated tables, you could omit that.

         See https://clickhouse.yandex/reference_en.html#Creating%20replicated%20tables
      -->
    <macros incl="macros" optional="true" />


    <!-- Reloading interval for embedded dictionaries, in seconds. Default: 3600. -->
    <builtin_dictionaries_reload_interval>3600</builtin_dictionaries_reload_interval>


    <!-- Maximum session timeout, in seconds. Default: 3600. -->
    <max_session_timeout>3600</max_session_timeout>

    <!-- Default session timeout, in seconds. Default: 60. -->
    <default_session_timeout>60</default_session_timeout>

    <!-- Sending data to Graphite for monitoring. Several sections can be defined. -->
    <!--
        interval - send every X second
        root_path - prefix for keys
        hostname_in_path - append hostname to root_path (default = true)
        metrics - send data from table system.metrics
        events - send data from table system.events
        asynchronous_metrics - send data from table system.asynchronous_metrics
    -->
    <!--
    <graphite>
        <host>localhost</host>
        <port>42000</port>
        <timeout>0.1</timeout>
        <interval>60</interval>
        <root_path>one_min</root_path>
        <hostname_in_path>true<hostname_in_path>

        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </graphite>
    <graphite>
        <host>localhost</host>
        <port>42000</port>
        <timeout>0.1</timeout>
        <interval>1</interval>
        <root_path>one_sec</root_path>

        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>false</asynchronous_metrics>
    </graphite>
    -->


    <!-- Query log. Used only for queries with setting log_queries = 1. -->
    <query_log>
        <!-- What table to insert data. If table is not exist, it will be created.
             When query log structure is changed after system update,
              then old table will be renamed and new table will be created automatically.
        -->
        <database>system</database>
        <table>query_log</table>

        <!-- Interval of flushing data. -->
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>


    <!-- Uncomment if use part_log
    <part_log>
        <database>system</database>
        <table>part_log</table>

        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </part_log>
    -->


    <!-- Parameters for embedded dictionaries, used in Yandex.Metrica.
         See https://clickhouse.yandex/reference_en.html#Internal%20dictionaries
    -->

    <!-- Path to file with region hierarchy. -->
    <!-- <path_to_regions_hierarchy_file>/opt/geo/regions_hierarchy.txt</path_to_regions_hierarchy_file> -->

    <!-- Path to directory with files containing names of regions -->
    <!-- <path_to_regions_names_files>/opt/geo/</path_to_regions_names_files> -->


    <!-- Configuration of external dictionaries. See:
         https://clickhouse.yandex/reference_en.html#External%20Dictionaries
    -->
    <dictionaries_config>*_dictionary.xml</dictionaries_config>

    <!-- Uncomment if you want data to be compressed 30-100% better.
         Don't do that if you just started using ClickHouse.
      -->
    <compression incl="clickhouse_compression">
    <!--
        <!- - Set of variants. Checked in order. Last matching case wins. If nothing matches, lz4 will be used. - ->
        <case>

            <!- - Conditions. All must be satisfied. Some conditions may be omitted. - ->
            <min_part_size>10000000000</min_part_size>        <!- - Min part size in bytes. - ->
            <min_part_size_ratio>0.01</min_part_size_ratio>   <!- - Min size of part relative to whole table size. - ->

            <!- - What compression method to use. - ->
            <method>zstd</method>
        </case>
    -->
    </compression>

    <!-- Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster.
         Works only if ZooKeeper is enabled. Comment it if such functionality isn't required. -->
    <distributed_ddl>
        <!-- Path in ZooKeeper to queue with DDL queries -->
        <path>/clickhouse/task_queue/ddl</path>
    </distributed_ddl>

    <!-- Settings to fine tune MergeTree tables. See documentation in source code, in MergeTreeSettings.h -->
    <!--
    <merge_tree>
        <max_suspicious_broken_parts>5</max_suspicious_broken_parts>
    </merge_tree>
    -->

    <!-- Protection from accidental DROP.
         If size of a MergeTree table is greater than max_table_size_to_drop (in bytes) than table could not be dropped with any DROP query.
         If you want do delete one table and don't want to restart clickhouse-server, you could create special file <clickhouse-path>/flags/force_drop_table and make DROP once.
         By default max_table_size_to_drop is 50GB, max_table_size_to_drop=0 allows to DROP any tables.
         Uncomment to disable protection.
    -->
    <!-- <max_table_size_to_drop>0</max_table_size_to_drop> -->

    <!-- Example of parameters for GraphiteMergeTree table engine -->
    <graphite_rollup_example>
        <pattern>
            <regexp>click_cost</regexp>
            <function>any</function>
            <retention>
                <age>0</age>
                <precision>3600</precision>
            </retention>
            <retention>
                <age>86400</age>
                <precision>60</precision>
            </retention>
        </pattern>
        <default>
            <function>max</function>
            <retention>
                <age>0</age>
                <precision>60</precision>
            </retention>
            <retention>
                <age>3600</age>
                <precision>300</precision>
            </retention>
            <retention>
                <age>86400</age>
                <precision>3600</precision>
            </retention>
        </default>
    </graphite_rollup_example>

    <!-- Directory in <clickhouse-path> containing schema files for various input formats.
         The directory will be created if it doesn't exist.
      -->
    <format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>
</yandex>
# 上面配置的详细说明,再补充,直接补充到每行后面即可


# 写users.xml
[root@XXXXXXX clickhouse_server]# vim users.xml
<?xml version="1.0"?>
<yandex>
    <!-- Profiles of settings. -->
    <profiles>
        <!-- Default settings. -->
        <default>
            <!-- Maximum memory usage for processing single query, in bytes. -->
            <max_memory_usage>10000000000</max_memory_usage>

            <!-- Use cache of uncompressed blocks of data. Meaningful only for processing many of very short queries. -->
            <use_uncompressed_cache>0</use_uncompressed_cache>

            <!-- How to choose between replicas during distributed query processing.
                 random - choose random replica from set of replicas with minimum number of errors
                 nearest_hostname - from set of replicas with minimum number of errors, choose replica
                  with minumum number of different symbols between replica's hostname and local hostname
                  (Hamming distance).
                 in_order - first live replica is choosen in specified order.
            -->
            <load_balancing>random</load_balancing>
        </default>

        <!-- Profile that allows only read queries. -->
        <readonly>
            <readonly>1</readonly>
        </readonly>
    </profiles>

    <!-- Users and ACL. -->
    <users>
        <!-- If user name was not specified, 'default' user is used. -->
        <default>
            <!-- Password could be specified in plaintext or in SHA256 (in hex format).

                 If you want to specify password in plaintext (not recommended), place it in 'password' element.
                 Example: <password>qwerty</password>.
                 Password could be empty.

                 If you want to specify SHA256, place it in 'password_sha256_hex' element.
                 Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>

                 How to generate decent password:
                 Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
                 In first line will be password and in second - corresponding SHA256.
            -->
            <password></password>

            <!-- List of networks with open access.

                 To open access from everywhere, specify:
                    <ip>::/0</ip>

                 To open access only from localhost, specify:
                    <ip>::1</ip>
                    <ip>127.0.0.1</ip>

                 Each element of list has one of the following forms:
                 <ip> IP-address or network mask. Examples: 213.180.204.3 or 10.0.0.1/8 or 2a02:6b8::3 or 2a02:6b8::3/64.
                 <host> Hostname. Example: server01.yandex.ru.
                     To check access, DNS query is performed, and all received addresses compared to peer address.
                 <host_regexp> Regular expression for host names. Example, ^server\d\d-\d\d-\d\.yandex\.ru$
                     To check access, DNS PTR query is performed for peer address and then regexp is applied.
                     Then, for result of PTR query, another DNS query is performed and all received addresses compared to peer address.
                     Strongly recommended that regexp is ends with $
                 All results of DNS requests are cached till server restart.
            -->
            <networks incl="networks" replace="replace">
                <ip>::1</ip>
        <ip>127.0.0.1</ip>
            </networks>

            <!-- Settings profile for user. -->
            <profile>default</profile>

            <!-- Quota for user. -->
            <quota>default</quota>
        </default>

        <!-- Example of user with readonly access. 说明:下面有两个用户seluser(readonly表示只读权限)和inuser(default表示默认权限)密码如下 -->
        <seluser>
            <password>8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92</password>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
        </seluser>
        <inuser>
            <password>8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92</password>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </inuser>
    </users>

    <!-- Quotas. -->
    <quotas>
        <!-- Name of quota. -->
        <default>
            <!-- Limits for time interval. You could specify many intervals with different limits. -->
            <interval>
                <!-- Length of interval. -->
                <duration>3600</duration>

                <!-- No limits. Just calculate resource usage for time interval. -->
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
</yandex>


# 写docker-compose.yml
[root@XXXXXXX clickhouse_server]# vim docker-compose.yml
version: '3'
services:
  clickhouse-server:
    image: yandex/clickhouse-server
    container_name: clickhouse-server_wj
    hostname: clickhouse-server_wj
    ports:
      - 8123:8123
    expose:
      - 9000
      - 9009
    volumes:
      - ./config.xml:/etc/clickhouse-server/config.xml
      - ./users.xml:/etc/clickhouse-server/users.xml
      - ./data:/var/lib/clickhouse
      - ./log/clickhouse-server.log:/var/log/clickhouse-server/clickhouse-server.log
      - ./log/clickhouse-server.err.log:/var/log/clickhouse-server/clickhouse-server.err.log


# 在当前docker-compose.yml文件所在的目录执行 docker-compose up -d 
[root@XXXXXXX clickhouse_server]# docker-compose up -d
Creating network "clickhouse_server_default" with the default driver
Creating clickhouse-server_wj ... done
# check一下服务是不是起来了,如下:
[root@XXXXXXX clickhouse_server]# docker ps
CONTAINER ID        IMAGE                      COMMAND             CREATED             STATUS              PORTS                                                    NAMES
dc9b85c22b6e        yandex/clickhouse-server   "/entrypoint.sh"    28 seconds ago      Up 27 seconds       9000/tcp, 0.0.0.0:8123->8123/tcp, 9009/tcp               clickhouse-server_wj
# 然后通过客户端连接之
[root@XXXXXXX clickhouse_server]# docker run -it --rm --link clickhouse-server_wj:clickhouse-server --net clickhouse_server_default yandex/clickhouse-client --host clickhouse-server --user seluser --password 8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92
ClickHouse client version 19.1.9.
Connecting to clickhouse-server:9000 as user seluser.
Connected to ClickHouse server version 19.1.10 revision 54413.

clickhouse-server_wj :)
clickhouse-server_wj :) show databases;

SHOW DATABASES

┌─name────┐
│ default │
│ system  │
└─────────┘

2 rows in set. Elapsed: 0.017 sec.

# 上面命令中的其他参数不解释,值得一提的是--net clickhouse_server_default 这个net参数的名字从哪来的,可以查看运行的clickhouse-server_wj服务端容器的详细信息来拿到
[root@XXXXXXX clickhouse_server]# docker inspect dc9b85c22b6e    重点看Networks的部分,如下:映射的就是clickhouse_server_default
...(此处省略n行)...
           "Networks": {
                "clickhouse_server_default": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "clickhouse-server",
                        "dc9b85c22b6e"
                    ],
                    "NetworkID": "ba7cf08769082f9afbe245f14ab2d7f9d7fc1f011e9d522ed2e75e9f6ebcee3f",
                    "EndpointID": "5abb64ebb7f4fd8641e6e1598e888b44cd26b0b2bba8ef6d28ba62b638de6c56",
                    "Gateway": "172.19.0.1",
                    "IPAddress": "172.19.0.2",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:13:00:02",
                    "DriverOpts": null
                }
            }

至此,就可以愉快的玩耍了,使用同1中类似操作即可。

猜你喜欢

转载自blog.csdn.net/langhailove_2008/article/details/88249926