Introduction and installation of Doris-01-Doris

Introduction to Doris

Overview

Apache Doris was developed by Baidu Big Data Department (previously called Baidu Palo, and was renamed Doris after being contributed to the Apache community in 2018). Within Baidu, there are more than 200 product lines in use, with more than 1,000 deployed machines, and a single business can handle up to Reaching hundreds of TB.

Apache Doris is a modern MPP (Massively Parallel Processing) analytical database product. Query results can be obtained with only sub-second response time, effectively supporting real-time data analysis. The distributed architecture of Apache Doris is very simple, easy to operate and maintain, and can support very large data sets of more than 10PB.

Apache Doris can meet a variety of data analysis needs, such as fixed historical reports, real-time data analysis, interactive data analysis, and exploratory data analysis.

[The external link image transfer failed. The source site may have an anti-leeching mechanism. It is recommended to save the image and upload it directly (img-fiZl3Poc-1687518900931) (null#pic_center)]

core advantages

  • Simple and easy to use: only two processes are needed for deployment and does not rely on other systems; online cluster expansion and contraction, automatic copy repair; compatible with MySQL protocol, and uses standard SQL;
  • High performance: Relying on the implementation of columnar storage engine, modern MPP architecture, vectorized query engine, pre-polymerized views, and data indexes, it achieves extremely fast performance in low-latency and high-throughput queries;
  • Unified data warehouse: a single system that can simultaneously support real-time data services, interactive data analysis and offline data processing scenarios;
  • Federated query: supports federated query analysis of data lakes such as Hive, Iceberg, Hudi and other databases and MySQL, Elasticsearch and other databases;
  • Multiple imports: supports batch pull imports such as HDFS/S3 and streaming pull imports such as MySQL Binlog/Kafka; supports micro-batch push writing through the HTTP interface and real-time push writing using Insert in JDBC;
  • Rich ecosystem: Spark uses Spark Doris Connector to read and write to Doris; Flink Doris Connector cooperates with Flink CDC to write data to Doris exactly once; using DBT Doris Adapter, data conversion can be easily completed in Doris.

scenes to be used

As shown in the figure below, after various data integration and processing, the data source is usually stored in the real-time data warehouse Doris and offline lake warehouse (Hive, Iceberg, Hudi). Apache Doris is widely used in the following scenarios.

  • report analysis
    • Real-time dashboards
    • Reports for internal analysts and managers
    • User or customer-oriented high-concurrency report analysis (Customer Facing Analytics). For example, site analysis for website owners and advertising reports for advertisers usually require thousands of QPS for concurrency, and query latency requires millisecond-level response. JD.com, a well-known e-commerce company, uses Apache Doris in advertising reports, writing 10 billion rows of data every day, with tens of thousands of concurrent queries per QPS, and the 99th percentile query delay is 150ms.
  • Ad-hoc Query: Self-service analysis for analysts. The query mode is not fixed and requires high throughput. Xiaomi has built a growth analysis platform (Growing Analytics, GA) based on Doris, which uses user behavior data to conduct business growth analysis. The average query delay is 10s, the 95th percentile query delay is within 30s, and the daily SQL query volume is tens of thousands. strip.
  • Unified data warehouse construction: a platform meets the needs of unified data warehouse construction and simplifies the cumbersome big data software stack. Haidilao's unified data warehouse based on Doris has replaced the old architecture composed of Spark, Hive, Kudu, Hbase, and Phoenix, and the architecture has been greatly simplified.
  • Data Lake Federated Query: Federate analysis of data in Hive, Iceberg, and Hudi through external appearance, greatly improving query performance while avoiding data copying.

Architecture

The architecture of Doris is very simple. It only has two roles and two processes: FE (Frontend) and BE (Backend). It does not rely on external components, making it easy to deploy and operate. Both FE and BE can be linearly expanded.

  • FE (Frontend) : stores and maintains cluster metadata ; responsible for receiving and parsing query requests, planning query plans, scheduling query execution, and returning query results. There are three main roles:

    Leader and Follower: Mainly used to achieve high availability of metadata, ensuring that in the event of a single node failure, metadata can be restored online in real time without affecting the entire service.

    Observer: used to expand query nodes and also function as metadata backup. If you find that the cluster pressure is very high and you need to expand the entire query capability, you can add observer nodes. The observer does not participate in any writing, only reading.

  • BE ( Backend ): Responsible for the storage and calculation of physical data; execute queries in a distributed manner based on the physical plan generated by FE.

    The reliability of the data is guaranteed by BE, and BE will store multiple copies or three copies of the entire data. The number of copies can be dynamically adjusted according to demand.

Both types of processes can be scaled horizontally, and a single cluster can support hundreds of machines and tens of petabytes of storage capacity. And these two types of processes ensure high availability of services and high reliability of data through consistency protocols. This highly integrated architectural design greatly reduces the operation and maintenance costs of a distributed system.

  • MySQL Client : Doris uses the MySQL protocol, users can directly access Doris using any MySQL ODBC/JDBC and MySQL client.

  • Broker : Broker is an independent stateless process. Encapsulates the file system interface and provides Doris with the ability to read files in remote storage systems, including HDFS, S3, BOS, etc.

Technical overview

  • In terms of interface , Doris adopts the MySQL protocol, is highly compatible with MySQL syntax, and supports standard SQL. Users can access Doris through various client tools, and supports seamless integration with BI tools.

  • In terms of storage engine , Doris uses columnar storage to encode, compress and read data in columns, which can achieve extremely high compression ratios while reducing the scanning of a large amount of irrelevant data, thereby making more effective use of IO and CPU resources.

  • Doris also supports rich index structures to reduce data scanning:

    • Sorted Compound Key Index allows you to specify up to three columns to form a composite sort key. Through this index, data can be effectively trimmed, thereby better supporting high-concurrency reporting scenarios.
    • Z-order Index: Using Z-order index, you can efficiently perform range queries on any combination of fields in the data model.
    • Min/Max: Efficiently filter equality and range queries for numeric types
    • Bloom Filter: Very effective for equivalent filtering and clipping of high cardinality columns
    • Invert Index: enables fast retrieval of any field
  • In terms of storage models, Doris supports multiple storage models and has made targeted optimizations for different scenarios:

    • Aggregate Key model: Value columns with the same Key are merged to greatly improve performance through early aggregation.
    • Unique Key model: Key is unique, data coverage of the same Key enables row-level data update
    • Duplicate Key model: detailed data model, meeting the detailed storage of fact tables
  • In terms of query engine , Doris adopts the MPP model, which is executed in parallel between nodes and within nodes. It also supports distributed Shuffle Join of multiple large tables, so that it can better handle complex queries.

    The Doris query engine is a vectorized query engine. All memory structures can be laid out in columnar format, which can significantly reduce virtual function calls, improve cache hit rates, and efficiently utilize SIMD instructions. In wide table aggregation scenarios, the performance is 5-10 times that of non-vectorized engines.

    Doris adopts Adaptive Query Execution technology, which can dynamically adjust the execution plan based on Runtime Statistics. For example, through Runtime Filter technology, it can generate and push Filter to the Probe side at runtime, and it can automatically penetrate the Filter to the lowest Scan node on the Probe side. , thereby significantly reducing the amount of Probe data and accelerating Join performance. Doris' Runtime Filter supports In/Min/Max/Bloom Filter.

  • In terms of optimizers , Doris uses an optimization strategy that combines CBO and RBO. RBO supports constant folding, subquery rewriting, predicate pushdown, etc. CBO supports Join Reorder. At present, CBO is still under continuous optimization, mainly focusing on more accurate collection and derivation of statistical information, more accurate cost model estimation, etc.

Compile and install

To install Doris, you need to compile it from source code first. There are two main ways: compiling using Docker development image (recommended) and compiling directly.

For direct compilation, please refer to the official website: https://doris.apache.org/zh-CN/installing/compilation.html

Compile using Docker development image

(1) Download the source code and unzip it

wget https://dist.apache.org/repos/dist/dev/incubator/doris/0.15/0.15.0-rc04/apache-doris-0.15.0-incubating-src.tar.gz

Unzip to /opt/software/:

tar -zxvf apache-doris-0.15.0-incubating-src.tar.gz -C /opt/software

(2) Download the Docker image

docker pull apache/incubator-doris:build-env-for-0.15.0

You can use the following command to check whether the image download is complete.

docker images

(3) Mount the local directory to run the image

Run the image by mounting the local Doris source code directory, so that the compiled output binary file will be stored in the host and will not disappear when the image exits. At the same time, mount the .m2 directory of maven in the image to the host directory to prevent the maven dependent libraries from being downloaded repeatedly every time the image compilation is started.

docker run -it \ -v /opt/software/.m2:/root/.m2 \ -v /opt/software/apache-doris-0.15.0-incubating-src/:/root/apache-doris-0.15.0-incubating-src/ \
apache/incubator-doris:build-env-for-0.15.0

(4) Switch to JDK 8

alternatives --set java java-1.8.0-openjdk.x86_64
alternatives --set javac java-1.8.0-openjdk.x86_64
export JAVA_HOME=/usr/lib/jvm/java-1.8.0

(5) Prepare Maven dependencies

The compilation process will download a lot of dependencies. You can extract the doris-repo.tar.gz we prepared to the corresponding directory mounted by Docker to avoid the process of downloading dependencies and speed up compilation.

tar -zxvf doris-repo.tar.gz -C /opt/software

You can also speed up downloading by specifying the Alibaba Cloud image warehouse:

vim /opt/software/apache-doris-0.15.0-incubating-src/fe/pom.xml<repositories>标签下添加:
<repository>
    <id>aliyun</id>
    <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>
vim /opt/software/apache-doris-0.15.0-incubating-src/be/pom.xml<repositories>标签下添加:
<repository>
    <id>aliyun</id>
    <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>

(6) Compile Doris

sh build.sh

If it is the first time to use build-env-for-0.15.0 or later versions, use the following command when compiling for the first time:

sh build.sh --clean --be --fe --ui

Because the build-env-for-0.15.0 version image has been upgraded with thrift (0.9 -> 0.13), you need to use the –clean command to force the use of the new version of thrift to generate code files, otherwise incompatible code will appear.

Installation requirements

Software and hardware requirements

(1) Linux operating system requirements

Linux system Version
CentOS 7.1 and above
Ubuntu 16.04 and above

(2)Software requirements

software Version
Java 1.8 and above
GCC 4.8.2 and above

(3) Development and testing environment

module CPU Memory disk network Number of instances
Frontend 8 core+ 8GB+ SSD or SATA, 10GB+* Gigabit Ethernet 1
Backend 8 core+ 16GB+ SSD or SATA, 50GB+ * Gigabit Ethernet 1-3 *

(4) Production environment

module CPU Memory disk network Number of instances (minimum requirement)
Frontend 16 core+ 64GB+ SSD or RAID card, 100GB+ * 10G network card 1-5 *
Backend 16 core+ 64GB+ SSD or SATA, 100G+ * 10G network card 10-100 *

Note 1:

  1. The disk space of FE is mainly used to store metadata, including logs and images. Typically ranges from a few hundred MB to several GB.
  2. BE's disk space is mainly used to store user data. The total disk space is calculated as the total user data * 3 (3 copies), and then an additional 40% space is reserved for background compaction and the storage of some intermediate data.
  3. Multiple BE instances can be deployed on a machine, but only one FE can be deployed . If you need 3 copies of data, you need at least 3 machines to deploy one BE instance each (instead of 1 machine to deploy 3 BE instances). It is recommended that FE and BE be separated in the production environment. The clocks of the servers where multiple FEs are located must be consistent (clock deviations of up to 5 seconds are allowed)
  4. The test environment can also be tested with only one BE. In the actual production environment, the number of BE instances directly determines the overall query latency.
  5. Turn off Swap on all deployment nodes.

Note 2: Number of FE nodes

  1. FE roles are divided into Follower and Observer, (Leader is a role elected in the Follower group, hereinafter collectively referred to as Follower).
  2. The FE node data is at least 1 (1 Follower). When deploying 1 Follower and 1 Observer, read high availability can be achieved. When 3 Followers are deployed, read and write high availability (HA) can be achieved.
  3. The number of Followers must be an odd number, and the number of Observers is arbitrary.
  4. Based on past experience, when cluster availability requirements are high (such as providing online services), 3 Followers and 1-3 Observers can be deployed. If it is an offline business, it is recommended to deploy 1 Follower and 1-3 Observers.
  • Usually we recommend about 10 ~ 100 machines to give full play to the performance of Doris (3 of them are deployed with FE (HA) and the remaining ones are deployed with BE)
  • Of course, the performance of Doris is directly related to the number and configuration of nodes. Doris can still run smoothly with at least 4 machines (one FE, three BE, one BE mixed with an Observer FE to provide metadata backup) and low configuration.
  • If FE and BE are co-located, you need to pay attention to resource competition issues and ensure that the metadata directory and data directory belong to different disks.

Port requirements

Instance name port name default port communication direction illustrate
BE be_port 9060 FE --> BE The port of the thrift server on BE, used to receive requests from FE
BE webserver_port 8040 BE <–> BE The port of the http server on BE
BE heartbeat_service_port 9050 FE --> BE Heartbeat service port (thrift) on BE, used to receive heartbeats from FE
BE brpc_port 8060 FE <–> BE, BE <–> BE brpc port on BE, used for communication between BEs
FE http_port 8030 FE <–> FE, user <–> FE http server port on FE
FE rpc_port 9020 BE --> FE, FE <–> FE Thrift server port on FE, the configuration of each fe needs to be consistent
FE query_port 9030 User<–>FE mysql server port on FE
FE edit_log_port 9010 FE <–> FE The port used for communication between bdbje on FE
Broker broker_ipc_port 8000 FE --> Broker, BE --> Broker The thrift server on the Broker is used to receive requests

Note:

  1. When deploying multiple FE instances, ensure that the http_port configuration of FE is the same.
  2. Before deployment, please ensure that each port has access rights in the proper direction.

Cluster deployment

Host 1 Host 2 Host 3
FE(LEADER) FE(FOLLOWER) FE(OBSERVER)
BE BE BE
BROKER BROKER BROKER

It is recommended that FE and BE be separated in the production environment.

(1) Create a directory and copy the compiled files

  • Create directory and copy compiled files

    mkdir /opt/module/apache-doris-0.15.0
    cp -r /opt/software/apache-doris-0.15.0-incubating-src/output 
    /opt/module/apache-doris-0.15.0
    
  • Modify the number of open files (per node)

    sudo vim /etc/security/limits.conf
    * soft nofile 65535
    * hard nofile 65535
    * soft nproc 65535
    * hard nproc 65535
    

    Reboot will take effect permanently, or you can use ulimit -n 65535 to take effect temporarily.

(2) Deploy FE nodes

  • Create the directory where fe metadata is stored

    mkdir /opt/module/apache-doris-0.15.0/doris-meta
    
  • Modify fe configuration file

    vim /opt/module/apache-doris-0.15.0/fe/conf/fe.conf
    #配置文件中指定元数据路径:
    meta_dir = /opt/module/apache-doris-0.15.0/doris-meta
    #修改绑定 ip(每台机器修改成自己的 ip)
    priority_networks = 192.168.8.101/24
    

    Notice:

    • In the production environment, it is strongly recommended to specify a separate directory and not place it in the Doris installation directory. It is best to use a separate disk (preferably if there is an SSD).
    • If the machine has multiple IPs, such as internal and external networks, virtual machine docker, etc., IP binding is required to correctly identify them.
    • JAVA_OPTS The default Java maximum heap memory is 4GB. It is recommended that the production environment be adjusted to more than 8G.
  • Start the FE of hadoop1

    /opt/module/apache-doris-0.15.0/fe/bin/start_fe.sh --daemon
    

(3) Configure BE node

  • Distribute BE

    scp -r /opt/module/apache-doris-0.15.0/be hadoop2:/opt/module
    scp -r /opt/module/apache-doris-0.15.0/be hadoop3:/opt/module
    
  • Create BE data storage directory (each node)

    mkdir /opt/module/apache-doris-0.15.0/doris-storage1
    mkdir /opt/module/apache-doris-0.15.0/doris-storage2
    
  • Modify BE's configuration file (each node)

    vim /opt/module/apache-doris-0.15.0/be/conf/be.conf
    #配置文件中指定数据存放路径:
    storage_root_path = /opt/module/apache-doris-0.15.0/doris-storage1;/opt/module/apache-doris-0.15.0/doris-storage2 #修改绑定 ip(每台机器修改成自己的 ip)
    priority_networks = 192.168.8.101/24
    

Notice:

  • storage_root_path defaults to be/storage and needs to be created manually. Use semicolons in English to separate multiple paths (do not add them after the last directory).

  • The medium where the directory is stored can be distinguished by the path, HDD or SSD. Capacity limits can be added at the end of each path, separated by English status commas, such as:

    illustrate:

    /home/disk1/doris.HDD,50, means the storage limit is 50GB, HDD;

    /home/disk2/doris.SSD,10, storage limit is 10GB, SSD;

    /home/disk2/doris, the storage limit is the maximum disk capacity, the default is HDD

  • If the machine has multiple IPs, such as internal and external networks, virtual machine docker, etc., IP binding is required to correctly identify them.

(4) Add all BE nodes in FE

BE nodes need to be added in FE before they can join the cluster. You can use mysql-client to connect to FE.

  • Install MySQL Client:

    • Create a directory:

      mkdir /opt/software/mysql-client/
      
    • Upload the following three rpm packages to /opt/software/mysql-client/:

      mysql-community-client-5.7.28-1.el7.x86_64.rpm

      mysql-community-common-5.7.28-1.el7.x86_64.rpm

      mysql-community-libs-5.7.28-1.el7.x86_64.rpm

    • Check whether the current system has installed MySQL

      sudo rpm -qa|grep mariadb
      #如果存在,先卸载
      sudo rpm -e --nodeps mariadb mariadb-libs mariadb-server
      
    • Install

      rpm -ivh /opt/software/mysql-client/*
      
  • Use MySQL Client to connect to FE:

    mysql -h hadoop1 -P 9030 -uroot
    

    By default, root has no password. Use the following command to modify the root password.

    SET PASSWORD FOR 'root' = PASSWORD('000000');
    
  • Add BE

    ALTER SYSTEM ADD BACKEND "hadoop1:9050";
    ALTER SYSTEM ADD BACKEND "hadoop2:9050";
    ALTER SYSTEM ADD BACKEND "hadoop3:9050";
    
  • View BE status

    SHOW PROC '/backends';
    

(5) Start BE

  • Start BE (per node)

    /opt/module/apache-doris-0.15.0/be/bin/start_be.sh --daemon
    
  • View BE status

    mysql -h hadoop1 -P 9030 -uroot -p
    SHOW PROC '/backends';
    

    Alive is true, indicating that the BE node is alive.

(6) Deploy FS_Broker (optional)

Broker is deployed independently of Doris in the form of a plug-in. If you need to import data from a third-party storage system, you need to deploy the corresponding Broker. By default, fs_broker for reading HDFS, Baidu Cloud BOS, and Amazon S3 is provided. fs_broker is stateless, and it is recommended that each FE and BE node deploy a Broker.

  • Compile FS_BROKER and copy files

    Enter the fs_brokers directory under the source code directory and use sh build.sh to compile

    Copy the corresponding Broker directory under the output directory of the source code fs_broker to all nodes that need to be deployed, and rename it: apache_hdfs_broker. It is recommended to keep it at the same level as the BE or FE directory.

  • Start Broker

    /opt/module/apache-doris-0.15.0/apache_hdfs_broker/bin/start_broker.sh --daemon
    
  • Add Broker: To let Doris' FE and BE know which nodes the Broker is on, add the Broker node list through the sql command.

    Use mysql-client to connect to the started FE and execute the following command:

    mysql -h hadoop1 -P 9030 -uroot -p
    ALTER SYSTEM ADD BROKER broker_name "hadoop1:8000","hadoop2:8000","hadoop3:8000";
    

    Among them, broker_host is the IP address of the node where the Broker is located; broker_ipc_port is in conf/apache_hdfs_broker.conf in the Broker configuration file.

  • Check Broker status

    Use mysql-client to connect to any started FE and execute the following command to view the Broker status:

    SHOW PROC "/brokers";
    

    Note: In a production environment, all instances should be started using a daemon process to ensure that the process will be automatically pulled up after exiting, such as Supervisor (opens new window). If you need to use a daemon to start, in versions 0.9.0 and earlier, you need to modify each start_xx.sh script to remove the last & symbol. Starting from version 0.10.0, just call sh start_xx.sh to start.

Scaling up and down

Doris can easily expand and shrink FE, BE, and Broker instances.

FE expansion and contraction

High availability of FE can be achieved by expanding FE to more than 3 nodes.

(1) After logging in to the client using MySQL, you can use the sql command to check the FE status. Currently, there is only one FE

mysql -h hadoop1 -P 9030 -uroot -p
SHOW PROC '/frontends';

You can also monitor through page access, access 8030, the account is root, and the password is blank by default and does not need to be filled in.

(2) Add FE node

FE is divided into three roles: Leader, Follower and Observer. By default, a cluster can have only one Leader and multiple Followers and Observers. The Leader and Follower form a Paxos selection group. If the Leader goes down, the remaining Followers will automatically select a new Leader to ensure high write availability. Observer synchronizes Leader's data, but does not participate in elections.

If only one FE is deployed, the FE will be the Leader by default. On this basis, several Followers and Observers can be added.

ALTER SYSTEM ADD FOLLOWER "hadoop2:9010";
ALTER SYSTEM ADD OBSERVER "hadoop3:9010";

(3) Configure and start Follower and Observer

When starting for the first time, the startup command needs to add the parameter –helper leader host: edit_log_port:

  • Distribute FE and modify the configuration of FE:

    scp -r /opt/module/apache-doris-0.15.0/fe hadoop2:/opt/module/apache-doris-0.15.0
    scp -r /opt/module/apache-doris-0.15.0/fe hadoop3:/opt/module/apache-doris-0.15.0
    
  • Start Follower in hadoop2:

    /opt/module/apache-doris-0.15.0/fe/bin/start_fe.sh --helper hadoop1:9010 --daemon
    
  • Start Observer in hadoop3

    /opt/module/apache-doris-0.15.0/fe/bin/start_fe.sh --helper hadoop1:9010 --daemon
    

(4) Check the running status

Use mysql-client to connect to any started FE.

SHOW PROC '/frontends';

(5) Delete FE node command

ALTER SYSTEM DROP FOLLOWER[OBSERVER] "fe_host:edit_log_port";

Note: When deleting Follower FE, ensure that the final remaining Follower (including Leader) nodes are an odd number.

BE expansion and contraction

(1) Add BE node

On the MySQL client, add BE nodes through the ALTER SYSTEM ADD BACKEND command.

(2) DROP method to delete BE nodes (not recommended)

ALTER SYSTEM DROP BACKEND "be_host:be_heartbeat_service_port";

Note: DROP BACKEND will directly delete the BE, and the data on it cannot be restored! ! ! Therefore, we strongly do not recommend using DROP BACKEND to delete BE nodes. When you use this statement, there will be corresponding tips to prevent misoperation.

(3) DECOMMISSION method to delete BE nodes (recommended)

ALTER SYSTEM DECOMMISSION BACKEND "be_host:be_heartbeat_service_port";
  • This command is used to safely delete BE nodes. After the command is issued, Doris will try to migrate the data on the BE to other BE nodes. When all data migration is completed, Doris will automatically delete the node.
  • This command is an asynchronous operation. After execution, you can see that the isDecommission status of the BE node is true through SHOW PROC '/backends';. Indicates that the node is going offline.
  • This command may not be executed successfully. For example, when the remaining BE storage space is not enough to accommodate the data on the offline BE, or the number of remaining machines does not meet the minimum number of copies, the command cannot be completed, and BE will always be in a state where isDecommission is true.
  • The progress of DECOMMISSION can be viewed through SHOW PROC '/backends'; TabletNum, if it is in progress, TabletNum will continue to decrease.
  • This operation can be canceled with the following command: CANCEL DECOMMISSION BACKEND “be_host:be_heartbeat_service_port”; after cancellation, the data on the BE will maintain the current remaining data volume. Doris will then re-balance the load.

Broker expansion and contraction

There is no hard requirement for the number of Broker instances. Usually one can be deployed per physical machine. Adding and removing Brokers can be done with the following commands:

ALTER SYSTEM ADD BROKER broker_name "broker_host:broker_ipc_port"; 
ALTER SYSTEM DROP BROKER broker_name "broker_host:broker_ipc_port"; 
ALTER SYSTEM DROP ALL BROKER broker_name;

Broker is a stateless process and can be started and stopped at will. Of course, after stopping, the job running on it will fail, just try again.

Guess you like

Origin blog.csdn.net/qq_44766883/article/details/131353420