www.colben.cn/content/post/ch-install.md
2021-11-14 14:32:08 +08:00

251 lines
7.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: "CentOS7 安装 ClickHouse 集群"
date: 2020-09-23T10:18:00+08:00
lastmod: 2020-10-10T01:40:00+08:00
tags: []
categories: ["clickhouse"]
---
# 环境
## Zookeeper 服务器
eth0 IP | eth1 IP | 操作系统 | ZK 版本 | myid
---- | ---- | ---- | ---- | ----
10.0.4.101 | 10.1.4.101 | CentOS7.8 | 3.4.14 | 101
10.0.4.102 | 10.1.4.102 | CentOS7.8 | 3.4.14 | 102
10.0.4.103 | 10.1.4.103 | CentOS7.8 | 3.4.14 | 103
- eth0 网卡用于向客户端提供服务eth1 网卡用于 Zookeeper 集群内部通信
- 配置时间同步,关闭 selinux 和 firewalld
## ClickHouse 服务器
eth0 IP | eth1 IP | 操作系统 | CH 版本 | shard 值 | replica 值
---- | ---- | ---- | ---- | ---- | ----
10.0.4.181 | 10.1.4.181 | CentOS7.8 | 20.3 LTS | 1 | 10.1.4.181
10.0.4.182 | 10.1.4.182 | CentOS7.8 | 20.3 LTS | 1 | 10.1.4.182
10.0.4.183 | 10.1.4.183 | CentOS7.8 | 20.3 LTS | 2 | 10.1.4.183
10.0.4.184 | 10.1.4.184 | CentOS7.8 | 20.3 LTS | 2 | 10.1.4.184
10.0.4.185 | 10.1.4.185 | CentOS7.8 | 20.3 LTS | 3 | 10.1.4.185
10.0.4.186 | 10.1.4.186 | CentOS7.8 | 20.3 LTS | 3 | 10.1.4.186
- eth0 网卡用于向客户端提供服务eth1 网卡用于 ClickHouse 集群内部通信
- 配置时间同步,关闭 selinux 和 firewalld
# 安装 Zookeeper 集群
- ClickHouse 集群依赖 zookeeper 管理集群配置
- 安装过程参考: [CentOS7 安装 zookeeper 集群](/post/zk-install/)
- 启动 zookeeper 集群zookeeper 正常运行后,才能进行后续步骤
# 安装 ClickHouse 集群
## 配置 ClickHouse yum 源
- 在每台 ClickHouse 服务器上执行如下操作
- 生成 clickhouse.repo 文件
```bash
echo '[clickhouse-lts]
name=ClickHouse - LTS Repository
baseurl=https://mirrors.tuna.tsinghua.edu.cn/clickhouse/rpm/lts/$basearch/
gpgkey=https://mirrors.tuna.tsinghua.edu.cn/clickhouse/CLICKHOUSE-KEY.GPG
gpgcheck=1
enabled=1
EOF
' > /etc/yum.repos.d/clickhouse.repo
```
- 重建 yum 缓存
```bash
yum clean all
yum makecache fast
```
## 安装 ClickHouse
- 在每台 ClickHouse 服务器上执行如下操作
- 安装 clickhouse-server 和 clickhouse-client
```bash
yum install clickhouse-server clickhouse-client
```
## 修改 ClickHouse 配置
- 在每台 ClickHouse 服务器上执行如下操作
- 我没用 /etc/metrika.xml 和 config.d 子目录,直接修改的 config.xml先备份
```bash
cd /etc/clickhouse-server/
cp config.xml config.xml.origin
```
- 编辑 /etc/clickhouse-server/config.xml修改部分如下
```xml
<!-- 各节点用于集群内部通信的 eth1 网卡 IP -->
<interserver_http_host>10.1.4.181</interserver_http_host> <!-- 10.0.4.181 -->
<interserver_http_host>10.1.4.182</interserver_http_host> <!-- 10.0.4.182 -->
<interserver_http_host>10.1.4.183</interserver_http_host> <!-- 10.0.4.183 -->
<interserver_http_host>10.1.4.184</interserver_http_host> <!-- 10.0.4.184 -->
<interserver_http_host>10.1.4.185</interserver_http_host> <!-- 10.0.4.185 -->
<interserver_http_host>10.1.4.186</interserver_http_host> <!-- 10.0.4.186 -->
<!-- 监听全部 IPV4 地址 -->
<listen_host>0.0.0.0</listen_host>
<!-- 默认数据目录,建议在该目录下挂载单独硬盘 -->
<path>/var/lib/clickhouse/</path>
<!-- 自定义存储策略,这里随便写了个 jbod非必须配置 -->
<storage_configuration>
<disks>
<default>
<keep_free_space_bytes>1073741824</keep_free_space_bytes>
</default>
<disk1>
<path>/clickhouse/disk1/</path> <!-- 该目录建议挂载单独硬盘 -->
</disk1>
<disk2>
<path>/clickhouse/disk2/</path> <!-- 该目录建议挂载单独硬盘 -->
</disk2>
</disks>
<policies>
<policy_jbod>
<volumes>
<disk_group>
<disk>disk1</disk>
<disk>disk2</disk>
</disk_group>
</volumes>
</policy_jbod>
</policies>
</storage_configuration>
<!-- 内存锁定 -->
<mlock_executable>true</mlock_executable>
<!-- 集群配置,三分片双副本 -->
<remote_servers>
<cluster_3s2r>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>10.1.4.181</host>
<port>9000</port>
</replica>
<replica>
<host>10.1.4.182</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>10.1.4.183</host>
<port>9000</port>
</replica>
<replica>
<host>10.1.4.184</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>10.1.4.185</host>
<port>9000</port>
</replica>
<replica>
<host>10.1.4.186</host>
<port>9000</port>
</replica>
</shard>
</cluster_3s2r>
</remote_servers>
<!-- zookeeper 配置 -->
<zookeeper>
<node index="1">
<host>10.0.4.101</host>
<port>2181</port>
</node>
<node index="2">
<host>10.0.4.102</host>
<port>2181</port>
</node>
<node index="3">
<host>10.0.4.103</host>
<port>2181</port>
</node>
</zookeeper>
<!-- 各节点的宏变量 -->
<macros> <!-- 10.0.4.181 -->
<shard>1</shard>
<replica>10.1.4.181</replica>
</macros>
<macros> <!-- 10.0.4.182 -->
<shard>1</shard>
<replica>10.1.4.182</replica>
</macros>
<macros> <!-- 10.0.4.183 -->
<shard>2</shard>
<replica>10.1.4.183</replica>
</macros>
<macros> <!-- 10.0.4.184 -->
<shard>2</shard>
<replica>10.1.4.184</replica>
</macros>
<macros> <!-- 10.0.4.185 -->
<shard>3</shard>
<replica>10.1.4.185</replica>
</macros>
<macros> <!-- 10.0.4.186 -->
<shard>3</shard>
<replica>10.1.4.186</replica>
</macros>
```
## 启动 ClickHouse
- 在每台 ClickHouse 服务器上执行如下操作
- 启动 clickhouse-server 服务
```bash
systemctl start clickhouse-server
```
# 查看集群状态
- 在任一 ClickHouse 服务器上执行如下操作
- 查询 system.cluster 表
```sql
SELECT * FROM system.clusters;
```
# 简单使用
- 在任意节点上登陆 clickhouse
```bash
clickhouse-client -h 127.0.0.1
```
- 创建数据库
```sql
CREATE DATABASE db1 ON CLUSTER cluser_3s2r;
USE db1;
```
- 创建数据表
```sql
CREATE TABLE db1.t1_local
ON CLUSTER cluster_3s2r (
col1 UInt32,
col2 String
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/t1_local', '{replica}')
ORDER BY (col1)
SETTINGS STORAGE_POLICY='policy_jbod';
```
- 创建数据表对应的分布式代理表
```sql
CREATE TABLE db1.t1
ON CLUSTER cluster_3s2r
AS db1.t1_local
ENGINE = Distributed(cluster_3s2r, db1, t1_local, rand());
```
- 通过分布式代理表写入和查询数据
```sql
INSERT INTO db1.t1 values(1,'aa');
SELECT * FROM db1.t1;
```