高版本ClickHouse管道服务实战及集群调优
一、ZK集群优化:【ZK日志压力不再成为推送瓶颈】
1、ZK集群与CK集群隔离,不参与混布,现阶段节点个数为5,后期可切换成标准3节点ZK
2、ZK集群配置信息调整
1 | A、事务日志和快照日志进行分盘存储,同时调整日志挂载盘所在位置,由原先的/export路径下的200G空间调整为/export/grid下的5.5T |
3、测试过程中的坑
1 | 日志目录的手动创建、myid文件的缺失(根据 Zookeeper Cluster的对应关系进行节点myid文件配置)、version-2文件缺失创建、zookpeer-server.pid文件赋权操作 |
4、ZK 配置项调整内容 【采用三台标N机器 - 10.220.45.72、10.220.45.73、10.220.45.74】
*log4j.properties *
1 | zookeeper.root.logger=INFO, ROLLINGFILE |
zoo_sample.cfg
1 | tickTime=2000 |
zoo.cfg
1 | tickTime=2000 |
定时清理日志脚本
1 | 0 0 * * * sudo find /export/data/zookeeper/data/version-2 -mtime +30 -name "log.*" -exec rm -Rf {} \; |
二、CK集群优化:【标D机器-12块*5.5T磁盘容量,CK推送write平均速率保持在50MB/s ~ 100MB/s,2.6Kw/s ~ 2.9Kw/s】
1、集群调优内容
1 | A、CK单机双实例模式,9000端口和13000端口,01-06为主端口存储盘,07-12为副端口存储盘 |
2、生产同步配置
1 | <max_insert_block_size>4194304</max_insert_block_size> |
1 | sudo mkdir -p /export/grid/01/clickhouse/jbod/9000/ |
config.xml
1 | <logger> |
metrika.xml
1 | <zookeeper-servers> |
users.xml
1 |
|
整个离线表CK推送过程氛围三个阶段:1)RDD :Hive数据源获取及结构转换(时长受限于:数据源标结构、数据质量、有效数据量、5K集群稳定-节点NameNode是否正常)2)Map:待推送数据区间Hash、重分片及推送字段字符串StringTokenizer格式化(Spark最大并发数、repartition数量、CK Hash方式)3)Foreach:数据真实Insert阶段(Insert bacth大小、Thread休眠时间、Spark最大并发数、CK CPU有效承受范围、CK Merge性能-磁盘IO读写速度、推送数据量-offset量+标签数量+标签枚举量
三、推送逻辑优化
代码层面优化 重要
1 | 1、建表语句添加use_minimalistic_part_header_in_zookeeper=1参数配置 |
四、前置数据优化:【前置数据的标准化及数据质量尤为关键、涉及到推送阶段1)和阶段3)的运行时长】
代码层面优化 重要
1 | 整体推送结论如下:全流程宽表推送服务已线上平稳运行一周,整体能保证下午15:00前保证所有表的最新分区维护到OLAP,CK实现T+1的数据更新。 |
五、新CK集群 40节点(20主、20备) + 单节点单实例 + 拆分业务宽表推送时间记录
资源参数:–driver-cores 1 –driver-memory 6G –executor-cores 1 –executor-memory 20G
配置参数:parallelism = 2001;ckMultiple = 55;batchInsert = 500000000;threadTime = 0s
结论:标签宽表的数据结构较大程度地影响推送第一阶段RDD生成时间及shuffle量的大小,(单表400列,下线200列)和(单表200列)两种Hive表结构进行测试对比,后者明显shuffle量会下降,第一阶段耗时也会削减50%左右,同时Hive表的存储方式如果能从ORC调整为PARQUET,预计整体推送时间仍能提升50%以上。
目前推送测试情况如下 标签宽表10亿行*300列,CK推送数据量膨胀比为10倍的情况下,整表推送时间的基线保持在(25min + 5min = 30min)
标准表表名 2020-12-06 | 宽表行数 | 宽表列数 | Hive数据量 | CK_dist数据量 | 推送时间 |
---|---|---|---|---|---|
dmxupf_width_biz_v3_jd_usr_pin_sec_01_s_d 8.13min(历史) | 3143411674 | 26 | 41.0G | 8355452 | 2020-12-08 16:10:26 (4.6min + 24s + 9s + 10s = 5.3min)CPU 25% |
dmxupf_width_biz_v3_jd_usr_pin_jdjr_01_s_d 63.3min(历史) | 1036113721 | 305 | 35.2G | 21057585966 | 2020-12-08 14:40:08(14min + 13min + 1min + 3min = 31min)CPU 40% |
dmxupf_width_biz_v3_jd_usr_pin_basic_01_s_d 71.5min(历史) | 1522254399 | 26 | 32.3G | 8565912577 | 2020-12-08 15:36:22(6.2min + 3.9min + 51s + 3min = 14min)CPU 25% |
dmxupf_width_biz_v3_jd_usr_pin_fin_01_s_d | 476631524 | 590 | 22.6G | 11953291531 | 2020-12-08 16:23:50(17min + 5.9min + 38s + 4.3min = 28min)CPU 30% |
dmxupf_width_biz_v3_jd_usr_pin_fin_02_s_d 39.25min(历史) | 391740122 | 149 | 7.3G | 2813360320 | 2020-12-08 16:33:39(14min + 2.2min + 26s + 55s = 18min)CPU 15% |
dmxupf_width_biz_v3_jd_usr_pin_mix_01_s_d 是否要用22G??? | 3174724968 | 30 | 70.3G | 45786681688 | 2020-12-08 17:04:06(9.4min + 13min + 2.7min + 16min = 41min)CPU 40% |
dmxupf_width_biz_v3_jd_usr_pin_mix_02_s_d | 978157942 | 40 | 29.7G | 5725585312 | 2020-12-08 17:07:44 (7.3min + 3.2min + 1.6min + 2min = 16min)CPU 15% |
dmxupf_width_biz_v3_jd_usr_pin_cf_01_s_d | 892646924 | 391 | 27.6G | 13301186218 | 2020-12-08 17:30:15 (16min + 8.1min + 41s + 4.6min = 30min)CPU 20% |
dmxupf_width_biz_v3_jd_usr_pin_cf_02_s_d | 181929673 | 137 | 2.5G | 1077344533 | 2020-12-08 17:51:24 (13min + 2.6min + 20s + 40s = 17min)CPU 15% |
dmxupf_width_biz_v3_jd_usr_pin_cf_03_s_d | 3051212081 | 238 | 27.5G | 10371289201 | 2020-12-08 18:06:24 (21min + 4.9min + 38s + 3.6min = 31min)CPU 25% |
dmxupf_width_biz_v3_jd_usr_pin_cf_04_s_m 99.25min / 51.6min(历史) | 3021392287 | 17 | 20.6G | 2020-12-08 18:31:04(5.6min + 2.6min + 18s + 1min = 10min)CPU 15% | |
dmxupf_width_biz_v3_jd_usr_pin_jdmall_01_s_d | 3175643883 | 269 | 82.5G | 34435203007 | 2020-12-08 19:15:23 (7.3min + 13min + 1.4min + 9min = 32min)CPU 30% |
dmxupf_width_biz_v3_jd_usr_pin_jdmall_02_s_d 112.12min(历史) | 1380243174 | 239 | 57.7G | 32467424440 | 2020-12-08 18:51:05 (7.9min + 13min + 1.5min + 10min = 32min)CPU 35% |
dmxupf_width_biz_jd_usr_pin_pay_s_d 72.37min(历史) | 1936821583 | 144 | 49.1G | 43633474737 | 2020-12-03 17:51:02 500 + 15G + 100 + 200000 + 2s(12min + 7.2min + 12min + 1.1min + 30min = 62.3min)75%2020-12-03 18:51:30 500 + 17G + 55 + 20000000 + 0s(11min + 7.2min + 11min + 2.2min + 17min = 48.4min)75%2020-12-03 21:58:36 1000 + 20G + 150 + 20000000 + 0s(8.9min + 7.4min + 13min + 2.2min + 21min = 54min)60%2020-12-04 15:44:25 500 + 17G + 55 + 50000000 + 0s(11min + 10min + 14min + 4.1min + 20min = 59.1min)40%2020-12-07 11:20:08 500 + 20G + 55 + 100000000 + 0s(41min + 14min = 55min)30%2020-12-07 11:20:08 500 + 20G + 55 + 500000000 + 0s(11min + 7.2min + 11min + 14min = 43min)CPU 40%(11 + 13 + 2 + 13 = 40min)最终参数确认组 |
dmxupf_width_biz_jd_usr_pin_ftc_s_d 16min(历史) | 246458197 | 125 | 1.8G | 793037332 | 2020-12-08 10:59:01 (10min + 2.2min + 22s + 38s = 13min)CPU 15% |
dmxupf_width_biz_jd_usr_pin_insu_s_d 65min(历史) | 29847749741016161997 | 300 | 33.3G13.3G | 7681909161 | 2020-12-08 10:59:01 (37min + 23min + 1.5min + 8min = 69.5min)CPU 20%pin0002368问题标签业务已经下线,待明日底表数据更新后再次推送(32min + 5.2min + 20s + 1.5min = 39min)CPU 20%(27min + 3.5min + 18s + 1.6min = 32min)CPU 20% |
dmxupf_width_jd_usr_pin_crd_s_d 27.75min(历史) | 825078098 | 243 | 15.1 G | 3593827335 | 2020-12-08 12:55:05 (19min + 3.1min + 22s + 1min = 23min)CPU 20% |
dmxupf_width_cs_jd_usr_pin_s_d 6.625min(历史) | 17282332 | 51 | 127.9M | 后期标准化表名后再推送,无推送风险 | |
dmxupf_biz_lh_impt_tm_s_d 7.875min(历史) | 107191918 | 25 | 3.7G | 后期标准化表名后再推送,无推送风险 | |
dmxupf_dmt_biz_lhzt_model_s_d 3min(历史) | 213681367 | 6 | 3.2G | 后期标准化表名后再推送,无推送风险 |