如何突破腾讯大数据分析架构瓶颈
2020-03-01 174浏览
- 1.— T4
- 2.
- 3.
- 4.storm spark streaming flink 1 storm 2 spark streaming 3 flink 1 5 10 sql 1 sql 1 count
- 5.+bitmap
- 6.App1 10 — (1,2,3,4...) — (1,2,3,4...) — (1,2,3,4...) — (1,2,3,4...) ... App2... App3... App1_ App1_ ...... App1_ App2_ ...... 1_ 1_ 1 2 2_ 1_ 1 1 Pv Uv ......
- 7.BitMap Index=17 …… 10000000 00100000 10010000 00110000 byte[n] byte-3 byte-2 byte-1 byte-0 1 bit 20 bitmap 1 238m 2 id 200k bitmap bitmap hadoop/hive count…from…groupby join… id sql sql join select disAnct 30 20 bit
- 8.BitMap 1 10000000 00000000 11000000 00110000 2 10100000 00000000 00010000 00100000 2 10100000 00000000 11010000 00110000 bitmap1 bitmap2 ... 00100000 00100000 00000000 00110000 Oppo 10100000 00010000 00010000 10000010 00100000 00000000 00010000 00000000 bitmap1&bitmap2
- 9.BitMap 1 2 3 4 5 6 7 8 : 1 bitmap1 bitmap2 bitmap1 Bitmap1 & bitmap2 (Bitmap1 bitmap2)^bitmap1 Bitmap1 bitmap2 app16me1 &app26me2 Bitmap1 & bitmap2 & bitmap3 &…
- 10.BitMap 1 2 1 1 A( u0 u1 u2 u3 u4 u5 u6 u7 hw 1 0 0 0 0 1 0 0 oppo 0 1 1 1 0 1 1 0 vivo 1 1 0 0 1 0 0 1 mi 0 0 0 0 0 0 0 1 BJ 1 1 1 0 0 0 1 0 SH 1 0 0 0 1 0 0 0 SZ 0 0 1 0 0 1 1 0 GZ 0 1 1 0 1 0 0 0 kashi 0 0 0 1 0 0 0 0 hw_BJ 1 0 0 0 0 0 0 0 mi_kashi 0 0 0 0 0 0 0 0 10 An&Bn An_Bn ) B( 10 ) 10+10=20 bitmap 10*10=100 bitmap 2 2 mi_kashi AB bitmap
- 11.BitMap 1 2 3 4 20 1-2 30 bitmap id 200k 20 bitmap bitmap bit 238m 42 2:42 10 1 10-20 id 42 gzip
- 12.+BitMap bitmap Dcache/bdb Mq Flink/Storm Hdfs hive SQL Id
- 13.Id id bitmap 1 50 id ID 20 Id 300 /s Id 5 hash Dcache/bdb 1s 20 sdk id id
- 14.BitMap 1 2 20 bitmap 238m 20 100byte 8k 100byte 65536 8k 100byte 65536 8k 100byte 65536 8k 64k 2m 524288 64k 16777216 2m 524288 64k 16777216 2m 524288 64k 16777216 2m
- 15.BitMap 3 RoaringBitmap value bitmap a b 2 short[] RoaringBitmap bitmap RoaringBitmap java bitmap key value 65535 4096 short value short[][] 65535 bitmap key 4096 65535*4096 short gc bitmap 4096 c 5000 bitmap 1001000000000000 16 short key 65535 1001000000000000 16 short value 4096 bitmap <4096 8k Short[] <4096 65535 8k Short[] <4096 65535 8k Short[] <4096 65535 8k 4096 short[]
- 16.BitMap 4 ArrayBitmap range Bitmap Bitmap , 65535) min(init, 65535) auto threshold n=range/bucket+(1?0) m=range/8+(1?0) bitmap bitmap <8k 1 2 3 4 init~bucket <8k init~bucket <8k init~bucket <8k roaring bitmap 10 50 100 5000 1 2 >=threshold 20 500 1000 3000 bitmap roaring bitmap roaring bitmap roaring bitmap roaring bitmap RoaringBitmap
- 17.• 1 hadoop+hive olap • PCG Flink+bitmap demo olap bitmap
- 18.
- 19.DMP 13 7 1000+ beacon.qq.com 4000 + MAU
- 20.1 1 — — 1 3 10 —
- 21.2 5 30 3 vip 1 10 35 docker 1 2 3 30 2 1 m10 6 m10 flink 11 m10 id server dcache 10 m10+3ts80 10 1 dcache 4 m10 200g id server 10
- 22.apk 1 hive 25 2 3 4 bitmap 1200 1 2 3 4 20 app imei flink2 app app 10 7 8 app_ _ 12 20 1-2 90 bitmap dcache10 400g id 1 bitmap bitmap app bimap bitmap 11
- 23.3000 100 hive 3000 app 48 20 3000 join1000 3000 7 bitmap 1 bitmap 100 1000 bitmap 20 1000 200 20 ID 200 MQ ...... / / / bitmap Time_app bitmap(imei) bitmap Time_app_ bitmap(imei) ...... 3000 -
- 24.ABtest
- 25.ABtest 1 2 Abtest 1 PV UV 2 vivo CTR ... CTR 1 CTR 2 3 1 pv uv 2 3 ... ... ... ...
- 26.ABtest ID TestID Log hdfs ABTest
- 27.ABtest
- 28.mysql Offset Flink Flink group count array/bitmap hdfs ID Server bitmap hdfs Bitmap Dcache
- 29.olap mysql Offset Flink Flink group count hdfs ID Server Dcache+Bdb array/bitmap ( ( bitmap Bitmap ) )
- 30.Olap
- 31.druid olap rowid " gd 1.1 hw 100 50 2 01:02 bj 1.1 oppo 120 50 3 01:03 sh 1.2 vivo 80 50 4 01:04 gd 1.1 oppo 60 50 2 .. . 0 1 rowid $ Select bitmap bj ...00000000 00000010 31 sh ...00000000 00000100 wah 1 ... 01:01 groupby ... metrics 1 # bitmap 0 dimensions timestamp gd ...00000000 00001001 1.1 ...00000000 00001011 1.2 ...00000000 00000100 ! gd&1.1 where =gd and ...00000000 00001001 =1.1? % 1 4 =100+60=160
- 32.druid impala 1 druid imei 2 3 impala imei app olap olap imei
- 33.Cube Cube[t][d][v] = bitmap(rowid) bitmap(imei) t (Z ) d (Y ) v (X ) t2 d2 t0-t1 cube[t2][ ][ ] cube[ ][d2][ ] d0=v0 and d0=v1 cube[<2][d0][<2] t2 t1 t0 v5 v4 v3 v2 v1 v0 d d2 3 d0 d1
- 34.1-3 Olap 1 Gd bitmap(imei) Gd_1.1 bitmap(imei) ...... 2 3 1 10 olap App App App imei rowid olap + druid mdb 3 imei 1 10 /100 Gd_1.1_vivo_ bitmap(imei) Gd_1.1_oppo_ bitmap(imei) ...... 2 30 3 2 Id Olap 1 2 imei rolap cube molap dimensions timestamp metrics ... 1 01:01:08 gd 1000 100 950 0 0 2 01:01:09 gd 1.1 200 30 180 0 0 3 03:01:20 gd 1.1 vivo ... 60 10 50 80 50 4 03:01:25 gd 1.1 oppo ... 80 15 70 60 50 5 03:02:10 bj 1.1 hw ... sum avg
- 35.Olap druid 1 Select sum( ) where =gd 2 Select sum( ) where =gd 3 Select sum( ) where and timestamp=[from, to] =gd Olap Olap druid +druid Olap
- 36.
- 37.