使用 TiDB 进行实时数据分析 马晓宇

2020-03-01 691浏览

  • 1.TiDB @PingCAP
  • 2.
  • 3.About Me ● ● ● @PingCAP BigData Infra Team Lead SQL on Hadoop
  • 4.
  • 5.● ● Data Sink SQL BI Tools OLTP DBs Server Events Web Console
  • 6.NoSQL ● ○ ○ Hadoop ● RDBMS ○ ○ ● ○ ○ Update NoSQL
  • 7.● ● ●
  • 8.DBA Icon credit to Recep Kutuk, becris, Kiranshastry@www.flaticon.com
  • 9.Merge Delta Unified View Refined Data
  • 10.vs
  • 11.TiDB
  • 12.TiDB ● ○ ○ ○ ○ ● ○ MySQL ○ ○ ACID SQL
  • 13.
  • 14.TiDB PD PD TSO/Data location PD PD Cluster Metadata MySQL Clients TiDB TiDB TiDB Syncer TiKV TiKV TiKV TiKV TiKV TiKV DistSQL API TiDB TiDB ... ... TiDB Cluster TiKV Cluster (Storage)
  • 15.Multi-Raft ● ○ Raft / ● ○ ○ ● ○ ●
  • 16.Coprocessor Transaction MVCC Raft RocksDB Region 1:[a-e] Region 1:[a-e] Region 2:[f-j] Region 1:[a-e] Region 3:[k-o] Region 2:[f-j] Region 3:[k-o] Region 2:[f-j] Region 4:[p-t] Region 3:[k-o] Region 5:[u-z] Region 4:[p-t] ... RocksDB Instance ... RocksDB Instance Raft group Region 5:[u-z] Region 4:[p-t] Region 5:[u-z] ... RocksDB Instance ... RocksDB Instance ···
  • 17.● MySQL 5.7 → MySQL 8.0 ○ ● ○ ● ○ Schema ● ○ JSON SQL
  • 18.DBA Maybe yes, maybe no ? ?
  • 19.● TiDB SQL ● → Ad Hoc Query ○ ○ → ○ SQL → Data Science / Machine Learning ○ ● Join → TiDB Hadoop
  • 20.TiSpark ● TiSpark TiDB Apache Spark ● ○ Apache Zeppelin R Join ○ ● Hive TiDB ● ● Apache Spark TiDB WIP
  • 21.TiSpark Spark Driver TiSpark gRPC TiKV gRPC retrieve data location Spark Exec Spark Exec Spark Exec TiSpark TiSpark TiSpark Placement Driver (PD) retrieve data from TiKV TiKV TiKV Distributed Storage Layer TiKV TiKV
  • 22.DBA Maybe yes, maybe no Icon credit to Recep Kutuk@www.flaticon.com
  • 23.● ● ○ ○ ○ ● ○ ○
  • 24.● TiDB ○ IO ● ○ ● ○ ○ ● TiDB + TiSpark
  • 25./ ● ● V / / L ○ / F ○ / V ○ R T A C A F M P EB
  • 26.TiFlash TiSpark Worker TiFlash Node 2 TiDB TiSpark Worker TiFlash Node 1 TiFlash Extension Cluster TiDB TiKV Node 1 TiKV Node 2 TiKV Node 3 Store 1 Region 1 Store 2 Region 4 Store 3 Region 2 Region 2 Region 3 Region 3 Region 3 Region 2 Region 4 Region 4 Region 1 Region 1 TiKV Cluster
  • 27.TiFlash TiDB 4 3
  • 28.C TiFlash TiDB 4 C 4
  • 29.Why TiFlash here ● ○ ○ ● ○ ○ ○ ○ +
  • 30.DBA with TiFlash Extension Icon credit to Recep Kutuk@www.flaticon.com
  • 31.Everything comes with a price NoSQL ● ○ TiDB ○ ○ Hadoop ● ○ Hadoop ○ Hadoop ○ ○ PB /
  • 32.
  • 33.● ● ○ Binlog ■ TiDB SQL ○ ○ ○ MySQL Spark
  • 34.-
  • 35.
  • 36.Thank You !