[华为]Apache CarbonData,实现大数据即席查询秒级响应-陈亮
2020-02-27 357浏览
- 1.实现大数据即席查询秒级响应
- 2.Liang Chen / 陈 亮 华为大数据开源开发部Leader Apache CarbonData PMC & Committer 10多年大数据和BI项目开发和实践经验,对大 数据开源技术(Hadoop,Spark,CarbonData等)Email:chenliang613@apache.org有深入理解.
- 3.⼤数据现在和未来将深刻的改变运营商 Biz Customer 网络增效 网络性能管理与SQM策略保障 客户关怀和CEM 360°C客户洞察 市场分析 实时营销与推荐 数据货币化 数据变现 快速决策与根因分析定位 客户忠诚度维系 客户精细分群与个性化推荐 OTT开放竞合 网络问题与规划 客户关怀与流程优化 预测与影响力分析 M2M和位置分析 Consumer OM Team Partners 6 4 7 8 Operations Big Data Suits apps for consumer apps for OM apps for Biz API E2E ICT Resource Orchestration Engine OSS suits ONT MxU 2 ADSL CloudDSL/OL T VDSL Smarter SoftCom 业务和运营的智能融合 NaaS OpenStack PCRF CaaS Cloud OS/OpenStack(Local Resource) + Middleware SDN SDN 1 D 以太+OTN (Metro) RRU RRU RaaS G.Fast CPE Small Cell BSS suits Cloud OS/OpenStack MxU Small Cell Big Data Suits E2E ICT Resource Orchestration Engine D RNC SRC D 3 GSM 5 CloudEdg e BRAS FW SDN S/PGW DPI Controller GGSN vCPE D SBC NAT IT apps PaaS Cloud OS/OpenStack (Local Resource, IaaS) Router + WDM (Backbone) D D SDN controller SGSN MME Apps & Services IMS HSS Telco apps IT apps SMS/IPTV… (SaaS) Middleware (PaaS) Cloud OS/OpenStack (Local Resource, IaaS) CloudBB UMTS LTE Cloud OS/OpenStack 1 SDN实时大象流挖掘 2 IPRAN流量仿真 5 小区拥塞动态控制 潜在离网用户维挽 7 一站式服务优化 6 3 SON 网络自动实时优化 4 8 快速故障关联处理 开放变现
- 4.How to choose storage for complex big data requirements?
- 5.NoSQL Database • Key-Valuestore:low latency, <5ms • Can not support multi-dimension query
- 6.Multi-dimensional problem • Pre-compute all aggregation combinations •Complexity:O(2^n) • Dimension < 10 • Too much space • Slow loading speed
- 7.Shared nothing database • Parallel scan + distributed compute • Questionable scalability and fault-tolerance • Cluster size < 100 data node • Not suitable for big batch job • Can not integrate with Hadoop ecosystem
- 8.Search engine • All column indexed • Fast searching • Simple aggregation • Designed for search but not OLAP • complexcomputation:TopN, join, multi-level aggregation • No SQL support
- 9.SQL on Hadoop • Modern distributed architecture, scale well in computation. • Pipelinebased:'>based: