Challenges & Opportunities in Graph Processing at Alibaba, 钱正平
2020-03-01 173浏览
- 1.Challenges & Opportunities in Graph Processing at Alibaba Zhengping Qian Computing Infrastructure Team Alibaba Group
- 2.About me • 2009, South China University of Technology • Computer architecture, networked systems • 2009-2015, Microsoft Research Asia • Systems research group • 2015-now, Alibaba Group • Big-datasystems:stream processing, graph analytics, machine learning, etc.
- 3.A lot of collaborators...
- 4.Knowledge graph Graph G(V, E), is a math concept In this talk, it refers to a data model
- 5.How the graph (data model) leads to knowledge?
- 6.
- 7.
- 8.“天龙八部里,在石屋内段誉和钟 灵发生什么了吗”
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.找周杰伦歌曲所有作曲人写的最好的10首歌曲, 按照分数排序。 Query usingGremlin:G.V() .hasLabel('artist') .has('name', '周杰伦') .out('e_sing') .in('e_write').dedup() .out('e_write') .order().by('rank_score', decr) .limit(10) .values('name', 'rank_score')
- 16.
- 17.BIG CHALLENGES FOR COMPUTER SYSTEMS
- 18.BIG CHALLENGES FOR COMPUTER SYSTEMS
- 19.BIG FUTURE WITH MACHINE INTELLIGENCE
- 20.BIG FUTURE WITH MACHINE INTELLIGENCE
- 21.COMPUTING PLATFORM AT ALIBABA GROUP 业务层 统一 开发环境 计算服务 统一调度 异构计算 统一存储 淘宝 阿里妈妈 高德 Lazada 合一 DW Suite (D2/Base) 开发套件 流水线管理 通用计算 ODPS (MARS) 蚂蚁 UC BI/大屏 大屏 流计算 Blink … 公共云 专有云 监控 监控/部署 部署/运维 运维 机器学习 PAI 图计算平台 FLASH 伏羲 伏羲Fuxi(统一资源管理,阿里层面资源混布) (统一资源管理,阿里层面资源混布) CPU 存储计算分离 存储计算分离/分级存储能力 分级存储能力 GPU FPGA ASIC
- 22.LAYERING
- 23.DATA-PARALLEL COMPUTATION Big Data
- 24.MULTI-PROCESS PIPELINE • Chaining a set of processes by their I/O streams • E.g., a Unixpipeline:program1 program2 program3 Douglas McIlroyReference:http://budiu.info/work/cloudera14.pptx,http://en.wikipedia.org/wiki/Pipeline_(Unix)
- 25.MULTI-PROCESS PIPELINE • Chaining a set of processes by their I/O streams • E.g., a Unixpipeline:program1 program2 program3 Input p1 p2 p3 OutputReference:http://budiu.info/work/cloudera14.pptx,http://en.wikipedia.org/wiki/Pipeline_(Unix)
- 26.MULTI-PROCESS PIPELINE + DATA PARALLELISM • Chaining a set of processes by their I/O streams • E.g., a Unixpipeline:program1 program2 program3 Input • p1 p2 p3 Output Partitioning input and/or output data for parallel computation Input1 p1 Input2 p1 Input3 p1 p2 p2 p3 Output1 p3 Output2 p3 Output3Reference:http://budiu.info/work/cloudera14.pptx,http://en.wikipedia.org/wiki/Pipeline_(Unix)
- 27.IMPLEMENTATION • Scheduler, buffering (queue) and fault tolerance Input1 p1 Input2 p1 Input3 p1 p2 p2 p3 Output1 p3 Output2 p3 Output3Reference:http://budiu.info/work/cloudera14.pptx,http://en.wikipedia.org/wiki/Pipeline_(Unix)
- 28.MapReduce var result = input.SelectMany(r => Mapper(r)) .GroupBy(r => Key(r)) .Select(g => Reducer(g)); Distributed sorting var result = input.SelectMany(r => Mapper(r)) .OrderBy(r => Key(r));Reference:http://budiu.info/work/cloudera14.pptx
- 29.A TYPICAL DATA-PROCESSING JOB AT ALIBABA
- 30.BIG FUTURE WITH MACHINE INTELLIGENCE
- 31.时效性需求 • • • 100,000,000 日志事件 日志事件/秒 秒 4,000 服务器 <3秒延时 秒延时
- 32.时效性需求 时效性需求2 • 1. 2. 3. 实时视频分析挑战: 带宽 资源调度 算法参数
- 33.Input stream Model of computation R Event X R X X Computation Channel M Output stream
- 34.Challenge of fault tolerance Input replay Internal state reconstruction R X Duplicate/missing output R X M X
- 35.BIG FUTURE WITH MACHINE INTELLIGENCE Graph processing requirements
- 36.:“ ”
- 37.“ ”
- 38.
- 39.“ ” ¥100.00 ¥99.00 “ ”
- 40.FLASH Advanced Applications Query Primitives Mining Primitives QueryLanguage:Primitive Operators Computing Models Memory/Disk/SSD/Multi-core + Distributed/Cloud Data Environments Dynamic Graph, Probabilistic, Sampling, Random Graph Control plane •Stateful •Dynamic control flow •Load balancing •Sync/async execution •Push/pull, … Data plane •Graph partitioning •Compact encoding •Compression •Indexing •Caching/reuse,…
- 41.BREADTH-FIRST SEARCH IN FLASH // VertexSet A = g.filter(has(attrVal("O0241P0009"), EQ, "15170811181")).local("dst", 0); g = g.set(UNION, A); // repeat(A, g); VertexSet nextA = A.exchange(out(), min("dst"), "predst").join(g) .filter(hasNo("dst")) .local("dst", add("predst", 1)); g = g.set(UNION, nextA); endRepeat(nextA, g).until(100, A.isEmpty()); // g.filter(has("dst")).outputLocal("dst", "O0241P0009");
- 42.“SMART” CONNECTED COMPONENTS IN FLASH final int p = 2; VertexSet A = V.filter(and(has(“id”, GT, pi), has(“id”, LT, p(i+1))); repeat(A, V); A = A.union(A.filter(hasNo(“color”))).local(“color”, attrVal(“id”))); A = A.exchange(both(), “color”, “preColorPath”).local(“colorPath”, appendList(“colorPath”, “preColorPath”)); A = A.union(A.filter(hasNo(“color)).local(“color”, min(attrVal(“colorPath)))); V = V.union(A); VertexSet nextA = V.filter(and(has(“id”, GT, p^(iterationCount())), has(“id”, LT, p^(iterationCount()+1)))); endRepeat(nextA, V).until(500, nextA.isEmpty()); S = V.group(flashTypeOf(1), “colorPath”, “colorPathList”).local(“colorGroupPath”, unionGroup(“colorPathList”)).toGlobal(“resultColorPath”, “colorPathList”); V = V.local(“color”, minValue(“color”, getByIndex(S.get(“resultColorPath”), attrVal(“color”))))); V.output(”file", "id", ”color");
- 43.RACK COMPUTING • • • • • • • 阿里巴巴大数据平台已经达到: 阿里巴巴大数据平台已经达到:5K、 、10K…规模 规模 真实的图往往可以在更小规模集群计算 A “Rack”: 成千上万的 成千上万的CPU/核 核 Tb级的内存 级的内存 PB级的磁盘 级的磁盘 快速互联的网络通信设施 快速互联的网络通信设施(如RDMA或InfiniBand)
- 44.GRAPH PROCESSING FRAMEWORKS
- 45.FLATTEN:PATTERN MATCHING USING FLASH
- 46.BIG FUTURE WITH MACHINE INTELLIGENCE
- 47.CONCLUDING REMARKS • Big data infrastructure is becoming a key foundation • Significant challenges to support diverse workloads at scale • Needs collaborations, and solving real problems • At Alibaba, we believe research and realworld impact can/must go hand-in-hand
- 48.WELCOME TO JOIN US IN HANGZHOU
- 49.MUSIC IS THE BEST