实时流系统Heron的异常检测和恢复 吴惠君 Twitter

2020-03-01 74浏览

  • 1.Self Regulating Stream Processing in Heron Huijun Wu 2017.12
  • 2.Huijun Wu Twitter, Inc. Infrastructure, Data Platform, Real-Time Compute
  • 3.Heron Overview Recent Improvements Self Regulating Challenges Dhalion Framework Case Study
  • 4.Heron Overview Datamodel:user/developer perspective What is Heron? A real-time, distributed, fault-tolerant stream processing engine from Twitter Topology(DAG) • Vertex Spout Bolt •Edge:Stream Tuple Compatible with Apache Storm data model
  • 5.Heron Overview Runtimearchitecture:data center with multiple topologies Topology sharedservices:• Scheduler • Uploader • Statemanager:zookeeperTools:• Tracker • UI
  • 6.Heron Overview Runtimearchitecture:one particular topology Shared betweencontainers:• Statemanager:zookeeper Type 1: container 0 • Topology master • Metrics cache Type 2: container x (x>0) • Stream manager • Heron instance • Metrics manager
  • 7.Heron Overview Runtime ratecontrol:backpressure Healthmetrics:● Metrics/counters ○ Backpressure ● Exceptions Backpressureexample:B3 in the container A triggers backpressure, which is broadcasted to all Stream managers to stop local Spouts.
  • 8.Heron Overview Recent Major Improvements (2016-2017) Self Regulating Challenges Dhalion Framework Case Study
  • 9.Recent Improvements Performance improvementhttps://blog.twitter.com/engineering/en_us/topics/open-source/2017/optimizing-twitter-heron.html
  • 10.Recent Improvements Resource managers Service Provider Interface (SPI) • Modular plugins •https://github.com/twitter/heron/tree/master/heron/spi• Scheduler implementation vs. delegation Supported resource pools • Mesos/Aurora/Marathon • Yarn • Kubernetes • Slurm • Localhttp://2015.qconshanghai.com/presentation/2792
  • 11.Recent Improvements Elastic runtime scaling ● ● ● ● ● Update parallelism at runtime Adapt to stream traffic load `heron update` command Minimize impact to running topology Intelligent packing algorithmhttp://2015.qconshanghai.com/presentation/2792
  • 12.Recent Improvements Statefulprocessing:effectively once Deliverysemantics:● At most once ● At least once ● Effectively once ○ Distributed snapshot/state checkpointing ○ At-least-once event delivery plus roll back ● Exactly oncehttp://2015.qconshanghai.com/presentation/2792
  • 13.Recent Improvements High levelDSL:functional API Domain Original topology API Heron Functional API Programming style Procedural, processing component based Functional Abstraction level Low level. Developers must think in terms of "physical" spout and bolt implementation logic. High level. Developers can write processing logic in an idiomatic fashion in the language of their choice, without needing to write and connect spouts and bolts. Processing model Spout and bolt logic must be created explicitly, and connecting spouts and bolts is the responsibility of the developer Spouts and bolts are created for you automatically on the basis of the processing graph that you buildhttp://2015.qconshanghai.com/presentation/2792
  • 14.Recent Improvements Multiple languages support ● ● ● Same data model with JavaPython:○ Documenthttps://twitter.github.io/heron/docs/developers/python/topologies/○ APIhttps://twitter.github.io/heron/api/python/○ Examplehttps://github.com/twitter/heron/tree/master/examples/src/python○ Code repositotyhttps://github.com/twitter/heron/tree/master/heronpyC++:○ Code repositoryhttps://github.com/twitter/heron/tree/master/heron/api/src/cpphttp://2015.qconshanghai.com/presentation/2792
  • 15.Recent Improvements Selfregulating:health mgr/dhalionMotivation:➢ the manual, time-consuming and error-prone tasks of tuning various configuration knobs to achieve service level objectives (SLO) as well as the maintenance of SLOs in the face of sudden, unpredictable load variation and hardware or software performance degradation What isDhalion:'>Dhalion: