Designing, Building and Testing Deterministic HFT Systems Peter

2020-03-01 154浏览

  • 1.Designing, Building and Testing Deterministic HFT Systems QCon Shanghai 2019 Peter Lawrey CEO, Chronicle Software
  • 2.
  • 3.Peter Lawrey Self funded $2.1m/y revenue Several Tier 1 banking clients 17m downloads 2018
  • 4.Use Cases Low latency FIX Engine < 20 us 99.9% Low latency Replication < 30 us Low latency Persisted messaging < 1 us 99% Custom FX Trading With 3 month ROI
  • 5.Low latency trading microservice A process which handles events and transaction in around 10 – 100 microseconds Ideally, it should be only 1% busy on average.
  • 6.Low latency trading microservice Sources of latency - Network and OS - Any messaging/persistence - The core business logic - Garbage collection
  • 7.Lowering latency The less you do the faster it will be. Your process should only do essential work and avoid time spent in abstraction.
  • 8.Doing less, testing with JMH double price = 1.25; double quantity = 1e6; @Benchmark public double multipleAndRoundDouble() { double value = price * quantity; return Math.round(value * 1e2) / 1e2; } @Benchmark public double multipleAndRoundBigDecimal() { return BigDecimal.valueOf(price) .multiply(BigDecimal.valueOf(quantity)) .setScale(2, BigDecimal.ROUND_HALF_UP) .doubleValue(); }
  • 9.Doing less worst Average 1 in 100 1 in 1000 double 0.052 μS 0.1 μS 0.9 μS BigDecimal 0.28 μS 0.4 μS 17 μS
  • 10.Reducing the impact of Garbage Collection Give yourself a budget e.g. 24 GB/day or 1 GB/hour => 1 GC a day
  • 11.Real time processing of transactions
  • 12.Real time processing of transactions
  • 13.Real time processing of transactions Series of low latency, non blocking tasks End to end latencies consistently low
  • 14.Real time processing of transactions
  • 15.Store every event model is getting cheaper 3.84 TB- £830 7.68 TB - £2100 15.36 TB - £4,200
  • 16.How do we guarantee correctness? Each microservice should be designed to be deterministic so it produces the same result for given inputs every time.
  • 17.How do we guarantee correctness? Time can be an input for time sensitive actions such as expiry. Time should be an input which is recorded, testable and replayable.
  • 18.Recovery in a low latency system The state of any microservice can be reconstructed by replying the inputs. This can take too long so the system can dump a progressive snapshot over N minutes
  • 19.Recovery in a low latency system On failover, restart or upgrade a system can read all of it’s inputs and/or the resulting state changes to recreate it’s state.
  • 20.Replayability reduces time to fix. Problems can be recreated quickly by taking the data from production, creating the same state on a development system, debugging and testing any fix.
  • 21.Testing microservices # setup.yamlopeningBalance:{timestampUS:'>timestampUS: