腾讯高级工程师尹烨——腾讯游戏容器云平台的演进之路

2020-02-27 246浏览

  • 1.腾讯游戏容器云平台演进之路 尹烨 高级工程师
  • 2.
  • 3.
  • 4.
  • 5.平台概况 技术方案 总结
  • 6.平台概况 •  2014 – now •  200+ APP、23W+ CPU core、800T+ Mem •  业务场景 •  轻量虚拟机 •  微服务 •  离线计算(大数据、机器学习
  • 7.技术栈
  • 8.平台概况 技术方案 总结
  • 9.轻量虚拟机 •  System Init(sysvinit /systemd + SSH •  IP per light-VM •  Run monitor agent in light-VM
  • 10.systemd •  Container Interface •  container=docker •  Cgroup is needed •  udev is not available when mount /sys read-only •  Systemd defines that shutdown signal as SIGRTMIN+3 •  ...
  • 11.Network(1) •  Bridge •  Bad performance •  Set veth txqlen=0
  • 12.Network(2) •  SR-IOV •  Good performance •  Binding VF interrupt •  Enable RPS
  • 13./proc •  Lxcfs •  Kernel support
  • 14.微服务 •  Only app in container •  IP per container ? •  Monitor
  • 15.Network - Overview
  • 16.Underlay to overlay •  LB •  http/https/tcp/udp
  • 17.VXLAN optimization •  UDP RSS •  ethtool -N eth10 rx-flow-hash udp4 sdfn •  VXLAN offload •  VXLAN GRO •  Kernel 3.14 (net:Add GRO support for vxlan traffic)
  • 18.CNI •  Simple •  Plugins(macvlan,ipvlan,bridge,multus,… •  Container runtimes(k8s,rkt,mesos,… •  SR-IOV CNI(github.com/hustcat/sriov-cni •  High performance(NFV,Proxy,LB,… •  VF interrupt CPU binding •  DPDK supported
  • 19.K8S extensions •  Scheduler plugin •  Cpuset and NUMA •  kubernetes#49186 (v1.8?)
  • 20.Monitor
  • 21.Log
  • 22.离线计算 •  Tensorflow + GPU •  NVIDIA/nvidia-docker(GPU device、CUDA library •  Spark
  • 23.Spark on K8S •  Native support for submitting Spark applications to a kubernetes cluster. •  The submitted application runs in a driver executing on a kubernetes pod, and executors lifecycles are also managed as pods. •  SPARK-18278 •https://github.com/apache-spark-on-k8s
  • 24.Architecture
  • 25.Comparison with Spark Standalone on K8S •  Elastic •  Spark executors can be elastic depending on job demands •  Simple •  Simplifies the process of running Spark jobs •  Efficient •  Only k8s-based resource scheduler
  • 26.镜像传输 •  自研企业级镜像仓库 •  P2P传输
  • 27.镜像仓库 •  Token认证 •  权限控制 •  操作日志及审计 •  分布式存储
  • 28.P2P镜像传输
  • 29.Kernel •  Overlayfs + XFS •  Buffer IO throttle •  Cgroup namespace •  网络sysctl内核参数隔离 •  Bugfix
  • 30.Overlayfs + XFS •  Advantage •  Simple •  Good IO performance •  XFS (project quota,inode limit •  Some problems •  Inotify(#11705 •  Unix socket(#12080,Kernel 4.7
  • 31.平台概况 技术方案 总结
  • 32.总结 •  容器重新定义业务部署和资源交付方式
  • 33.