腾讯高级工程师尹烨——腾讯游戏容器云平台的演进之路
2020-02-27 246浏览
- 1.腾讯游戏容器云平台演进之路 尹烨 高级工程师
- 2.
- 3.
- 4.
- 5.平台概况 技术方案 总结
- 6.平台概况 • 2014 – now • 200+ APP、23W+ CPU core、800T+ Mem • 业务场景 • 轻量虚拟机 • 微服务 • 离线计算(大数据、机器学习
- 7.技术栈
- 8.平台概况 技术方案 总结
- 9.轻量虚拟机 • System Init(sysvinit /systemd + SSH • IP per light-VM • Run monitor agent in light-VM
- 10.systemd • Container Interface • container=docker • Cgroup is needed • udev is not available when mount /sys read-only • Systemd defines that shutdown signal as SIGRTMIN+3 • ...
- 11.Network(1) • Bridge • Bad performance • Set veth txqlen=0
- 12.Network(2) • SR-IOV • Good performance • Binding VF interrupt • Enable RPS
- 13./proc • Lxcfs • Kernel support
- 14.微服务 • Only app in container • IP per container ? • Monitor
- 15.Network - Overview
- 16.Underlay to overlay • LB • http/https/tcp/udp
- 17.VXLAN optimization • UDP RSS • ethtool -N eth10 rx-flow-hash udp4 sdfn • VXLAN offload • VXLAN GRO • Kernel 3.14 (net:Add GRO support for vxlan traffic)
- 18.CNI • Simple • Plugins(macvlan,ipvlan,bridge,multus,… • Container runtimes(k8s,rkt,mesos,… • SR-IOV CNI(github.com/hustcat/sriov-cni • High performance(NFV,Proxy,LB,… • VF interrupt CPU binding • DPDK supported
- 19.K8S extensions • Scheduler plugin • Cpuset and NUMA • kubernetes#49186 (v1.8?)
- 20.Monitor
- 21.Log
- 22.离线计算 • Tensorflow + GPU • NVIDIA/nvidia-docker(GPU device、CUDA library • Spark
- 23.Spark on K8S • Native support for submitting Spark applications to a kubernetes cluster. • The submitted application runs in a driver executing on a kubernetes pod, and executors lifecycles are also managed as pods. • SPARK-18278 •https://github.com/apache-spark-on-k8s
- 24.Architecture
- 25.Comparison with Spark Standalone on K8S • Elastic • Spark executors can be elastic depending on job demands • Simple • Simplifies the process of running Spark jobs • Efficient • Only k8s-based resource scheduler
- 26.镜像传输 • 自研企业级镜像仓库 • P2P传输
- 27.镜像仓库 • Token认证 • 权限控制 • 操作日志及审计 • 分布式存储
- 28.P2P镜像传输
- 29.Kernel • Overlayfs + XFS • Buffer IO throttle • Cgroup namespace • 网络sysctl内核参数隔离 • Bugfix
- 30.Overlayfs + XFS • Advantage • Simple • Good IO performance • XFS (project quota,inode limit • Some problems • Inotify(#11705 • Unix socket(#12080,Kernel 4.7
- 31.平台概况 技术方案 总结
- 32.总结 • 容器重新定义业务部署和资源交付方式
- 33.