CloudIDE at AntFin | 2020-03-01

2020-03-01

Why in-house Cloud Development

Developer on-boarding acceleration (migrate developer tools to the cloud & store development assets in the cloud)
Standardized workspace: immutable dev / runtime environment
Collaborative dev solution & trouble shooting (share & attach remote workspaces)
Security
Scalable: replicate and distribute the entire development workspace on-premises
Integrated approach to DevOps
- Create containerized Dev / Test / Staging environments (hosted on a shared-resource cloud)
- Running the tests in an exact copy of production
- Integrated CI / CD pipelines

Remote workspaces with native toolchains
- Using a thin client to connect with cloud-based containers / VMs (X Windows)
- Made some compromise between the “Cloud-Native” way and traditional “VM way”
WebIDE + Cloud-based workspaces (multi-year effort)

k8s cluster:
- xx+ machines (most of them are retired physical machines which don’t have SLAs)
- providing xx RAM & xx CPU cores
- every host node is running CentOS
Workspace:
- Multiple containers run within a single workspace encapsulation
- Workspace configuration:
  - adaptable templates (called stacks) to create new workspace
  - resource management (quota / limit)
- k8s-friendly application stack definition (Docker image, kubernetes.yaml, Helm Chart)
- The workspace engine will be capable of interpreting an application stack definition and generating the workspace
CloudIDE container:
- IDE Container (IDE services): fat single-container apps, with an init system
- Dev Container (apps): CentOS based container with tini as the top-level process
- Containers talk to each other over the network and form a complete cloud-dev system
Overlay Network & routing & service
Stack: In-house Docker registry. Dockerfiles are kept in the VCS

Availability & Stability (SLA 99%, RTO < 30min)
System utilization is low (7% CPU overall)
Start-up speed is slow (~30s)
Provision / Scheduler
- Resource allocation is handled by the in-house container platform, Sigma (k8s)
- Orchestration system: Che-inspired scheduler
Distributed Storage
- Local PV, backup / sync to block storage
- GlusterFS for persistent stateful services RWX
Developer Experience on IDE
- Code / Debug / Language Service
- Build log
- Real-time collaboration: last-write-wins policy / multi-cursor editing
- Desktop IDE sync: fuse-based mount and sync, sshfs