View on 📖 Notion

    This post documents the process of self-hosting Gitpod (with a Community plan). There is also a Professional plan offering available at https://www.gitpod.io/self-hosted. Please check it out.

    One of the reasons I tried self-hosting Gitpod is to better understand its workspace orchestration & provision capability, which is hard to have a thorough understanding of, by just observing the user-space behavior. The widespread adoption of the Cloud is changing the Engineering Productivity landscape (which I’ve been working on), Gitpod demonstrates a path to a cloud-based development experience, which is fascinating to me. Also I need to setup a testbed to finish working on some PR for Gitpod’s OSS repository. So that’s how it began.

    The underlying host machine I was using is a Tencent Cloud Virtual Machine (CVM) with the SA2.2XLARGE16 specification, which is a standard model with balanced performance. The VM is an 8-core 16GB instance, and it supports a pay-as-you-go billing model. Note that Gitpod has a requirement on the Node’s Kernel version (≥ 5.4.0), so better choose Ubuntu Server 20.04 LTS for OS Image.

    Installation process

    1. Cluster setup

    To setup a Kubernetes cluster:

    • Install a single node k3s cluster (with custom node labels) following the k3s cluster setup guide.

        $ export INSTALL_K3S_EXEC="server --disable traefik --flannel-backend=none \
            --node-label gitpod.io/workload_meta=true \
            --node-label gitpod.io/workload_ide=true \
            --node-label gitpod.io/workload_workspace_regular=true \
            --node-label gitpod.io/workload_workspace_headless=true \
            --node-label gitpod.io/workload_workspace_services=true"
        $ curl -sfL https://get.k3s.io | sh -
        $ k3s -version
        k3s version v1.23.6+k3s1 (418c3fa8)
        go version go1.17.5
      
    • k3s installer generates its own kubeconfig file, so export the KUBECONFIG envvar for other tools like helm to access the k3s cluster. Note that k3s comes with kubectl baked in, so setting an alias to k3s kubectl is handier.

        $ export KUBECONFIG=/etc/rancher/k3s/k3s.yaml # export KUBECONFIG=... && helm install
        $ alias k="k3s kubectl"
      
    • Install the Calico Operator & CRD, and modify the calico-config based on the Gitpod installation guide.

        $ k create -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml
        $ k create -f https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml
      
        # download the Calico manifest <https://docs.projectcalico.org/manifests/calico-vxlan.yaml>
        # add 1 line `"container_settings": { "allow_ip_forwarding": true }` to the `plugin` section
        # copy the edited file to `/var/lib/rancher/k3s/server/manifests/`
      
    • Check the Node status again.

        $ k get node -owide
        NAME            STATUS   ROLES                  AGE   VERSION        INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
        vm-0-9-ubuntu   Ready    control-plane,master   1h    v1.23.6+k3s1   172.22.0.9    <none>        Ubuntu 20.04.4 LTS   5.4.0-109-generic   containerd://1.5.11-k3s2
      

    2. Install Cert-Manager & Configure DNS

    The Certificate and Networking managements are well-known hard problems for software developers, even without the complexity of Kubernetes. I tried several approaches to avoid this particular rabbit hole, and later found out it’s better just to learn the necessary prior knowledge and follow the installation guide.

    Some basic understanding I got:

    • cert-manager is the “Certificates as a Service” in the Kubernetes ecosystem. It introduces CRDs like Certificate / Issuer into the Kubernetes API.

    • ACME (Automated Certificate Management Environment) is a protocol proposed by Let’s Encrypt. It allows an ACME client (e.g. cert-manager) to request (also renew/revoke) a certificate automatically from CA (e.g. Let’s Encrypt)
    • DNS-01 challenge is a domain validation procedure, as you prove to CA that you control the domain name by putting a specific TXT record under it (by making API calls to the DNS provider). Different from other challenge types (such as HTTP-01), DNS-01 allows issuing wildcard certificates (which Gitpod requires).

    Move on to the installation:

    • Configure root domain (@) and two wildcard subdomains (* and *.ws) from the DNS provider, pointing to the public IP of the VM. Make some digs to check if the settings work.

        $ dig <mygitpod.domain>
        A <mygitpod.domain>. 9m11s   43.156.xx.xx
        $ dig test.<mygitpod.domain>
        A test.<mygitpod.domain>. 9m41s   43.156.xx.xx
      
        $ which dig
        dig: aliased to dog
      
    • Install cert-manager.

        $ k apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.yaml
        $ k get pod -A
        NAMESPACE         NAME                                       READY   STATUS    RESTARTS   AGE
        kube-system       calico-node-mmff6                          1/1     Running   0          18m
        kube-system       coredns-d76bd69b-9glsl                     1/1     Running   0          21m
        kube-system       calico-kube-controllers-6b77fff45-dt8lg    1/1     Running   0          18m
        kube-system       metrics-server-7cd5fcb6b7-r8wsj            1/1     Running   0          21m
        kube-system       local-path-provisioner-6c79684f77-vw9js    1/1     Running   0          21m
        tigera-operator   tigera-operator-7d8c9d4f67-9n5c8           1/1     Running   0          17m
        cert-manager      cert-manager-64d9bc8b74-rgng9              1/1     Running   0          20s
        cert-manager      cert-manager-cainjector-6db6b64d5f-gvgwx   1/1     Running   0          20s
        cert-manager      cert-manager-webhook-6c9dd55dc8-7gzbt      1/1     Running   0          20s
      
    • Create API Token from the DNS provider console (I was using DNSPod).

    cert-manager does not have built-in support for DNSPod, however it supports Webhook resolver for these out-of-tree DNS providers. I tried cert-manager-webhook-dnspod and it worked out great. It also has well-documented instructions on the Tencent Cloud doc site.

    • Install cert-manager-webhook-dnspod using Helm with custom options.

        $ git clone --depth 1 https://github.com/qqshfox/cert-manager-webhook-dnspod.git
        $ helm install --name cert-manager-webhook-dnspod ./deploy/cert-manager-webhook-dnspod \
            --namespace cert-manager \
            --set groupName=<GROUP_NAME> \
            --set secrets.apiID=<DNSPOD_API_ID>,secrets.apiToken=<DNSPOD_API_TOKEN> \
            --set clusterIssuer.enabled=true,clusterIssuer.email=<EMAIL_ADDRESS>
      
    • Check the generated ClusterIssuer. If the “READY” state keeps False, describe the ClusterIssuer to see if there is an exception.

        $ k get ClusterIssuer -A
        NAME                                         READY   AGE
        cert-manager-webhook-dnspod-cluster-issuer   True    1h
      
        $ k describe ClusterIssuer cert-manager-webhook-dnspod-cluster-issuer -n cert-manager
        ...
        Spec:
          Ca:
            Secret Name:  cert-manager-webhook-dnspod-ca
        Status:
          Conditions:
            Last Transition Time:  2022-06-07T11:19:16Z
            Message:               Signing CA verified
            Observed Generation:   1
            Reason:                KeyPairVerified
            Status:                True
            Type:                  Ready
        Events:
          Type    Reason           Age                From                  Message
          ----    ------           ----               ----                  -------
          Normal  KeyPairVerified  16m (x2 over 16m)  cert-manager-issuers  Signing CA verified
      
    • Create a Certificate.

        # cert.yaml
        apiVersion: cert-manager.io/v1
        kind: Certificate
        metadata:
          name: my-crt
          namespace: default # choose your namespace
        spec:
          secretName: my-crt-secret
          issuerRef:
            name: cert-manager-webhook-dnspod-cluster-issuer # refs to the ClusterIssuer
            kind: ClusterIssuer
            group: cert-manager.io
          dnsNames: # your dnsNames
          - "mygitpod.domain"
          - "*.mygitpod.domain"
          - "*.ws.mygitpod.domain"
      
    • Wait for the certificate’s “READY” state to become True (could be minutes).

        $ k apply -f cert.yaml
        $ k get cert -A
        NAMESPACE      NAME                                      READY   SECRET                                    AGE
        cert-manager   cert-manager-webhook-dnspod-ca            True    cert-manager-webhook-dnspod-ca            1h
        cert-manager   cert-manager-webhook-dnspod-webhook-tls   True    cert-manager-webhook-dnspod-webhook-tls   1h
        cert-manager   my-crt                                    True    my-crt-secret                             1h
      
    • Check and validate the certificate (optional). tls.key / tls.crt can be found in secret my-crt-secret

        $ k describe cert my-crt -n cert-manager
        ...
        Spec:
          Dns Names:
            mygitpod.domain
            *.mygitpod.domain
            *.ws.mygitpod.domain
          Issuer Ref:
            Group:      cert-manager.io
            Kind:       ClusterIssuer
            Name:       cert-manager-webhook-dnspod-cluster-issuer
          Secret Name:  my-crt-secret
        Status:
          Conditions:
            Last Transition Time:  2022-06-07T11:28:39Z
            Message:               Certificate is up to date and has not expired
            Observed Generation:   1
            Reason:                Ready
            Status:                True
            Type:                  Ready
          Not After:               2022-09-05T10:28:38Z
          Not Before:              2022-06-07T10:28:39Z
          Renewal Time:            2022-08-06T10:28:38Z
          Revision:                1
        Events:                    <none>
      
        # dump certificates if you need
        $ k get secret mytest1111-cloud-crt-secret -n cert-manager -o jsonpath='{.data.tls\.crt}' | base64 -d
      

    3. Install Gitpod

    Gitpod previously used Helm for self-hosted installation, and recently it switched from Helm to a custom installer. There is a blog explaining some technical reasons: https://www.gitpod.io/blog/gitpod-installer. Gitpod now suggests using kots, which might be based on installer as well, but with a pretty WebUI console.

    • Install kots plugin & install Gitpod using kots.

        $ curl https://kots.io/install | bash
        $ k kots install gitpod
        Enter the namespace to deploy to: gitpod
          • Deploying Admin Console
            • Creating namespace ✓
            • Waiting for datastore to be ready ✓
        Enter a new password to be used for the Admin Console: •••••••••
          • Waiting for Admin Console to be ready ✓
      
          • Press Ctrl+C to exit
          • Go to http://localhost:8800 to access the Admin Console
      
    • Admin-console listened on localhost:8800 only, so setup an Nginx server to proxy the requests.

        $ apt install nginx-full
        $ vi /etc/nginx/nginx.conf
      
        # some piece of nginx.conf
        ...
        http {
            server {
                listen 8800;
                location / {
                    proxy_pass  http://127.0.0.1:8800/;
                }
            }
            # include /etc/nginx/conf.d/*.conf;
            # include /etc/nginx/sites-enabled/*;
            ...
        }
        ...
      
        $ nginx
      

    Now visit the admin-console from your browser, the only notable part is the “Issuer name” of TLS certificates settings (cert-manager-webhook-dnspod-cluster-issuer). Hit “Continue” to run the preflight check and wish it all green.

    • Once the installation is finished, check if all the workloads are ready.

        $ k get pod -n gitpod
        NAME                                  READY   STATUS
        kotsadm-minio-0                       1/1     Running
        kotsadm-postgres-0                    1/1     Running
        kotsadm-bb4b8b869-d879q               1/1     Running
        gitpod-telemetry-27576960-n95md       0/1     Completed
        installation-status-67d64d7cd-88wk7   1/1     Running
        svclb-proxy-9r8ws                     3/3     Running
        registry-776df46bd6-rw4jk             1/1     Running
        dashboard-7bdd74f889-wflkr            1/1     Running
        agent-smith-h44fq                     2/2     Running
        openvsx-proxy-0                       2/2     Running
        image-builder-mk3-7c7fb9fddb-x9rrs    2/2     Running
        ws-manager-76cb58c9fd-49zfv           2/2     Running
        content-service-db499c5c4-6zlwc       1/1     Running
        ide-proxy-56dc5f7d44-xqldw            1/1     Running
        minio-5c684d5449-lsk28                1/1     Running
        ws-daemon-sbb5f                       2/2     Running
        blobserve-586448694c-jwnwq            2/2     Running
        ws-proxy-64f67cd7b5-tn6nn             2/2     Running
        mysql-0                               1/1     Running
        messagebus-0                          1/1     Running
        proxy-6fd7c89748-bl8ts                2/2     Running
        registry-facade-dlzrq                 2/2     Running
        server-845864bb59-vfzwv               2/2     Running
        ws-manager-bridge-86c78c4487-ssbqv    2/2     Running
      

    4. Start a Workspace

    Now visit the domain name and the Gitpod dashboard should be alive. Configure code host integration (such as GitHub) the same way you do with Gitpod SaaS.

    Some notable points when creating a workspace:

    • When bootstrapping the first workspace, Gitpod would pull a 3GiB+ workspace-full image. If you have limited bandwidth or an unstable network connection, this process could be painful.
    • Gitpod tries to communicate with the in-cluster registry https://registry.mygitpod.domain/v2/workspace-image for committing additional layers, however the registry host resolves to the public IP, which causes a “hairpin” — travel out of the cluster and then back in via the external IP, thus got lots of timeout exception.

    @iQQBot advised to configure kubedns to add a rewrite rule mapping registry.mygitpod.domain to proxy.gitpod.svc.cluster.local, and then the registry host would resolve to the internal ClusterIP. proxy is the component for TLS termination and traffic proxy. Note that just setting the registry host URL to registry service FQDN from gitpod configMap wouldn’t work (cert validation would fail).

    After the modification, the workspace pod turned to Running state quickly and finally web browser proceeded to the workspace view.

    Conclusions

    The official doc is pretty high quality, I did have several failed attempts when trying to bypass or violate some requirements, and got all kinds of error messages, from containerd failures (yield by ws-daemon when using a CentOS-based host), certificate failures (crictl pull & other system tools failed when using a self-signed cert first, and later stuck at issuing ACME certs), network connectivity issues, etc. I spent the first several nights just fighting these weird problems and got no progress at all, when I started over just followed all the requirements from Gitpod’s guide, and it worked out quickly.

    A basic familiarity with Kubernetes and the Gitpod architecture is necessary. Without the help from Gitpod member @iQQBot (Thanks again @iQQBot!), I would possibly be blocked somewhere and just given up during the process.

    Though this single node cluster is far from a production-ready development infrastructure, and there are lots of black boxes across the whole system, it did give me a chance to see how Gitpod build the magic product from an inside view.

    Some suggestions on Cloud-hosting service selection:

    • Put more resources onto the disk & network part (e.g. choose a High I/O CVM instance model), especially if you’re using the in-cluster registry & object storage.