This blog post is the first in a series of articles describing our 6+ month journey towards running the dexi.io platform on a microservice architecture deployed on the container orchestration software Kubernetes.
The blog post is aimed at a technical audience, e.g. developers, devops, data scientists/analysts or others interested in such technical topics.
Later articles will likely include some more nitty-gritty details like:
- Running a MongoDB cluster in Kubernetes
- Running two Kubernetes clusters: on premise and in Google Kubernetes Engine (GKE)
- Our continuous integration/deployment (CI/CD) flow, including deployment to Kubernetes
- Reporting third-party metrics to Datadog from containers running in Kubernetes
These articles cover both the success stories but also the hardships we have been through. We hope that someone working with microservices, containerisation and ar e considering Kubernetes or perhaps are already using Kubernetes can benefit from our learnings.
Microservice Architecture or Chaos
While already running on a kind of microservice architecture, adding new features to the platform was not as easy as we would like it to be. As an example, it required making a lot of manual configuration on the production systems. This meant that instead of adding new microservices we almost always opted for adding features to the existing microservices, making them not-so-micro over time. To be able to easily add more microservices to support the requirements of both the current and future platform, we knew we needed the platform to run on a true microservice architecture, probably running in containers (like Docker containers).
There are many good and well-documented reasons for using microservices and running software in containers. Our most important reasons and requirements were:
- Scope: each microservice only covers one domain/part of the platform. This makes it easier to understand, maintain, test, operate (e.g. deploy) and document that service.
- Platform isolation:
- Any operation of one of our enterprise customers, e.g. executing a robot, must not affect another enterprise customer or the shared system.
- Deployment of one service must not affect another part of the system.
- Extensibility: the system must be fully extensible and pluggable, e.g. to be able to add hundreds of integrations to new and exciting services to an already long list of integrations available.
- Scalability of developer team:
- We require a high degree of automation to support the advanced features our platform offers.
- It must be possible for new developers to quickly start adding value.
The de-facto standard for containers is Docker which is the container technology we chose.
Container Orchestration aka Controlling the Chaos
To be able to work efficiently with a large number of services - potentially hundreds - we knew we needed some kind of container orchestration framework that handles the life cycle and the operation of these.
Our requirements for such a framework were:
- Allow for zero downtime / always up by supporting replication, rolling updates and automatic restart of services. Our customers come from all over the world and are constantly executing robots.
- Going forward, taking the system offline for even a couple of hours to do a regular deployment is not viable. The downtime we just had for our new architecture was our first planned downtime lasting more than a few minutes. And we expect it to be our last.
- We take our uptime very seriously. Balancing fast iteration of high quality software with high availability will continue to be one of our top priorities and the new platform must support that.
- Support compartmentalisation of services for large customers into logically separate “namespaces” or even physically separate clusters, perhaps on-premise/private cloud. Customers can have a number of different reasons for requiring their data never to leave their own physical infrastructures but would still like to be able to use the features of services like ours. Such separate clusters should be easy to connect to our own cluster.
- Make it possible to ”spin up” and connect new clusters of pretty much any type of software, e.g. databases (e.g. MongoDB, Elasticsearch), message queues (e.g. RabbitMQ, Kafka), computational clusters (e.g. Hadoop, Apache Spark), etc.
- Make administration of our server infrastructure as easy as possible, the framework should function as a “distributed operating system”, abstracting away physical server resources (CPU, memory, disk) into pools of resources and handling execution and scheduling of abstract tasks (services).
- Make it easy to handle peak loads, e.g. ensure high throughput of executions or many concurrent users on the platform by making it easy to scale services.
- Allow a high performing UI (low latency) by making it possible to deploy services in different geographical regions as close to customers as possible.
- Allow easy health monitoring of services.
There are many techniques for working efficiently with microservices and many container orchestration frameworks available. Two of the major players we looked into were Apache Mesos (with DC/OS ) and Kubernetes (also known as k8s).
Choosing a Framework
Spoiler alert! We chose Kubernetes.
One of the most important factors was the fact that every major cloud provider at the time, in addition to Google themselves, were jumping on the Kubernetes bandwagon (and still are), ensuring its continued rapid development and longevity.
Mesos and DC/OS has a number of appealing traits. For instance, Mesos operates at a lower level in an operating system metaphor and could thus probably be a more natural choice for a general-purpose compute cluster. Kubernetes on the other hand is built for persistent services and well-defined “jobs”. Ultimately the choice was based on the wide adoption of Kubernetes and the fact that Kubernetes solves infrastructure really well - whereas Mesos seems more general purpose, DC/OS and Marathon not being the primary focus of the Mesos project.
Getting up and running with Kubernetes (Rancher)
At Dexi we’ve been running on bare-metal for years and as we started experimenting with Kubernetes we were doing so “by hand” on a small cluster of Ubuntu servers. What quite quickly became apparent however was that the documentation for installing and setting up Kubernetes from scratch is lacking at best.
After a lot of hair pulling and screaming at innocent computers we came across Rancher (and Cattle). At this point we were introducing new technologies into the experiments every day so adding Rancher to the list didn’t scare much - and Rancher does one thing superbly well: setting up and configuring a Kubernetes cluster. Literally at the click of a button the room went from frustration and madness to uncontrolled excitement: we finally had a running Kubernetes cluster!
When we set out to explore Kubernetes we wanted to run everything, from databases to our own internal services, within Kubernetes. We had researched a concept known as stateful sets as a solution for doing stateful applications, like a database, and assumed that this feature would provide a simple solution to that problem.
Once again we were hitting edge-case space of Kubernetes: while stateful sets are useful they require a distributed file system (like GlusterFS) to be able to tie itself in a predictable manner to the same physical disk upon restarts / reschedulings of the service. Introducing a network file system to run a high performance database was a high risk move that we decided against, not knowing the “dos” and “don’ts” of GlusterFS (which appeared to be the most widely adopted file system for on-premise deployments of Kubernetes). Stateful sets will, in future versions of Kubernetes, be getting support for “stickyness” to local(host) disks but do not have it yet.
Doing Things the k8s Way
A lot of things we expected to be simple turned out to be complex problems. If there is one thing about shifting to Kubernetes that you absolutely must do - it is changing your mindset to work with the capabilities of Kubernetes.
First, you must shift your thinking from processes that run on machines to containers that run on a pool of resources. Second, understanding taints, labels, deployments, services, pods, daemon sets, stateful sets, headless services, service accounts, secrets, volume disk claims, sidecars and more is probably essential to the success of your Kubernetes journey. So far we have been able to solve every challenge with a good understanding of these concepts and from time to time a good deal of creativity.
Kubernetes is a brilliant piece of software and we’re very happy with our new setup. A few examples:
- We have a fully automated integration and deployment pipeline (CD/CI flow).
- We can introduce new services very easily.
- Service isolation, including logging, monitoring and other things needed when running a service in production, provides a great balance between general availability (daemon sets) and service specifics (sidecars).
- Scaling a service is effortless (one command/click).
- Scaling our cluster is done easily by using daemon sets to automatically configure firewall rules (iptables), logging and monitoring on the new hosts.
It has taken us a long time to get here, though! At times it has felt like two steps forward and one step back. It’s easy to get a service running on Kubernetes. It takes effort and has a steep learning curve to run services at production quality. We’ve written and rewritten a lot of YAML!
In the coming weeks and months we’ll dive deeper into some of the challenges we have had to overcome - and we will let you in on the solutions for them. We hope you’ll stay tuned and leave some comments for us: maybe you have had similar (or wildly different) experiences. Maybe you have come up with a better setup or maybe you just want to tell us that we’re awesome :)
Giving back to the Community
While you (and we) anxiously await the next blog post we would like to introduce a log program that sends logs to Google Cloud Platform’s log service, Stackdriver:
Completely open source and ready for you to use (or contribute to)!
In our own Kubernetes cluster we run this service as a daemon set. On each host the program collects the logs and sends them to Stackdriver in the same format that Google itself does for its own Kubernetes setup ( Google Kubernetes Engine, GKE).
We have no doubt that the future of cloud software means sharing and deploying YAML files - and we’re excited to make our first contribution to that as well.
Thank you for reading,