Intro
On the 5th of November 2018, the IT team at Kogan.com started another hackday. This time, we set our goal on learning Kubernetes. We wanted to answer the question; how exactly can we leverage container orchestration to make our deployment process faster and more efficient?
In order to achieve our goal, we set out to deploy one of our major apps that controls customer subscription preferences with Kubernetes on two different cloud providers, Google Cloud Platform (GCP GKE) and Amazon Web Services (AWS EKS). By doing this, we hoped we could understand the pros and cons of each platform, while learning the intricacies of Kubernetes deployment at the same time.
This will be a two part blog series. The first part will be a short overview of our motivation, goals, as well as how the day actually unfolded. The second part will be more technical, focus on the approaches of the two teams, and discuss the pros and cons of each platform.
Motivation
In order to understand our motivation, it is useful to have a general idea of our deployment process here at Kogan. Everyday, we have a set time where we do a daily deploy of our major apps. The deployment pipeline consists of a fairly expensive build step, storing the artefact, and then pushing the artifact to provisioned servers. Those servers are mostly provisioned with a combination of autoscaled CloudFormation and Salt.
While this existing process is sufficient in most cases, we are still facing some outstanding issues that we would like to improve. Autoscaling takes too long, so we aren't able to react to changes in traffic patterns as quickly as we'd like. Secondly, the deployment process can take up to 15 minutes.
One solution for this is to introduce docker into production as it will help us with deployment speed and standardise our process further. The number one tool for container orchestration at the moment is Kubernetes, and that's where it comes into play.
The word Kubernetes had been floating around the office for a while, but few of the engineers could say they understood it, let alone experimented with it. Infrastructure can be a bit of a black box for some people, so we took this opportunity to learn together and get everyone somewhat across how deployment works at Kogan.
Goal & Planning
Considering that we would like to learn the pros and cons of both GCP and AWS platforms we set up two teams of 6 to 7 developers. Each team was responsible for deploying the app to their assigned platform in a manner suitable for a staging or UAT environment.
With this in mind we wrote down a series of infrastructure related tasks on what constitutes deploying" the app. We separated tasks into major components, like setting up a cluster, deploying the web app, and deploying dependent services.
We began with an outline for the day, and then settled in to install all the tools we would require. After completing a short tutorial with minikube we split up into our respective teams to put together a proposed architecture diagram which we'd present to each other before any real hacking began.
Hackday
The initial challenges were understanding the fundamental concepts and terminology surrounding Kubernetes. What is a Pod? What is a Service? What are the difference between them? Where does a Deployment come in?
As the day unfolded and each team had their infrastructure running, we met some new challenges. The first was a difficulty with making the application Kubernetes "ready", which involved new configuration, variables, and strategies for managing static content. Then there were difficulties with learning new tools, like Helm, for composing a Kubernetes deployment.
We found that examples and documentation could be lacking or outdated. There was a big jump in translating our existing docker-compose configuration to something that Kubernetes would understand e.g passing secrets to apps automatically, and getting the application to communicate with the Database.
There were also some difficulties in getting a cluster running. The GCP team, as expected, had an easier time considering that they were able to run Kubernetes controller out of the box on GCP. The AWS team spent quite some time configuring the VPC and other AWS services to accommodate a Kubernetes cluster.
Due to these challenges, at the end of the day, we had a situation where the GCP team had managed to setup their cluster, but had difficulties composing the configuration necessary to get the application running. The AWS team had managed to successfully get the apps deployed locally with Minikube but more of a hard time in translating that configuration to the AWS cluster.
Conclusions
Even though we did not manage to completely deploy our app from end-to-end with Kubernetes, we had gained a significant amount of experience as a team. Everyone was involved in the nitty gritty details of configuring Kubernetes, and even at various levels, every team member was aware of Kubernetes basic concepts and tooling.
The hackday was a giant leap forward on our long term goal of converting our production apps to running on Kubernetes. We also acquired valuable information on the pros and cons of each cloud platform. Overall, we were confident to say that the goals set out originally had been achieved by the end of the day.
Stay tuned for part 2 where we will discuss the technical advantages and disadvantages of both GCP and AWS platform, and what we have learned as a whole from a devops standpoint.