I’ve been using Amazon Elastic Container Service (Amazon ECS) since 2016 and my team manages our EC2 resources through our internal PaaS. We started using ECS when it was still called EC2 Container Service, but because of features like Fargate, that name no longer made sense.
At YipitData we have an internal PaaS where developers specify the processes they want to run (i.e., docker image, command, number of processes, web/worker tier), the ECS and EC2 part is figured out by the system. It’s annoying to manage a pool of EC2 instances, deal with instance replacement as the processes change, update operating system images once in a while, deal with EC2 instances retirement, hardware failures, and everything else that anybody working with EC2 has to worry about. When AWS Fargate came out, I thought it would be AWS’s answer to all the problems aforementioned, but I also thought they were moving the container ecosystem in a better different direction.
I’ve been managing infrastructure solely on AWS since 2013; thus, my views are biased towards my personal experience. My infrastructure team is tiny, we’ve been 1-3 people over the years, and we often pick services that will give us the least headaches and least maintenance. We have optimized our AWS bill heavily and we are aware of our vendor lock-in. Teams with different sizes and workloads on different cloud providers will have a different reality. Please keep this in mind throughout this blog post.
Kubernetes vs. ECS
We’ve been using ECS since 2016. At that time, Kubernetes wasn’t as mature as it is today and fewer integrations existed. Also, because Kubernetes is meant to be cloud-agnostic, it will never integrate with AWS as easily as ECS (e.g., IAM, CloudWatch, ALB). From what I’ve heard from my friends’ experience running Kubernetes in production, their infrastructure seems to be much more complex than what I’ve accomplished with ECS — Kubernetes has too many separate pieces to manage.
A container orchestration system is only part of the puzzle and it can be abstracted out for the most part. Your engineers don’t care if you’re running Kubernetes, ECS, Docker Swarm, Mesos or anything else, they only care if what they need to run is running at the specified capacity. Unless you’re an infrastructure engineer/DevOps, you should not care about the details of container orchestration!
Regardless of the container orchestration system you use, one problem is inevitable: there must be a pool of compute resources to run containers. Most companies have dedicated teams managing those clusters, dealing with OS updates, and making sure there are enough resources available at all times. Most of this management is at the instance level, which means that each instance runs multiple containers. If any instance has to be replaced, there’ll be a disturbance in more than one container; maybe a container from a different system will have to shut down because it happens to be on the same instance. It seems that reasoning about containers at the instance level is the wrong approach, there could be a better way.
With AWS Fargate, you specify how to run containers and AWS figures out the compute part for you. You don’t need to spin up instances to meet capacity or worry about OS upgrades, Fargate’s got your back — for a price. I think this is the right direction for most teams that don’t need to go too far to optimize their EC2 usage or are underwater with infrastructure/DevOps demand.
Systems like Fargate abstract one more aspect of the container ecosystem: Docker abstracts the build & execution phase, ECS abstracts the orchestration, and Fargate abstracts the servers. Most teams don’t care about how containers are orchestrated or how compute resources are managed, as long as the system meets their requirements. We’ve been running AWS Fargate in production since last year, and we knew one day we’d hit a wall and would have to go back to our EC2 optimizations, but if Fargate was (a lot) cheaper, I don’t think we’d go back to EC2. At YipitData, the bulk of our container processes are workers/batch jobs, which we’re happy to run on spot instances and save 80-90% of the bill. If your projects aren’t ready to run on spot instances, take a look at Fargate, it may help you.
Kubernetes may lose market share
I think Kubernetes may be to containers what Xen was to virtualization. At one point everybody cared about managing Xen, and then came the public cloud providers offering virtual machines for a reasonable price. Do new engineers even know about Xen? They don’t care, and they shouldn’t. Are my machines running on Xen, KVM, or Nitro? As long as AWS doesn’t mess up, I don’t care. Most companies want to deploy projects faster and outsource everything that is irrelevant to them, they don’t care about how the cloud providers do what they’re paid to do.
Will Kubernetes lose space to ECS because of technologies like Fargate? Maybe. I can hear some people say “but Kubernetes is cloud-agnostic” and I hear them. But remember: multi-cloud isn’t all sunshine and rainbows.
Fargate is a black box that you don’t have much control over, like all other managed services from AWS, and it doesn’t support a few things we’d like to see (e.g., custom volumes and custom Docker capabilities), but it’s a great step towards better abstractions. Recently we’ve had to disrupt our services and replace all of our Fargate tasks when the runc CVE-2019-5736 came out, but it was a lot less painful than the work we had to do to replace all of our EC2 instances.
There’ll always be issues with technology choices, but you have to decide which ones are worth dealing with. If one day Fargate supports spot tasks, we may switch entirely and never look back at EC2.