But what happens when containers are seen as “typical pieces of infrastructure” that do not add any business value? Then the discussion shifts from the original perspective to a new one. Previously mentioned benefits compared to Virtual Machines become blurry. You still have to manage infrastructure / middleware-related resources. Especially when compared to serverless. Therefore some people coin the phrase “Containers are the new Virtual Machines”. Let’s explore this concept from various perspectives.
Every container requires a so-called template or image that specifies the operating system it uses together with the desired OS-level packages. Think of using an Alpine-based image with OS-level packages such as curl, wget, cat, cp, chmod, chown, etc to carry out specific operations. All of these packages come with a specific version, mostly dictated by what the Operating Systems support (at a given time). From time to time, you need to upgrade the Operating System (base image) of the template. This includes a new version of the OS level packages as well. No big deal if these are all compatible with the Operating System itself.
But a bit more painful if you require specific versions to be installed. You need to delete or disable the package for which you require a replacement and (manually) install the desired version of the package. Sounds like a boring infrastructure-related task that also requires proper testing. Besides this, it also might have an effect on your application if it strictly depends on this specific version. Will you still benefit from containers, given this perspective?
Containers need to talk to each other if your application is split into multiple microservices that each run in their own isolated environment. You need to route internal traffic from an API gateway or load balancer towards your containers. And additionally, you also need to control the outgoing traffic. Perhaps your containers are only allowed to specific URLs and nothing else. This requires a corporate Egress firewall and perhaps specific rules per application/container. You all need to maintain this. That is no sinecure. It also requires a design for “container networking” as explained on the webpage of Suse.
From a “plain Docker” perspective, you already have five networking types to choose from: Host, Bridge, Custom Bridge, Container defined, and also No Networking at all. For every application, you need to select the best networking option for your solution. You also require a networking driver that drives the given option. Just like Virtual Machines, you need to maintain the network and troubleshoot it once in a while. With serverless, you don’t care about the networking since that is abstracted away from you and offered by the service provider. However, it is important to choose the right specs. Think of speed, security, accepted latency, etc.
Container to container communication needs to be secure to avoid leaking information “in-between” different services that work together in harmony. It’s easy to make a mistake in a container image that poses a security configuration weakness that needs to be fixed. For example: allow traffic from the entire internet to your application instead of from a specific network segment (CIDR block) or Security Group (in AWS). Or what about runtime issues that can easily pop up if you have not defined your runtime rules correctly? A developer might not be aware of it and run a malicious package that has a backdoor in it.
Other problems are more on the infrastructure side of things: containers that do not have any CPU or memory limits specified. In case your workload misbehaves it can cause the entire system to crash since it will eat up precious resources from the underlying system. Auto-scaling in the cloud might prevent the system from crashing, but you might face another big risk: huge cloud bills. You need to control and maintain your infrastructure aspects to remain secure and “in control”. Once you forget those aspects, your application and thus your business is at risk. This all adds up to the number of responsibilities of the cloud or (container) platform team(s). Again the same kind of problems are just like Virtual Machines. When focusing on delivering business features, you would want to avoid all of this altogether.
Virtual Machines require a root disk for the Operating System. And sometimes another disk or network drive to store large amounts of data. Both disks should be encrypted (at rest) to remain secure. You need to carefully select which types of disks you prefer. Solid State Disks (SDDs) are faster compared to traditional disks (HDDs). However, they are also more expensive. Containers require a so-called “storage driver” to obtain storage solutions for persistent storage.
Example storage provisioners for Kubernetes-based workloads are Azure Disk, Local, and AWSElasticBlockStore to name a few. These provisioners translate to a “storage class” and (lower level) a storage driver. Often, they come in the form of an additional container to run the desired software package.
An example would be the installation of the LVM flexVolume driver for Kubernetes. You need to install this software tool and run it as a Pod in your existing Kubernetes cluster. It enables your other containers to mount a data volume inside it so you can use this volume to store persistent data that remains intact even when your container is deleted and recreated. This driver also needs to be updated, upgraded, and patched. Every time, you need to assess the upcoming changes with the new version and carefully test if all works fine after the upgrade. Not only from a functional point of view but also from a technical point of view by conducting speed tests, data reliability tests, etc. This all requires additional maintenance you would not have when running your application without the use of containers.
Base images/master templates
Containers are made up of so-called images or templates. Every layer contains an instruction to be carried out after each other. Layers can be “stacked” and are inherited from top to bottom. There is no limit to the number of levels. This leads to a so-called “parent-child” relationship and images become dependent on each other. Parent images are also called “based images”. From a functional point of view, these can also be called “master templates or golden templates” just like templates for Virtual Machines. You need to carefully watch any changes in the entire “chain” of images when it comes to patching or other changes. Since all depend on each other, a version upgrade in a parent image has a direct effect on the underlying images that depend on it.
Often, maintaining those base images is the problem area of a “platform team”. Those teams handle infrastructure and middle-ware so also from an organizational point of view, there is a dependency. That is the same as Virtual Machines for which that same platform team creates and maintains the “master templates”. Change requests might pop up if a team wants something else. Or what about hardening an image? The purpose is to make the system more secure, but at the same time functionality can be broken in a matter of minutes. Teams that work in isolation that do not depend on those templates do not face these problems. Serverless setups do not require (base) images and master templates. Discard the burden of these infrastructure-related aspects and focus on your business features. A big win for everyone in the organization.
Logging and monitoring
Logging and monitoring are essential for your application to troubleshoot any issues early in advance. No news so far. Virtual Machines require logging of Operating System (components) as well as logging for the application(s) in charge. The same is true for containers. Consider a container that runs a single process (best practice), by default, there is only one log stream. Container logs spit out information about every aspect of the Operating System which makes up the pieces of the infrastructure layer and the application itself.
Perhaps the platform team creates and maintains the (base) image. Problems in these components need to be addressed by them. On the other hand, the application-related logs are the responsibility of the DevOps team. Sometimes the line of responsibilities is a bit blurry. It needs no explanation to understand that the above-mentioned situation can lead to a lot of overhead and communications between those teams.
Apart from the different levels of logging and monitoring, you also require a logging driver or logging collector to generate the correct logs and process them accordingly. In my of my previous articles, the main features of FluentD are explained. Once you run that in a Kubernetes cluster, you also need to maintain this piece of infrastructure since it runs as a separate component alongside your other applications. Who is responsible for that? As with Virtual Machines, more maintenance and patching is needed to keep them running smooth and secure.
Containers are the new Virtual Machines is a rather bold statement that might make you think about it twice. In this article, I highlighted some of the aspects that require additional infrastructure (pieces) to set up and maintain. It’s not just about building and packaging your application into a container, run about it and then forget it. Nope, it requires a lot more, even more than we can cover in this article. I hope this article helped you to think about containers (again) before you jump into them. Perhaps other solutions such as Serverless are better for your applications when they are already modern and do not require specific (cloud) native solutions that containers need. Remember, it’s all based on your situation, knowledge, money to spend, and the amount of effort an organization wants to put into its application deployment methods.