16 min read
Based on MapR Academy course, BUS – Application Containers and Kubernetes
This post discusses application containers – what they are, what they do, and why they matter.
We’ll start with an overview of what application containers are and how they are used in enterprises to help improve deployment time, consistency, and the efficiency and availability of applications. Along the way, we will cover key characteristics of what containers can and cannot do, and how they compare to virtual machines.
In future related posts, we will cover how Kubernetes is used to orchestrate containers and associated resources. We’ll discuss how Kubernetes schedules the deployment of containers, scales container resources, manages communication between applications in containers, and is used to monitor the health and availability of containers. Combining the MapR Data Platform with application containers and Kubernetes makes applications fully capable of consuming and processing live, enterprise-level data.
An application container is a stand-alone, all-in-one package for a software application. Containers include the application binaries, plus the software dependencies and the hardware requirements needed to run, all wrapped up into an independent, self-contained unit.
The application container is dropped into a system, then runs using the local hardware and operating system. Since it includes all of the necessary dependencies, the container functions exactly the same when deployed on a laptop, on a server, on a virtual machine, in the cloud, or on any other compatible system.
As a self-contained package, an application container can easily be moved to a different system, or even be uploaded and downloaded using a software hub, to be shared with others.
Let’s take a look at an example of how and why you might containerize an application in an enterprise environment.
This example tracks how containers are used in the development cycle of an app designed to monitor the health sensors on wearable devices. The sensor data is streamed to the customer’s health care provider network, which performs machine learning on the data, to look for warning signs of health issues.
Development of the health monitoring sensor tracking app is done by a team of developers, working on laptops. They commit their work to a software hub every day. For reasons that will become clear in a moment, let’s assume that the development laptops are all 3 - 4 years old, use spinning hard drives, have 16 GB of RAM, and each has its own, specialized version of some flavor of Linux and an IDE. The application works flawlessly in their development environment.
Because the health monitoring application is large and complex, it gets broken into microservices, in order to run more efficiently. Each service is then packaged into a separate application container.
Each container includes the final code binaries for the individual services.
Included are the dependencies for each service, such as the libraries and APIs needed to communicate with the sensors on any device the app supports, geolocation to know where the data is coming from, data streaming, web protocols, authentication with the health care system, and anything else needed for the app to run.
Also included is a YAML (YAML Ain't Markup Language)file, which defines the CPU, RAM, network, and storage needs for each service.
Each container includes only the dependencies it needs for the single service in it. The microservices approach allows each container to specialize for its service. Two services can even use different versions of the same library if needed, because the container environment allows them to function independently of each other.
Once application development is complete, the containers are tested by the QA team.
The QA environment uses a dozen nodes with CentOS, running a stripped-down version of the live environment. The servers are a few years old; they vary greatly with CPU speed and cores and the amount of RAM and network cards on each server. Some have spinning disks while others use SSDs.
Because containers include all the software dependencies each service needs, the application runs exactly the same on the QA servers as it did on the development laptops cited earlier.
Containers define what hardware resources are required for each service to run. The server carves these resources out when the service is needed and provides them for the container. The service runs inside the containerized environment, which is spared from having its resources cannibalized by other applications on the server. When the service is no longer needed, the container is shut down and the resources are released back to the system.
When the application clears QA, it is ready to be released on the live environment, which, in this example, is made up of 100 nodes in a distributed cluster. These nodes use CentOS on the very latest hardware with 12 cores, 256 GB of RAM, SSDs, and gigabit network cards.
Again, because the services are containerized with all of the dependencies they need, they run just as they did on the development laptops and QA servers, albeit quite a bit faster. The application services run happily in their containers, safe from disturbance by any other applications on the system, pulling resources when needed and releasing them when not.
To create high availability for the application, each service is scaled by spinning up multiple instances of the container image and distributing them across the cluster. Each container only includes a single service and its dependencies, which makes them very lightweight. Cloning and launching new copies of a container takes just seconds. Quickly, several copies of the services are spread out across the cluster, each performing the work sent to it and returning results, independent of the other copies in the environment.
To share containers, you could, for example, just upload the containers to the cloud where they can be downloaded. Since the containers include everything the app needs to run, it will run functions the same as it does on your own clusters.
An application container is similar to, but not the same as, a virtual machine. A container is a package of the application and its dependencies. It will run on any compatible OS and hardware. Containers are designed to be streamlined application packages and nothing else. In the case of the example, the containers were streamlined down to microservices, so that each container is as lean and efficient as possible.
A virtual machine, on the other hand, includes a complete copy of an operating system, along with any applications and software dependencies running on it. In addition, a virtual machine requires a hypervisor layer to talk to the host server. A VM can be capable of performing more complex tasks, but requires a much larger package, plus more overhead in time and resources from a server.
An application container is a clean and effective package for running a single application as efficiently as possible.
Application containers consist of just an application, its software dependencies, and a small YAML file with its hardware requirements. They use the OS and infrastructure native to the system they are deployed onto. Therefore, a container is very lightweight when compared to other virtualization techniques, like VMs. They can be ported quickly to other environments and take just seconds to deploy, clone, or even relaunch in the case of a problem.
While a container can include more than one application, that is not generally considered a best practice. An application container is a streamlined environment, designed to run an application efficiently. Adding more applications adds complexity and potential conflict to the package. In some cases, tightly coupled applications may share a container. In the vast majority of cases, however, an application will have its own specialized container.
Since a container only includes an application and its dependencies, it has a very small footprint on the server. Because of this small size, several copies of a container can be launched on each server.
A higher application density leads to more efficient processing, as there are more copies of the application to distribute the workload.
Availability of the app is also greatly improved. The loss of a single container will not affect the overall function of the service, when there are several others to pick up the lost production.
The small size of an application container makes it quick to move between servers, up to a cloud, or to mirror to another cluster. In addition, a container is completely self-sufficient and has all of the resources that the application needs to run.
When moving a container, it can just be dropped onto its new home and launched in a plug-and-play fashion. You do not need to test for compatibility or load additional software.
The containerized app can even run directly off of an instance in the cloud, giving you absolute flexibility over availability and the run-time environment.
The application container ecosystem is very large and diverse. Thousands of containerized applications are available on hubs to use as templates or foundations for proprietary apps. A simple container with Apache Web Server can be downloaded, pre-built, saving the development time and resources needed to create common services over and over.
Because each application container creates an isolated environment for its application, the resources allocated to it are the entire machine. Other copies of the same container are "unaware" of each other.
As a result, it’s easy to have multiple different applications on the same server, all running simultaneously. Each application uses the resources assigned to it. There are no concerns about an application taking resources from another application and causing it to crash. When an application completes its work, resources are released back to the system.
You know, now, that application containers are completely self-contained. Any application in a container runs in the same exact manner on any hardware. Because the container contains all dependencies, it doesn’t require the OS to supply any code. As long as the hardware can match the needs of the application, it will happily chug along no matter where you put it.
When an application is containerized, it is saved as an image. As the application grows, and as changes are made, these changes are saved as a new image on top of the original one. Therefore, any time you need to push or pull an upgrade of a containerized application, all you need to move and install is the new image. You do not need the entire container each time.
Containers are a highly specialized runtime environment, designed to run a single application as efficiently as possible. Because of this specialization, they are perfectly designed to break up large applications into microservices.
Each microservice can run in its container when called upon, then release the resources back to the system when it is done. You can also define a different number of clones for each container, based on the use of each service, allowing for more copies of services that are used more frequently in your app.
Read more about microservices and containers in our FREE eBook, A Practical Guide to Microservices and Containers.
When a container is shut down, either intentionally or due to some failure, everything in the container goes down with it. You can save out results as part of your application, but other progress and data, like logs, are lost. Each time you need to use the application, or spin up a new copy, it starts from a new beginning state. The state of the application cannot be saved and built upon with a new iteration.
A container does not know where its data comes from, nor where it is going. The container is completely enclosed in its environment, and cannot see data sources for work or storage. External data can be accessed, and results from work can be saved out of the container environment, but the container has no idea what or where that data is.
Therefore, the app in the container cannot take advantage of data locality to make sure the work and data are on the same node. This may result in extra movement of data and reduce the efficiency of the service in the container.
Containers are not aware that the hardware that is provisioned to it is the entire server. This is helpful in that the containerized application is protected, but this also means that the application is not aware of, nor can it take advantage of, common services for enterprise applications, like networking with other apps, scheduling, distribution, and load balancing.
You have seen how application containers speed up the pipeline from production to delivery. They solve the problem of having an app working reliably in different environments, making development processes much more efficient and flexible.
Breaking up large applications into microservices and putting them into containers provides fast and flexible services that can quickly be deployed and moved between systems.
Containers are not a stand-alone solution, however, and have a few inherent limitations.
In part two of this blog series, we explore how Kubernetes is used with containers in an enterprise level environment, to orchestrate the deployment and management of hundreds or thousands of services across multiple clusters.
Stay ahead of the bleeding edge...get the best of Big Data in your inbox.