A Practical Guide to Microservices and Containers

by James A. Scott

Infrastructure Agility

Application agility depends upon an equally agile infrastructure. Both have similar characteristics. As we’ve seen, agile applications are deconstructed, distributed and dynamically assembled as needed. The whole point of microservices and containers is to eliminate as much of the need for vertically integrated and hard-coded logic as possible. It is also to remove dependencies on specific servers and hardware components.

Today, organizations have an unprecedented variety of options for deploying infrastructure. There’s traditional on-premise infrastructure – the data center – public cloud, private cloud and hybrid cloud. There is also a huge selection of software-as-a-service (SaaS) providers who expose their services through APIs that their customers can use to extend their own applications. The downside of choice, of course, is complexity. That’s why agile infrastructure should be designed to be deployed on as broad a combination of platforms as possible. Resources should be shared. All platforms should support multi-tenancy for the greatest deployment flexibility. This is even desirable within the on-premises data center. As much as possible, platforms should have common operating systems, containers, automation tools, permissions, security and even pathnames. If something is in a home directory on-premise, there should be a duplicate home directory and path on a cloud platform so applications don’t come to a halt over things like simple naming conventions.

Deployment Options

Deployment options fall into four basic models, which can be mixed and matched as needed.

On-Premises Infrastructure

This is been the dominant enterprise computing model for more than 50 years. Organizations maintain their own equipment and software in a captive data center with full control over all aspects of processing, scheduling, administration and maintenance. Many organizations in regulated industries have no choice but to use an on-premises model because of the need to tightly control data and document processes. However, the cost and significant capital expense involved with building and maintaining on-premises architecture is prompting many organizations to shift some or all of their workloads to more-flexible cloud options. On-premises computing won’t go away anytime soon, however. Legacy equipment and applications may be incompatible with a cloud environment, and organizations that want to protect investments in hardware and software may choose to maintain on premises investments for years until depreciation cycles have run their course and applications can be redeveloped.

Public Cloud

Public cloud makes resources, such as processors, memory, operating systems, applications and storage, available over the public internet on a pay-per-usage basis. Think of it as a computers in the sky. Public cloud is like using a local server, but the server is virtualized and managed elsewhere by a cloud provider with a high degree of automation.

Organizations use public cloud for a variety of reasons, but the most popular are flexibility, scalability and ease of administration. Public cloud instances can be launched with a few mouse clicks and just as easily taken down when no longer needed. Developers and end-users can, in many cases, deploy their own cloud instances without approval from IT and its accompanying delays. Billing is usually based upon usage, which gives organizations accountability and flexibility to pay only for the resources they use. Public cloud instances can be scaled up or down with relative ease, and many cloud providers offer best-of-breed automation tools to make administration easy. Public cloud is also an excellent platform for developing applications that will “live” in the cloud, such as those meant for use on mobile devices or with services that are exposed via APIs.

Private Cloud

For organizations that want the flexible automation benefits of public cloud but need to keep resources on premises for control or compliance reasons, private cloud is a popular alternative. This model provides the same scalability, automation and flexibility advantages of public cloud and on-premise environment that can be physically secured and tightly managed. Private clouds can be built using existing data center equipment architecture or licensed from public cloud providers, could deliver what is essentially a version of their existing services in a secure environment. True private cloud is more than just virtualization. The research firm Wikibon defines it as encompassing converged architecture, virtualized software and hardware, self-service provisioning, orchestration/automation and a single point of control.

Hybrid Cloud

When you combine a public and private cloud, you get a hybrid cloud. This architecture combines both models in a manner that is seamless and that permits workloads to easily move back and forth. This gives organizations a combination of control and flexibility that can be adjusted to the situation. Hybrid architecture preserves existing hardware and software investments while giving companies the flexibility to move applications to the cloud as resources and budgets permit. Not all applications can be moved easily, and some may continue to live for a long time in private data centers. In those cases, organizations may opt for a “cloud bursting” approach in which demand spills over to a duplicate or compatible cloud application as needed. This reduces the need to add on-premise infrastructure that sits idle much of the time. There are even cloud-cloud options, in which applications move back and forth between multiple public clouds.

Containers and Clouds

One of the most compelling advantages of cloud computing is developer productivity. As noted above, developers can quickly spin up their own cloud instances, provision the tools they want and scale up and down easily.

Containers are an ideal tool for developers to use when shifting between on premises, private cloud and public cloud architectures. Because containers are independent of the underlying operating system and infrastructure, they can be moved quickly and with minimal disruption. Some organizations even use multiple public clouds, and shift workloads back and forth depending upon price and special offers from the service providers. Containers make this process simple.

Orchestration

Agile infrastructure should minimize the need for human intervention in routine tasks, such as resource deployment and management. Overworked IT administrators and paperwork can introduce significant delay that undermines the value of cloud environments. Automation makes cloud computing fast and efficient by using software tools to handle these tasks.

For example, automation can enable the setup of multiple virtual machines with identical configurations using a single script written in Puppet, an open-source configuration tool that enables applications and infrastructure to be defined using English-like commands. Puppet scripts can be shared and used to enforce changes across data center and cloud platforms.

Ansible is an open source automation platform that can be used for tasks like configuration management, application deployment and task automation. It can also be used to automate cloud provisioning and intra-service orchestration using a “playbook” metaphor that permits multiple automation tasks to keep the combine to powerful effect.

As noted earlier, Kubernetes is bringing these same kinds of automation and orchestration capabilities to containers, with features that are customized for the unique stateless, self-contained characteristics of those vehicles. Kubernetes is optimized for orchestrating large numbers of containers, ensuring that each has the resources it needs and providing for things like health monitoring, restart and load balancing.

Kubernetes isn’t a replacement for Puppet and Ansible, but is another resource that works specifically at the container layer and that can be managed by those automation tools. The combination of VM automation and Kubernetes gives IT organizations unprecedented productivity advantages compared to manual systems administration.

Edge Computing

The internet of things will create vast new volumes of data, a flood that International Data Corp. expects will reach 44 zettabytes annually by 2020. To help visualize that, if you covered a football field with 32 gigabyte iPhones and kept stacking layers on top of each other, by the time you got to 44 zettabytes the stack would reach 14.4 miles into the air. At that altitude, the temperature is -65° and the barometric pressure is 1/30 that of the surface of the earth. IDC further estimates that machine-generated data will account for 40 percent of the digital universe in 2020, up from 11 percent a decade ago.

These unprecedented data volumes will require a new approach to processing, since traditional server, storage and network models won’t scale enough. This is why edge computing is rapidly emerging as a new architecture.

Edge computing distributes resources to the far reaches of the network and close to the devices that generate data. Edge servers collect streaming data, analyze it and make decisions as necessary. These servers can pass selected or summary data to the cloud over the network, but most of the processing takes place locally.

Edge computing has some important implications for IT infrastructure and application development. Many applications will need to be restructured to distribute logic across the network. Storage will likewise need to be decentralized. This will create new issues of reliability and data integrity that are inherent in broadly decentralized networks. Cloud servers will become control nodes for intelligent edge devices, performing summary analytics while leaving real-time decision making to edge servers.

Containerized microservices will be an important technology in the construction of IOT backplanes. Distributed processing frameworks will require federated, multi-domain management with intelligence moving fluidly to the places it’s most needed. Automation and orchestration tools like Kubernetes will evolve to meet this demand.

Serverless Computing

Cloud computing has made servers transparent, and serverless computing – also called event-driven computing, or Function-as-a-Service (FaaS) – takes this to another level. It reimagines application design and deployment with computing resources provided only as needed from the cloud. Instead of being deployed to a discrete server, containerized, microservices-based routines are launched in the cloud and call upon server resources only as needed.

The idea is to remove infrastructure concerns from the code, thereby enabling microservices to interact more freely with each other and to scale as needed. The user pays only for server resources as they are provisioned, without any costs associated with idle capacity.

Amazon Web Services’ Lambda is an example of serverless computing. It’s used to extend other AWS services with custom logic that runs in response to events, such as API calls, storage updates and database updates.

While still a fledgling technology, serverless computing has great potential to enable the development of applications that are far more scalable and flexible than those that are bound by servers or VMs. Containers and microservices will be key to the development of this new model.

Security

Having a robust but flexible security architecture is integral to the success of these technologies. The use of containers and microservices may significantly increase the number of instances running in your organization compared to virtual machines. This requires attention to security policies and the physical location of containers. Without proper security, you would want to avoid for example, running a public web server and an internal financial application in containers on the same server. Someone who compromises the web server to gain administrative privileges might be able to access data in the financial application.

Containers increase the complexity of the computing infrastructure because they can be dynamically orchestrated across services or even across multiple clouds. Self-provisioning means that administrators don’t necessarily know which containers are running, and a container’s IP address may be invisible outside of the local host.

Containers are different from virtual machines in the area of security. They use similar security features to LXC containers, which are an operating-system-level virtualization method for running multiple isolated Linux systems on a control host using a single Linux kernel. When a container is started, it creates a set of namespaces and control groups. Namespaces ensure that processes running within a container cannot see or interfere with processes running in other containers. Each container also has its own network stack, which prevents privileged access to the sockets or interfaces of another container.

Containers can interact with each other through specified ports for actions like pinging, sending and receiving packets and establishing TCP connections. All of this can be regulated by security policies. In effect, containers are just like physical machines connected through a common ethernet switch.

Stateful containers present somewhat more complicated security considerations because they connect directly to underlying storage. This presents the possibility that a rogue container could intercept read or write operations from a neighbor and compromise privileges.

Using a persistent data store with security built-in can minimize risk. The data store should have the following features:

  • A pre-built, certified container image with pre-defined permissions. This image should be used as a template for any new containers so that new security issues aren’t introduced.
  • Security tickets. Users can pass a MapR ticket file into the container at runtime with all data access authorized and audited according to the authenticated identity of the ticket file. This ensures that operations are performed as the authenticated user. A different ticket should be created for each container that is launched.
  • Secure authentication at the container level. This ensures that containerized applications only have access to data for which they are authorized.
  • Encryption. Any storage-or network-related communications should be encrypted.
  • Configuration via Dockerfile scripts. This can be used as a basis for defining security privileges with the flexibility to customize the image for specific application needs.

Microservices bring their own brand of security challenges. Instead of protecting a few monolithic applications, administrators must attend to a much larger number of federated services, each communicating with each other and creating a large amount of network traffic. Service discovery is a capability that enables administrators to automatically identify new services by pinpointing real-time service interactions and performance.

Cluster-based micro-segmentation is another useful tool. Network segments can be set up with their own security policies at a high level – for example, separating the production environment from the development environment – or in a more granular fashion, such as governing interactions between a CRM system and customer financial information. These policies are enforced at the cluster level.

Automation is also the security administrator’s friend. The complexity of a containerized microservices environment naturally lends itself to human error. By using automation for tasks such as defining policies and managing SSL certificates, that risk is significantly reduced.

In the early days of the container wave, security was considered to be a weak point of the technology. Much progress has been made in just the past two years, though. By using the techniques noted above, your containerized microservices environment should be no less secure than your existing VMs.