Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Mastering Kubernetes
Mastering Kubernetes

Mastering Kubernetes: Large scale container deployment and management

Arrow left icon
Profile Icon Gigi Sayfan
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (9 Ratings)
Paperback May 2017 426 pages 1st Edition
eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Gigi Sayfan
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Full star icon Empty star icon 4 (9 Ratings)
Paperback May 2017 426 pages 1st Edition
eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$29.99 $43.99
Paperback
$54.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Mastering Kubernetes

Chapter 1. Understanding Kubernetes Architecture

Kubernetes is a big open source project with a lot of code and a lot of functionality. You have probably read about Kubernetes, and maybe even dipped your toes in and used it in a side project or maybe even at work. But to understand what Kubernetes is all about, how to use it effectively, and what the best practices are, requires much more. In this chapter, we will build together the foundation necessary to utilize Kubernetes to its full potential. We will start by understanding what container orchestration means. Then we will cover important Kubernetes concepts that will form the vocabulary we will use throughout the book. After that, we will dive into the architecture of Kubernetes proper and look at how it enables all the capabilities Kubernetes provides to its users. Then, we will discuss the various runtimes and container engines that Kubernetes supports (Docker is just one option), and finally, we will discuss the role of Kubernetes in the full continuous integration and deployment pipeline.

At the end of this chapter, you will have a solid understanding of container orchestration, what problems Kubernetes addresses, the rationale for Kubernetes design and architecture, and the different runtime it supports. You'll also be familiar with the overall structure of the open source repository and be ready to jump in and find answers to any question.

Understanding container orchestration

The primary responsibility of Kubernetes is container orchestration. That means making sure that all the containers that execute various workloads are scheduled to run physical or virtual machines. The containers must be packed efficiently following the constraints of the deployment environment and the cluster configuration. In addition, Kubernetes must keep an eye on all running containers and replace dead, unresponsive, or otherwise unhealthy containers. Kubernetes provides many more capabilities that you will learn about in the following chapters. In this section, the focus is on containers and their orchestration.

Physical machines, virtual machines, and containers

It all starts and ends with hardware. In order to run your workloads, you need some real hardware provisioned. That includes actual physical machines, with certain compute capabilities (CPUs or cores), memory, and some local persistent storage (spinning disks or SSDs). In addition, you will need some shared persistent storage and to hook up all these machines using networking so they can find and talk to each other. At this point, you run multiple virtual machines on the physical machines or stay at the bare-metal level (no virtual machines). Kubernetes can be deployed on a bare-metal cluster (real hardware) or on a cluster of virtual machines. Kubernetes in turn can orchestrate the containers it manages directly on bare-metal or on virtual machines. In theory, a Kubernetes cluster can be composed of a mix of bare-metal and virtual machines, but this is not very common.

Containers in the cloud

Containers are ideal to package microservices because, while providing isolation to the microservice, they are very lightweight and you don't incur a lot of overhead when deploying many microservices as you do with virtual machines. That makes containers ideal for cloud deployment, where allocating a whole virtual machine for each microservice would be cost prohibitive.

All major cloud providers, such as AWS, GCE, and Azure, provide container hosting services these days. Some of them, such as Google's GKE, are based on Kubernetes. Others, such as Microsoft Azure's container service, are based on other solutions (Apache Mesos). By the way, AWS has the ECS (the containers service over EC2), which uses their own orchestration solution. The great thing about Kubernetes is that it can be deployed on all those clouds. Kubernetes has a cloud provider interface that allows any cloud provider to implement it and integrate Kubernetes seamlessly.

Cattle versus pets

In the olden days, when systems were small, each server had a name. Developers and users knew exactly what software was running on each machine. I remember that, in many of the companies I worked for, we had multi-day discussions to decide on a naming theme for our servers. For example, composers and Greek mythology characters were popular choices. Everything was very cozy. You treated your servers like beloved pets. When a server died it was a major crisis. Everybody scrambled to try to figure out where to get another server, what was even running on the dead server, and how to get it working on the new server. If the server stored some important data, then hopefully you had an up-to-date backup and maybe you'd even be able to recover it.

Obviously, that approach doesn't scale. When you have a few tens or hundreds of servers, you must start treating them like cattle. You think about the collective and not individuals. You may still have some pets (that is, your build machines), but your web servers are just cattle.

Kubernetes takes the cattle approach to the extreme and takes full responsibility for allocating containers to specific machines. You don't need to interact with individual machines (nodes) most of the time. This works best for stateless workloads. For stateful applications, the situation is a little different, but Kubernetes provides a solution called StatefulSet, which we'll discuss soon.

In this section, we covered the idea of container orchestration and discussed the relationships between hosts (physical or virtual) and containers, as well as the benefits of running containers in the cloud, and finished with a discussion about cattle versus pets. In the following section, we will get to know the world of Kubernetes and learn its concepts and terminology.

Kubernetes concepts

In this section, I'll briefly introduce many important Kubernetes concepts and give you some context as to why they are needed and how they interact with other concepts. The goal is to get familiar with these terms and concepts. Later, we will see how these concepts are woven together to achieve awesomeness. You can consider many of these concepts as building blocks. Some of the concepts, such as node and master, are implemented as a set of Kubernetes components. These components are at a different abstraction level, and I discuss them in detail in a dedicated section, Kubernetes components.

Here is the famous Kubernetes architecture diagram:

Kubernetes concepts

Cluster

A cluster is a collection of hosts storage and networking resources that Kubernetes uses to run the various workloads that comprise your system. Note that your entire system may consist of multiple clusters. We will discuss this advanced use case of federation in detail later.

Node

A node is a single host. It may be a physical or virtual machine. Its job is to run pods. Each Kubernetes node runs several Kubernetes components, such as a kubelet and a kube proxy. Nodes are managed by a Kubernetes master. The nodes are worker bees of Kubernetes and shoulder all the heavy lifting. In the past they were called minions. If you read some old documentation or articles, don't get confused. Minions are nodes.

Master

The master is the control plane of Kubernetes. It consists of several components, such as an API server, a scheduler, and a controller manager. The master is responsible for the global, cluster-level scheduling of pods and handling of events. Usually, all the master components are set up on a single host. When considering high-availability scenarios or very large clusters, you will want to have master redundancy. I will discuss highly available clusters in detail in Chapter 4, High Availability and Scaling.

Pod

A pod is the unit of work in Kubernetes. Each pod contains one or more containers. Pods are always scheduled together (always run on the same machine). All the containers in a pod have the same IP address and port space; they can communicate using localhost or standard inter-process communication. In addition, all the containers in a pod can have access to shared local storage on the node hosting the pod. The shared storage will be mounted on each container. Pods are important feature of Kubernetes. It is possible to run multiple applications inside a single Docker container by having something like supervisor as the main Docker application that runs multiple processes, but this practice is often frowned upon, for the following reasons:

  • Transparency: Making the containers within the pod visible to the infrastructure enables the infrastructure to provide services to those containers, such as process management and resource monitoring. This facilitates a number of conveniences for users.
  • Decoupling software dependencies: The individual containers may be versioned, rebuilt, and redeployed independently. Kubernetes may even support live updates of individual containers someday.
  • Ease of use: Users don't need to run their own process managers, worry about signal and exit-code propagation, and so on.
  • Efficiency: Because the infrastructure takes on more responsibility, containers can be more lightweight.

Pods provide a great solution for managing groups of closely related containers that depend on each other and need to co-operate on the same host to accomplish their purpose. It's important to remember that pods are considered ephemeral, throwaway entities that can be discarded and replaced at will. Any pod storage is destroyed with its pod. Each pod gets a unique ID (UID), so you can still distinguish between them if necessary.

Label

Labels are key-value pairs that are used to group together sets of objects, very often pods. This is important for several other concepts, such as replication controller, replica sets, and services that operate on dynamic groups of objects and need to identify the members of the group. There is a NxN relationship between objects and labels. Each object may have multiple labels, and each label may be applied to different objects. There are certain restrictions by design on labels. Each label on an object must have a unique key. The label key must adhere to a strict syntax. It has two parts: prefix and name. The prefix is optional. If it exists then it is separated from the name by a forward slash (/) and it must be a valid DNS sub-domain. The prefix must be 253 characters long at most. The name is mandatory and must be 63 characters long at most. Names must start and end with an alphanumeric character (a-z, A-Z, 0-9) and contain only alphanumeric characters, dots, dashes, and underscores. Values follow the same restrictions as names. Note that labels are dedicated for identifying objects and not for attaching arbitrary metadata to objects. This is what annotations are for (see the following section).

Annotation

Annotations let you associate arbitrary metadata with Kubernetes objects. Kubernetes just stores the annotations and makes their metadata available. Unlike labels, they don't have strict restrictions about allowed characters and size limits. In my experience, you always need such metadata for complicated systems, and it is nice that Kubernetes recognizes this need and provides it out of the box so you don't have to come up with your own separate metadata store and mapping object to their metadata.

We've covered most, if not all, of Kubernetes' concepts; there are a few more I mentioned briefly. In the next section, we will continue our journey into Kubernetes architecture by looking into its design motivations, the internals and implementation, and even pick at the source code.

Label selector

Label selectors are used to select objects based on their labels. Equality-based selectors specify a key name and a value. There are two operators, = (or ==) and !=, for equality or inequality based on the value. For example:

role = webserver

This will select all objects that have that label key and value.

Label selectors can have multiple requirements separated by a comma. For example:

role = webserver, application != foo

Set-based selectors extend the capabilities and allow selection based on multiple values:

role in (webserver, backend)

Replication controller and replica set

Replication controllers and replica sets both manage a group of pods identified by a label selector and ensure that a certain number is always up and running. The main difference between them is that replication controllers test for membership by name equality and replica sets can use set-based selection. Replica sets are newer and designated as the next-generation replication controllers. They are still in beta and are not fully supported by all the tools at the time of writing. Hopefully, by the time you read this, they will be full-fledged members.

Kubernetes guarantees that you will always have the same number of pods running as you specified in a replication controller or a replica set. Whenever the number drops due to a problem with the hosting node or the pod itself, Kubernetes will fire up new instances. Note that, if you manually start pods and exceed the specified number, the replication controller will kill some extra pods.

Replication controllers used to be central to many workflows, such as rolling updates and running one-off jobs. As Kubernetes evolved, it introduced direct support for many of these workflows, with dedicated objects such as Deployment, Job, and DaemonSet. We will meet them all later.

Service

Services are used to expose some functionality to users or other services. They usually encompass a group of pods, usually identified by – you guessed it – a label. You can have services that provide access to external resources, or to pods you control directly at the virtual IP level. Native Kubernetes services are exposed through convenient endpoints. Note that services operate at layer 3 (TCP/UDP). Kubernetes 1.2 added the Ingress object, which provides access to HTTP objects. More on that later. Services are published or discovered via one of two mechanisms: DNS, or environment variables. Services can be load-balanced by Kubernetes. But, developers can choose to manage load balancing themselves in case of services that use external resources or require special treatment.

There are many gory details associated with IP addresses, virtual IP addresses, and port spaces. We will discuss them in depth in a future chapter.

Volume

Local storage on the pod is ephemeral and goes away with the pod. Sometimes that's all you need, if the goal is just to exchange data between containers of the node, but sometimes it's important for the data to outlive the pod, or it's necessary to share data between pods. The volume concept supports that need. Note that, while Docker has a volume concept too, it is quite limited (although getting more powerful). Kubernetes uses its own separate volumes. Kubernetes also supports additional container types such as rkt, so it couldn't rely on Docker volumes even in principle.

There are many volume types. Kubernetes currently directly supports each volume type. In the future, another layer of indirection may be added and an abstract volume plugin may be developed. The emptyDir volume type mounts a volume on each container that is backed by default by whatever is available on the hosting machine. You can request a memory medium if you want. This storage is deleted when the pod is terminated for any reason. There are many volume types for specific cloud environments, various networked filesystems, and even Git repositories. An interesting volume type is the persistentDiskClaim, which abstracts the details a little bit and uses the default persistent storage in your environment (typically in a cloud provider).

StatefulSet

Pods come and go, and if you care about their data then you can use persistent storage. That's all good. But sometimes you want Kubernetes to manage a distributed data store such as Kubernetes or MySQL Galera. These clustered stores keep the data distributed across uniquely identified nodes. You can't model that with regular pods and services. Enter StatefulSet. If you remember earlier, I discussed pets versus cattle and how cattle is the way to go. Well, StatefulSet sits somewhere in the middle. StatefulSet ensures (similar to a replication controller) that a given number of pets with unique identities are running at any given time. Pets have the following properties:

  • A stable hostname, available in DNS
  • An ordinal index
  • Stable storage linked to the ordinal and hostname

StatefulSet can help with peer discovery as well as adding or removing pets.

Secret

Secrets are small objects that contain sensitive info such as credentials and tokens. They are stored as plaintext in etcd, accessible by the Kubernetes API server, and can be mounted as files into pods (using dedicated secret volumes that piggyback on regular data volumes) that need access to them. The same secret can be mounted into multiple pods. Kubernetes itself creates secrets for its components, and you can create your own secrets. Another approach is to use secrets as environment variables. Note that secrets in a pod are always stored in memory (tmpfs in the case of mounted secrets) for better security.

Name

Each object in Kubernetes is identified by a UID and a name. The name is used to refer to the object in API calls. Names should be up to 253 characters long and use lowercase alphanumeric characters, dash (-) and dot (.). If you delete an object, you can create another object with the same name as the deleted object, but the UIDs must be unique across the lifetime of the cluster. The UIDs are generated by Kubernetes, so you don't have to worry about it.

Namespace

A namespace is a virtual cluster. You can have a single physical cluster that contains multiple virtual clusters segregated by namespaces. Each virtual cluster is totally isolated from other virtual clusters, and they can only communicate through public interfaces. Note that Node objects and persistent volumes don't live in a namespace. Kubernetes may schedule pods from different namespaces to run on the same node. Likewise, pods from different namespaces can use the same persistent storage.

When using namespaces, you have to consider network policies and resource quotas to ensure proper access and distribution of the physical cluster resources.

Diving into Kubernetes architecture in depth

Kubernetes has very ambitious goals. It aims to manage and simplify the orchestration, deployment, and management of distributed systems across a wide range of environments and cloud providers. It provides many capabilities and services that should work across all that diversity, while evolving and remaining simple enough for mere mortals to use. This is a tall order. Kubernetes achieves this by following a crystal-clear, high-level design and well-thought-out architecture that promotes extensibility and pluggability. Many parts of Kubernetes are still hard-coded or environment-aware, but the trend is to refactor them into plugins and keep the core generic and abstract. In this section, we will peel Kubernetes like an onion, starting with the various distributed systems design patterns and how Kubernetes supports them, then go over the surface of Kubernetes, which is its set of APIs, and then take a look at the actual components that comprise Kubernetes. Finally, we will take a quick tour of the source-code tree to gain even better insight into the structure of Kubernetes itself.

At the end of this section, you will have a solid understanding of Kubernetes architecture and implementation, and why certain design decisions were made.

Distributed systems design patterns

All happy (working) distributed systems are alike, to paraphrase Tolstoy in Anna Karenina. That means that, to function properly, all well-designed distributed systems must follow some best practices and principles. Kubernetes doesn't want to be just a management system. It wants to support and enable these best practices and provide high-level services to developers and administrators. Let's look at some of those described as design patterns.

Sidecar pattern

The sidecar pattern is about co-locating another container in a pod in addition to the main application container. The application container is unaware of the sidecar container and just goes about its business. A great example is a central logging agent. Your main container can just log to stdout, but the sidecar container will send all logs to a central logging service where they will be aggregated with the logs from the entire system. The benefits of using a sidecar container versus adding central logging to the main application container are enormous. First, applications are not burdened anymore with central logging, which could be a nuisance. If you want to upgrade or change your central logging policy or switch to a totally new provider, you just need to update the sidecar container and deploy it. None of your application containers change, so you can't break them by accident.

Ambassador pattern

The ambassador pattern is about representing a remote service as if it were local and possibly enforcing some policy. A good example of the ambassador pattern is if you have a Redis cluster with one master for writes and many replicas for reads. A local ambassador container can serve as a proxy and expose Redis to the main application container on the localhost. The main application container simply connects to Redis on localhost:6379 (Redis default port), but it connects to the ambassador running in the same pod, which filters the requests, and sends write requests to the real Redis master and read requests randomly to one of the read replicas. Just like with the sidecar pattern, the main application has no idea what's going on. That can help a lot when testing against a real local Redis. Also, if the Redis cluster configuration changes, only the ambassador needs to be modified; the main application remains blissfully unaware.

Adapter pattern

The adapter pattern is about standardizing output from the main application container. Consider the case of a service that is being rolled out incrementally: it may generate reports in a format that doesn't conform to the previous version. Other services and applications that consume that output haven't been upgraded yet. An adapter container can be deployed in the same pod with the new application container and massage their output to match the old version until all consumers have been upgraded. The adapter container shares the filesystem with the main application container, so it can watch the local filesystem, and whenever the new application writes something, it immediately adapts it.

Multi-node patterns

The single-node patterns are all supported directly by Kubernetes via pods. Multi-node patterns such as leader election, work queues, and scatter-gather are not supported directly, but composing pods with standard interfaces to accomplish them is a viable approach with Kubernetes.

The Kubernetes APIs

If you want to understand the capabilities of a system and what it provides, you must pay a lot of attention to its API. The API provides a comprehensive view of what you can do with the system as a user. Kubernetes exposes several sets of REST APIs for different purposes and audiences. Some of the APIs are used primarily by tools and some can be used directly by developers. An important aspect of the APIs is that they are under constant development. The Kubernetes developers keep it manageable by trying to extend (adding new objects and new fields to existing objects) and avoid renaming or dropping existing objects and fields. In addition, all API endpoints are versioned, and often have an alpha or beta notation too. For example:

/api/v1
/api/v2alpha1

You can access the API through the kubectl cli, via client libraries, or directly through REST API calls. There are elaborate authentication and authorization mechanism we will explore in a later chapter. At this point, let's get a glimpse into the surface area of the APIs.

Kubernetes API

This is the main API of Kubernetes. It is huge. All the concepts we discussed before, and many auxiliary concepts, have corresponding API objects and operations. If you have the right permissions you can list, get, create, and update objects. Here is a detailed documentation of one of the most common operations, get a list of all the pods:

GET /api/v1/pods

It accepts various query parameters (all optional):

  • pretty: If true, the output is pretty printed
  • labelSelector: A selector expression to limit the result
  • watch: If true, watch for changes and return a stream of events
  • resourceVersion: With watch, returns only events that occurred after that version
  • timeoutSeconds: Timeout for the list or watch operation

Autoscaling API

The autoscaling API is very focused and lets you control the horizontal pod autoscaler, which manages a group of pods based on CPU utilization and even application-specific metrics. You can list, query, create, update, and destroy autoscaler objects using the /apis/autoscaling/v1 endpoint.

Batch API

The batch API lets you manage jobs. Jobs are pods that perform some activity and terminate. Unlike regular pods managed by a replication controller, they are supposed to terminate when the job is done. The batch API uses the pod template to specify jobs and then allows you, as usual, to list, query, create, and delete jobs through the /apis/batch/v1 endpoint.

Kubernetes components

A Kubernetes cluster has several master components used to control the cluster, as well as node components that run on each cluster node. Let's get to know all these components and how they work together.

Master components

The master components typically run on one node, but in a highly available or very large cluster, they may be spread across multiple nodes.

API server

The kube API server exposes the Kubernetes REST API. It can easily scale horizontally as it is stateless and stores all the data in the etcd cluster. The API server is the embodiment of the Kubernetes control plane.

Etcd

Etcd is a highly reliable distributed data store. Kubernetes uses it to store the entire cluster state. In small, transient cluster a single instance of etcd can run on the same node with all the other master components. But, for more substantial clusters it is typical to have a 3-node or even 5-node etcd cluster for redundancy and high availability.

Controller manager

The controller manager is a collection of various managers rolled up into one binary. It contains the replication controller, the pod controller, the services controller, the endpoints controller, and others. All these managers watch over the state of the cluster via the API and their job is to steer the cluster into the desired state.

Scheduler

The kube-scheduler is responsible for scheduling pods into nodes. This is a very complicated task as it needs to consider multiple interacting factors, such as the following:

  • Resource requirements
  • Service requirements
  • Hardware/software policy constraints
  • Affinity and anti-affinity specifications
  • Data locality
  • Deadlines

DNS

Starting with Kubernetes 1.3, a DNS service is part of the standard Kubernetes cluster. It is scheduled as a regular pod. Every service (except headless services) receives a DNS name. Pods can receive a DNS name too. This is very useful for automatic discovery.

Node components

Nodes in the cluster need a couple of components to interact with the cluster master components, receive workloads to execute, and update the cluster on their status.

Proxy

The kube proxy does low-level network housekeeping on each node. It reflects the Kubernetes services locally and can do TCP and UDP forwarding. It finds cluster IPs via environment variables or DNS.

Kubelet

The kubelet is the Kubernetes representative on the node. It oversees communicating with the master components and manage the running pods. That includes the following:

  • Download pod secrets from the API server
  • Mount volumes
  • Run the pod's container (Docker or Rkt)
  • Report the status of the node and each pod
  • Run container liveness probes

In this section, we dug into the guts of Kubernetes and explored its architecture from a very high level of vision and supported design patterns, through its APIs and the components used to control and manage the cluster. In the next section, we will take a quick look at the various runtimes that Kubernetes supports.

Kubernetes runtimes

Kubernetes originally only supported Docker as a container runtime engine. But that is no longer the case. Rkt is another supported runtime engine and there are interesting attempts to work with Hyper.sh containers via Hypernetes. A major design policy is that Kubernetes itself should be completely decoupled from specific runtimes. The interaction between Kubernetes and the runtime is through a relatively generic interface that runtime engines must implement. Most of the communication is using the pod and container concepts and the operations that can be performed on a container. Each runtime engine is responsible for implementing the Kubernetes runtime interface to be compatible.

In this section, you'll get a closer look at the runtime interface and get to know the individual runtime engines. At the end of this section, you'll be able to make a well-informed decision about which runtime engine is appropriate for your use case and under what circumstances you may switch or even combine multiple runtimes in the same system.

The runtime interface

The runtime interface for containers is specified in the Kubernetes project on GitHub. Kubernetes is open source, so we can look at it at the following URL:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/container/runtime.go.

I'll present here snippets from this file without the elaborate comments. Even if you're not a full-fledged programmer and know nothing about the Go language, you should be able to grasp the scope and responsibilities of a runtime engine from the viewpoint of Kubernetes:

Note

A quick note about Go to help you parse the code: The method name comes first, followed by the method's parameters in parentheses. Each parameter is a pair, consisting of a name followed by its type. Finally, the return values are specified. Go allows multiple return types. It is very common to return an error object in addition to the actual result. If everything is OK, the error object will be nil.

type Runtime interface {
  Type() string

  Version() (Version, error)

  APIVersion() (Version, error)

  Status() error

  GetPods(all bool) ([]*Pod, error)
}

The fact that it is an interface means that Kubernetes doesn't provide an implementation. The first group of methods provides general information about the runtime: Type, Version, APIVersion, and Status. You can also get all the pods:

SyncPod(pod *api.Pod, apiPodStatus api.PodStatus, podStatus *PodStatus, pullSecrets []api.Secret, backOff *flowcontrol.Backoff) PodSyncResult

KillPod(pod *api.Pod, runningPod Pod, gracePeriodOverride *int64) error

GetPodStatus(uid types.UID, name, namespace string) (*PodStatus, error)

GetNetNS(containerID ContainerID) (string, error)

GetPodContainerID(*Pod) (ContainerID, error)

GetContainerLogs(pod *api.Pod, containerID ContainerID, logOptions *api.PodLogOptions, stdout, stderr io.Writer) (err error)

DeleteContainer(containerID ContainerID) error

The next group of methods deal mostly with pods appropriately as this is the main abstraction in the Kubernetes conceptual model. Then there is the GetPodContainerID(), which gets you from a container to a pod, and a few more container-related methods:

  • ContainerCommandRunner
  • ContainerAttacher
  • ImageService

The last three items, ContainerCommandRunner, ContainerAttacher, and ImageService, are interfaces that the runtime interface inherits. This means that whoever implements the runtime interface also needs to implement the methods of these interfaces. The interfaces are defined in the same file. Just the interface names provide a lot of information about what they do. Kubernetes obviously needs to run commands in containers, and it needs to attach containers to its pods and pull container images. I encourage you to pursue this file and get familiar with the code.

Now that you are familiar at the code level with what Kubernetes considers as a runtime engine, let's look at the individual runtime engines briefly.

Docker

Docker is, of course, the 800 pound gorilla of containers. Kubernetes was originally designed to manage only Docker containers. The multi-runtime capability was first introduced in Kubernetes 1.3. Until then, Kubernetes could only manage Docker containers.

I assume you're very familiar with Docker and what it brings to the table if you are reading this book. Docker enjoys tremendous popularity and growth, but there is also a lot of criticism toward it. Critics often mention the following concerns:

  • Security
  • Difficulty setting up multi-container applications (in particular, networking)
  • Development, monitoring, and logging
  • Limitations of Docker containers running one command
  • Releasing half-based features too fast

Docker is aware of the criticisms and has addressed some of these concerns. In particular, Docker invested in its Docker swarm product. Docker swarm is a Docker-native orchestration solution that competes with Kubernetes. It is simpler to use than Kubernetes, but it's not as powerful or mature.

Note

Starting with Docker 1.12, swarm mode is included in the Docker Daemon natively, which upset some people due to bloat and scope creep. That in turn made more people turn to CoreOS rkt as an alternative solution.

Starting with Docker 1.11, released on April 2016, Docker has changed the way it runs containers. The runtime now uses containerd and runC to run Open Container Initiative (OCI) images in containers:

Docker

Rkt

Rkt is a new container manager from CoreOS (developers of the CoreOS Linux distro, etcd, flannel, and more). The rkt runtime prides itself on its simplicity and strong emphasis on security and isolation. It doesn't have a Daemon like the Docker engine and relies on the OS init system, such as systemd, to launch the rkt executable. Rkt can download images (both App Container (appc) images and OCI images), verify them, and run them in containers. Its architecture is much simpler.

App container

CoreOS started a standardization effort in December 2014 called appc. This includes standard image format (ACI), runtime, signing, and discovery. A few months later, Docker started its own standardization effort with OCI. At this point it seems these efforts will converge. This is a great thing as tools, images, and runtime will be able to interoperate freely. We're not there yet.

Rktnetes

Rktnetes is Kubernetes plus rkt as the runtime engine. Kubernetes is still in the process of abstracting away the runtime engine. Rktnetes is not really a separate product. From the outside, all it takes is running the kubelet on each node with a couple of command-line switches. But, since there are fundamental differences between Docker and rkt, you may run into a variety of issues.

Is rkt ready for production usage?

The integration between rkt and Kubernetes is not totally seamless; there are still some rough spots. My recommendation at this stage (late 2016) is to prefer Docker unless you have a very specific reason to use rkt. If you decide that it's important for your use case to use rkt then you should base your cluster on CoreOS. It is most likely that you will find the best integration with the CoreOS cluster, as well as the best documentation and online support.

Hyper containers

Hyper containers are another option. A Hyper container has a lightweight VM (its own guest kernel) and it runs on bare metal. Instead of relying on Linux cgroups for isolation, it relies on a hypervisor. This approach presents an interesting mix compared to standard bare-metal clusters that are difficult to set up and public clouds where containers are deployed on heavyweight VMs.

Hypernetes

Hypernetes is a multi-tenant Kubernetes distribution that uses Hyper containers as well as some OpenStack components for authentication, persistent storage, and networking. Since containers don't share the host kernel, it is safe to run containers of different tenants on the same physical host:

Hypernetes

In this section, we've covered the various runtime engines that Kubernetes supports as well as the trend toward standardization and convergence. In the next section, we'll take a step back and look at the big picture, and how Kubernetes fits into the CI/CD pipeline.

Continuous integration and deployment

Kubernetes is a great platform for running your microservice-based applications. But, at the end of the day, it is an implementation detail. Users, and often most developers, may not be aware that the system is deployed on Kubernetes. But Kubernetes can change the game and make things that were too difficult before possible.

In this section, we'll explore the CI/CD pipeline and what Kubernetes brings to the table. At the end of this section you'll be able to design CI/CD pipelines that take advantage of Kubernetes properties such as easy-scaling and development-production parity to improve the productivity and robustness of day-to-day development and deployment.

What is a CI/CD pipeline?

A CI/CD pipeline is a set of steps that a set of changes by developers or operators that modify the code, data or configuration of a system, test them and deploys them to production. Some pipelines are fully automated and some are semi-automated with human checks. In large organizations, there may be test and staging environments where changes are deployed to automatically, but release to production requires manual intervention. The following diagram describes a typical pipeline.

It may be worth mentioning that developers can be completely isolated from production infrastructure. Their interface is just a Git workflow, where a good example is Deis Workflow (PaaS on Kubernetes, similar to Heroku):

What is a CI/CD pipeline?

Designing a CI/CD pipeline for Kubernetes

When your deployment target is a Kubernetes cluster, you should rethink some traditional practices. For starters, packaging is different. You need to bake images for your containers. Reverting code changes is super easy and instantaneous by using smart labeling. It gives you a lot of confidence that, if a bad change slips through the testing net, somehow you'll be able to revert to the previous version immediately. But you want to be careful there. Schema changes and data migrations can't be automatically rolled back. Another unique capability of Kubernetes is that developers can run a whole cluster locally. That takes some work when you design your cluster, but since the microservices that comprise your system run in containers, and those containers interact via APIs, it is possible and practical to do. As always, if your system is very data-driven, you will need to accommodate for that and provide data snapshots and synthetic data that your developers can use.

Summary

In this chapter, we covered a lot of ground, and you got to understand the design and architecture of Kubernetes. Kubernetes is an orchestration platform for microservice-based applications running as containers. Kubernetes clusters have master and worker nodes. Containers run within pods. Each pod runs on a single physical or virtual machine. Kubernetes directly supports many concepts, such as services, labels, and persistent storage. You can implement various distributed systems design patterns on Kubernetes. The containers themselves may be Docker, rkt, or Hyper containers.

In Chapter 2, Creating Kubernetes Clusters, we will explore the various ways to create Kubernetes clusters, discuss when to use different options, and build a multi-node cluster.

Left arrow icon Right arrow icon

Key benefits

  • This practical guide demystifies Kubernetes and ensures that your clusters are always available, scalable, and up to date
  • Discover new features such as autoscaling, rolling updates, resource quotas, and cluster size
  • Master the skills of designing and deploying large clusters on various cloud platforms

Description

Kubernetes is an open source system to automate the deployment, scaling, and management of containerized applications. If you are running more than just a few containers or want automated management of your containers, you need Kubernetes. This book mainly focuses on the advanced management of Kubernetes clusters. It covers problems that arise when you start using container orchestration in production. We start by giving you an overview of the guiding principles in Kubernetes design and show you the best practises in the fields of security, high availability, and cluster federation. You will discover how to run complex stateful microservices on Kubernetes including advanced features as horizontal pod autoscaling, rolling updates, resource quotas, and persistent storage back ends. Using real-world use cases, we explain the options for network configuration and provides guidelines on how to set up, operate, and troubleshoot various Kubernetes networking plugins. Finally, we cover custom resource development and utilization in automation and maintenance workflows. By the end of this book, you’ll know everything you need to know to go from intermediate to advanced level.

Who is this book for?

The book is for system administrators and developers who have intermediate level of knowledge with Kubernetes and are now waiting to master its advanced features. You should also have basic networking knowledge. This advanced-level book provides a pathway to master Kubernetes.

What you will learn

  • Architect a robust Kubernetes cluster for long-time operation
  • Discover the advantages of running Kubernetes on GCE, AWS, Azure, and bare metal
  • See the identity model of Kubernetes and options for cluster federation
  • Monitor and troubleshoot Kubernetes clusters and run a highly available Kubernetes
  • Create and configure custom Kubernetes resources and use third-party resources in your automation workflows
  • Discover the art of running complex stateful applications in your container environment
  • Deliver applications as standard packages

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : May 25, 2017
Length: 426 pages
Edition : 1st
Language : English
ISBN-13 : 9781786461001
Vendor :
Google
Tools :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : May 25, 2017
Length: 426 pages
Edition : 1st
Language : English
ISBN-13 : 9781786461001
Vendor :
Google
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 158.97
DevOps with Kubernetes
$54.99
Getting Started with Kubernetes, Second Edition
$48.99
Mastering Kubernetes
$54.99
Total $ 158.97 Stars icon
Banner background image

Table of Contents

15 Chapters
1. Understanding Kubernetes Architecture Chevron down icon Chevron up icon
2. Creating Kubernetes Clusters Chevron down icon Chevron up icon
3. Monitoring, Logging, and Troubleshooting Chevron down icon Chevron up icon
4. High Availability and Reliability Chevron down icon Chevron up icon
5. Configuring Kubernetes Security, Limits, and Accounts Chevron down icon Chevron up icon
6. Using Critical Kubernetes Resources Chevron down icon Chevron up icon
7. Handling Kubernetes Storage Chevron down icon Chevron up icon
8. Running Stateful Applications with Kubernetes Chevron down icon Chevron up icon
9. Rolling Updates, Scalability, and Quotas Chevron down icon Chevron up icon
10. Advanced Kubernetes Networking Chevron down icon Chevron up icon
11. Running Kubernetes on Multiple Clouds and Cluster Federation Chevron down icon Chevron up icon
12. Customizing Kubernetes - API and Plugins Chevron down icon Chevron up icon
13. Handling the Kubernetes Package Manager Chevron down icon Chevron up icon
14. The Future of Kubernetes Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
(9 Ratings)
5 star 55.6%
4 star 11.1%
3 star 11.1%
2 star 22.2%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Hosam Al Ali Aug 09, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is an excellent book, it contains hands on sample and practice
Amazon Verified review Amazon
Kathleen O'Reilly Dec 01, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
great book to learn Kubernetes if you are a novice
Amazon Verified review Amazon
Y. Pant Feb 17, 2019
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Coming from an architectural background, I wanted to see the overall architecture of Kubernetes at 5000 ft in diagrams (in addition to word description). I have read many books over the last few weeks - most of them provided great description of the capabilities of Kubernetes. However this book provided a great contextual diagram about how these individual pieces fitted together (a diagram is worth many thousand words).
Amazon Verified review Amazon
Ali Marshal Jul 14, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Mastering Kubernetes is a great book. It is both comprehensive and approachable. It covers Kubernetes from the very basics of working with pods and containers, and all the way through very advanced concepts like cluster federation, API usage, plugins and extensions. This is the only Kubernetes book that is up to date with the most recent additions to Kubernetes. I appreciate the attention to detail and how the author explains the concepts and the big picture first and then dives in into low-level detail and examples.
Amazon Verified review Amazon
Hetz Ben Hamo Aug 15, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This is one (if not THE one) of the best books to learn and familiarize yourself with Kubernetes. It's not only talking about Kubernetes which runs on a cloud like GCP, it also teaches you how to set it on your desktop, how to build it with few VM's, how to expose services (on places that you're not getting the cloud's provider external IP automatically), what's the competition, it also talks about other container formats (rkt, etc), how to monitor Kubernetes, how to write the YAML/JSON files etc.Looking forward for 2nd edition for this book (it talks about Kubernetes 1.4.x, we're in 1.7.x) ;)
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.