Exploring options for transformation
In developing and migrating workloads to the cloud, there are a number of options that architects must consider from the beginning. In this section, we will elaborate on these choices.
From monolith to microservices
A lot of companies will have technical debt, including monolithic applications. These are applications where services are tightly coupled and deployed as one environment. It’s extremely hard to update or upgrade these applications; updating a service means that the whole application must be updated. Monolithic applications are not very scalable and agile. Microservices might be a solution, wherein services are loosely coupled.
Transforming a monolithic application to microservices is a very cumbersome process. First of all, the question that must be answered is: is it worthwhile? Does the effort and thus costs weigh up to the benefits of transformation? It might be better to leave the application as-is, maybe lift-and-shift it to the cloud, and parallel design, build, and deploy a replacement application using a microservices architecture.
With that, we have identified the possible cloud transformation strategies that we discussed in Chapter 2, Collecting Business Requirements:
- Rehost
- Replatform
- Repurchase
- Refactor
- Rearchitect
- Rebuild
- Retire
- Retain
Microservices typically involve a rearchitect and refactor. The functionality of the application and the underlying workloads is adapted to run in microservices. That includes the rearchitecture of the original monolithic application and refactoring the code. This might sometimes be a better option, with the development of cloud-native applications replacing the old application, especially when the original application is likely to consume a lot of heavy—costly—resources in the cloud or prevent future updates and upgrades.
Enterprises can make the decision to retain the old environment for backup or disaster recovery reasons. This will definitely lead to extra costs: costs of having to manage the old environment and investing in the development of a new environment. Part of the strategy can also be “make or buy” with either in-house development or buying software “off the shelf.” It all needs to be considered in drawing the plans.
More technology has been emerging to create cloud-native architectures, moving applications away from the classical VMs to containers and serverless. We’ll review this in the next sections.
From machines to serverless
Public clouds started as a copy of existing data centers, with servers, storage, and network connectivity like companies had in their own data centers. The only difference was that this data center was operated by a different company that “rented” equipment to customers. To enable a lot of customers to make use of the data center, most of the equipment was virtualized by implementing software that divided a server into multiple, software-defined servers. One server hosted multiple customers in a multi-tenant mode. The same principle was applied to storage and network equipment, enabling very efficient usage of all available resources.
Virtual machines were the original model that was used by customers in the public cloud. A service was hosted on a machine with a fixed number of CPUs, memory, and attached disks. The issue was that, in some cases, services didn’t need the full machine all of the time, but nonetheless, the whole machine was charged to the customer for that time. There were even services that only were short-lived: rapidly scaled up and down again, as soon as the service was not running anymore. Especially in microservices, this is becoming common practice. Event-based architectures are where a service is triggered by an action of a customer and stopped as soon as the action has been executed. Full-blown virtual machines are too heavy for these services.
Serverless options in the public cloud are a solution for this. In the traditional sense, you use your own server on which your own software runs. With a serverless architecture, the software is run in the cloud only when necessary. There’s no need to reserve servers, saving costs. Don’t be fooled by the term “serverless” as there are still servers involved, but this time, the service only uses a particular part of the server and only for a short amount of time. Serverless is a good solution when an organization doesn’t want to bother about managing the infrastructure.
But the biggest advantage of using serverless options is the fact that it can help organizations in becoming event-driven. With microservices, there are software components that focus on one specific task, such as payments or orders. These are transactions that follow a process. Serverless functions each perform their own step in the process, only consuming the resources the function needs to execute that specific task in the process. Then, serverless is a good option to include in the architecture.
Major public clouds have these solutions: Azure Functions, AWS Lambda, and Google Cloud Functions.
Containers and multi-cloud container orchestration
Serverless is a great solution to run specific functions in the cloud, but they are not suitable to host applications or application components. Still, companies want to get rid of heavy virtual machines. VMs are generally an expensive solution in the cloud. The machine has a fixed configuration and runs a guest operating system for itself. So, the hosting machine is virtualized, allowing multiple workloads to run on one machine. But these workloads that run in a VMs, still require their own operating system. Each VM runs its own binaries, libraries, and applications, causing the VM to become quite big.
Containers are a way to use infrastructure in a more efficient way to host workloads. Containers work with a container engine and only that engine requires the operating system. Containers share the host operating system kernel but also the binaries and libraries. This makes the containers themselves quite light and much faster than VMs. When an architecture is built around microservices, containers are a good solution.
Containers are a natural solution to run microservices, but there are other scenarios for containers. A lot of applications can be migrated to containers easily and with that, moved to the cloud quickly—making containers a good migration tactic.
Each container might run a specific service or application component. In case of upgrades or updates, only a few containers might be “impacted.”
The following diagram explains the difference between a VM and a container:
Figure 3.5: Virtual machine (left) versus containers (right)
This is a good point to explain something else in working with a container architecture: sidecars. Sidecar containers run along with the main container holding a specific functionality of the application. If we only want to change that functionality but nothing else, we can use sidecar containers. The sidecar container holds the functionality that shouldn’t be changed, while the functionality in the main container is updated. The following diagram shows a simple example of a sidecar architecture:
Figure 3.6: Simple architecture for a sidecar
There’s a catch to using containers, and that’s the aforementioned container engine. You need a platform that is able to run the containers. The default industry standard has become Kubernetes.
With Kubernetes, containers are operated on compute clusters with a management layer that enables the sharing of resources and the scheduling of tasks to workloads that reside within the containers. Resources are compute clusters, a group of servers –commonly referred to as nodes—that host the containers. The management or orchestration layer makes sure that these nodes work as one unit to run containers and execute processes—the tasks—that are built inside the containers.
The cluster management tracks the usage of resources in the cluster such as memory, processing power, and storage, and then assigns containers to these resources so that the cluster nodes are utilized in an optimized way and applications run well.
In other words, scaling containers is not so much about the containers themselves but more about scaling the underlying infrastructure. Kubernetes uses Pods, enabling the sharing of data and application code among different containers, acting as one environment. Pods work with the share fate principle, meaning that if one container dies in the Pod, all containers go with it.
All major cloud providers offer solutions to run containers using Kubernetes:
- Azure Kubernetes Services (AKS): The managed service to deploy and run Kubernetes clusters in Azure. Azure also offers Azure Container Apps and Azure Container Instances as serverless options.
- Elastic Kubernetes Services (EKS): The AWS-managed service for Kubernetes platforms in AWS. EKS Anywhere uses EKS Distro, the open-source distribution of Kubernetes. Since it’s a managed service, AWS takes care of testing and tracking Kubernetes updates, dependencies, and patches. To be clear, the same applies to AKS. AWS also offers a serverless solution to run container environments: Fargate. This removes the need to provision and manage servers, and simply allocates the right amount of compute, eliminating the need to choose instances and scale cluster capacity.
- Google Kubernetes Engine (GKE): The managed service of Google Cloud to deploy and run Kubernetes clusters in GCP.
- Alibaba Cloud Container Service for Kubernetes (ACK): The managed service of Alibaba Cloud to deploy and run Kubernetes clusters on Alibaba Cloud.
- Oracle Container Engine for Kubernetes (OKE): The managed container service that we can use in OCI. This service also includes serverless options with OKE Virtual Nodes.
All the mentioned providers also offer unmanaged services to run containers and container orchestrations, but the advantage of managed services is that the functionality of, for instance, scaling and load balancing across clusters is completely automated and taken care of by the provider.
The Kubernetes clusters with Pods and nodes must be configured on that specific platform, using one of the services mentioned above. There are technologies that provide tools to manage Kubernetes clusters across various cloud platforms, such as VMWare Tanzu, NetApp Astra, and Azure Arc:
- VMWare Tanzu: This is the suite of products that VMware launched to manage Kubernetes workloads and containers across various cloud platforms. The offering was launched with Tanzu Mission Control but has evolved over the past years with offerings that allow for application transformation (Tanzu Application Platform) and cloud-native developments (Tanzu Labs).
- NetApp Astra: NetApp started as a company that specialized in storage solutions, specifically in network attached storage (NAS), but over the years, NetApp evolved to a cloud management company with a suite of products, including Astra, that allows management of various Kubernetes environments.
- Azure Arc: Azure Arc-enabled Kubernetes allows you to attach and configure Kubernetes clusters running anywhere. You can connect your clusters running on other public cloud providers (such as GCP or AWS) or clusters running on your on-premises data center (such as VMware vSphere or Azure Stack HCI) to Azure Arc.
The interesting part of these products is that they are also able to manage lightweight Kubernetes, known as K3S, in environments that are hosted in on-premises private stacks, allowing for seamless integration of Kubernetes and container management through one console.
Keeping the infrastructure consistent
Microservices, serverless, containers, and legacy environments that run virtual machines in a more traditional way all need to be operated from the cloud. The big challenge is to keep the infrastructure consistent. In this section, we will briefly discuss methodologies and tools to achieve this.
The preferable way of keeping infrastructure consistent is by working through templates. Such a template contains all the configurations with which an infrastructure component should comply. We can take a virtual machine as an example. A VM can be deployed straight from the marketplace of a cloud provider. Typically, companies have specific demands for servers: they must be configured with specific settings that define the configuration of the operating system, level of access, and security parameters of the server. First, we don’t want to do this manually every time we enroll a server, and second, if we do it manually, the chances are that there will be deviations. Hence, we use templates to automate the enrollment of the desired state of servers and to keep all servers consistent with policies.
Let’s look at an example of how a template for a VM could look. Be aware that this list is not meant to be exhaustive:
- Sizing of the VM
- Operating system
- Configuration parameters of the operating system
- Access policies
- Network settings
- Workgroup or domain settings
- Security parameters
- Boot sequence
This can, of course, be done for every component in our infrastructure: storage, databases, routers, gateways, and firewalls, for example. There are a couple of methods to create templates. The two common ones are:
- Manual configuration and saving the template in a repository
- Cloning a template from an existing resource
There are a number of tools that can help in maintaining templates and keeping the infrastructure consistent:
- Terraform: This is an open-source tool by HashiCorp and became the industry standard for Infrastructure as Code (IaC). Terraform allows you to create, deploy, and manage infrastructure across various cloud platforms. Users define and deliver data center infrastructure using a declarative configuration language known as HashiCorp Configuration Language (HCL), or optionally, JavaScript Object Notification (JSON).
- Bicep: Bicep files let users define infrastructure in Azure in a declarative way. These files can be used multiple times so that resources are deployed in a consistent way. The advantage of Bicep is that it has an easy syntax compared to JSON templates. Bicep addresses Azure Resources directly through Azure Resource Manager (ARM), whereas in JSON, these resources must first be defined. Quickstart templates for Bicep are available through https://github.com/Azure/azure-quickstart-templates/tree/master/quickstarts.
- CloudFormation: What Bicep does for Azure, CloudFormation does for AWS. It provisions IaC to AWS, using CloudFormation templates. CloudFormation templates are available on Github: https://github.com/awslabs/aws-cloudformation-templates.
All these technologies evolve at the speed of lightning. The challenge that every company faces is how to keep up with all the developments. We will try to give some guidance in the next section.