Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
DevOps for Databases
DevOps for Databases

DevOps for Databases: A practical guide to applying DevOps best practices to data-persistent technologies

Arrow left icon
Profile Icon Jambor
Arrow right icon
Can$34.99 Can$50.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (7 Ratings)
eBook Dec 2023 446 pages 1st Edition
eBook
Can$34.99 Can$50.99
Paperback
Can$63.99
Subscription
Free Trial
Arrow left icon
Profile Icon Jambor
Arrow right icon
Can$34.99 Can$50.99
Full star icon Full star icon Full star icon Full star icon Full star icon 5 (7 Ratings)
eBook Dec 2023 446 pages 1st Edition
eBook
Can$34.99 Can$50.99
Paperback
Can$63.99
Subscription
Free Trial
eBook
Can$34.99 Can$50.99
Paperback
Can$63.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Table of content icon View table of contents Preview book icon Preview Book

DevOps for Databases

Data at Scale with DevOps

Welcome to the first chapter! In this book, you will learn the fundamentals of DevOps, its impact on the industry, and how to apply it to modern data persistence technologies.

When I first encountered the term DevOps years ago, I initially saw it as a way to grant development teams unrestricted access to production environments. This made me nervous, especially because there seemed to be a lack of clear accountability at that time, making the move toward DevOps appear risky.

At the time (around 2010), the roles of developers and operations were divided by a very strict line. Developers could gain read-only privileges, but that’s about it. What I did not see back then was that this was the first step in blurring the lines between development and operation teams. We already had many siloed teams pointing fingers at one another. This made the work slow, segmented, and frustrating. I was worried this would just increase complexity and cause an even greater challenge. Luckily, today’s world of DevOps is very different, and we can all improve it together even further!

There are no more dividing lines between the development and operations teams – they are one team with a common objective. This improves quality, speed, and agility! This also means that traditional roles such as database admin are changing as well. We now have site reliability engineers (SREs) or DevOps engineers who are experts at using databases and able to perform operational and development tasks alike. Blurring the line means you increase the responsibilities, and in a high-performing DevOps team, this means you are responsible for everything from end to end. Modern tooling and orchestration frameworks can help you do way more than ever before, but it’s a very different landscape than it was many years ago.

This book will introduce you to this amazing new world, walk you through the journey that leads us to this ever-changing world of DevOps today, and give some indications as to where we might go next.

By the end of this book, you will be able to not only demonstrate your theoretical knowledge but also design, build, and operate complex systems with a heavy focus on data persistence technologies.

DevOps and data persistence technologies have a love-hate relationship, which makes this topic even more interesting.

In this chapter, we will take a deep dive into the following topics:

  • The modern data landscape
  • Why speed matters
  • Data management strategies
  • The early days of DevOps
  • SRE versus DevOps
  • Engineering principles
  • Objectives – SLOs/SLIs

The modern data landscape

Have you ever wondered how much data we generate every single day? Or the effort required to store and access your data on demand? What about the infrastructure or the services required to make all of this happen? Not to mention the engineering effort put in to make all of this happen. If you have, you are in the right place. These questions inspired me to dive deep into the realms of DevOps and SRE and inspired the creation of this book.

Technology impacts almost every aspect of our lives. We are more connected than ever, with access to more information and services than we even realize. It’s not just our computers, phones, or tablets that are connected to the internet, but our cars, cameras, watches, televisions, speakers, and more. The more digital native we become, the bigger our digital footprint grows.

A digital footprint, also known as a digital shadow, is a collection of data that represents an individual’s interactions and activities across digital platforms and the internet. This data can be categorized as either passive, where it’s generated without direct interaction – such as browsing history – or active, resulting from deliberate online actions such as social media posts or emails. Your digital footprint serves as an online record of your digital presence, and it can have lasting implications for your privacy and reputation.

As of 2022, researchers estimate that out of 8 billion people (the world’s population as of 2022), approximately 5 billion utilize the internet daily. Compared to the 2 billion that was measured in 2012, this is a 250% increase over 10 years. This is an incredible increase. See the following figure for reference:

Figure 1.1 – Daily internet users (in billions)

Figure 1.1 – Daily internet users (in billions)

Each person who has a digital presence generates digital footprints in two ways.

The first is actively. When you browse a website, upload a picture, send an email, or make a video call, you generate data that will be utilized and stored for some time. The other, less obvious way is passive data generation. If you, like me, utilize digital services with push notifications on or have GPS enabled on your phone with a timeline, for example, you are generating data every minute of the day – even if you do not use these services actively. Prime examples can be any Internet of Things (IoT) devices, something such as an internet-enabled security camera – even if you are not actively using it, it’s still generating data and constantly uploading it to your service provider for safekeeping. IoT devices are the secondary source of data generators right after us active internet surfers. Researchers estimate that approximately 13 billion IoT devices are being connected and in daily use as of 2022, with the expectation that this figure will become close to 30 billion by the end of 2030. See the following figure for reference:

Figure 1.2 – Connected IoT devices (in the billions)

Figure 1.2 – Connected IoT devices (in the billions)

Combining the 5 billion active internet users with the 13 billion connected IoT devices, it is easy to guess that our combined digital footprint must be ginormous. Yet trying to guess the exact number is much harder than you might think. Give it a try.

As of 2023, it is estimated that we generate approximately 3.5 exabytes of data every single day. This is about 1 exabyte more than what was estimated in 2021. To help visualize how much data we are talking about, let me try to put this into perspective. Let’s say you have a notebook (or one of the latest phones) with 1 TB of storage capacity. If you were to use this 1 TB storage to store all this information, it would be full in less than 0.025 seconds. An alternative way to think about it is that we can fill 3,670,016 devices with 1 TB storage within 24 hours.

How do we generate data today?

Well, for starters, we collectively send approximately 333.2 billion emails per day. This means that more than 3.5 million emails are sent per second. We also make over 0.5 billion hours of video calls, stream more than 200 million hours of media content, and share more than 5 billion videos and photos every single day.

So, yes, that’s a lot of us armed with many devices (on average, one active internet user had about 2.6 IoT devices in 2022) generating an unbelievable amount of data every single day. But the challenge does not stop at the amount of data alone. The speed and reliability of interacting with it are just as important as, if not more important than, the storage itself. Have you ever searched for one of your photos to show someone, but it was slow and took forever to find, so you gave up? We have all been there, but can you remember just how much time after doing this that you decided to abandon your search?

As technology advances, we gain quicker access to information and multitask more efficiently, which may be contributing to a gradual decline in our attention spans. Research shows that in 2000, the average attention span was 12 seconds. Since then, significant technological milestones have occurred: the advent of the iPhone, YouTube, various generations of mobile networks, Wikipedia, and Spotify, to name a few. Internet speed has also soared, moving from an average of 127 kilobits per second in 2000 to 4.4 Mbps by 2010, and hitting an average of 50.8 Mbps by 2020 – with some areas experiencing speeds well over 200 Mbps today.

As the digital landscape accelerates, so do our expectations, resulting in further erosion of our attention spans. By 2015, that 12-second average had fallen to just 8.25 seconds and dropped slightly below 8 seconds by 2022.

Why speed matters

If you consider your attention span the full amount of time you would consider spending to complete a simple task, such as showing photos or videos to a friend, this means searching for it is just a small percentage of your total time. Let’s say you are using a type of cloud service to search for your photo or video. What would you consider to be an acceptable amount of time between you hitting search and receiving your content?

I still remember the time when “buffering” was a given thing, but if you see something similar today, you would find it unacceptable. According to multiple studies, the ideal load time for “average content,” such as photos or videos, is somewhere between 1 and 2 seconds. 53% of mobile site visits are abandoned if pages take longer than three seconds to load. A further two-second delay in load time results in abandonment rates of up to 87%.

This shows us that storing our data is not enough – making it accessible reliably and with blazing speed is not only nice to have but an absolute necessity in today’s world.

Data management strategies

There are many strategies out there, and we will need to use most of them to meet and hopefully exceed our customers’ expectations. Reading this book, you will learn about some of the key data management strategies at length. For now, however, I would like to bring six of these techniques to your attention. We will take a much closer look at each of these in the upcoming chapters:

  • Bring your data closer: The closer the data is to users, the faster they can access it. Yes, it may sound obvious, but users can be anywhere in the world, and they might even be traveling while trying to access their data. For them, these details do not matter, but the expectation will remain the same.

    There are many different ways to keep data physically close. One of the most successful strategies is called edge computing, which is a distributed computing paradigm that brings computation and data storage closer to the sources of data. This is expected to improve response times and save bandwidth. Edge computing is an architecture rather than a specific technology (and a topology), and is a location-sensitive form of distributed computing.

The other very obvious strategy is to utilize the closest data center possible when utilizing a cloud provider. AWS, for example, spans 96 Availability Zones within 30 geographic Regions around the world as of 2022. Google Cloud offers a very similar 106 zones and 35 regions as of 2023.

Leveraging the nearest physical location can greatly decrease your latency and therefore your customer experience.

  • Reduce the length of your data journey: Again, this is a very obvious one. Try to avoid any unnecessary steps to create the shortest journey between the end user and their data. Usually, the shortest will be the fastest (obviously it’s not that simple, but as a best practice, it can be applied). The greater the number of actions you do to retrieve the required information, the greater computational power you utilize, which directly increases the cost associated with the operation. It also linearly increases the complexity and most of the time increases latency and cost as well.
  • Choose the right database solutions: There are many database solutions out there that you can categorize based on type, such as relational to non-relational (or NoSQL), the distribution being centralized or distributed, and so on. Each category has a high number of sub-categories and each can offer a unique set of solutions to your particular use case. It’s really hard to find the right tool for the job, considering that requirements are always changing. We will dive deeper into each type of system and their pros and cons a bit later in this book.
  • Apply clever analytics: Analytical systems, if applied correctly, can be a real game changer in terms of optimization, speed, and security. Analytics tools are there to help develop insights and understand trends and can be the basis of many business and operational decisions. Analytical services are well placed to provide the best performance and cost for each analytics job. They also automate many of the manual and time-consuming tasks involved in running analytics, all with high performance, so that customers can quickly gain insights.
  • Leverage machine learning (ML) and artificial intelligence (AI) to try to predict the future: ML and AI are critical for a modern data strategy to help businesses and customers predict what will happen in the future and build intelligence into their systems and applications. With the right security and governance control combined with AI and ML capabilities, you can make automated actions regarding where data is physically located, who has access to it, and what can be done with it at every step of the data journey. This will enable you to stick with the highest standards and greatest performance when it comes to data management.
  • Scale on demand: The aforementioned strategies are underpinned by the method you choose to operate your systems. This is where DevOps (and SRE) plays a crucial part and can be the deciding factor between success and failure. All major cloud providers provide you with literally hundreds of platform choices for virtually every workload (AWS offered 475 instance types at the end of 2022). Most major businesses have a very “curvy” utilization trend, which is why they find the on-demand offering of the cloud very attractive from a financial point of view.

You should only pay for resources when you need them and pay nothing when you don’t. This is one of the big benefits of using cloud services. However, this model only works in practice if the correct design and operational practices and the right automation and compatible tooling are utilized.

A real-life example

A leading telecommunications company was set to unveil their most anticipated device of the year at precisely 2 P.M., a detail well publicized to all customers. As noon approached, their online store saw typical levels of traffic. By 1 P.M., it was slightly above average. However, a surge of customers flooded the site just 10 minutes before the launch, aiming to be among the first to secure the new phone. By the time the clock struck 2 P.M., the website had shattered previous records for unique visitors. In the 20 minutes from 1:50 P.M. to 2:10 P.M., the visitor count skyrocketed, increasing twelvefold.

This influx triggered an automated scaling event that expanded the company’s infrastructure from its baseline (designated as 1x) to an unprecedented 32x. Remarkably, this massive scaling was needed only for the initial half-hour. After that, it scaled down to 12x by 2:30 P.M., further reduced to 4x by 3 P.M., and returned to its baseline of 1x by 10 P.M.

This seamless adaptability was made possible through a strategic blend of declarative orchestration frameworks, infrastructure as code (IaC) methodologies, and fully automated CI/CD pipelines. To summarize, the challenge is big. To be able to operate reliably yet cost-effectively, with consistent speed and security, all the while automatically scaling these services up and down on demand without human interaction in a matter of minutes, you need a set of best practices on how to design, build, test, and operate these systems. This sounds like DevOps.

The early days of DevOps

I first came across DevOps around 2014 or so, just after the first annual State of DevOps report was published. At the time, the idea sounded great, but I had no idea how it worked. It felt like – at least to me – it was still in its infancy or I was not knowledgeable and experienced enough to see the big picture just yet. Probably the latter. Anyway, a lot has happened since then, and the industry picked up the pace. Agile, CI/CD, DevSecOps, GitOps, and other approaches emerged on the back of the original idea, which was to bring software developers and operations together.

DevOps emerged as a response to longstanding frictions between developers (Devs) and operations (Ops) within the IT industry. The term obvious seems apt here because, for anyone involved in IT during that period, the tension was palpable and constant. Devs traditionally focused solely on creating or fixing features, handing them off to Ops for deployment and ongoing management. Conversely, Ops prioritized maintaining a stable production environment, often without the expertise to fully comprehend the code they were implementing.

This set up an inherent conflict: introducing new elements into a production environment is risky, so operational stability usually involves minimizing changes. This gave rise to a “Devs versus Ops” culture, a divide that DevOps sought to bridge. However, achieving this required both sides to evolve and adapt.

In the past, traditional operational roles such as system administrators, network engineers, and monitoring teams largely relied on manual processes. I can recall my initial stint at IBM, where the pinnacle of automation was a Bash script. Much of the work in those days – such as setting up physical infrastructure, configuring routing and firewalls, or manually handling failovers – was done by hand.

While SysAdmin and networking roles remain essential, even in the cloud era, the trend is clearly toward automation. This shift enhances system reliability as automated configurations are both traceable and reproducible. If systems fail, they can be swiftly and accurately rebuilt.

Though foundational knowledge of network and systems engineering is irreplaceable, the push toward automation necessitates software skills – a proficiency often lacking among traditional operational engineers. What began with simple Bash scripts has evolved to include more complex programming languages such as Perl and Python, and specialized automation languages such as Puppet, Ansible, and Terraform.

In terms of the development side, the development team worked with very long development life cycles. They performed risky and infrequent “big-bang” releases that almost every time caused massive headaches for the Ops teams and posed a reliability/stability risk to the business. Slowly but steadily, Dev teams moved to a more frequent, gradual approach that tolerated failures better. Today, we call this Agile development.

If you look at it from this point of view, you can say that a set of common practices designed to reduce friction between Dev and Ops teams is the basis of DevOps. However, simple common practices could not solve the Dev versus Ops mentality that the industry possessed at the time. Shared responsibility between Devs and Ops was necessary to drive this movement to success. Automation that enables the promotion of new features into production rapidly and safely in a repeatable manner could only be achieved if the two teams worked together, shared a common objective, and were accountable (and responsible) for the outcome together. This is where SRE came into the picture.

SRE versus DevOps

SRE originated at Google. In the words of Ben Treynor (VP of engineering at Google), “SRE is what happens when you ask a software engineer to design an operations function.”

If you want to put it simply (again, I am quoting Google here), “Class SRE implements DevOps.”

SRE is the (software) engineering discipline that aims to bridge the gap between Devs and Ops by treating all aspects of operations (infrastructure, monitoring, testing, and so on) as software, therefore implementing DevOps in its ultimate form. This is fully automated, with zero manual interaction, treating every single change to any of its components (again referring to any changes to infrastructure, monitoring, testing, and so on) as a release. Every change is done via a pipeline, in a version-controlled and tested manner. If a release fails, or a production issue is observed and traced back to a change, you can simply roll back your changes to the previously known, healthy state.

The fact that it is treated as any other software release allows the Dev teams to take on more responsibility and take part in Ops, almost fully blurring the line between the Dev and Ops functions. Ultimately, this creates a You build it, you run it culture – which makes “end-to-end” ownership possible.

So, are SRE and DevOps the same thing? No, they are not. SRE is an engineering function that can also be described as a specific implementation of DevOps that focuses specifically on building and running reliable systems, whereas DevOps is a set of practices that is more broadly focused on bringing the traditional Dev and Ops functions closer together.

Regardless of which way you go, you want to ensure that you set an objective, engineering principles, and a tooling strategy that can help you make consistent decisions as you embark on your journey as a DevOps/SRE professional.

Engineering principles

I offer the following engineering principles to start with:

  • Zero-touch automation for everything (if it’s manual – and you have to do it multiple times a month – it should be automated)
  • Project-agnostic solutions (defined in the configuration to avoid re-development for new projects, any tool/module should be reusable)
  • IaC (infrastructure should be immutable where possible and defined as code; provisioning tools should be reusable)
  • Continuous delivery (CD) with continuous integration (CI) (common approaches and environments across your delivery cycle; any service should be deployable immediately)
  • Reliability and security validated at every release (penetration testing, chaos testing, and more should be added to the CI/CD pipeline; always identify the point of flavors at your earliest)
  • Be data-driven (real-time data should be utilized to make decisions)

To fully realize your engineering goals and adhere to your principles without compromise, you should make “immutable IaC” a priority objective.

To enable this, I would recommend the following IaC principles:

  • Systems can be easily reproduced
  • Systems are immutable
  • Systems are disposable
  • Systems are consistent
  • Processes are repeatable
  • Code/config are version-controlled

Once you have defined your goals, it’s time for you to choose the right tools for the job. To do that, you must ensure these tools are allowed to utilize the following:

  • Declarative orchestration framework(s):
    • The declarative orchestration approach uses structural models that describe the desired application structure and state. These are interpreted by a deployment engine to enforce this state.
    • It enables us to define the end state and interact in a declarative manner, thus making managing the application less resource-intensive (faster speed to market and cheaper costs).

    The following is an example Terraform file (main.tf):

    provider "aws" {
      region = "us-west-2"
    }
    # Create an S3 bucket
    resource "aws_s3_bucket" "my_bucket" {
      bucket = "my-unique-bucket-name"
      acl    = "private"
    }
    # Create an EC2 instance
    resource "aws_instance" "my_instance" {
      ami           = "ami-0c55b159cbfafe1f0" # This is an example Amazon Linux 2 AMI ID; use the appropriate AMI ID for your needs
      instance_type = "t2.micro"
      tags = {
        Name = "MyInstance"
      }
    }
  • Declarative resource definition:
    • In a declarative style, you simply say what resources you would like, and any properties they should have so that you can create and deploy an entire infrastructure declaratively. For example, you can deploy not only agents (or sidecars) but also the network infrastructure, storage systems, and any other resources you may need.
    • This enables us to define what our infrastructure resources should look like and force the orchestrator to create it (focus on the how while leveraging declarative orchestration).

    The following is an example that uses Kubernetes, which is a popular container orchestration platform that exemplifies the concept of declarative resource definition. In this example, we’ll define a Deployment for a simple web server and a Service to expose it.

    Here’s a YAML file (deployment-and-service.yaml) for Kubernetes:

    # Deployment definition to create a web server pod
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: my-web-server
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: web-server
      template:
        metadata:
          labels:
            app: web-server
        spec:
          containers:
          - name: nginx
            image: nginx:1.17
            ports:
            - containerPort: 80
    ---
    # Service definition to expose the web server
    apiVersion: v1
    kind: Service
    metadata:
      name: my-web-service
    spec:
      selector:
        app: web-server
      ports:
        - protocol: TCP
          port: 80
  • Idempotency:
    • This allows you to create and deploy an entire infrastructure declaratively. For example, you can deploy not only agents (or sidecars) but also the network infrastructure, storage systems, and any other resources you may need. Idempotency is the property that an operation may be applied multiple times with the result not differing from the first application. Restated, this means multiple identical requests should have the same effect as a single request.
    • Idempotency enables the same request to be sent multiple times but the result given is always the same (same as declared, never different).
  • No secrets and environment config in code:
    • The main cloud providers all have a secure way to manage secrets. These solutions provide a good way to store secrets or environment config values for the application you host on their services.
    • Everything should be self-served and manageable in a standardized manner and therefore secrets and configs must be declarative and well defined to work with the aforementioned requirements.
  • Convention over configuration:
    • Also known as environment tag-based convention over configuration, convention over configuration is a simple concept that is primarily used in programming. It means that the environment in which you work (systems, libraries, languages, and so on) assumes many logical situations by default, so if you adapt to them rather than creating your own rules each time, programming becomes an easier and more productive task.
    • This means that developers have to make fewer decisions when they’re developing and there are always logical default options. These logical default options have been created out of convention, not configuration.
  • Automation scripts packaged into an image:
    • This enables immutability and encourages sharing. No longer is a script located on a server and then has to be copied to others – instead, it can be shipped just like the rest of our code, enabling scripts to be available in a registry rather than dependent on others.

Thanks to the amazing progress in this field in the past 10+ years, customer expectations are sky-high when it comes to modern solutions. As we established earlier, if content does not load in under two seconds, it is considered to be slow. If you have to wait longer than 3 to 5 seconds, you are likely to abandon it. This is very similar to availability and customer happiness. When we talk about customer happiness (which evolved from customer experience), a concept you cannot measure and therefore cannot be data-driven, setting the right goals/objectives can be crucial to how you design your solutions.

Objectives – SLOs/SLIs

Service-level objectives (SLOs), which is a concept that’s referenced many times in Google’s SRE handbook, can be a great help to set your direction from the start. Choosing the right objective, however, can be trickier than you might think.

My personal experience aligns with Google’s recommendation, which suggests that an SLO – which sets the target for the reliability of a service’s customers – should be under 100%.

This is due to multiple reasons. Achieving 100% is not just very hard and extremely expensive, but almost impossible given that almost all services have soft/hard dependencies on other services. If just one of your dependencies offers less than 100% availability, your SLO cannot be met. Also, even with every precaution you can make, and every redundancy in place, there is a non-zero probability that something (or many things) will fail, resulting in less than 100% availability. More importantly, even if you could achieve 100% reliability of your services, the customers would very likely not experience that. The path your customers must take (the systems they have to use) to access your services is likely to have less than 100% SLO.

Most commercial internet providers, for example, offer 99% availability. This also means that as you go higher and higher, let’s say from 99% to 99.9% or IBM’s extreme five nines (99.999%), the cost of achieving and maintaining this availability will be significantly more expensive the more “nines” you add, but your customers will experience less and less of your efforts, which makes the objective questionable.

Above the selected SLO threshold, almost all users should be “happy,” and below this threshold, users are likely to be unhappy, raise concerns, or just stop using the service.

Once you’ve agreed that you should look for an SLO less than 100%, but likely somewhere above or around 99%, how do you define the right baseline?

This is where service-level indicators (SLIs), service-level agreements (SLAs), and error budgets come into play. I will not detail all of these here, but if you are interested, please refer to Google’s SRE book (https://sre.google/books/) for more details on the subject.

Let’s say you picked an SLO of 99.9% – which is, based on my personal experience, the most common go-to for businesses these days. You now have to consider your core operational metrics. DevOps Research and Assessment (DORA) suggests four key metrics that indicate the performance of a DevOps team, ranking them from “low” to “elite,” where “elite” teams are more likely to meet or even exceed their goals and delight their customers compared to “low” ranking teams.

These four metrics are as follows:

  • Lead time for change, a metric that quantifies the duration from code commit to production deployment, is in my view one of the most crucial indicators. It serves as a measure of your team’s agility and responsiveness. How swiftly can you resolve a bug? Think about it this way:
    • Low-performing: 1 month to 6 months of lead time
    • Medium-performing: 1 week to 1 month of lead time
    • High-performing: 1 day to 1 week of lead time
    • Elite-performing: Less than 1 day of lead time
  • Deployment frequency, which measures the successful release count to production. The key word here is successful, as a Dev team that constantly pushes broken code through the pipeline is not great:
    • Low-performing: 1 month to 6 months between deployments
    • Medium-performing: 1 week to 1 month between deployments
    • High-performing: 1 day to 1 week between deployments
    • Elite-performing: Multiple deployments per day/less than 1 day between deployments
  • Change failure rate, which measures the percentage of deployments that result in a failure in production that requires a bug fix or rollback. The goal is to release as frequently as possible, but what is the point if your team is constantly rolling back those changes, or causing an incident by releasing a bad update? By tracking it, you can see how often your team is fixing something that could have been avoided:
    • Low-performing: 45% to 60% CFR
    • Medium-performing: 15% to 45% CFR
    • High-performing: 0% to 15% CFR
    • Elite-performing: 0% to 15% CFR
  • Mean time to restore (MTTR) measures how long it takes an organization to recover from a failure. This is measured from the initial moment of an outage until the incident team has recovered all services and operations. Another key and related metric is mean time to acknowledge (MTTA), which measures the time it takes to be aware of and confirm an issue in production:
    • Low-performing: 1 week to 1 month of downtime
    • Medium- and high-performing: Less than 24 hours of downtime
    • Elite-performing: Less than 1 hour of downtime

In conclusion, SLOs are crucial in setting reliability targets for a service, with a recommendation for these to be under 100% to account for dependencies and potential service failures. Utilizing tools such as SLIs, SLAs, and error budgets is essential in defining the appropriate SLO baseline, usually around or above 99%. We have also highlighted the importance of core operational metrics, as suggested by DORA, in assessing the performance of a DevOps team. These metrics, including lead time for change, deployment frequency, change failure rate, and MTTR, provide tangible criteria to measure and improve a team’s efficiency and effectiveness in service delivery and incident response.

Summary

DevOps presents challenges; introduce data and those challenges intensify. This book aims to explore that intricate landscape.

Consider this: immutable objects and IaC with declarative orchestration frameworks often yield secure, dependable, and repeatable results. But what happens when you must manage entities that resist immutability? Think about databases or message queues that house data that can’t be replicated easily. These technologies are integral to production but demand unique attention.

Picture this: a Formula 1 car swaps out an entire tire assembly in mere seconds during a pit stop. Similarly, with immutable objects such as load balancers, a quick destroy-and-recreate action often solves issues. It’s convenient and rapid, but try applying this quick-swap approach to databases and you risk data corruption. You must exercise caution when dealing with mutable, data-persistent technologies.

Fast forward to recent years, and you’ll find attempts to facilitate database automation via custom resource definitions (CRDs) or operators. However, such methods have proven costly and complex, shifting the trend toward managed services. Yet, for many, outsourcing data operations isn’t the ideal solution, given the priority of data security.

Navigating DevOps and SRE best practices reveals the looming complexities in managing data-centric technologies. Despite the valuable automation tools at our disposal, maintaining the highest DevOps standards while capitalizing on this automation is anything but straightforward. We’ll delve into these challenges and potential solutions in the chapters to come.

Left arrow icon Right arrow icon

Key benefits

  • Implement core operational capabilities via automated pipelines, including testing and rollbacks
  • Create infrastructure, deploy software, test execution, and monitor operations using the as-code strategy
  • Automate common implementation patterns for databases with declarative orchestration frameworks
  • Purchase of the print or Kindle book includes a free PDF eBook

Description

In today's rapidly evolving world of DevOps, traditional silos are a thing of the past. Database administrators are no longer the only experts; site reliability engineers (SREs) and DevOps engineers are database experts as well. This blurring of the lines has led to increased responsibilities, making members of high-performing DevOps teams responsible for end-to-end ownership. This book helps you master DevOps for databases, making it a must-have resource for achieving success in the ever-changing world of DevOps. You’ll begin by exploring real-world examples of DevOps implementation and its significance in modern data-persistent technologies, before progressing into the various types of database technologies and recognizing their strengths, weaknesses, and commonalities. As you advance, the chapters will teach you about design, implementation, testing, and operations using practical examples, as well as common design patterns, combining them with tooling, technology, and strategies for different types of data-persistent technologies. You’ll also learn how to create complex end-to-end implementation, deployment, and cloud infrastructure strategies defined as code. By the end of this book, you’ll be equipped with the knowledge and tools to design, build, and operate complex systems efficiently.

Who is this book for?

This book is for newcomers as well as seasoned SREs, DevOps engineers, and system engineers who are interested in large-scale systems with a heavy focus on data-persistent technologies. Database administrators looking to level up in the world of DevOps will also find this book helpful. Experience with cloud Infrastructure, basic development, and operations will help you get the most out of this book.

What you will learn

  • Apply DevOps best practices to data-persistent technologies
  • Get to grips with architectural-level design and implementation
  • Explore the modern data journey and data modeling with database technology
  • Master the operation of large-scale systems with zero-touch automation
  • Achieve speed, resilience, security, and operability at different scales
  • Design DevOps teams with end-to-end ownership models

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Dec 29, 2023
Length: 446 pages
Edition : 1st
Language : English
ISBN-13 : 9781837637898
Vendor :
Couchbase Inc
Concepts :
Tools :

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Product feature icon AI Assistant (beta) to help accelerate your learning
OR
Modal Close icon
Payment Processing...
tick Completed

Billing Address

Product Details

Publication date : Dec 29, 2023
Length: 446 pages
Edition : 1st
Language : English
ISBN-13 : 9781837637898
Vendor :
Couchbase Inc
Concepts :
Tools :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just Can$6 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just Can$6 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total Can$ 190.97
DevOps for Databases
Can$63.99
Architecting AWS with Terraform
Can$56.99
Mastering Kubernetes
Can$69.99
Total Can$ 190.97 Stars icon
Banner background image

Table of Contents

23 Chapters
Part 1: Database DevOps Chevron down icon Chevron up icon
Chapter 1: Data at Scale with DevOps Chevron down icon Chevron up icon
Chapter 2: Large-Scale Data-Persistent Systems Chevron down icon Chevron up icon
Chapter 3: DBAs in the World of DevOps Chevron down icon Chevron up icon
Part 2: Persisting Data in the Cloud Chevron down icon Chevron up icon
Chapter 4: Cloud Migration and Modern Data(base) Evolution Chevron down icon Chevron up icon
Chapter 5: RDBMS with DevOps Chevron down icon Chevron up icon
Chapter 6: Non-Relational DMSs with DevOps Chevron down icon Chevron up icon
Chapter 7: AI, ML, and Big Data Chevron down icon Chevron up icon
Part 3: The Right Tool for the Job Chevron down icon Chevron up icon
Chapter 8: Zero-Touch Operations Chevron down icon Chevron up icon
Chapter 9: Design and Implementation Chevron down icon Chevron up icon
Chapter 10: Database Automation Chevron down icon Chevron up icon
Part 4: Build and Operate Chevron down icon Chevron up icon
Chapter 11: End-to-End Ownership Model – a Theoretical Case Study Chevron down icon Chevron up icon
Chapter 12: Immutable and Idempotent Logic – A Theoretical Case Study Chevron down icon Chevron up icon
Chapter 13: Operators and Self-Healing Data Persistent Systems Chevron down icon Chevron up icon
Chapter 14: Bringing Them Together Chevron down icon Chevron up icon
Part 5: The Future of Data Chevron down icon Chevron up icon
Chapter 15: Specializing in Data Chevron down icon Chevron up icon
Chapter 16: The Exciting New World of Data Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Full star icon Full star icon 5
(7 Ratings)
5 star 100%
4 star 0%
3 star 0%
2 star 0%
1 star 0%
Filter icon Filter
Top Reviews

Filter reviews by




Sangita Mahala Jan 11, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I highly recommend this book to anyone interested in the intersection of DevOps and database management. It's an insightful, informative, and essential read for anyone involved in building, managing, or utilizing data-persistent systems.
Amazon Verified review Amazon
Gabor Gerencser Feb 05, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
DevOps for Databases book presents an invaluable resource for professionals engaged in database management within contemporary DevOps/SRE frameworks. This comprehensive tome adeptly blends practical insights with theoretical underpinnings, offering a well-rounded understanding of the subject matter. Drawing from extensive real-world experience and best practices, it meticulously explores various facets of database management, ranging from fundamental concepts to advanced AI applications, encompassing both relational and non-relational databases. With illustrative examples, the book delves into essential tooling, self-healing mechanisms, and a plethora of other compelling topics. I wholeheartedly recommend this publication to individuals at all career levels, as it serves as a catalyst for enhancing one's proficiency in database management
Amazon Verified review Amazon
Michael Cade May 05, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I received my copy of the book at the turn of the new year and took to it in some of my spare time, my specific interest was obviously around databases but how in 2024 we should consider the database, a stateful workload when it comes to automation, orchestration and code. I thought the book kicked off nicely with the brief overview but not going to deep into the over arching devops mindset, principles and process.The ability to cover the concepts I throughout and make the book engaging is the key metric for me and why it has to be a 5 star rating.If you are easily distracted like me, the best way to attack this book and similar is to set aside some time each day or week to get through a chapter or 10 pages. This will help with that concentration. NAs the author of 90DaysOfDevOps, I was also intrigued to how this could be used to spread the use cases to the community. In 2023 there was a section on Databases where we cover the fundamentals but this book gave me a much bigger insight into the next level of databases and how we can really bring that level of automation and control to our mission critical data.It has also given me an idea regarding DevOps for Data Management which would include data protection, recovery, security, intelligence and mobility when it comes to databases but also other data services we find in our environments today.I have always found that Pakt books and mostly authors provide a great human touch and perspective to the subject matter being covered.
Amazon Verified review Amazon
Sandor Szabo Jan 07, 2024
Full star icon Full star icon Full star icon Full star icon Full star icon 5
I have bought this book as a means to get more information about how DevOps practices can be applied for everyday use in a corporate environment mostly concerning Databases, but also touching best practices, security, team build-out and much more.I was very happy with the content! It's a well built, step-by-step explanation of different steps, procedures, practices backed up with lots of demonstrable content as well. I especially loved the code segments that I could use effortlessly by slightly changing it to fit my needs to try out new settings or demonstrate for my team how the new capabilities work.An overwhelming amount of information - in a good way! - yet still so clearly being guided through the book. It's a must have and I encourage everyone to read it over and over again to discover all the important bits and pieces of how DevOps can (and should) be applied in practice!
Amazon Verified review Amazon
David Jambor Dec 30, 2023
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.