Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Scalable Data Streaming with Amazon Kinesis

You're reading from   Scalable Data Streaming with Amazon Kinesis Design and secure highly available, cost-effective data streaming applications with Amazon Kinesis

Arrow left icon
Product type Paperback
Published in Mar 2021
Publisher Packt
ISBN-13 9781800565401
Length 314 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (4):
Arrow left icon
Rajeev Chakrabarti Rajeev Chakrabarti
Author Profile Icon Rajeev Chakrabarti
Rajeev Chakrabarti
Tarik Makota Tarik Makota
Author Profile Icon Tarik Makota
Tarik Makota
Brian Maguire Brian Maguire
Author Profile Icon Brian Maguire
Brian Maguire
Danny Gagne Danny Gagne
Author Profile Icon Danny Gagne
Danny Gagne
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Section 1: Introduction to Data Streaming and Amazon Kinesis
2. Chapter 1: What Are Data Streams? FREE CHAPTER 3. Chapter 2: Messaging and Data Streaming in AWS 4. Chapter 3: The SmartCity Bike-Sharing Service 5. Section 2: Deep Dive into Kinesis
6. Chapter 4: Kinesis Data Streams 7. Chapter 5: Kinesis Firehose 8. Chapter 6: Kinesis Data Analytics 9. Chapter 7: Amazon Kinesis Video Streams 10. Section 3: Integrations
11. Chapter 8: Kinesis Integrations 12. Other Books You May Enjoy

Decoupling systems

A distributed system is composed of multiple networked servers that work together by sending messages between each other. They allow applications to be built that require more compute, storage, or resiliency than is available on a single instance. Some common distributed systems are the World Wide Web, distributed databases, and scientific computing clusters. Distributed systems are often fractal. For example, the three-tier web application, perhaps the most common architecture you will see in the wild, is often constructed of distributed databases, log analysis systems, and payment providers.

The need for distributed systems has increased dramatically over the past 10 years. There are three primary drivers for this: data scale, computational requirements, and organization design and coordination. At first, these systems were brittle and challenging to manage, but over time, certain key patterns emerged that have enabled them to scale by reducing complexity.

The first key in managing complexity was adopting standardized interfaces and common data formats and encodings. This allowed the development of microservice-based architectures where different teams could manage functionality and provide it as a service to the rest of the organization. This reduced the amount of coordination among teams and allowed them to iterate and release at their own appropriate speed, thereby acknowledging and leveraging Conway's Law.

Conway's Law

In 1967, Melvin Conway stated: "Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization's communication structure." This is based on the observation that people need to communicate in order to design and develop systems. When this is applied to microservices, it allows the groups to own their services directly and explicitly model the organization/communication/software architecture correspondence.

The second was to separate the program into different fault domains by moving to a loosely coupled architecture. This is often achieved by having one system send another system a message. However, messages being sent from one fault domain to another made it difficult to reason and understand the complex failure modes of these systems. By introducing asynchronous message brokers, we can define clear boundaries between different fault domains, making it possible to reason about them. The message queue acts as an invariant in the system. It provides a clean interface where it can send messages and retrieve them. If another system is unavailable, the message broker will be able to cache the messages, called a backlog, and that system is responsible for handling them when it resumes service.

There are still many challenges to the design, deployment, and orchestration of these decoupled systems. However, the introduction of modern highly available message brokers has been key in reducing their complexity.

Now that we've seen how asynchronous messaging can separate fault domains, let's learn how they fit into distributed systems.

You have been reading a chapter from
Scalable Data Streaming with Amazon Kinesis
Published in: Mar 2021
Publisher: Packt
ISBN-13: 9781800565401
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image