In this chapter, we covered a large number of topics, including self-healing, autoscaling, logging, metrics, and distributed tracing. Monitoring a distributed system is tough. Just installing and configuring the various monitoring services like Fluentd, Prometheus, and Jaeger is a non-trivial project. Managing the interactions between them and how your services support logging, instrumentation, and tracing adds another level of complexity. We've seen how Go-kit, with its middleware concept, makes it somewhat easier to add those operational concerns in a decoupled way from the core business logic. Once you have all the monitoring for those systems in place, there's a new set of challenges to take into account – how do you gain insights from all the data? How can you integrate it into your alerting and incident response process? How can you continuously...





















































