Issue 151
If you’re a tools person, this is the week you’ve been waiting for. Lots of unique and interesting stuff to learn about. Enjoy! 😎⛄
This issue is sponsored by:
Raygun + Flutter: Build more resilient mobile applications ⚡
Raygun is expanding its powerful Error Monitoring and Crash Reporting solution to Flutter. Now you can get complete visibility into the health of your Flutter applications, with rich diagnostics that take you to the root cause of errors and crashes. Read the blog today.
Articles & News on monitoring.love
Observability & Monitoring Community Slack
It’s been amazing to see the community grow throughout 2021 and into 2022. We’d love to have you join us and share what you’ve been working on.
From The Community
Managing Availability in Service Based Deployments with Continuous Testing
I think there are a lot of different names for what Salesforce refers here to as Continuous Testing, but it’s a good practice regardless and the author does a good job explaining its role in deployment validation.
Getting visibility into your container images
This article introduces a new (to me) tool that looks super helpful for creating an inventory of all the software versions running in a container. I know that you can sort of do this with Prometheus already, but a standalone tool for audits makes a lot of sense too.
Deep Dive into Cortex Metrics
A two-part series on getting started with Cortex with Prometheus for a highly-available metrics aggregation solution.
Let service teams own the service operations instead of the SRE
This is more of a DevOps/SRE-related article, but it affects many of us who own monitoring or observability services.
5 Dashboard Design Best Practices
Most teams I’ve worked with will slap a bunch of metrics and graphs together without really understanding how to use the data effectively. This is a thoughtful look at how to design a dashboard with your users in mind.
SysAdvent Day 23 - What is eBPF?
You’ve heard a lot about eBPF but haven’t been sure how to try it out? SysAdvent brings us some examples for getting started with the BCC framework and a little Python.
Obvious Ownership: A Sensible Humane Registry
Discoverability of systems is a huge deal, but tools like Prometheus will only get you so far. How do you track ownership and relationships between seemingly disparate systems? A humane registry might be just the ticket.
Chronosphere is the only observability platform that puts you back in control by taming rampant data growth and cloud-native complexity, delivering increased business confidence. Learn how DoorDash is “no longer flying blind” with increased visibility and reliability from Chronosphere’s end-to-end solution. Learn more here. (SPONSORED)
How to Debug Ruby Performance Issues Using Profiling
This feels more like a “showcase” of the Pyroscope open source project, but still a nifty looking profiling tool nonetheless.
Top AWS Lambda metrics to monitor
A quick recap of the metrics you’re likely to want to keep an eye on when dealing with Lambdas.
Five tricks for logging at scale in a Kubernetes environment with Grafana Loki
It’s a little buried below the marketing speak (not to mention the actual video hidden behind a registration form), but here’s a quick and dirty list of useful takeaways for anyone running Grafana Loki.
Tools
“A CLI tool and Go library for generating a Software Bill of Materials (SBOM) from container images and filesystems.”
“Pyroscope is an open source continuous profiling platform.”
Job Opportunities
Site Reliability Engineer at MaxMind (NA Remote)
Senior Cloud Engineer at Nebulaworks (Remote)
Infrastructure Engineer at Verishop (Remote)
Principal Site Reliability Engineer at Cribl (US Remote)
Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)
See you next week!
– Jason (@obfuscurity) Monitoring Weekly Editor