Issue 128

Great to hear so much positive feedback on the newsletter lately. If there are any particular topics or tools you’d like to hear more about, please drop me a line. In the meantime, enjoy a ton of variety in this week’s issue, along with some fresh job postings (including one working with Tesla’s supercomputer team!). ⚡🚗🚀

This issue is sponsored by:

Splunk logo Get visibility into your complex applications, no matter where they’re deployed. Consolidate tools and eliminate blind spots by combining infrastructure and application monitoring, logging, RUM, and more. Get alerts in real-time, based on all your data without sampling. Check out a free trial of Splunk Observability Cloud today.

Articles & News on monitoring.love

Observability & Monitoring Community Slack

Come hang out with all your fellow Monitoring Weekly readers. I mean, I’m also there, but I’m sure everyone else is way cooler.

From The Community

Sometimes It’s Not Always DNS

You know I love a good debugging story. Proof that monitoring data is only as good as your ability to derive meaning from it.

Monitoring Alerts That Don’t Suck

Healthy alerting practices, yes please and thank you.

Building Faster Indexing with Apache Kafka and Elasticsearch

Doordash redesigned their search platform to optimize for Elasticsearch reindexing.

How to Establish a Culture of Secure DevOps

This article is largely focused on DevOps (as it applies to a Security-aware culture) but so many of these principles ring true for building a Observability-aware culture as well.

Kubernetes Dashboards: Headlamp

Walkthrough of the Headlamp dashboard for Kubernetes. Looks like a really easy tool to test out in development.

4 Key Observability Metrics for Distributed Applications

A hybrid look of some of the USE and RED monitoring methods, applied specifically to distributed services on Kubernetes.

Definitive guide for performance improvement tools and solutions for web applications

I don’t know that I’d call it a definitive guide, but there are some useful bits if your team is responsible for web application performance (or even if you just run your own website and want to keep tabs on things).

Monitoring New Relic Subscription Usage

Frankly, I’m kind of surprised that New Relic doesn’t expose this data natively, but it’s nice to see that NRQL can be leveraged in this manner.

What CI Observability Means for DevOps

Please share this one with your CI/CD engineers.

Re-considering Observability’s principles

I don’t necessarily agree with the author on all points, but it feels like a useful exercise to re-examine our assumptions around observability in practice.

Unpacking Observability: Understanding Logs, Events, Spans, and Traces

A friendly primer on some key Observability primitives. Moo.

Modern Incident Management for IT Teams

I’d love to see the author drill into the postmortem aspects of incident management, but possibly still a good reference for engineers and support folks that interface with SRE teams.

Real Time Linux Server monitoring with GLANCES

Getting some real htop vibes here. NGL, this feels that a throwback to named servers, but I’m also kind of shocked to see that it has almost 20k stars on GitHub.

Intro to K8s Monitoring — Part 1

A quick overview of some basic Kubernetes monitoring use cases and resources.

This issue is sponsored by:

OpsRamp logo Want to shape the future of an enterprise observability and AIOps platform?

Volunteer to test the new OpsRamp Free Trial monitoring application and earn some cash while helping to improve our product roadmap. If you’re a cloud engineer, monitoring nerd, or just an ops specialist with strong vision, sign up today.

Tools

kinvolk/headlamp

“Headlamp can be used in-cluster , where it’s accessed through a web browser, or as a desktop application (using the information defined in the user’s kubeconfig).”

nicolargo/glances

“Glances is a cross-platform monitoring tool which aims to present a large amount of monitoring information through a curses or Web based interface.”

Events

Monitorama PDX 2021 - September 13-15 (Portland, OR)

One of the first technical conferences to resume in-person events, Monitorama is returning to Portland, OR this fall. It looks like a return to form for one of our favorite events (ok, we might be biased). Hope to see you there!

Job Opportunities

Site Reliability Engineer at DigitalOcean (Remote)

Site Reliability Engineer at SageSure (Remote)

Site Reliability Engineer - Supercomputing at Tesla (Fremont, CA, USA)

Senior Software Engineer - DevOps at Angel Studios (Remote)

Ready to lower your AWS bill? Now might be the perfect time for an AWS Cost Optimization project with The Duckbill Group. The Duckbill Group aims for a 15-20% cost reduction in identified savings opportunities through tweaks to your architecture–or your money back. (SPONSORED)

See you next week!

– Jason (@obfuscurity) Monitoring Weekly Editor