PLAY PODCASTS
Monitoring checklist
Episode 7

Monitoring checklist

Nikolay takes us through a checklist of important things to monitor, while Michael tries to keep up.

Postgres FM · Nikolay Samokhvalov and Michael Christofides

August 19, 202227m 1s

Audio is streamed directly from the publisher (media.transistor.fm) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.

Show Notes

Monitoring checklist (dashboard 1):

  1. TPS and (optional but also desired) QPS
  2. Latency (query duration) — at least average. Better: histogram, percentiles
  3. Connections (sessions) — stacked graph of session counts by state (first of all: active and idle-in-transaction; also interesting: idle, others) and how far the sum is from max_connection (+pool size for PgBouncer).
  4. Longest transactions (max transaction age or top-n transactions by age), excluding autovacuum activity
  5. Commits vs rollbacks — how many transactions are rolled back
  6. Transactions left till transaction ID wraparound
  7. Replication lags / bytes in replication slot / unused replication slots
  8. Count of WALs waiting to be archived (archiving lag)
  9. WAL generation rates
  10. Locks and deadlocks
  11. Basic query analysis graph (top-n by total_time or by mean_time?)
  12. Basic wait event analysis (a.k.a. “active session analysis” or “performance insights”)

And links to a few things we mentioned: 


------------------------


What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov and @michristofides

If you would like to share this episode, here's a good link (and thank you!)


Postgres FM is brought to you by:


With special thanks to:

  • Jessie Draws for the elephant artwork

Topics

PostgresPostgreSQLDatabasesSQLtechnology