Alerting with Time Series


In a Cloud Native infrastructure, failure is normal and expected. The loss of a single node or a dozen hard drives is gracefully handled by the systems running a datacenter and there is no reason to page someone at 4am. This calls for an alerting system that understands service availability at a global scope, yet is still able to give detailed reports if and when there is a service-impacting incident. This talk explores how time series based alerting solves this problem, the Prometheus architecture behind it, and how practical anomaly detection can be implemented.

Language: English

Level: Intermediate

Fabian Reinartz

Software Engineer - CoreOS

Fabian Reinartz is a software engineer at CoreOS and one of the core developers of Prometheus, a monitoring system and time series database. Previously, he was a production engineer at SoundCloud and worked on information retrieval during his time at Saarland University.

Go to speaker's detail