Ongoing Code Execution Downtime
Incident Report for CoderPad
Postmortem

We’re very sorry for the downtime.

Turns out, collectd ate all our disk. Exacerbated by a failed rollover last night, today at about noon, our worker nodes went down due to lack of space on the boot drive. We’re going to probably nix collectd for metrics-gathering entirely. Thank you for your patience.

Posted 11 months ago. Sep 28, 2018 - 12:31 PDT

Resolved
It looks like our nodes managed to run out of disk space. We're not sure why, but the service should be okay for now.
Posted 11 months ago. Sep 28, 2018 - 12:21 PDT
Identified
We've identified host unhealthiness in our code execution backend and are redeploying now.
Posted 11 months ago. Sep 28, 2018 - 12:13 PDT
This incident affected: Execution Tier.