Ongoing Code Execution Downtime
Incident Report for CoderPad
Postmortem

We’re very sorry for the downtime.

Turns out, collectd ate all our disk. Exacerbated by a failed rollover last night, today at about noon, our worker nodes went down due to lack of space on the boot drive. We’re going to probably nix collectd for metrics-gathering entirely. Thank you for your patience.

Posted 9 months ago. Sep 28, 2018 - 12:31 PDT

Resolved
It looks like our nodes managed to run out of disk space. We're not sure why, but the service should be okay for now.
Posted 9 months ago. Sep 28, 2018 - 12:21 PDT
Identified
We've identified host unhealthiness in our code execution backend and are redeploying now.
Posted 9 months ago. Sep 28, 2018 - 12:13 PDT
This incident affected: Execution Tier.