Understanding Delivery Failures
Our timeout setting is currently set to 10 seconds. We will only retry on a connection failure (connection refused, socket timeout) as we can't be sure we actually reached your host. This could also allow you to shut down your systems for maintenance and know we'll pick up delivering alerts where we left off once the system is back up.
However, we will not retry if we receive a http response regardless of the http status code. At that point, we were able to deliver the alert even if your host had an error processing it. The reason we do this is so one problematic entry (regardless of the cause) doesn't cause us to retry over and over and hold up all future alert deliveries.
We recommend customers, particularly high volume customers, store the POSTed alert string to a fast storage systems (e.g. a simple DB one-column table) and return a 200 response immediately. Then, process those alerts asynchronously on your end. This assures that you can process the alerts on your own schedule.
Article is closed for comments.