Use the Errors section of the docs to make your AWS integrations more robust

If you’re implementing a cloud integration in AWS, you likely want to make it robust so that it can gracefully handle errors and recover from failures. To do this, you first need to understand the different ways in which your integration might fail.

My starting point for this is usually the AWS API docs for the service I’m integrating with. Almost every API endpoint of every AWS service defines an “Errors” section in its docs that lists out the different types of errors that can be returned.

Using this as the starting point, I can then:

  • Understand what validation or sanitisation I need to perform on any request parameters I provide
  • Decide if I need to add compensating logic to handle a particular error type, such as a catch block in a Lambda function or a fork in a StepFunctions state machine
  • Decide what edge-case automated tests I need to write in order to verify that any compensating logic I write is correct
  • Determine if there are any operational concerns such as throughput exceeded or throttling errors that I need to consider in my code that invokes the API, e.g. ensuring I don’t flood an API with a ton of parallel requests.
  • See if a particular error type is transient, and so worth retrying, or permanent. If it’s retryable, I can then determine if I can safely allow a particular error type to throw without handling it inside a Lambda function (triggering automatic retry, if invoked async) or if I instead need to manually handle retrying in my own code.
  • Know if this API call could produce a partial failure that I need to handle in my code (e.g. DynamoDB’s BatchWriteItem). Note: partial errors are typically provided in the API Response object and not as an error type.
  • Know what specific metrics I need to alert on in CloudWatch so I can monitor any operational issues

Trying to understand all the failure modes of a distributed cloud system can be overwhelming, but the simple step of RTFM, specifically the errors section of the API docs, is a great place to start.

Join daily email list

I publish short emails like this on building software with serverless on a daily-ish basis. They’re casual, easy to digest, and sometimes thought-provoking. If daily is too much, you can also join my less frequent newsletter to get updates on new longer-form articles.

    View Emails Archive

    ☎️ Serverless Clarity Call

    Need quick guidance on a specific issue on your AWS serverless project? Or just wondering where to start with serverless?

    Book a call and ask me anything.

    Learn more >>

    🛫 Serverless Launchpad

    Ready to start building your new AWS serverless project but need help with getting everything setup?

    The Serverless Launchpad is a done-for-you DevOps service installed in under a week. You get a leading-practice multi-account AWS environment, a scaffolded codebase and architecture including the common AWS serverless services, isolated cloud environments for individual developers, automated delivery pipelines right through to production and much more. Everything is IaC, extensively documented and handed over to your developers.

    Learn more >>