How to wait for an async task to complete inside an E2E test

When writing an integration or E2E test, you sometimes need to wait for an asynchronous cloudside action to complete before continuing to the next step. This can be a difficult area to write reliable tests for.

Examples of such actions could be:

  • Any Lambda function that is triggered asynchronously (e.g. from an EventBridge event, S3 upload, DynamoDB stream, SQS message)
  • A Step Functions execution that is triggered from an async event
  • Waiting on an eventually consistent data store to be queryable (e.g. Algolia)

The amount of time to wait for such actions can be hard to gauge as factors such as cold starts often come into play as E2E tests are often run immediately after a deployment.

Consider a test case for verifying that an S3 file upload event correctly triggers a Step Functions execution.

A naive solution (using Jest framework here) is to hardcode a wait time into your tests like so:

// ...
describe('Uploading a valid file to S3 bucket folder', () => {
  let s3SourceKey: string;
  const fileId = uuid();

  beforeAll(async () => {
      s3SourceKey = await uploadSampleFileToS3(fileId);
  });

  it('starts StepFunction execution with fileId as the execution name prefix', async () => {
    // Wait 15 seconds to allow execution time to trigger
    await new Promise((r) => setTimeout(r, 15000));

    // Now query StepFunctions API to find the execution
    const executionsResponse = await stepfunctions.listExecutions({
      stateMachineArn: STATE_MACHINE_ARN,
    }).promise();
    const foundExecution = executionsResponse.executions.find(
        e => e.name.startsWith(fileId),
    )!;
    expect(foundExecution).toBeTruthy();
    expect(foundExecution.status).not.toEqual('FAILED');
  });

  // ...
});

However, while this will probably work for most test runs, it can result in non-deterministic flakiness. Or a test that is always slow because the wait time is set so high to account for the slowest of runs.

A better solution is to use a wait-and-retry polling strategy that attempts to fetch the data using a configurable number of attempts before giving up.

The async-retry NPM module makes this easy in Node.js. Here’s how it could be used to test the same S3 to StepFunctions async use case:

import retry from 'async-retry';
// ...

describe('Uploading a valid file to S3 bucket folder', () => {
  let s3SourceKey: string;
  const fileId = uuid();

  beforeAll(async () => {
      s3SourceKey = await uploadSampleFileToS3(fileId);
  });

  it('starts StepFunction execution with fileId as the execution name prefix', async () => {
      await retry(async () => {
          const executionsResponse = await stepfunctions.listExecutions({
              stateMachineArn: STATE_MACHINE_ARN,
          }).promise();
          const foundExecution = executionsResponse.executions.find(
              e => e.name.startsWith(fileId),
          )!;
          expect(foundExecution).toBeTruthy();
          expect(foundExecution.status).not.toEqual('FAILED');
      }, {
          retries: 15, // Make at most 15 attempts
          factor: 1,
          minTimeout: 1000, // Wait 1 second between each attempt
      });
  });

  // ...
});

There are a few AWS-specific Node.js modules which have this polling pattern built into them:

  • aws-testing-library — Jest and Chai extension functions for checking presence of data in several AWS services
  • sls-test-tools — new Jest extension library which currently verifies delivery of events to EventBridge (but I expect will support more AWS services soon)

Hope you find this useful.

—Paul.

Join daily email list

I publish short emails like this on building software with serverless on a daily-ish basis. They’re casual, easy to digest, and sometimes thought-provoking. If daily is too much, you can also join my less frequent newsletter to get updates on new longer-form articles.

    View Emails Archive

    🩺
    Architecture & Process Review

    Built a serverless app on AWS, but struggling with performance, maintainability, scalability or DevOps practices?

    I can help by reviewing your codebase, architecture and delivery processes to identify risk areas and their causes. I will then recommend solutions and help you with their implementation.

    Learn more >>

    🪲 Testing Audit

    Are bugs in production slowing you down and killing confidence in your product?

    Get a tailored plan of action for overhauling your AWS serverless app’s tests and empower your team to ship faster with confidence.

    Learn more >>