Common bus factor pitfalls in serverless projects

The bus factor is a measurement of the risk resulting from information and capabilities not being shared among team members, derived from the phrase “in case they get hit by a bus”.

Knowledge silos within a software development team bring a lot of risk. While I do think the nature of serverless architectures encourage a broader set of responsibilities and knowledge for individual developers on a team, there are still areas where such silos can emerge. Here’s a few that I’ve noticed:

  • Infrastructure-as-Code. Developers who avoid IaC because they see it as their job to only write the application code and need someone else to provision and configure the dependent infra for them.
  • Complex cross-cutting features that use less common cloud services or architectural patterns that deviate from the more common synchronous CRUDL commands, e.g. choreographed async event flows or Step Functions workflows.
  • IAM configuration.
  • CI/CD pipeline configuration and deployment scripts.
  • AWS account setup.

So how can you mitigate these bus factor risks in your serverless team? Here’s a few suggestions:

  • Encourage developers to deliver end to end vertical feature slices rather than having separate developers do the application code vs the infrastructure code, say.
  • Continuously deploy to production so that developers are pushed to become familiar with the CI/CD pipeline and release process from an early stage, as well as learning observability tools (if their feature breaks something in production, they’ll be responsible for debugging it there)
  • Minimise the number of different tools, languages and frameworks that are used across the application code as well as IaC and deployment scripts.
  • Practise code reviews or pair programming, especially on more complex features that use less commonly touched cloud services.
  • Document key areas such your high-level architecture, AWS org setup and delivery process which everyone should be able to understand or reference if they need to.
  • Create design documents and diagrams for complex use cases such as choreographed event fan-outs where even after reading through the code it would still be difficult to understand all the touchpoints in the flow.
  • Write automated tests. Even if I’ve never edited or even read a certain part of the codebase, I’ll be much less reluctant to change it if I know it is covered by tests which are run regularly.
  • If possible, use Infrastructure-as-Code even for baseline cloud landing zone configuration (like OrgFormation). This acts as a form of documentation as engineers needing to debug an access control issue can always check this code to see how a particular account has been set up or locked down (e.g. SSO Permission Sets and SCPs).

— Paul

P.S. I’m still looking for startup CTOs to interview for some research I’m doing. So if you know any early-stage tech startup teams who: 1) Have seed funding AND 2) Are NOT using a serverless architecture for their product, then I’d really appreciate you introducing me to their CTO.

Join daily email list

I publish short emails like this on building software with serverless on a daily-ish basis. They’re casual, easy to digest, and sometimes thought-provoking. If daily is too much, you can also join my less frequent newsletter to get updates on new longer-form articles.

    View Emails Archive

    Free Intro Call

    Book a free 30-minute introduction call with me to see how we could work together.

    Select a time for our call

    🪲 Testing Audit

    Are bugs in production slowing you down and killing confidence in your product?

    Get a tailored plan of action for overhauling your AWS serverless app’s tests and empower your team to ship faster with confidence.

    Learn more >>