“Functions calling other functions is an anti-pattern. There are a very few edge cases where this is a valid pattern, but they are not easily broken down.”
This is a quote from Paul Johnston in his article on Serverless Best Practices. If you don’t know Paul, he’s the co-founder of the ServerlessDays worldwide conferences and all-round serverless community star. I’ve learnt a ton from reading his articles.
While I agree with Paul’s point here, if you’ve read my recent newsletter emails you’ll know that I’m skeptical of best practice recommendations without full context, so today I want to look at some of the edge cases for when this approach is a valid pattern.
But let’s first look at the main reasons for Paul’s statement…
Let’s say we have 2 Lambda functions:
apiHandler1 is triggered from an API Gateway endpoint and needs to get some data from an external API and store it into a database. Several other endpoints also require this functionality, so this logic has been separated out into its own function,
apiHandler1 invokes synchronously using the
invoke operation of the Lambda SDK.
The main problem with this approach is that while
dataProcessor1 is in-flight,
apiHandler1’s execution is idle waiting for a response. So you are paying for the execution time of both functions, both rounded up to the nearest 100ms. Also, if
dataProcessor1 fails, your code is responsible for retrying it.
Paul recommends this solution:
“Functions should push data to a data store or queue, which should trigger another function if more work is needed.”
I would also add that if your calling function is user-facing and needs to wait on the response, then you should simply keep any reused logic (such as that performed by
dataProcessor1 in this example) as a simple shared library package rather than deploy it as a standalone Lambda function.
When you add a Lambda function to a VPC, you lose internet access. This means you cannot call other AWS APIs (S3, DynamoDB, etc) from the function without adding a pretty expensive always-on NAT Gateway.
You can avoid this by using the VPC Proxy Lambda Function pattern which involves one function outside a VPC synchronously invoking another function that’s inside the VPC.
This pattern does still suffer from the same double-charging of concurrently executing functions issue, but in many cases this cost will be less than having to pay for the costs of a NAT Gateway. Another plus is that you avoid all the overhead that goes with managing network config.
invoke operation of the Lambda API also supports asynchronously invoking a function.
By doing this,
apiHandler1 from our earlier example would receive an immediate acknowledgement and asynchronously invoke
dataProcessor1, allowing the API call to return quickly to the user and stop the billing clock. Also, if
dataProcessor1 fails, then AWS will automatically retry the invocation twice and then route it to a dead letter queue if you so wish.
Of course, this pattern only works if the calling function doesn’t need to return the data supplied by the invoked function to the client. If this is ok for your use case, you don’t need the PubSub capability that SNS gives you and don’t want to have to configure another AWS resource for your simple async flow, then I think this is a perfectly valid use.
I would add that I would probably only use this pattern between 2 functions within the same microservice, otherwise you introduce configuration inter-dependencies that make deployment more complex.
If so, I’d love to hear them. Hit reply and let me know.
Other articles you might enjoy:
Free Email Course
How to transition your team to a serverless-first mindset
In this 5-day email course, you’ll learn:
- Lesson 1: Why serverless is inevitable
- Lesson 2: How to identify a candidate project for your first serverless application
- Lesson 3: How to compose the building blocks that AWS provides
- Lesson 4: Common mistakes to avoid when building your first serverless application
- Lesson 5: How to break ground on your first serverless project