Tool calls with AWS Bedrock are easier than you think


I had a use case come up recently where we wanted to keep all the data on AWS inside the AWS account for the project.

Lots of people have a fairly rational fear that they don’t want to give their data over to big tech like ChatGPT. Despite ChatGPT claiming they do NOT train on data, a requirement was made to keep it in AWS since AWS has all of our data anyway. I am not a lawyer, and I don’t play one on the internet, so double-check your terms and services.

I chose to give AWS Bedrock a spin, specifically their Converse API.

I was surprised to see there wasn’t anything that needed to be provisioned. Converse’s serverless inference implementation just worked out of the box. That blew my mind a bit, but I suppose why would you?

Just charge per invocation, it’s not like it has persistent data or stores code like a lambda.

Setup:

It's super simple, it just uses the AWS SDK v3, and you send it Converse Commands. Include the tool call definition, and it will respond just like you would expect.

How did it perform?

I was able to get Amazon’s Nova Lite to do simple tool calls, no problem. I decided to try my luck with Nova micro to see how that ran, and it correctly made the same tool call with the exact right parameters.

What did it cost?

I can’t go into too much detail on what I was using it for right now, but I was able to get it to run each inference for about $0.00002. If this were running on a website with 100,000 executions a day, we are talking about $2 per day.

Now that is without any caching or high-performance tuning. Add that in, and we could cut that down a bit more.

My plan is to dig into this a bit deeper in future posts.

If you want to get access to hands-on workshops about how to do serverless inference at scale on AWS, check out the Schematical Group Coaching Community.