Host your LLM in the browser with WebLLM

Are you paying too much to host your LLMs on AWS?
In a previous post I talked about creative (and possibly extreme) ways to save on hosting costs for an Image Classifier. Today I want to talk about a wild way to save on an LLM.
Meet WebLLM an LLM that runs in the browser. How? It uses WebGPU for hardware acceleration so it calls an API in the browser that then has your browser spin up GPUs on the user’s laptop. This means no need for you to spin up expensive GPUs server side.
Sounds amazing right? Well the drawback is it is a little slow compared to most LLMs.
Is it worth it? You tell me. If you were charging people money for it probably not but if you were a hobbyist that wanted to train an LLM with a bunch of niche knowledge about your favorite hobby(for some reason bird watching comes to mind) and give it away for free on the internet to people that are not that technical then I would strongly consider this to save you $$$.
If this is not the case and you want to host LLMs cost efficiently on AWS reach out to me and let’s get you set up.
Question For You: How are you hosting your LLMs?