Moore's Law Applied To LLM's Context Windows
If you are not familiar with Moore’s Law it basically states that the amount of compute power in electronic devices will double about every 2 years. His original thoughts were specifically on transistors but as the technology evolved the law can be extrapolated for CPUs, GPUs and memory.
To put it in perspective when I built my first computer I used a hard drive with something like 256MB of memory, nowadays even your watch or doorbell have 10x more memory then that while my current desktop has 3TB of memory.
My theory is that we will see similar trajectories in context windows on large models like LLMs or really any generic models. Both input and outputs will likely grow in size at a similar rate; possibly even faster.
If you're not familiar with what a “Context Window” is, that is the amount of information you can feed back into an LLM so it has “Context” to the problem you are trying to solve.
I am sure someone has already made similar proclamations but if not feel free to call this “Lea’s Law” (jk).
I am curious if anyone disagrees. Let me know your thoughts!