LLMs are restricted by a context window size that limits the amount of data they can handle per interaction. For instance, GPT-4 has a context window size of approximately 33,000 tokens, a constraint that renders it impractical for enterprise-scale applications. An LLM Scratchpad uses a dynamic temporary memory model to work around these limitations, enabling LLMs to tackle large data challenges that they otherwise could not handle.
The two most important engineering components of Scratchpads are Variables and Tools. Variables contain data that are stored in memory for access by the LLM agent. Storing data in Variables gives the LLM access to data without having to explicitly write that data in the context window, which has limited space. Tools are just Python functions created to perform a specific task on one or more Variables.
Justin Pounders walks through them in, Cook a Scrumptious Solution with LLM Scratchpads. In the video he has an analogy that drives home the key points:
It’s kind of like cooking. The Variables are the ingredients, onions, peppers, meat, stock, and the Tools are the techniques, chopping, boiling, roasting. The LLM comes up with a “recipe” that applies the techniques to the ingredients, or in our case, it applies the Tools to the Variables to answer a question.
While this is an advanced topic, Justin does a great job of breaking it down into digestible pieces. Additionally, we created a supplemental report, Conquering LLM Context Constraints, that summarizes key points from Ep. 5, Ep. 12, & Ep. 13 of our Generative AI and LLM strategy series.
I hope it helps accelerate your company’s Generative AI transformation.