Building a demonstration application based on sample data sets is relatively straightforward; however, it is much more challenging to build and deploy an application that solves problems in the enterprise. Today, I am going to share the top five challenges we've seen teams face when beginning this journey. I drew inspiration from Chip Huyen's fantastic article on building LLM (Large Language Model) applications for production.
- Scaling prompt engineering. Prompt engineering is an approach to customizing large language models by optimizing the content fed into the prompt. While it's straightforward for simple use cases, when building production applications, you run into many challenges, such as ambiguity in how large language models ingest and generate results, compatibility of prompts between different large language models, and maintaining them. These challenges are further exacerbated if the prompts require interaction and input from users.
- Latency. This one is straightforward. GPT-4, and even GPT-3.5, are simply too slow for many applications. There are some impressive demos on the internet, such as Khan Academy's TED talk with large language models responding in real-time. In reality, delays can range from 5 to 30 seconds depending on the query, which is way too slow for any real-time application.
- Optimization patterns. Customizing large language models for specific business problems is challenging because there are so many options, and the approaches and language are often inconsistent. Broadly speaking, the choice comes down to optimizing the prompt for a general model or fine-tuning a model on your data. But there are many sub-variances of this approach, and ways to integrate other LLMs into the workflow.
- Tooling. The tools are either too early in development or nonexistent. I've covered some of the more popular ones recently, such as LangChain and LlamaIndex, and many more are coming. But for the moment, building an application requires a lot of customization.
- Agent policies. As I've discussed before, the major obstacle to achieving Enterprise AGI is finding a way to bring the human into the loop of training the control agent to perform specific tasks. So far, I haven't seen a demonstration of this solution, even on simple projects.
None of these challenges are showstoppers, and every day, teams are discovering and sharing workarounds. However, any roadmap for deploying an LLM system should contain an enormous amount of variability to account for these and unknown challenges that may arise.