Performance-Driven Development™
Our proprietary approach for creating transparency and getting LLM projects on track
How Performance-Driven Development™ works
01
We capture representative data & tasks
We develop a representative set of data and tasks in close collaboration with your team. Data security concerns? Don’t worry, we can generate data to get started.
02
We build & optimize the LLM solution
We select the highest-impact LLM optimization techniques to get the performance you need. Techniques include selecting models, improving prompts, adding context, using agents or tools. Expensive, high-risk options (like fine-tuning) are a last resort.
03
We build your Performance Evaluation Framework
The foundation of Performance-Driven Development is the solution’s Performance Evaluation Framework. It generates the metrics you need for transparency.
04
We demonstrate progress with performance reports
Each week we generate performance reports and closely review them with you. You see fast progress through the tasks, solution effectiveness, cost, speed - and any other critical metrics.
EXAMPLE PERFORMANCE REPORT
Measure progress with
customized metrics.
Tasks are based on your goals, such as
customer questions on a RAG solution. Each week we review progress with your team and
take action based on your business goals.
customized metrics.
Week
9/9/24
9/16/24
9/23/24
9/30/24
10/17/24
Tasks
15
15
36
36
45
Performance
Confidence
Medium
High
Medium
High
High
Average
Time
1.6s
1.4s
1.4s
8.4s
3.2s
Average
Cost
.002
.002
.002
.015
.011
Recommendation
Baseline data parsing and prompts working as expected. LLM getting confused because it lacks context on the 3 complex questions.
Adding context worked as discussed, confidence high on all tasks.
Will add more customer tasks.
Total tasks increased, but performance degraded multi-step analysis.
Will test a larger LLM.
Achieved high performance on all tasks. Next steps are adding high-risk scenarios identified by legal and attempting to improve latency and cost.
Successfully handled all of legal’s high-risk scenarios. Task filtering reduced cost and latency.
customer questions on a RAG solution. Each week we review progress with your team and
take action based on your business goals.
Performance-Driven development creates transparency and gets your AI initiatives on track
WITHOUT
PERFORMANCE-DRIVEN DEVELOPMENT™
PERFORMANCE-DRIVEN DEVELOPMENT™
WITH
PERFORMANCE-DRIVEN DEVELOPMENT™
PERFORMANCE-DRIVEN DEVELOPMENT™
Schedule
You're stuck in experimentation mode
You make steady improvements and hit your milestones
Performance
You lack metrics and rely on subjective user feedback
You have granular, relevant metrics and get specific user feedback
Robustness
You inadvertently degrade your solution while trying to make improvements
You instantly detect and prevent potential errors with your performance evaluation workflow
Alignment
Your organization doesn’t have a shared understanding of your AI goals.
Everyone understands your AI goals and how your solution supports them.
Risk
Progress stalls because you cannot address legal and security concerns.
You build trust with legal and security by overcoming their objections through metrics