LLM Routing Engine - Integration Checklist
Owner: Development Team
Status: ✅ Implementation Complete | ⏳ Integration In Progress
Last Updated: October 19, 2025
🎯 Phase 1: Fix Pre-Existing Build Issues
✅ Identified Issues
| File | Issue | Line | Severity | Fix |
|---|---|---|---|---|
| FileUploadService.java | Type mismatch: Map passed to setMetadata(String) | 91, 111 | 🔴 HIGH | Convert Map to JSON string or change method signature |
📋 Action Items
-
Fix FileUploadService.java:91
// BEFORE:
record.setMetadata(fileMetadata); // fileMetadata is Map<String,Object>
// AFTER (Option 1 - Recommended):
record.setMetadata(objectMapper.writeValueAsString(fileMetadata));
// OR AFTER (Option 2):
// Update FileRecord model to accept Map in setMetadata() -
Fix FileUploadService.java:111 (same issue, same fix)
-
Verify build passes
mvn clean compile -DskipTests
🚀 Phase 2: Deploy LLM Routing Services
✅ Files Ready
- ✅
LLMRoutingService.java(588 lines) - ✅
LLMCostCalculatorService.java(305 lines) - ✅
LLMRoutingController.java(417 lines)
📋 Action Items
-
Verify Spring bean registration
grep -n "@Service\|@RestController\|@Component" valkyrai/src/main/java/com/valkyrlabs/valkyrai/service/LLM*
grep -n "@Service\|@RestController\|@Component" valkyrai/src/main/java/com/valkyrlabs/valkyrai/controller/LLM* -
Run unit tests
mvn test -pl valkyrai -Dtest=LLMRoutingServiceTests -
Add to Spring Security configuration (if needed)
- Determine which endpoints require which roles
- Add
@PreAuthorizeannotations if not already present - Update security context if needed
🔌 Phase 3: Integrate with WorkflowService
📋 Action Items
3.1 Update ValkyrWorkflowService
-
Inject routing service
@Autowired
private LLMRoutingService routingService; -
Add routing logic to executeTask()
Location:
valkyrai/src/main/java/com/valkyrlabs/workflow/ValkyrWorkflowService.java// In executeTask() method, before executing ExecModule:
if (isLLMTask(task)) {
String selectedModel = routingService.routeRequest(
principal.getId(),
task.getDescription(),
task.getEstimatedInputTokens(),
task.getEstimatedOutputTokens()
);
// Update ExecModule config with selected model
task.getExecModule().getConfig().put("modelId", selectedModel);
logger.info("Routed LLM task to model: {}", selectedModel);
}
// Continue with normal execution
Map<String, Object> result = executeExecModule(task, execModule, input);
// After execution, record usage:
if (isLLMTask(task)) {
routingService.recordUsage(
principal.getId(),
selectedModel,
actualInputTokens,
actualOutputTokens,
executionTimeMs
);
} -
Add helper method
private boolean isLLMTask(Task task) {
String className = task.getExecModule().getClassName();
return className != null &&
(className.contains("LLM") ||
className.contains("ChatModule") ||
className.contains("CompletionModule"));
}
3.2 Update ExecModule Base Classes
- Update VModule.execute() signature or create LLMModule subclass
public abstract class LLMModule extends VModule {
protected String modelId;
@Override
public final Map<String, Object> execute(Workflow w, Task t, ExecModule m, Map<String, Object> input) {
this.modelId = (String) m.getConfig().getOrDefault("modelId", getDefaultModel());
return executeLLM(w, t, m, input);
}
protected abstract Map<String, Object> executeLLM(Workflow w, Task t, ExecModule m, Map<String, Object> input);
protected abstract String getDefaultModel();
}
🌐 Phase 4: Integrate with LLMController (Existing)
📋 Action Items
4.1 Update LLMController
-
Inject routing service
@Autowired
private LLMRoutingService routingService; -
Update sendChatRequest() to use routing
Location:
valkyrai/src/main/java/com/valkyrlabs/valkyrai/controller/LLMController.java@PostMapping("/chat")
public ResponseEntity<?> sendChatRequest(@RequestBody ChatRequest request) {
Principal principal = SecurityContextHolder.getContext().getAuthentication().getPrincipal();
// ROUTE REQUEST BEFORE EXECUTION
String selectedModel = routingService.routeRequest(
principal.getId(),
request.getMessages().get(request.getMessages().size()-1).getContent(),
request.getEstimatedInputTokens(),
request.getEstimatedOutputTokens()
);
// Use selected model
LlmDetails llmDetails = lmDetailsRepository.findByModelId(selectedModel);
// Execute with selected model
ChatResponse response = llmService.sendRequest(llmDetails, request);
// Record usage
routingService.recordUsage(
principal.getId(),
selectedModel,
response.getInputTokens(),
response.getOutputTokens(),
response.getLatencyMs()
);
return ResponseEntity.ok(response);
}
4.2 Add Cost Estimation Endpoint
- Add /estimate endpoint
@PostMapping("/estimate")
public ResponseEntity<?> estimateCost(@RequestBody CostEstimateRequest request) {
// Delegate to controller (already implemented)
}
💻 Phase 5: Update Frontend (ValorIDE)
📋 Action Items
5.1 Generate TypeScript Client
-
Update OpenAPI spec if needed
- Ensure LLM routing endpoints are documented
- Add request/response schemas
-
Regenerate TypeScript client
# From ValorIDE workspace
npm run generate-api-client
5.2 Update ValorIDE Task Loop
-
Inject routing service
// In ValorIDE service initialization
private llmRoutingService = new LLMRoutingService(this.httpClient); -
Query routing before LLM requests
Location: ValorIDE task execution loop
// Before making LLM request:
const routing = await this.llmRoutingService.route({
taskDescription: currentTask.description,
estimatedInputTokens: this.estimateTokens(context),
estimatedOutputTokens: 500,
});
const { modelId, estimatedCost } = routing;
// Log estimated cost to user
this.notifyUser(`Estimated cost: $${estimatedCost.toFixed(4)}`);
// Make LLM request with selected model
const response = await this.llmService.query({
model: modelId,
messages: this.buildMessages(context),
});
5.3 Add Cost Display in UI
-
Add cost tracker to IDE status bar
// Show running cost total
updateCostDisplay(totalCost: number) {
const statusBarItem = this.statusBar.createStatusBarItem();
statusBarItem.text = `💰 Session: $${totalCost.toFixed(2)}`;
statusBarItem.show();
} -
Add model selector dropdown (optional)
// Allow user to override routing strategy
private userSelectedModel?: string;
selectLLMModel() {
// Show quick pick of available models
}
📊 Phase 6: Testing & Validation
📋 Action Items
6.1 Unit Tests
-
Run existing tests
mvn test -pl valkyrai -Dtest=LLMRoutingServiceTests -
Verify all tests pass
- Token estimation
- Cost calculation
- Model selection
- Budget enforcement
6.2 Integration Tests
-
Test WorkflowService integration
mvn test -pl valkyrai -Dtest=WorkflowIntegrationTests -
Test REST endpoints
# Start app locally
mvn spring-boot:run -pl valkyrai
# Test each endpoint
curl -X POST http://localhost:8080/v1/llm/route ...
curl http://localhost:8080/v1/llm/pricing
curl http://localhost:8080/v1/llm/stats/{principalId}
6.3 End-to-End Tests
-
Test ValorIDE integration
- Execute IDE task that triggers LLM request
- Verify routing was applied
- Verify cost was tracked
-
Test cost accuracy
- Make requests with known token counts
- Verify actual cost matches estimate
- Check daily aggregation
📈 Phase 7: Monitoring & Analytics
📋 Action Items
-
Add Prometheus metrics (optional)
// In LLMRoutingService:
private MeterRegistry meterRegistry;
public void recordUsage(...) {
meterRegistry.counter("llm.requests", "model", modelId).increment();
meterRegistry.timer("llm.cost", "model", modelId).record(cost);
} -
Create Grafana dashboard (optional)
- Cost over time
- Models used
- Average latency
- Budget utilization
-
Add alerts (optional)
- Budget threshold alerts
- Model performance degradation
- Routing failures
📝 Phase 8: Documentation
📋 Action Items
-
Update API documentation
- Add LLM routing endpoints to Swagger
- Document request/response schemas
- Add usage examples
-
Update README
- Reference LLM routing capabilities
- Link to quick reference guide
- Add architecture diagram
-
Create deployment guide
- Configuration options
- Environment variables needed
- Performance tuning tips
🎯 Phase 9: Deployment
📋 Action Items
-
Create deployment checklist
- Database migrations (if any)
- Configuration files
- Environment variables
-
Deploy to staging
# Tag release
git tag -a vllm-routing-1.0 -m "LLM Routing Engine v1.0"
# Deploy to staging
./deploy.sh staging -
Run smoke tests on staging
- Verify all endpoints work
- Check cost calculations
- Validate budget enforcement
-
Deploy to production
./deploy.sh production -
Monitor production
- Watch error rates
- Monitor cost trends
- Check request latency
📊 Success Criteria
✅ Phase 1 (Build Fix)
- FileUploadService compiles without errors
- Full Maven build succeeds
✅ Phase 2 (Services Deployed)
- Services start without errors
- Spring beans registered correctly
- Unit tests pass
✅ Phase 3-4 (Workflow Integration)
- Workflow tasks route correctly
- LLMController uses routing
- Cost tracking works
✅ Phase 5 (Frontend Integration)
- ValorIDE queries routing API
- Cost display shows correctly
- Model selection works
✅ Phase 6 (Testing)
- All integration tests pass
- End-to-end tests succeed
- Cost calculations verified
✅ Phase 7-9 (Production)
- Monitoring working
- Deployed to production
- No critical issues
🚨 Blockers & Risks
| Risk | Impact | Mitigation |
|---|---|---|
| FileUploadService build error | 🔴 HIGH | Fix immediately (lines 91, 111) |
| Model pricing changes | 🟡 MEDIUM | Add versioning to pricing model |
| Budget enforcement edge cases | 🟡 MEDIUM | Comprehensive testing |
| Latency of routing decision | 🟢 LOW | Cache routing decisions, implement async |
📞 Questions & Contacts
- Routing Logic: See
LLMRoutingService.javajavadoc - Cost Calculation: See
LLMCostCalculatorService.java - Integration: See integration examples in this document
- Support: Check
LLM_ROUTING_ENGINE_IMPLEMENTATION.md
Version: 1.0 | Status: Active | Last Updated: October 19, 2025