Skip to main content

LLM Routing Engine - Integration Checklist

Owner: Development Team
Status: ✅ Implementation Complete | ⏳ Integration In Progress
Last Updated: October 19, 2025


🎯 Phase 1: Fix Pre-Existing Build Issues

✅ Identified Issues

FileIssueLineSeverityFix
FileUploadService.javaType mismatch: Map passed to setMetadata(String)91, 111🔴 HIGHConvert Map to JSON string or change method signature

📋 Action Items

  • Fix FileUploadService.java:91

    // BEFORE:
    record.setMetadata(fileMetadata); // fileMetadata is Map<String,Object>

    // AFTER (Option 1 - Recommended):
    record.setMetadata(objectMapper.writeValueAsString(fileMetadata));

    // OR AFTER (Option 2):
    // Update FileRecord model to accept Map in setMetadata()
  • Fix FileUploadService.java:111 (same issue, same fix)

  • Verify build passes

    mvn clean compile -DskipTests

🚀 Phase 2: Deploy LLM Routing Services

✅ Files Ready

  • LLMRoutingService.java (588 lines)
  • LLMCostCalculatorService.java (305 lines)
  • LLMRoutingController.java (417 lines)

📋 Action Items

  • Verify Spring bean registration

    grep -n "@Service\|@RestController\|@Component" valkyrai/src/main/java/com/valkyrlabs/valkyrai/service/LLM*
    grep -n "@Service\|@RestController\|@Component" valkyrai/src/main/java/com/valkyrlabs/valkyrai/controller/LLM*
  • Run unit tests

    mvn test -pl valkyrai -Dtest=LLMRoutingServiceTests
  • Add to Spring Security configuration (if needed)

    • Determine which endpoints require which roles
    • Add @PreAuthorize annotations if not already present
    • Update security context if needed

🔌 Phase 3: Integrate with WorkflowService

📋 Action Items

3.1 Update ValkyrWorkflowService

  • Inject routing service

    @Autowired
    private LLMRoutingService routingService;
  • Add routing logic to executeTask()

    Location: valkyrai/src/main/java/com/valkyrlabs/workflow/ValkyrWorkflowService.java

    // In executeTask() method, before executing ExecModule:

    if (isLLMTask(task)) {
    String selectedModel = routingService.routeRequest(
    principal.getId(),
    task.getDescription(),
    task.getEstimatedInputTokens(),
    task.getEstimatedOutputTokens()
    );

    // Update ExecModule config with selected model
    task.getExecModule().getConfig().put("modelId", selectedModel);

    logger.info("Routed LLM task to model: {}", selectedModel);
    }

    // Continue with normal execution
    Map<String, Object> result = executeExecModule(task, execModule, input);

    // After execution, record usage:
    if (isLLMTask(task)) {
    routingService.recordUsage(
    principal.getId(),
    selectedModel,
    actualInputTokens,
    actualOutputTokens,
    executionTimeMs
    );
    }
  • Add helper method

    private boolean isLLMTask(Task task) {
    String className = task.getExecModule().getClassName();
    return className != null &&
    (className.contains("LLM") ||
    className.contains("ChatModule") ||
    className.contains("CompletionModule"));
    }

3.2 Update ExecModule Base Classes

  • Update VModule.execute() signature or create LLMModule subclass
    public abstract class LLMModule extends VModule {
    protected String modelId;

    @Override
    public final Map<String, Object> execute(Workflow w, Task t, ExecModule m, Map<String, Object> input) {
    this.modelId = (String) m.getConfig().getOrDefault("modelId", getDefaultModel());
    return executeLLM(w, t, m, input);
    }

    protected abstract Map<String, Object> executeLLM(Workflow w, Task t, ExecModule m, Map<String, Object> input);
    protected abstract String getDefaultModel();
    }

🌐 Phase 4: Integrate with LLMController (Existing)

📋 Action Items

4.1 Update LLMController

  • Inject routing service

    @Autowired
    private LLMRoutingService routingService;
  • Update sendChatRequest() to use routing

    Location: valkyrai/src/main/java/com/valkyrlabs/valkyrai/controller/LLMController.java

    @PostMapping("/chat")
    public ResponseEntity<?> sendChatRequest(@RequestBody ChatRequest request) {
    Principal principal = SecurityContextHolder.getContext().getAuthentication().getPrincipal();

    // ROUTE REQUEST BEFORE EXECUTION
    String selectedModel = routingService.routeRequest(
    principal.getId(),
    request.getMessages().get(request.getMessages().size()-1).getContent(),
    request.getEstimatedInputTokens(),
    request.getEstimatedOutputTokens()
    );

    // Use selected model
    LlmDetails llmDetails = lmDetailsRepository.findByModelId(selectedModel);

    // Execute with selected model
    ChatResponse response = llmService.sendRequest(llmDetails, request);

    // Record usage
    routingService.recordUsage(
    principal.getId(),
    selectedModel,
    response.getInputTokens(),
    response.getOutputTokens(),
    response.getLatencyMs()
    );

    return ResponseEntity.ok(response);
    }

4.2 Add Cost Estimation Endpoint

  • Add /estimate endpoint
    @PostMapping("/estimate")
    public ResponseEntity<?> estimateCost(@RequestBody CostEstimateRequest request) {
    // Delegate to controller (already implemented)
    }

💻 Phase 5: Update Frontend (ValorIDE)

📋 Action Items

5.1 Generate TypeScript Client

  • Update OpenAPI spec if needed

    • Ensure LLM routing endpoints are documented
    • Add request/response schemas
  • Regenerate TypeScript client

    # From ValorIDE workspace
    npm run generate-api-client

5.2 Update ValorIDE Task Loop

  • Inject routing service

    // In ValorIDE service initialization
    private llmRoutingService = new LLMRoutingService(this.httpClient);
  • Query routing before LLM requests

    Location: ValorIDE task execution loop

    // Before making LLM request:
    const routing = await this.llmRoutingService.route({
    taskDescription: currentTask.description,
    estimatedInputTokens: this.estimateTokens(context),
    estimatedOutputTokens: 500,
    });

    const { modelId, estimatedCost } = routing;

    // Log estimated cost to user
    this.notifyUser(`Estimated cost: $${estimatedCost.toFixed(4)}`);

    // Make LLM request with selected model
    const response = await this.llmService.query({
    model: modelId,
    messages: this.buildMessages(context),
    });

5.3 Add Cost Display in UI

  • Add cost tracker to IDE status bar

    // Show running cost total
    updateCostDisplay(totalCost: number) {
    const statusBarItem = this.statusBar.createStatusBarItem();
    statusBarItem.text = `💰 Session: $${totalCost.toFixed(2)}`;
    statusBarItem.show();
    }
  • Add model selector dropdown (optional)

    // Allow user to override routing strategy
    private userSelectedModel?: string;

    selectLLMModel() {
    // Show quick pick of available models
    }

📊 Phase 6: Testing & Validation

📋 Action Items

6.1 Unit Tests

  • Run existing tests

    mvn test -pl valkyrai -Dtest=LLMRoutingServiceTests
  • Verify all tests pass

    • Token estimation
    • Cost calculation
    • Model selection
    • Budget enforcement

6.2 Integration Tests

  • Test WorkflowService integration

    mvn test -pl valkyrai -Dtest=WorkflowIntegrationTests
  • Test REST endpoints

    # Start app locally
    mvn spring-boot:run -pl valkyrai

    # Test each endpoint
    curl -X POST http://localhost:8080/v1/llm/route ...
    curl http://localhost:8080/v1/llm/pricing
    curl http://localhost:8080/v1/llm/stats/{principalId}

6.3 End-to-End Tests

  • Test ValorIDE integration

    • Execute IDE task that triggers LLM request
    • Verify routing was applied
    • Verify cost was tracked
  • Test cost accuracy

    • Make requests with known token counts
    • Verify actual cost matches estimate
    • Check daily aggregation

📈 Phase 7: Monitoring & Analytics

📋 Action Items

  • Add Prometheus metrics (optional)

    // In LLMRoutingService:
    private MeterRegistry meterRegistry;

    public void recordUsage(...) {
    meterRegistry.counter("llm.requests", "model", modelId).increment();
    meterRegistry.timer("llm.cost", "model", modelId).record(cost);
    }
  • Create Grafana dashboard (optional)

    • Cost over time
    • Models used
    • Average latency
    • Budget utilization
  • Add alerts (optional)

    • Budget threshold alerts
    • Model performance degradation
    • Routing failures

📝 Phase 8: Documentation

📋 Action Items

  • Update API documentation

    • Add LLM routing endpoints to Swagger
    • Document request/response schemas
    • Add usage examples
  • Update README

    • Reference LLM routing capabilities
    • Link to quick reference guide
    • Add architecture diagram
  • Create deployment guide

    • Configuration options
    • Environment variables needed
    • Performance tuning tips

🎯 Phase 9: Deployment

📋 Action Items

  • Create deployment checklist

    • Database migrations (if any)
    • Configuration files
    • Environment variables
  • Deploy to staging

    # Tag release
    git tag -a vllm-routing-1.0 -m "LLM Routing Engine v1.0"

    # Deploy to staging
    ./deploy.sh staging
  • Run smoke tests on staging

    • Verify all endpoints work
    • Check cost calculations
    • Validate budget enforcement
  • Deploy to production

    ./deploy.sh production
  • Monitor production

    • Watch error rates
    • Monitor cost trends
    • Check request latency

📊 Success Criteria

✅ Phase 1 (Build Fix)

  • FileUploadService compiles without errors
  • Full Maven build succeeds

✅ Phase 2 (Services Deployed)

  • Services start without errors
  • Spring beans registered correctly
  • Unit tests pass

✅ Phase 3-4 (Workflow Integration)

  • Workflow tasks route correctly
  • LLMController uses routing
  • Cost tracking works

✅ Phase 5 (Frontend Integration)

  • ValorIDE queries routing API
  • Cost display shows correctly
  • Model selection works

✅ Phase 6 (Testing)

  • All integration tests pass
  • End-to-end tests succeed
  • Cost calculations verified

✅ Phase 7-9 (Production)

  • Monitoring working
  • Deployed to production
  • No critical issues

🚨 Blockers & Risks

RiskImpactMitigation
FileUploadService build error🔴 HIGHFix immediately (lines 91, 111)
Model pricing changes🟡 MEDIUMAdd versioning to pricing model
Budget enforcement edge cases🟡 MEDIUMComprehensive testing
Latency of routing decision🟢 LOWCache routing decisions, implement async

📞 Questions & Contacts

  • Routing Logic: See LLMRoutingService.java javadoc
  • Cost Calculation: See LLMCostCalculatorService.java
  • Integration: See integration examples in this document
  • Support: Check LLM_ROUTING_ENGINE_IMPLEMENTATION.md

Version: 1.0 | Status: Active | Last Updated: October 19, 2025