ValkyrAI Command Processor System

The ValkyrAI Command Processor enables LLMs to execute both frontend (browser-based) and backend (server-based) commands through structured XML syntax. This creates powerful AI workflows where models can interact with APIs, read URLs, capture screen data, and perform server operations.

Architecture Overview

Frontend Command Processor

Location: web/typescript/valkyr_labs_com/src/utils/commandProcessor.ts
Purpose: Handles browser-based commands like screen capture, navigation, UI interactions
Execution: Runs in the user's browser with SageChat integration

Backend Command Processor

Location: valkyrai/src/main/java/com/valkyrlabs/valkyrai/service/CommandProcessor.java
Purpose: Handles server-side commands like API calls, URL reading, system operations
Execution: Runs on the ValkyrAI server with security controls

Command Formats

All commands use XML syntax for consistent parsing across frontend and backend systems.

Frontend Commands

Screen Capture

Captures current page content for AI analysis.

<screencapture>
<type>screenshot</type>
</screencapture>

Parameters:

type: screenshot (full page capture) or screenscrape (text-only)

Example Usage: "Please capture what's currently on the screen so I can help you with this page."

WebSocket Communication

Sends messages through the WebSocket connection.

<websocket>
<message>{"type": "command", "payload": "Hello WebSocket"}</message>
</websocket>

Parameters:

message: JSON object containing WebSocket message data

Navigate to different URLs within the application.

<navigate>
<url>/dashboard</url>
</navigate>

Parameters:

url: Target URL (relative or absolute)

UI Interactions

Perform browser UI actions like clicking, scrolling, typing.

<ui_action>
<action>click</action>
<target>.btn-primary</target>
<value>optional text for type actions</value>
</ui_action>

Parameters:

action: click, scroll, type
target: CSS selector for the target element
value: Text to type (only for type action)

Backend Commands

API Data Retrieval

Fetch data from internal APIs with optional filtering.

<get_api_data>
<api>ContentData</api>
<qbe>{"type": "BLOG"}</qbe>
<params>limit=5&sort=createdDate,desc</params>
</get_api_data>

Parameters:

api: API endpoint name (e.g., ContentData, Product, Customer)
qbe: Query By Example JSON for filtering (optional)
params: Additional query parameters like limit, sort (optional)

Example Usage: "Show me the latest 5 blog posts from the content system."

URL Reading

Read content from external URLs with security restrictions.

<read_url>
<url>https://api.example.com/data</url>
<format>json</format>
</read_url>

Parameters:

url: Target URL to read
format: text or json (optional, defaults to text)

API Data Posting

Create new records through internal APIs.

<post_api_data>
<api>ContentData</api>
<data>{"title": "New Blog Post", "content": "Hello World", "type": "BLOG"}</data>
</post_api_data>

Parameters:

api: API endpoint name
data: JSON data to create

Server Operations

Execute safe server operations (read-only by default).

<server_op>
<operation>status</operation>
<target>valkyrai</target>
</server_op>

Parameters:

operation: status, logs (extensible for more operations)
target: Service name or "system" (optional)

Usage Examples

Content Analysis Workflow

User: "What blog content do we have about AI?"

LLM Response: "Let me check our content database for AI-related blog posts.

<get_api_data>
<api>ContentData</api>
<qbe>{"type": "BLOG", "content": "*AI*"}</qbe>
<params>limit=10</params>
</get_api_data>

I found several AI-related blog posts in your content system..."

Screen Assistance Workflow

User: "I'm confused by this page, can you help?"

LLM Response: "I'll take a look at what you're seeing to better assist you.

<screencapture>
<type>screenscrape</type>
</screencapture>

Based on the current page content, I can see you're on the dashboard..."

Data Integration Workflow

User: "Create a summary of our products and check external pricing data."

LLM Response: "I'll gather your product data and check external pricing.

<get_api_data>
<api>Product</api>
<params>limit=20</params>
</get_api_data>

<read_url>
<url>https://api.competitor.com/prices</url>
<format>json</format>
</read_url>

Based on your 20 products and the external pricing data..."

Security & Permissions

Frontend Security

Commands execute within browser sandbox
Screen capture limited to current domain
Navigation restricted to safe URLs
UI interactions limited to visible elements

Backend Security

URL reading restricted to whitelisted domains
API calls use user's authentication context
Server operations limited to read-only by default
All command execution is logged and audited

Permission Levels

Anonymous Users: Limited to screen capture and safe navigation
Authenticated Users: API data access based on user permissions
Admin Users: Extended server operations (when implemented)

Integration with SageChat

SageChat automatically processes commands in LLM responses:

Command Detection: XML commands parsed from LLM response
Context Injection: Command results added to conversation context
User Feedback: Success/failure messages shown in chat
WebSocket Integration: Real-time command execution updates

Development Guidelines

Adding New Commands

Frontend Commands

Add command handler in FrontendCommandProcessor
Implement execution method with proper error handling
Update command documentation
Add tests for new functionality

Backend Commands

Add command pattern in CommandProcessor
Implement execution method with security checks
Add to security whitelist if needed
Document command format and usage

Best Practices

Always validate command parameters
Implement proper error handling and user feedback
Use security whitelist approach (deny by default)
Log all command execution for audit purposes
Keep commands atomic and idempotent where possible

Troubleshooting

Common Issues

Command not executing: Check XML syntax and parameter validation
Security errors: Verify URL/operation is whitelisted
API errors: Check user permissions and API endpoint availability
WebSocket issues: Ensure connection is established before sending commands

Debugging

Enable debug logging in both frontend and backend processors
Check browser console for frontend command issues
Review server logs for backend command execution
Use WebSocket debug messages for real-time feedback

Future Enhancements

File Operations: Upload/download file commands
Database Queries: Direct SQL execution with proper security
External Integrations: Commands for third-party services
Workflow Orchestration: Multi-step command sequences
Advanced Security: Role-based command permissions

Architecture Overview​

Frontend Command Processor​

Backend Command Processor​

Command Formats​

Frontend Commands​

Screen Capture​

WebSocket Communication​

Navigation​

UI Interactions​

Backend Commands​

API Data Retrieval​

URL Reading​

API Data Posting​

Server Operations​

Usage Examples​

Content Analysis Workflow​

Screen Assistance Workflow​

Data Integration Workflow​

Security & Permissions​

Frontend Security​

Backend Security​

Permission Levels​

Integration with SageChat​

Development Guidelines​

Adding New Commands​

Frontend Commands​

Backend Commands​

Best Practices​

Troubleshooting​

Common Issues​

Debugging​

Future Enhancements​

Architecture Overview

Frontend Command Processor

Backend Command Processor

Command Formats

Frontend Commands

Screen Capture

WebSocket Communication

Navigation

UI Interactions

Backend Commands

API Data Retrieval

URL Reading

API Data Posting

Server Operations

Usage Examples

Content Analysis Workflow

Screen Assistance Workflow

Data Integration Workflow

Security & Permissions

Frontend Security

Backend Security

Permission Levels

Integration with SageChat

Development Guidelines

Adding New Commands

Frontend Commands

Backend Commands

Best Practices

Troubleshooting

Common Issues

Debugging

Future Enhancements