ValkyrAI Command Processor System
The ValkyrAI Command Processor enables LLMs to execute both frontend (browser-based) and backend (server-based) commands through structured XML syntax. This creates powerful AI workflows where models can interact with APIs, read URLs, capture screen data, and perform server operations.
Architecture Overview
Frontend Command Processor
- Location:
web/typescript/valkyr_labs_com/src/utils/commandProcessor.ts
- Purpose: Handles browser-based commands like screen capture, navigation, UI interactions
- Execution: Runs in the user's browser with SageChat integration
Backend Command Processor
- Location:
valkyrai/src/main/java/com/valkyrlabs/valkyrai/service/CommandProcessor.java
- Purpose: Handles server-side commands like API calls, URL reading, system operations
- Execution: Runs on the ValkyrAI server with security controls
Command Formats
All commands use XML syntax for consistent parsing across frontend and backend systems.
Frontend Commands
Screen Capture
Captures current page content for AI analysis.
<screencapture>
<type>screenshot</type>
</screencapture>
Parameters:
type
:screenshot
(full page capture) orscreenscrape
(text-only)
Example Usage: "Please capture what's currently on the screen so I can help you with this page."
WebSocket Communication
Sends messages through the WebSocket connection.
<websocket>
<message>{"type": "command", "payload": "Hello WebSocket"}</message>
</websocket>
Parameters:
message
: JSON object containing WebSocket message data
Navigation
Navigate to different URLs within the application.
<navigate>
<url>/dashboard</url>
</navigate>
Parameters:
url
: Target URL (relative or absolute)
UI Interactions
Perform browser UI actions like clicking, scrolling, typing.
<ui_action>
<action>click</action>
<target>.btn-primary</target>
<value>optional text for type actions</value>
</ui_action>
Parameters:
action
:click
,scroll
,type
target
: CSS selector for the target elementvalue
: Text to type (only fortype
action)
Backend Commands
API Data Retrieval
Fetch data from internal APIs with optional filtering.
<get_api_data>
<api>ContentData</api>
<qbe>{"type": "BLOG"}</qbe>
<params>limit=5&sort=createdDate,desc</params>
</get_api_data>
Parameters:
api
: API endpoint name (e.g., ContentData, Product, Customer)qbe
: Query By Example JSON for filtering (optional)params
: Additional query parameters like limit, sort (optional)
Example Usage: "Show me the latest 5 blog posts from the content system."
URL Reading
Read content from external URLs with security restrictions.
<read_url>
<url>https://api.example.com/data</url>
<format>json</format>
</read_url>
Parameters:
url
: Target URL to readformat
:text
orjson
(optional, defaults to text)
API Data Posting
Create new records through internal APIs.
<post_api_data>
<api>ContentData</api>
<data>{"title": "New Blog Post", "content": "Hello World", "type": "BLOG"}</data>
</post_api_data>
Parameters:
api
: API endpoint namedata
: JSON data to create
Server Operations
Execute safe server operations (read-only by default).
<server_op>
<operation>status</operation>
<target>valkyrai</target>
</server_op>
Parameters:
operation
:status
,logs
(extensible for more operations)target
: Service name or "system" (optional)
Usage Examples
Content Analysis Workflow
User: "What blog content do we have about AI?"
LLM Response: "Let me check our content database for AI-related blog posts.
<get_api_data>
<api>ContentData</api>
<qbe>{"type": "BLOG", "content": "*AI*"}</qbe>
<params>limit=10</params>
</get_api_data>
I found several AI-related blog posts in your content system..."
Screen Assistance Workflow
User: "I'm confused by this page, can you help?"
LLM Response: "I'll take a look at what you're seeing to better assist you.
<screencapture>
<type>screenscrape</type>
</screencapture>
Based on the current page content, I can see you're on the dashboard..."
Data Integration Workflow
User: "Create a summary of our products and check external pricing data."
LLM Response: "I'll gather your product data and check external pricing.
<get_api_data>
<api>Product</api>
<params>limit=20</params>
</get_api_data>
<read_url>
<url>https://api.competitor.com/prices</url>
<format>json</format>
</read_url>
Based on your 20 products and the external pricing data..."
Security & Permissions
Frontend Security
- Commands execute within browser sandbox
- Screen capture limited to current domain
- Navigation restricted to safe URLs
- UI interactions limited to visible elements
Backend Security
- URL reading restricted to whitelisted domains
- API calls use user's authentication context
- Server operations limited to read-only by default
- All command execution is logged and audited
Permission Levels
- Anonymous Users: Limited to screen capture and safe navigation
- Authenticated Users: API data access based on user permissions
- Admin Users: Extended server operations (when implemented)
Integration with SageChat
SageChat automatically processes commands in LLM responses:
- Command Detection: XML commands parsed from LLM response
- Context Injection: Command results added to conversation context
- User Feedback: Success/failure messages shown in chat
- WebSocket Integration: Real-time command execution updates
Development Guidelines
Adding New Commands
Frontend Commands
- Add command handler in
FrontendCommandProcessor
- Implement execution method with proper error handling
- Update command documentation
- Add tests for new functionality
Backend Commands
- Add command pattern in
CommandProcessor
- Implement execution method with security checks
- Add to security whitelist if needed
- Document command format and usage
Best Practices
- Always validate command parameters
- Implement proper error handling and user feedback
- Use security whitelist approach (deny by default)
- Log all command execution for audit purposes
- Keep commands atomic and idempotent where possible
Troubleshooting
Common Issues
- Command not executing: Check XML syntax and parameter validation
- Security errors: Verify URL/operation is whitelisted
- API errors: Check user permissions and API endpoint availability
- WebSocket issues: Ensure connection is established before sending commands
Debugging
- Enable debug logging in both frontend and backend processors
- Check browser console for frontend command issues
- Review server logs for backend command execution
- Use WebSocket debug messages for real-time feedback
Future Enhancements
- File Operations: Upload/download file commands
- Database Queries: Direct SQL execution with proper security
- External Integrations: Commands for third-party services
- Workflow Orchestration: Multi-step command sequences
- Advanced Security: Role-based command permissions