Memory Graph 4NF Plan
Subtitle: Relational semantic graph for durable memory, execution, and CRM state
Executive Summary
ValkyrAI already contains the raw ingredients for a semantic graph implemented on top of a normalized relational database:
MemoryEntryTagPrincipalOrganizationCustomerOpportunitySalesActivityContentDataGoalStrategicPriorityTaskWorkflowRunAgentGrayMatter
What is still missing is a formal graph strategy that treats these objects as first-class nodes in a connected memory and execution web.
This plan defines the 4NF direction:
- keep reusable facts in their own tables
- eliminate repeating semantic groups from blob fields
- use explicit relations for important links
- use tags and graph edges as normalized join structures
- preserve ownership, provenance, and indexability
The result is an explainable, queryable, ACL-friendly semantic web imposed on an RDB.
Problem Statement
Today, memory can be written into MemoryEntry, but many relationships still live implicitly in text, source URLs, or ad hoc workflow knowledge.
That creates several problems:
- memory is durable but not fully relational
- semantic lookup depends too much on text scanning
- ownership and provenance exist, but entity-to-entity meaning is under-modeled
- repeated dimensions such as tags, related entities, and cross-domain links are not consistently normalized
- graph traversal is possible in principle, but not explicit enough in schema design
To make GrayMatter and api-0 truly useful as durable machine memory, the system needs a stronger relational graph discipline.
Design Goal
Model ValkyrAI memory as a 4NF semantic graph where:
- core entities remain independently queryable
- multi-valued relationships become explicit join structures
- important semantic edges are first-class and indexed
- ownership and ACL remain compatible with every node and edge
- the system supports both deterministic business workflows and semantic retrieval
Existing Graph-Capable Nodes
Identity spine
PrincipalOrganizationAddressRoleAuthority
Memory spine
MemoryEntryTagNoteContentDataGrayMatter
Commercial spine
CustomerOpportunitySalesActivitySalesPipeline
Execution spine
GoalStrategicPriorityTaskWorkflowRunAgent
These are already enough to define a meaningful semantic graph. The remaining work is formalization and normalization.
4NF Rules For The Memory Graph
Rule 1: Keep stable entities first-class
If a concept has independent lifecycle, permissions, or query value, it must be its own model.
Examples:
PrincipalOrganizationCustomerContentDataMemoryEntryGoalWorkflow
Rule 2: Multi-valued dimensions must not live as repeated text blobs
If an entity can have many tags, many related nodes, many references, or many semantic labels, model those as join structures.
Examples:
- tags
- related memories
- linked customers
- linked goals
- linked workflows
Rule 3: Provenance must remain normalized
Source metadata is itself a structured dimension and should remain queryable:
- source channel
- source message id
- source URL
- creator / owner
- creation timestamp
Rule 4: Execution and memory must be linked
Important memory should be able to point to execution objects and vice versa.
Examples:
- a memory entry about a strategy should link to a
Goal - an implementation note should link to a
TaskorWorkflow - a CRM discovery memory should link to
CustomerandOpportunity
Rule 5: Public projection must be separate from private memory
Internal semantic nodes should remain private by default. Published content should be projected deliberately through explicit public surfaces.
Proposed Schema Enhancements
1) Upgrade MemoryEntry to support explicit semantic links
MemoryEntry should remain the durable freeform memory nucleus, but it should gain structured link fields for the most important entity families.
Proposed additions
MemoryEntry:
properties:
principal:
$ref: '#/components/schemas/Principal'
organization:
$ref: '#/components/schemas/Organization'
customer:
$ref: '#/components/schemas/Customer'
opportunity:
$ref: '#/components/schemas/Opportunity'
contentData:
$ref: '#/components/schemas/ContentData'
goal:
$ref: '#/components/schemas/Goal'
task:
$ref: '#/components/schemas/Task'
workflow:
$ref: '#/components/schemas/Workflow'
agent:
$ref: '#/components/schemas/Agent'
run:
$ref: '#/components/schemas/Run'
Why
These fields support the most important one-to-one or one-to-primary-context relationships without forcing everything into generic edge traversal.
2) Introduce GraphLink as a generic semantic edge model
For many-to-many and cross-domain graphing, use a dedicated edge table.
Purpose
Allow arbitrary semantic edges between node types without denormalizing the nodes themselves.
Proposed shape
GraphLink:
type: object
properties:
id:
type: string
format: uuid
ownerId:
type: string
format: uuid
fromType:
type: string
fromId:
type: string
format: uuid
toType:
type: string
toId:
type: string
format: uuid
relationType:
type: string
description: semantic relation, e.g. relates_to, derived_from, supports, blocks, references
weight:
type: number
format: double
sourceChannel:
type: string
sourceMessageId:
type: string
sourceUrl:
type: string
createdDate:
type: string
format: date-time
lastModifiedDate:
type: string
format: date-time
Example edges
MemoryEntry -> GoalviasupportsTask -> StrategicPriorityviaimplementsOpportunity -> ContentDataviauses_assetMemoryEntry -> MemoryEntryviaderived_fromCustomer -> Organizationviabelongs_to
Why
This gives the system a true graph layer without sacrificing relational integrity.
3) Normalize tags as reusable graph labels
Tags already exist and should become a first-class semantic indexing system.
Rules
- Never rely on raw string arrays when reusable
Tagobjects are available - Standardize high-value tag families:
- memory type
- domain
- priority
- workflow area
- sensitivity
- publication state
- commercial stage
Example tag families
domain:crmdomain:memorydomain:workflowstate:draftstate:publishedsensitivity:privatesignal:highgraph:strategicgraph:execution
4) Link execution entities more explicitly
Execution objects already form a partial graph, but should be treated as a semantic layer.
Important existing edges
StrategicPriority -> Goal[]Workflow -> Task[]Task -> WorkflowRun -> TaskAgent -> Workflow[]
Desired additions
Task -> GoalTask -> StrategicPriorityWorkflow -> GoalRun -> WorkflowAgent -> StrategicPriority
This supports better reasoning over intent, action, and outcome.
5) Link CRM and memory directly
CRM state should not float separately from memory.
Desired memory/CRM links
MemoryEntry -> CustomerMemoryEntry -> OpportunityMemoryEntry -> OrganizationSalesActivity -> ContentDatawhen an asset was usedOpportunity -> ContentDatafor linked collateral
This gives sales memory a proper graph substrate rather than a pile of notes.
Indexing Strategy
The graph only becomes useful if traversal is cheap.
Required indexes
Ownership / security
owner_id
Identity and business traversal
principal_idorganization_idcustomer_idopportunity_idgoal_idtask_idworkflow_idagent_idrun_id
Provenance
source_message_idsource_urlsource_channel
Time
created_datelast_modified_date
Public/content access
slugtitlestatus
Graph traversal
For GraphLink:
(from_type, from_id)(to_type, to_id)relation_type(owner_id, relation_type)
Tagging
- tag join tables on both sides
- normalized tag name / slug / namespace
Memory Graph Example
A single strategic memory should be able to connect across multiple domains.
Example
MemoryEntry: “6D-A is the organizational operating method”- linked to:
StrategicPriority: GovernanceGoal: Compounding autonomous executionContentData: 6D-A Governance Engine docAgent: ValorTag:domain:memory,graph:strategic,sensitivity:privateGraphLink:supports,codifies,relates_to
This is not just storage. It becomes machine-reasonable structure.
Public vs Private Graph Surfaces
Private-by-default remains mandatory.
Private by default
MemoryEntryGraphLink- CRM entities unless explicitly exposed
- internal execution state
Public by projection
- selected
ContentData - published marketing assets
- public MCP/catalog objects
- explicitly public APIs
The semantic graph should be rich internally and selectively projected externally.
Phase Rollout
Phase 1
- Treat
MemoryEntry + Tag + source metadataas canonical durable memory - Standardize normalized tag usage
- Enforce post-write GET verification in the correct auth scope
Phase 2
- Extend
MemoryEntrywith first-class links to key node families - Add missing indexes for traversal and provenance
Phase 3
- Introduce
GraphLink - Use it for semantic cross-domain edges
- Build graph traversal services and UI inspectors
Phase 4
- Integrate GrayMatter explicitly as graph-native memory orchestration
- add reasoning and recommendation flows over relational graph neighborhoods
Canonical Direction
The ValkyrAI memory system should become:
a private-by-default, ACL-aware, highly normalized relational semantic graph that unifies identity, memory, execution, and commercial state.
This is the practical semantic web: not vague embeddings replacing structure, but a graph of typed objects and indexed relations living inside a disciplined RDB.