Memory Graph 4NF Plan
Subtitle: Relational semantic graph for durable memory, execution, and CRM state
Executive Summaryβ
ValkyrAI already contains the raw ingredients for a semantic graph implemented on top of a normalized relational database:
MemoryEntryTagPrincipalOrganizationCustomerOpportunitySalesActivityContentDataGoalStrategicPriorityTaskWorkflowRunAgentGrayMatter
What is still missing is a formal graph strategy that treats these objects as first-class nodes in a connected memory and execution web.
This plan defines the 4NF direction:
- keep reusable facts in their own tables
- eliminate repeating semantic groups from blob fields
- use explicit relations for important links
- use tags and graph edges as normalized join structures
- preserve ownership, provenance, and indexability
The result is an explainable, queryable, ACL-friendly semantic web imposed on an RDB.
Problem Statementβ
Today, memory can be written into MemoryEntry, but many relationships still live implicitly in text, source URLs, or ad hoc workflow knowledge.
That creates several problems:
- memory is durable but not fully relational
- semantic lookup depends too much on text scanning
- ownership and provenance exist, but entity-to-entity meaning is under-modeled
- repeated dimensions such as tags, related entities, and cross-domain links are not consistently normalized
- graph traversal is possible in principle, but not explicit enough in schema design
To make GrayMatter and api-0 truly useful as durable machine memory, the system needs a stronger relational graph discipline.
Design Goalβ
Model ValkyrAI memory as a 4NF semantic graph where:
- core entities remain independently queryable
- multi-valued relationships become explicit join structures
- important semantic edges are first-class and indexed
- ownership and ACL remain compatible with every node and edge
- the system supports both deterministic business workflows and semantic retrieval
Existing Graph-Capable Nodesβ
Identity spineβ
PrincipalOrganizationAddressRoleAuthority
Memory spineβ
MemoryEntryTagNoteContentDataGrayMatter
Commercial spineβ
CustomerOpportunitySalesActivitySalesPipeline
Execution spineβ
GoalStrategicPriorityTaskWorkflowRunAgent
These are already enough to define a meaningful semantic graph. The remaining work is formalization and normalization.
4NF Rules For The Memory Graphβ
Rule 1: Keep stable entities first-classβ
If a concept has independent lifecycle, permissions, or query value, it must be its own model.
Examples:
PrincipalOrganizationCustomerContentDataMemoryEntryGoalWorkflow
Rule 2: Multi-valued dimensions must not live as repeated text blobsβ
If an entity can have many tags, many related nodes, many references, or many semantic labels, model those as join structures.
Examples:
- tags
- related memories
- linked customers
- linked goals
- linked workflows
Rule 3: Provenance must remain normalizedβ
Source metadata is itself a structured dimension and should remain queryable:
- source channel
- source message id
- source URL
- creator / owner
- creation timestamp
Rule 4: Execution and memory must be linkedβ
Important memory should be able to point to execution objects and vice versa.
Examples:
- a memory entry about a strategy should link to a
Goal - an implementation note should link to a
TaskorWorkflow - a CRM discovery memory should link to
CustomerandOpportunity
Rule 5: Public projection must be separate from private memoryβ
Internal semantic nodes should remain private by default. Published content should be projected deliberately through explicit public surfaces.
Proposed Schema Enhancementsβ
1) Upgrade MemoryEntry to support explicit semantic linksβ
MemoryEntry should remain the durable freeform memory nucleus, but it should gain structured link fields for the most important entity families.
Proposed additionsβ
MemoryEntry:
properties:
principal:
$ref: '#/components/schemas/Principal'
organization:
$ref: '#/components/schemas/Organization'
customer:
$ref: '#/components/schemas/Customer'
opportunity:
$ref: '#/components/schemas/Opportunity'
contentData:
$ref: '#/components/schemas/ContentData'
goal:
$ref: '#/components/schemas/Goal'
task:
$ref: '#/components/schemas/Task'
workflow:
$ref: '#/components/schemas/Workflow'
agent:
$ref: '#/components/schemas/Agent'
run:
$ref: '#/components/schemas/Run'
Whyβ
These fields support the most important one-to-one or one-to-primary-context relationships without forcing everything into generic edge traversal.
2) Introduce GraphLink as a generic semantic edge modelβ
For many-to-many and cross-domain graphing, use a dedicated edge table.
Purposeβ
Allow arbitrary semantic edges between node types without denormalizing the nodes themselves.
Proposed shapeβ
GraphLink:
type: object
properties:
id:
type: string
format: uuid
ownerId:
type: string
format: uuid
fromType:
type: string
fromId:
type: string
format: uuid
toType:
type: string
toId:
type: string
format: uuid
relationType:
type: string
description: semantic relation, e.g. relates_to, derived_from, supports, blocks, references
weight:
type: number
format: double
sourceChannel:
type: string
sourceMessageId:
type: string
sourceUrl:
type: string
createdDate:
type: string
format: date-time
lastModifiedDate:
type: string
format: date-time
Example edgesβ
MemoryEntry -> GoalviasupportsTask -> StrategicPriorityviaimplementsOpportunity -> ContentDataviauses_assetMemoryEntry -> MemoryEntryviaderived_fromCustomer -> Organizationviabelongs_to
Whyβ
This gives the system a true graph layer without sacrificing relational integrity.
3) Normalize tags as reusable graph labelsβ
Tags already exist and should become a first-class semantic indexing system.
Rulesβ
- Never rely on raw string arrays when reusable
Tagobjects are available - Standardize high-value tag families:
- memory type
- domain
- priority
- workflow area
- sensitivity
- publication state
- commercial stage
Example tag familiesβ
domain:crmdomain:memorydomain:workflowstate:draftstate:publishedsensitivity:privatesignal:highgraph:strategicgraph:execution
4) Link execution entities more explicitlyβ
Execution objects already form a partial graph, but should be treated as a semantic layer.
Important existing edgesβ
StrategicPriority -> Goal[]Workflow -> Task[]Task -> WorkflowRun -> TaskAgent -> Workflow[]
Desired additionsβ
Task -> GoalTask -> StrategicPriorityWorkflow -> GoalRun -> WorkflowAgent -> StrategicPriority
This supports better reasoning over intent, action, and outcome.
5) Link CRM and memory directlyβ
CRM state should not float separately from memory.
Desired memory/CRM linksβ
MemoryEntry -> CustomerMemoryEntry -> OpportunityMemoryEntry -> OrganizationSalesActivity -> ContentDatawhen an asset was usedOpportunity -> ContentDatafor linked collateral
This gives sales memory a proper graph substrate rather than a pile of notes.
Indexing Strategyβ
The graph only becomes useful if traversal is cheap.
Required indexesβ
Ownership / securityβ
owner_id
Identity and business traversalβ
principal_idorganization_idcustomer_idopportunity_idgoal_idtask_idworkflow_idagent_idrun_id
Provenanceβ
source_message_idsource_urlsource_channel
Timeβ
created_datelast_modified_date
Public/content accessβ
slugtitlestatus
Graph traversalβ
For GraphLink:
(from_type, from_id)(to_type, to_id)relation_type(owner_id, relation_type)
Taggingβ
- tag join tables on both sides
- normalized tag name / slug / namespace
Memory Graph Exampleβ
A single strategic memory should be able to connect across multiple domains.
Exampleβ
MemoryEntry: β6D-A is the organizational operating methodβ- linked to:
StrategicPriority: GovernanceGoal: Compounding autonomous executionContentData: 6D-A Governance Engine docAgent: ValorTag:domain:memory,graph:strategic,sensitivity:privateGraphLink:supports,codifies,relates_to
This is not just storage. It becomes machine-reasonable structure.
Public vs Private Graph Surfacesβ
Private-by-default remains mandatory.
Private by defaultβ
MemoryEntryGraphLink- CRM entities unless explicitly exposed
- internal execution state
Public by projectionβ
- selected
ContentData - published marketing assets
- public MCP/catalog objects
- explicitly public APIs
The semantic graph should be rich internally and selectively projected externally.
Phase Rolloutβ
Phase 1β
- Treat
MemoryEntry + Tag + source metadataas canonical durable memory - Standardize normalized tag usage
- Enforce post-write GET verification in the correct auth scope
Phase 2β
- Extend
MemoryEntrywith first-class links to key node families - Add missing indexes for traversal and provenance
Phase 3β
- Introduce
GraphLink - Use it for semantic cross-domain edges
- Build graph traversal services and UI inspectors
Phase 4β
- Integrate GrayMatter explicitly as graph-native memory orchestration
- add reasoning and recommendation flows over relational graph neighborhoods
Canonical Directionβ
The ValkyrAI memory system should become:
a private-by-default, ACL-aware, highly normalized relational semantic graph that unifies identity, memory, execution, and commercial state.
This is the practical semantic web: not vague embeddings replacing structure, but a graph of typed objects and indexed relations living inside a disciplined RDB.