Skip to main content

Memory Graph 4NF Plan

Subtitle: Relational semantic graph for durable memory, execution, and CRM state

Executive Summary​

ValkyrAI already contains the raw ingredients for a semantic graph implemented on top of a normalized relational database:

  • MemoryEntry
  • Tag
  • Principal
  • Organization
  • Customer
  • Opportunity
  • SalesActivity
  • ContentData
  • Goal
  • StrategicPriority
  • Task
  • Workflow
  • Run
  • Agent
  • GrayMatter

What is still missing is a formal graph strategy that treats these objects as first-class nodes in a connected memory and execution web.

This plan defines the 4NF direction:

  • keep reusable facts in their own tables
  • eliminate repeating semantic groups from blob fields
  • use explicit relations for important links
  • use tags and graph edges as normalized join structures
  • preserve ownership, provenance, and indexability

The result is an explainable, queryable, ACL-friendly semantic web imposed on an RDB.


Problem Statement​

Today, memory can be written into MemoryEntry, but many relationships still live implicitly in text, source URLs, or ad hoc workflow knowledge.

That creates several problems:

  • memory is durable but not fully relational
  • semantic lookup depends too much on text scanning
  • ownership and provenance exist, but entity-to-entity meaning is under-modeled
  • repeated dimensions such as tags, related entities, and cross-domain links are not consistently normalized
  • graph traversal is possible in principle, but not explicit enough in schema design

To make GrayMatter and api-0 truly useful as durable machine memory, the system needs a stronger relational graph discipline.


Design Goal​

Model ValkyrAI memory as a 4NF semantic graph where:

  • core entities remain independently queryable
  • multi-valued relationships become explicit join structures
  • important semantic edges are first-class and indexed
  • ownership and ACL remain compatible with every node and edge
  • the system supports both deterministic business workflows and semantic retrieval

Existing Graph-Capable Nodes​

Identity spine​

  • Principal
  • Organization
  • Address
  • Role
  • Authority

Memory spine​

  • MemoryEntry
  • Tag
  • Note
  • ContentData
  • GrayMatter

Commercial spine​

  • Customer
  • Opportunity
  • SalesActivity
  • SalesPipeline

Execution spine​

  • Goal
  • StrategicPriority
  • Task
  • Workflow
  • Run
  • Agent

These are already enough to define a meaningful semantic graph. The remaining work is formalization and normalization.


4NF Rules For The Memory Graph​

Rule 1: Keep stable entities first-class​

If a concept has independent lifecycle, permissions, or query value, it must be its own model.

Examples:

  • Principal
  • Organization
  • Customer
  • ContentData
  • MemoryEntry
  • Goal
  • Workflow

Rule 2: Multi-valued dimensions must not live as repeated text blobs​

If an entity can have many tags, many related nodes, many references, or many semantic labels, model those as join structures.

Examples:

  • tags
  • related memories
  • linked customers
  • linked goals
  • linked workflows

Rule 3: Provenance must remain normalized​

Source metadata is itself a structured dimension and should remain queryable:

  • source channel
  • source message id
  • source URL
  • creator / owner
  • creation timestamp

Rule 4: Execution and memory must be linked​

Important memory should be able to point to execution objects and vice versa.

Examples:

  • a memory entry about a strategy should link to a Goal
  • an implementation note should link to a Task or Workflow
  • a CRM discovery memory should link to Customer and Opportunity

Rule 5: Public projection must be separate from private memory​

Internal semantic nodes should remain private by default. Published content should be projected deliberately through explicit public surfaces.


Proposed Schema Enhancements​

MemoryEntry should remain the durable freeform memory nucleus, but it should gain structured link fields for the most important entity families.

Proposed additions​

MemoryEntry:
properties:
principal:
$ref: '#/components/schemas/Principal'
organization:
$ref: '#/components/schemas/Organization'
customer:
$ref: '#/components/schemas/Customer'
opportunity:
$ref: '#/components/schemas/Opportunity'
contentData:
$ref: '#/components/schemas/ContentData'
goal:
$ref: '#/components/schemas/Goal'
task:
$ref: '#/components/schemas/Task'
workflow:
$ref: '#/components/schemas/Workflow'
agent:
$ref: '#/components/schemas/Agent'
run:
$ref: '#/components/schemas/Run'

Why​

These fields support the most important one-to-one or one-to-primary-context relationships without forcing everything into generic edge traversal.


For many-to-many and cross-domain graphing, use a dedicated edge table.

Purpose​

Allow arbitrary semantic edges between node types without denormalizing the nodes themselves.

Proposed shape​

GraphLink:
type: object
properties:
id:
type: string
format: uuid
ownerId:
type: string
format: uuid
fromType:
type: string
fromId:
type: string
format: uuid
toType:
type: string
toId:
type: string
format: uuid
relationType:
type: string
description: semantic relation, e.g. relates_to, derived_from, supports, blocks, references
weight:
type: number
format: double
sourceChannel:
type: string
sourceMessageId:
type: string
sourceUrl:
type: string
createdDate:
type: string
format: date-time
lastModifiedDate:
type: string
format: date-time

Example edges​

  • MemoryEntry -> Goal via supports
  • Task -> StrategicPriority via implements
  • Opportunity -> ContentData via uses_asset
  • MemoryEntry -> MemoryEntry via derived_from
  • Customer -> Organization via belongs_to

Why​

This gives the system a true graph layer without sacrificing relational integrity.


3) Normalize tags as reusable graph labels​

Tags already exist and should become a first-class semantic indexing system.

Rules​

  • Never rely on raw string arrays when reusable Tag objects are available
  • Standardize high-value tag families:
    • memory type
    • domain
    • priority
    • workflow area
    • sensitivity
    • publication state
    • commercial stage

Example tag families​

  • domain:crm
  • domain:memory
  • domain:workflow
  • state:draft
  • state:published
  • sensitivity:private
  • signal:high
  • graph:strategic
  • graph:execution

Execution objects already form a partial graph, but should be treated as a semantic layer.

Important existing edges​

  • StrategicPriority -> Goal[]
  • Workflow -> Task[]
  • Task -> Workflow
  • Run -> Task
  • Agent -> Workflow[]

Desired additions​

  • Task -> Goal
  • Task -> StrategicPriority
  • Workflow -> Goal
  • Run -> Workflow
  • Agent -> StrategicPriority

This supports better reasoning over intent, action, and outcome.


CRM state should not float separately from memory.

  • MemoryEntry -> Customer
  • MemoryEntry -> Opportunity
  • MemoryEntry -> Organization
  • SalesActivity -> ContentData when an asset was used
  • Opportunity -> ContentData for linked collateral

This gives sales memory a proper graph substrate rather than a pile of notes.


Indexing Strategy​

The graph only becomes useful if traversal is cheap.

Required indexes​

Ownership / security​

  • owner_id

Identity and business traversal​

  • principal_id
  • organization_id
  • customer_id
  • opportunity_id
  • goal_id
  • task_id
  • workflow_id
  • agent_id
  • run_id

Provenance​

  • source_message_id
  • source_url
  • source_channel

Time​

  • created_date
  • last_modified_date

Public/content access​

  • slug
  • title
  • status

Graph traversal​

For GraphLink:

  • (from_type, from_id)
  • (to_type, to_id)
  • relation_type
  • (owner_id, relation_type)

Tagging​

  • tag join tables on both sides
  • normalized tag name / slug / namespace

Memory Graph Example​

A single strategic memory should be able to connect across multiple domains.

Example​

  • MemoryEntry: β€œ6D-A is the organizational operating method”
  • linked to:
    • StrategicPriority: Governance
    • Goal: Compounding autonomous execution
    • ContentData: 6D-A Governance Engine doc
    • Agent: Valor
    • Tag: domain:memory, graph:strategic, sensitivity:private
    • GraphLink: supports, codifies, relates_to

This is not just storage. It becomes machine-reasonable structure.


Public vs Private Graph Surfaces​

Private-by-default remains mandatory.

Private by default​

  • MemoryEntry
  • GraphLink
  • CRM entities unless explicitly exposed
  • internal execution state

Public by projection​

  • selected ContentData
  • published marketing assets
  • public MCP/catalog objects
  • explicitly public APIs

The semantic graph should be rich internally and selectively projected externally.


Phase Rollout​

Phase 1​

  • Treat MemoryEntry + Tag + source metadata as canonical durable memory
  • Standardize normalized tag usage
  • Enforce post-write GET verification in the correct auth scope

Phase 2​

  • Extend MemoryEntry with first-class links to key node families
  • Add missing indexes for traversal and provenance

Phase 3​

  • Introduce GraphLink
  • Use it for semantic cross-domain edges
  • Build graph traversal services and UI inspectors

Phase 4​

  • Integrate GrayMatter explicitly as graph-native memory orchestration
  • add reasoning and recommendation flows over relational graph neighborhoods

Canonical Direction​

The ValkyrAI memory system should become:

a private-by-default, ACL-aware, highly normalized relational semantic graph that unifies identity, memory, execution, and commercial state.

This is the practical semantic web: not vague embeddings replacing structure, but a graph of typed objects and indexed relations living inside a disciplined RDB.