Skip to main content

Memory Graph 4NF Plan

Subtitle: Relational semantic graph for durable memory, execution, and CRM state

Executive Summary

ValkyrAI already contains the raw ingredients for a semantic graph implemented on top of a normalized relational database:

  • MemoryEntry
  • Tag
  • Principal
  • Organization
  • Customer
  • Opportunity
  • SalesActivity
  • ContentData
  • Goal
  • StrategicPriority
  • Task
  • Workflow
  • Run
  • Agent
  • GrayMatter

What is still missing is a formal graph strategy that treats these objects as first-class nodes in a connected memory and execution web.

This plan defines the 4NF direction:

  • keep reusable facts in their own tables
  • eliminate repeating semantic groups from blob fields
  • use explicit relations for important links
  • use tags and graph edges as normalized join structures
  • preserve ownership, provenance, and indexability

The result is an explainable, queryable, ACL-friendly semantic web imposed on an RDB.


Problem Statement

Today, memory can be written into MemoryEntry, but many relationships still live implicitly in text, source URLs, or ad hoc workflow knowledge.

That creates several problems:

  • memory is durable but not fully relational
  • semantic lookup depends too much on text scanning
  • ownership and provenance exist, but entity-to-entity meaning is under-modeled
  • repeated dimensions such as tags, related entities, and cross-domain links are not consistently normalized
  • graph traversal is possible in principle, but not explicit enough in schema design

To make GrayMatter and api-0 truly useful as durable machine memory, the system needs a stronger relational graph discipline.


Design Goal

Model ValkyrAI memory as a 4NF semantic graph where:

  • core entities remain independently queryable
  • multi-valued relationships become explicit join structures
  • important semantic edges are first-class and indexed
  • ownership and ACL remain compatible with every node and edge
  • the system supports both deterministic business workflows and semantic retrieval

Existing Graph-Capable Nodes

Identity spine

  • Principal
  • Organization
  • Address
  • Role
  • Authority

Memory spine

  • MemoryEntry
  • Tag
  • Note
  • ContentData
  • GrayMatter

Commercial spine

  • Customer
  • Opportunity
  • SalesActivity
  • SalesPipeline

Execution spine

  • Goal
  • StrategicPriority
  • Task
  • Workflow
  • Run
  • Agent

These are already enough to define a meaningful semantic graph. The remaining work is formalization and normalization.


4NF Rules For The Memory Graph

Rule 1: Keep stable entities first-class

If a concept has independent lifecycle, permissions, or query value, it must be its own model.

Examples:

  • Principal
  • Organization
  • Customer
  • ContentData
  • MemoryEntry
  • Goal
  • Workflow

Rule 2: Multi-valued dimensions must not live as repeated text blobs

If an entity can have many tags, many related nodes, many references, or many semantic labels, model those as join structures.

Examples:

  • tags
  • related memories
  • linked customers
  • linked goals
  • linked workflows

Rule 3: Provenance must remain normalized

Source metadata is itself a structured dimension and should remain queryable:

  • source channel
  • source message id
  • source URL
  • creator / owner
  • creation timestamp

Rule 4: Execution and memory must be linked

Important memory should be able to point to execution objects and vice versa.

Examples:

  • a memory entry about a strategy should link to a Goal
  • an implementation note should link to a Task or Workflow
  • a CRM discovery memory should link to Customer and Opportunity

Rule 5: Public projection must be separate from private memory

Internal semantic nodes should remain private by default. Published content should be projected deliberately through explicit public surfaces.


Proposed Schema Enhancements

MemoryEntry should remain the durable freeform memory nucleus, but it should gain structured link fields for the most important entity families.

Proposed additions

MemoryEntry:
properties:
principal:
$ref: '#/components/schemas/Principal'
organization:
$ref: '#/components/schemas/Organization'
customer:
$ref: '#/components/schemas/Customer'
opportunity:
$ref: '#/components/schemas/Opportunity'
contentData:
$ref: '#/components/schemas/ContentData'
goal:
$ref: '#/components/schemas/Goal'
task:
$ref: '#/components/schemas/Task'
workflow:
$ref: '#/components/schemas/Workflow'
agent:
$ref: '#/components/schemas/Agent'
run:
$ref: '#/components/schemas/Run'

Why

These fields support the most important one-to-one or one-to-primary-context relationships without forcing everything into generic edge traversal.


For many-to-many and cross-domain graphing, use a dedicated edge table.

Purpose

Allow arbitrary semantic edges between node types without denormalizing the nodes themselves.

Proposed shape

GraphLink:
type: object
properties:
id:
type: string
format: uuid
ownerId:
type: string
format: uuid
fromType:
type: string
fromId:
type: string
format: uuid
toType:
type: string
toId:
type: string
format: uuid
relationType:
type: string
description: semantic relation, e.g. relates_to, derived_from, supports, blocks, references
weight:
type: number
format: double
sourceChannel:
type: string
sourceMessageId:
type: string
sourceUrl:
type: string
createdDate:
type: string
format: date-time
lastModifiedDate:
type: string
format: date-time

Example edges

  • MemoryEntry -> Goal via supports
  • Task -> StrategicPriority via implements
  • Opportunity -> ContentData via uses_asset
  • MemoryEntry -> MemoryEntry via derived_from
  • Customer -> Organization via belongs_to

Why

This gives the system a true graph layer without sacrificing relational integrity.


3) Normalize tags as reusable graph labels

Tags already exist and should become a first-class semantic indexing system.

Rules

  • Never rely on raw string arrays when reusable Tag objects are available
  • Standardize high-value tag families:
    • memory type
    • domain
    • priority
    • workflow area
    • sensitivity
    • publication state
    • commercial stage

Example tag families

  • domain:crm
  • domain:memory
  • domain:workflow
  • state:draft
  • state:published
  • sensitivity:private
  • signal:high
  • graph:strategic
  • graph:execution

Execution objects already form a partial graph, but should be treated as a semantic layer.

Important existing edges

  • StrategicPriority -> Goal[]
  • Workflow -> Task[]
  • Task -> Workflow
  • Run -> Task
  • Agent -> Workflow[]

Desired additions

  • Task -> Goal
  • Task -> StrategicPriority
  • Workflow -> Goal
  • Run -> Workflow
  • Agent -> StrategicPriority

This supports better reasoning over intent, action, and outcome.


CRM state should not float separately from memory.

  • MemoryEntry -> Customer
  • MemoryEntry -> Opportunity
  • MemoryEntry -> Organization
  • SalesActivity -> ContentData when an asset was used
  • Opportunity -> ContentData for linked collateral

This gives sales memory a proper graph substrate rather than a pile of notes.


Indexing Strategy

The graph only becomes useful if traversal is cheap.

Required indexes

Ownership / security

  • owner_id

Identity and business traversal

  • principal_id
  • organization_id
  • customer_id
  • opportunity_id
  • goal_id
  • task_id
  • workflow_id
  • agent_id
  • run_id

Provenance

  • source_message_id
  • source_url
  • source_channel

Time

  • created_date
  • last_modified_date

Public/content access

  • slug
  • title
  • status

Graph traversal

For GraphLink:

  • (from_type, from_id)
  • (to_type, to_id)
  • relation_type
  • (owner_id, relation_type)

Tagging

  • tag join tables on both sides
  • normalized tag name / slug / namespace

Memory Graph Example

A single strategic memory should be able to connect across multiple domains.

Example

  • MemoryEntry: “6D-A is the organizational operating method”
  • linked to:
    • StrategicPriority: Governance
    • Goal: Compounding autonomous execution
    • ContentData: 6D-A Governance Engine doc
    • Agent: Valor
    • Tag: domain:memory, graph:strategic, sensitivity:private
    • GraphLink: supports, codifies, relates_to

This is not just storage. It becomes machine-reasonable structure.


Public vs Private Graph Surfaces

Private-by-default remains mandatory.

Private by default

  • MemoryEntry
  • GraphLink
  • CRM entities unless explicitly exposed
  • internal execution state

Public by projection

  • selected ContentData
  • published marketing assets
  • public MCP/catalog objects
  • explicitly public APIs

The semantic graph should be rich internally and selectively projected externally.


Phase Rollout

Phase 1

  • Treat MemoryEntry + Tag + source metadata as canonical durable memory
  • Standardize normalized tag usage
  • Enforce post-write GET verification in the correct auth scope

Phase 2

  • Extend MemoryEntry with first-class links to key node families
  • Add missing indexes for traversal and provenance

Phase 3

  • Introduce GraphLink
  • Use it for semantic cross-domain edges
  • Build graph traversal services and UI inspectors

Phase 4

  • Integrate GrayMatter explicitly as graph-native memory orchestration
  • add reasoning and recommendation flows over relational graph neighborhoods

Canonical Direction

The ValkyrAI memory system should become:

a private-by-default, ACL-aware, highly normalized relational semantic graph that unifies identity, memory, execution, and commercial state.

This is the practical semantic web: not vague embeddings replacing structure, but a graph of typed objects and indexed relations living inside a disciplined RDB.