A memory grain arrives at your storage layer. It might contain a user's email address. It might contain a medical diagnosis. It might contain the temperature reading from a server room sensor. Each of these demands different handling: PII needs encryption at rest, PHI needs HIPAA-compliant storage, and public sensor data can go anywhere.
The question is: how do you know which is which without parsing the entire payload?
Section 13 of the OMS v1.0 specification defines a sensitivity classification system that answers this question in two layers. The first layer is a 2-bit field in the fixed header --- readable in O(1) time without any deserialization. The second layer is a structured tag vocabulary in the payload that provides fine-grained classification. Together, these layers enable fast routing decisions at the infrastructure level while preserving detailed metadata for policy engines.
Header-Level Sensitivity: Two Bits, Four Levels
Section 13.1 defines the sensitivity field as bits 6-7 of byte 1 (the flags byte) in the 9-byte fixed header:
Byte 1 (flags):
+---+---+---+---+---+---+---+---+
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
+---+---+---+---+---+---+---+---+
| | | | | | | |
| | | | | | | +-- signed (COSE Sign1)
| | | | | | +------ encrypted (AES-256-GCM)
| | | | | +---------- compressed (zstd)
| | | | +-------------- has_content_refs
| | | +------------------ has_embedding_refs
| | +---------------------- cbor_encoding
+---+-------------------------- sensitivity (2 bits)
The two sensitivity bits encode four classification levels:
| Binary | Value | Level | Meaning |
|---|---|---|---|
00 | 0 | Public | No sensitivity constraints |
01 | 1 | Internal | Organization-internal data, not PII |
10 | 2 | PII | Contains personally identifiable information |
11 | 3 | PHI | Contains protected health information (HIPAA) |
This is a routing hint, not a security boundary. But it is an extremely efficient one. A storage router can read a single byte --- byte 1 of the fixed header --- extract bits 6-7, and immediately decide where to send the grain. No MessagePack deserialization. No field parsing. No string comparison. Just a bit shift and a mask:
def get_sensitivity(header_bytes: bytes) -> int:
flags = header_bytes[1]
return (flags >> 6) & 0x03 # Extract bits 6-7
# Route based on sensitivity
sensitivity = get_sensitivity(grain_blob)
if sensitivity == 0b11: # PHI
store = hipaa_compliant_store
elif sensitivity == 0b10: # PII
store = encrypted_store
elif sensitivity == 0b01: # Internal
store = internal_store
else: # Public
store = default_storeThis O(1) routing is the key benefit. In a system processing millions of grains per second, the ability to route without deserialization means compliance-aware storage can operate at wire speed.
Standard Tag Vocabulary
While header bits provide fast routing, the structural_tags field in the payload provides detailed classification. Section 13.2 defines five standard prefix categories:
pii: --- Personal Data
Tags identifying personally identifiable information:
| Tag | Description |
|---|---|
pii:email | Email address |
pii:phone | Phone number |
pii:ssn | Social Security number |
pii:name | Personal name |
These tags identify data that falls under GDPR's definition of "personal data" (any information relating to an identified or identifiable natural person) and CCPA's definition of "personal information" (information that identifies or could reasonably be linked to a consumer).
phi: --- Health Data
Tags identifying protected health information:
| Tag | Description |
|---|---|
phi:diagnosis | Medical diagnosis |
phi:medication | Medication records |
phi:lab_result | Laboratory test results |
PHI tags correspond to HIPAA's regulatory category of "protected health information" under 45 CFR. Any grain tagged with a phi: prefix triggers the highest sensitivity level (11) in the header.
reg: --- Regulatory Jurisdiction
Tags identifying which regulatory storage or retention rules apply:
| Tag | Description |
|---|---|
reg:pci-dss | PCI-compliant storage required |
reg:sox | 7-year immutable audit retention (Sarbanes-Oxley) |
reg:basel-iii | Regulatory capital data |
reg:gdpr-art17 | Erasure-eligible under GDPR Article 17 |
sec: --- Security Data
Tags identifying security-sensitive credentials:
| Tag | Description |
|---|---|
sec:credential | Authentication credential |
sec:api_key | API key or secret |
sec:token | Authentication or session token |
Security tags trigger the PII sensitivity level (10) in the header. While credentials are not personal data in the GDPR sense, they require the same level of encryption and access control.
legal: --- Legal Data
Tags identifying legally sensitive material:
| Tag | Description |
|---|---|
legal:privilege | Attorney-client privileged information |
legal:litigation_hold | Data subject to litigation hold (must not be deleted) |
Legal tags also trigger the PII sensitivity level (10). A grain tagged legal:litigation_hold demands careful handling: it must be preserved even if a deletion request arrives, because legal hold obligations may override erasure rights.
Automatic Sensitivity Setting at Write Time
The tag vocabulary is not just metadata --- it drives the header sensitivity bits. Section 13.2 states: "At write time, serializer scans tags and sets header sensitivity bits to highest classification present."
This means the serializer is responsible for consistency between tags and header bits. A grain with structural_tags: ["phi:diagnosis", "pii:name"] must have its header sensitivity bits set to 11 (PHI), because phi: is the highest classification present. The serializer does not require manual configuration of the header bits; it derives them from the tags.
Here is what this looks like in practice:
def compute_sensitivity(structural_tags: list[str]) -> int:
sensitivity = 0b00 # Default: public
for tag in structural_tags:
if tag.startswith("phi:"):
return 0b11 # PHI is highest; short-circuit
elif tag.startswith(("pii:", "sec:", "legal:")):
sensitivity = max(sensitivity, 0b10)
elif tag.startswith("reg:"):
sensitivity = max(sensitivity, 0b01)
return sensitivitySensitivity Consistency Validation
Section 13.4 formalizes the relationship between tags and header bits with two rules --- one for serializers, one for parsers.
Serializer Rule
At write time, the serializer MUST scan all structural_tags values and set the header sensitivity bits to the highest classification present, using this mapping:
| Tag Prefix Present | Minimum Header Sensitivity |
|---|---|
phi:* | 11 (PHI) |
pii:*, sec:*, legal:* | 10 (PII) |
reg:* | 01 (internal) minimum --- policy engine determines actual tier |
| No sensitive tags | 00 or 01 at writer's discretion |
Note the asymmetry for reg: tags. A reg:pci-dss tag sets the minimum to 01 (internal), but the policy engine may determine a higher tier is needed. The other prefixes have deterministic mappings.
Parser Rule
At parse time, if structural_tags is present, the parser MUST validate that the header sensitivity bits are not lower than the highest classification the tags require. If they are lower, the parser MUST reject with ERR_SENSITIVITY_MISMATCH.
def validate_sensitivity(header_sensitivity: int, structural_tags: list[str]):
required = compute_sensitivity(structural_tags)
if header_sensitivity < required:
raise ValueError(
f"ERR_SENSITIVITY_MISMATCH: header sensitivity {header_sensitivity} "
f"is lower than tags require ({required}). "
f"Possible serializer defect or header tampering."
)This validation creates a one-way ratchet. Header sensitivity can be higher than tags require (a writer may choose 01 for a grain with no sensitive tags), but it can never be lower. The highest-classified tag present sets the floor.
Header Sensitivity Limitations
Section 13.3 is explicit about what header sensitivity bits are and what they are not. They are advisory routing metadata, not a compliance guarantee. This distinction matters.
The limitation is fundamental: tag-based sensitivity assignment depends on the writer correctly identifying and tagging sensitive fields at creation time. If a grain contains a user's Social Security number but the writer fails to tag it with pii:ssn, the header bits will read 00 (public) and the grain will be routed to unencrypted storage. The header cannot catch what the writer did not declare.
The specification defines four practices that systems processing regulated content SHOULD follow:
-
Treat header sensitivity bits as a fast-path routing hint, not a classification guarantee. The header enables efficient routing, but routing decisions should not be the end of the compliance story.
-
Perform payload inspection for sensitive decisions. Before routing or sharing a grain, deserialize the payload and validate
structural_tags. The header is the fast path; payload inspection is the verification. -
Enforce writer responsibility. Establish clear tagging protocols for regulated workflows. If an agent writes grains containing PHI, it must be configured to tag them with
phi:prefixes. The specification provides the tagging mechanism; the organization provides the tagging discipline. -
Apply layered defense. Combine header-level filtering with payload inspection. Never gate compliance decisions solely on header bits. The header catches correctly tagged grains at wire speed; payload inspection catches everything else.
This layered approach mirrors how security works in other domains. A firewall rule (fast, header-based) provides the first line of defense. Deep packet inspection (slower, payload-based) provides the second. Neither alone is sufficient.
Legal Neutrality
Section 13.5 contains an important statement: the sensitivity classifications in the specification (public, internal, PII, PHI) are technical routing and storage metadata. They are not legal definitions.
Different legal regimes define regulated data differently:
| Jurisdiction | Term | Scope |
|---|---|---|
| GDPR (EU) | "personal data" | Any information relating to an identified or identifiable natural person |
| CCPA (California) | "personal information" | Information that identifies or could reasonably be linked to a consumer |
| LGPD (Brazil) | "dados pessoais" | Similar scope to GDPR |
| HIPAA (USA) | "protected health information" | A specific regulatory category under 45 CFR |
The specification states: "Implementations MUST determine sensitivity classification according to applicable jurisdictional law and organizational policy." The .mg tags and header bits are a compliance-aware tagging mechanism to facilitate routing and policy enforcement. The legal determination of what constitutes regulated data is outside the scope of the format.
This neutrality is deliberate. A grain tagged pii:email is asserting a technical classification, not making a legal claim. Whether an email address constitutes "personal data" under a specific jurisdiction depends on context that the format cannot capture. The format provides the tagging infrastructure; legal counsel provides the classification rules.
Use Cases
Routing PHI to HIPAA-Compliant Storage
A health assistant agent creates a grain recording a patient's medication:
{
"type": "belief",
"subject": "patient-789",
"relation": "takes",
"object": "metformin 500mg twice daily",
"confidence": 0.99,
"source_type": "user_explicit",
"created_at": 1739980800000,
"namespace": "health-assistant",
"user_id": "patient-789",
"author_did": "did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK",
"structural_tags": ["phi:medication", "pii:name"]
}The serializer scans structural_tags, finds phi:medication, and sets header sensitivity to 11 (PHI). The storage router reads byte 1 of the header, extracts bits 6-7 (11), and routes the grain to the HIPAA-compliant storage tier. The per-user encryption pattern from Section 20.3 encrypts the grain with a key derived from "patient-789". The entire routing and encryption decision happens without parsing the MessagePack payload.
Filtering PII for Encryption at Rest
A customer service agent stores a user's contact preferences:
{
"type": "belief",
"subject": "alice-42",
"relation": "prefers",
"object": "email for shipping notifications",
"confidence": 0.95,
"source_type": "user_explicit",
"created_at": 1739980800000,
"namespace": "customer-service",
"user_id": "alice-42",
"author_did": "did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK",
"structural_tags": ["pii:name", "pii:email", "preference"]
}The serializer finds pii:name and pii:email, setting header sensitivity to 10 (PII). The preference tag has no sensitive prefix and does not affect the sensitivity level. At the storage layer, the grain is routed to an encrypted tier. The presence of user_id triggers per-user encryption via HKDF-SHA256 key derivation. Even without deserializing the payload, the system knows this grain needs encryption.
Tagging Financial Data for PCI Compliance
A financial agent records a transaction detail:
{
"type": "belief",
"subject": "transaction-9182",
"relation": "involves",
"object": "card ending 4242",
"confidence": 1.0,
"source_type": "system_generated",
"created_at": 1739980800000,
"namespace": "payments",
"author_did": "did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK",
"structural_tags": ["reg:pci-dss", "sec:credential"]
}Two tags are present: reg:pci-dss and sec:credential. The sec: prefix maps to PII level (10), which is higher than reg:'s minimum of 01. The serializer sets header sensitivity to 10. The reg:pci-dss tag acts as a routing directive: the policy engine sees it and routes the grain to PCI-DSS-compliant storage infrastructure. The sec:credential classification ensures the grain is encrypted.
Detecting Sensitivity Mismatch
Consider a grain that arrives with header sensitivity 00 (public) but contains structural_tags: ["phi:diagnosis"]. The parser computes the required sensitivity: phi:* maps to 11 (PHI). The header says 00. This is a mismatch:
ERR_SENSITIVITY_MISMATCH: header sensitivity 0 is lower than
tags require (3). Possible serializer defect or header tampering.
The parser rejects the grain. This validation prevents a class of attacks where a malicious or buggy writer deliberately under-classifies sensitive data to bypass access controls. It also catches serializer bugs before they result in compliance violations.
Sensitivity in the Broader Architecture
The sensitivity classification system connects to the other compliance features in OMS:
- Per-user encryption (Section 20.3): Grains with
user_idand sensitivity bits10or11are candidates for per-user key derivation and encrypted storage. - Crypto-erasure (Section 20.6): When a user's key is destroyed, all grains encrypted with that key become unrecoverable --- regardless of their sensitivity level.
- Selective disclosure (Section 10): For grains that need to be partially shared, selective disclosure can hide specific fields while revealing others, with the sensitivity tags indicating which fields are sensitive.
- Provenance chain (Section 14.1): Every grain's derivation history is tracked, providing an audit trail that satisfies GDPR Article 30 and HIPAA Section 164.308.
The header sensitivity bits are the entry point to this system. They provide the fast path for routing decisions. The tag vocabulary provides the detailed classification. The per-user encryption pattern provides the cryptographic enforcement. And the consistency validation ensures that the header and tags always agree.
Summary
| Layer | Mechanism | Speed | Accuracy |
|---|---|---|---|
| Header bits (13.1) | 2-bit field, byte 1 bits 6-7 | O(1) --- no deserialization | Advisory --- depends on writer |
| Tag vocabulary (13.2) | structural_tags prefixes: pii:, phi:, reg:, sec:, legal: | Requires payload parsing | Detailed --- per-field classification |
| Consistency validation (13.4) | Serializer sets, parser verifies | Automatic at read/write | Catches mismatches and tampering |
| Legal neutrality (13.5) | Technical metadata, not legal definitions | N/A | Jurisdiction-dependent |
The two-layer design reflects a practical reality: infrastructure needs to make fast decisions, but compliance needs to make correct decisions. Header bits handle the first case. Payload inspection handles the second. Together, they provide a sensitivity classification system that operates at wire speed for the common case while maintaining full accuracy for the cases that matter most.
For the per-user encryption pattern that acts on these sensitivity classifications --- including HKDF-SHA256 key derivation, blind indexes, and crypto-erasure --- see GDPR-Ready Agent Memory.