Logo

User Journey Hashed-Chain Tracking

ujht

User Journey Hash-Chain Tracking (UJHT) defines a hash-chain-based, privacy-preserving mechanism for measuring a visitor’s in-session navigation path ("user journey") without cookies, local persistent identifiers, or device fingerprinting. UJHT relies on an ephemeral, client-held secret seed to compute a one-way hash chain across sequential events. Each transmitted event contains only the current hash, the previous hash, and a minimal payload. Servers stitch events into journeys by matching prev → curr edges without learning or inferring a stable user identifier.

This draft additionally specifies a Journey Catalog control plane. The catalog defines step keys and allowed transitions (edges). Clients emit human-readable step hints via namespaced x-* payload members; servers validate against the catalog while remaining agnostic to the client’s secret seed.

This is an Editor’s Draft and may change at any time. Feedback is welcome via the issue tracker.

Introduction

Traditional web analytics depend on cookies or fingerprinting to recognize visitors across page loads; these approaches raise privacy and regulatory concerns. UJHT provides a session-scoped alternative that avoids persistent identifiers while enabling ordered in-session path analysis for MPAs and SPAs. The hybrid control-plane model lets design and product stakeholders evolve a journey graph in parallel with implementation while the client independently computes the cryptographic chain.

Conformance

The key words MUST, MUST NOT, SHOULD, and MAY are to be interpreted as described in RFC 2119 and RFC 8174.

Conformance classes:

Terminology

Client
User agent executing site JavaScript.
Server
First-party collection endpoint receiving UJHT events.
Session
Single, ephemeral browsing context (tab/window lifetime).
Seed (S)
Cryptographically random secret generated per session and retained only client-side.
Event
Minimal record describing a step in the journey.
Hash chain
Sequence where each event hash commits to its payload and the previous hash.
Journey Catalog
A versioned server-side registry of step keys and allowed edges (control plane).
Step key
Human-readable identifier for a journey step (e.g., onboarding.enter_email@v3).

Data Model

Event Object

Each event sent by the client MUST be a JSON object:

{
  "v": 1,
  "curr": "<hex-encoded hash>",
  "prev": "<hex-encoded hash or empty string>",
  "payload": {
    "path": "/example",
    "event": "view",
    "ts": 1734373200000,
    "ref": "https://example.org",
    "x-step": "onboarding.enter_email@v3",
    "x-journey": "onboarding",
    "x-catalog": "2025-08-04T12:00Z",
    "x-variant": "A"
  }
}

Fields

  • v — protocol version (this spec defines 1).
  • curr — lowercase hex of current event hash hi. For v = 1, this MUST be 64 hexadecimal characters (HMAC-SHA-256 output).
  • prev — lowercase hex of previous event hash hi-1 or empty string for root events; for v = 1 this MUST be empty or 64 hexadecimal characters.
  • payload — minimal event data. Required members:
    • path — normalized page path or route identifier; MUST NOT include PII.
    • event — event type token (e.g., "view", "action:signup").
    • ts — Unix epoch ms; SHOULD be rounded (see § Privacy Requirements).
    • ref — referrer origin or empty string. Clients MUST NOT include path, query, or fragment components.

Additional namespaced members under payload whose keys begin with x- are permitted; servers MUST ignore unknown members. When a Journey Catalog is enabled (§ Journey Catalog), clients SHOULD include:

  • x-step — step key for this event. MUST be included if the deployment enforces catalog validation.
  • x-journey — optional journey/funnel name.
  • x-catalog — catalog version identifier (e.g., timestamp or semantic version).
  • x-variant — optional experiment or variant label.

Values of x-* members SHOULD be short, low-cardinality ASCII tokens (max 256 chars) and MUST NOT contain PII, hashed or encrypted PII, or per-user stable identifiers.

Canonicalization

Clients MUST canonicalize payload using the JSON Canonicalization Scheme (JCS) or an equivalent scheme that produces a stable byte sequence for the same semantic JSON object.

CANON(payload) || "|" || prev

Hash Algorithm

For protocol version v = 1, clients MUST compute hi = HMAC_SHA256(S, CANON(payload) || "|" || hi-1) and encode the 32-byte output as lowercase hex (64 characters). The initial predecessor h-1 is defined as the empty string (""), and the corresponding root event MUST set prev to "".

Seeds MUST be generated with at least 128 bits of entropy (256 bits recommended) using a cryptographically secure random source (for example, crypto.getRandomValues() in Web Crypto), and MUST NOT be transmitted, persisted, or shared across sessions.

This specification fixes the hash construction for v = 1 to HMAC-SHA-256. Future protocol versions MAY define additional constructions with distinct v values and corresponding schema updates.

Client Processing Model

  1. On session start, generate a fresh seed S and derive an HMAC key. A session is the lifetime of a tab/window (see § Session semantics and edge cases).
  2. Step discovery (hybrid control-plane): determine step metadata using one or more of:
    • HTML meta tags (ujht-step, ujht-journey, ujht-catalog, ujht-variant),
    • server response headers (UJHT-Step, UJHT-Journey, UJHT-Catalog, UJHT-Variant),
    • application configuration (e.g., route table constants).
    Clients MUST NOT fetch the catalog; step keys are opaque tokens for the client.
  3. For each step, construct and canonicalize payload, compute curr, transmit the event, and update prev ← curr.
  4. Implementations MAY store prev in sessionStorage to bridge navigations; they MUST NOT write cookies, localStorage, IndexedDB, or other persistent state for UJHT.
  5. Clients MUST suppress all UJHT transmission when a Do Not Track signal is enabled (see § User Preference Signaling).

Implementation note: Clients SHOULD emit event: "view" on route changes (SPA) and significant page loads (MPA), and event: "action:*" for key user actions.

Session semantics and edge cases

Clients define a UJHT session as the lifetime of a single browsing context (for example, a tab or window). In practice, user agents and frameworks may perform back/forward cache restores, reloads, and process restarts that affect client state.

  • If sessionStorage is unavailable or a client cannot recover a valid prev value after navigation or restart, it MUST start a new chain by setting prev = "" and computing a new root event.
  • If a page is restored from a back/forward cache with stale or inconsistent state, clients SHOULD prefer starting a new chain rather than attempting to reuse an old prev that no longer reflects the actual journey.
  • Implementations MUST NOT attempt to bridge chains across independent browsing contexts (for example, separate tabs or windows) by sharing seeds or prev values.

Transport

TLS and HTTP Versions

  • All UJHT transmissions MUST use TLS (HTTPS) with TLS 1.2 or newer. Clear-text HTTP MUST NOT be used in production. For development, http://localhost is permitted.
  • HTTP/1.1, HTTP/2, and HTTP/3 are supported. Implementations SHOULD prefer HTTP/2 or HTTP/3 where available.

HTTP Interface

Method: POST • Headers: Content-Type: application/ujht+json • Body: Event object (see Event Object). Servers MUST treat curr as an idempotency key within a given collection endpoint. Responses: 204 on success; 400, 413, 429, 5xx as appropriate.

For reliability on unload, clients SHOULD use navigator.sendBeacon(). Fallbacks include asynchronous fetch() with keepalive.

Batching

Clients MAY send a batch envelope with media type application/ujht-batch+json containing an array of events, ordered or unordered. Servers MUST accept out-of-order events and deduplicate by curr.

{
  "v": 1,
  "events": [ { /* event */ }, { /* event */ } ]
}

Streaming

Clients MAY stream events using NDJSON over HTTP with Content-Type: application/ujht-ndjson. For HTTP/1.1, Transfer-Encoding: chunked MAY be used; HTTP/2 and HTTP/3 provide native data framing. Each line MUST contain a single event JSON object.

Optional Server Hints (Non-authoritative)

Servers MAY include response headers that provide non-authoritative hints to clients:

  • UJHT-Step, UJHT-Journey, UJHT-Catalog, UJHT-Variant — suggest step metadata for the next client emission.
  • UJHT-Expected-Next — a comma-separated list of likely next step keys for QA/observability.

Clients MUST NOT rely on hints for hashing or identity; they are advisory only.

WebSocket Subprotocol

Implementations MAY use secure WebSockets (wss://) with a subprotocol token ujht.v1. Messages are UTF-8 JSON; both single events and batches are allowed. Clear-text ws:// MUST NOT be used in production. Clients MUST NOT open a WebSocket when a Do Not Track or equivalent opt-out signal is enabled.

User Preference Signaling

  • Clients MUST suppress UJHT when navigator.doNotTrack === "1" or the UA signals a comparable setting.
  • Clients MUST NOT transmit an explicit "DNT notice" event solely to announce Do Not Track status.
  • Servers SHOULD also respect the HTTP DNT: 1 header to disable any server-side analytics logic associated with UJHT.
  • Deployments SHOULD honor additional user preference and privacy signals exposed by the user agent or applicable regulation when they are clearly equivalent to an opt-out of tracking.

Server Processing Model

Validation

  • Servers MUST validate version, types, and formats; SHOULD detect and mitigate payloads containing obvious PII patterns (for example, email addresses, telephone numbers, or account identifiers) by rejecting the event (e.g., with 400) or sanitizing offending fields and annotating them; and MUST index prev for stitching.
  • Servers MUST NOT derive or require access to the session seed, or mint surrogate hashes intended to replace curr/prev. The construction of the hash chain is a client responsibility.
  • Servers MAY normalize payload.ref to origin-only when misconfigured clients send full URLs.

Storage vs. Streaming

  • Servers MAY store raw events as edges (curr, prev, path, event, ts, ref, received_at).
  • Servers MAY stream events to downstream processors (for example, OTLP/HTTP, Kafka). Controllers MUST ensure downstream processors adhere to § Privacy Requirements and do not perform cross-session or cross-site re-identification using UJHT data.
  • When streaming without storage, servers SHOULD implement back-pressure and transient buffering to avoid data loss.

Stitching

To reconstruct a journey, start from any event with prev = "" and follow successive events where prev equals the predecessor’s curr, ordered by ts (tie-break by received_at). Servers MUST tolerate out-of-order arrival and ignore duplicates by curr.

If a predecessor event is missing (that is, there is no event whose curr matches a given prev), servers MUST treat the first seen event as the start of a new reconstructable chain for analytics purposes and MAY annotate such events with an anomaly token (for example, missing_predecessor).

Catalog Validation & Anomalies

When a Journey Catalog is enabled and x-step is present:

  • Servers MUST verify that x-step exists in the active catalog version.
  • For non-root events, servers SHOULD verify that the transition from the predecessor’s x-step to the current x-step is allowed by the catalog. If the predecessor x-step is unknown or missing, validation MAY be skipped for that edge.
  • Servers MUST NOT drop events solely due to catalog mismatches; instead they SHOULD annotate with an anomaly flag and proceed.

Recommended anomaly tokens (attached as an implementation-defined note, for example, x-anomaly): catalog_unknown_step, invalid_transition, catalog_version_mismatch, missing_predecessor.

Journey Catalog (Control Plane)

The Journey Catalog defines the semantic model of a journey independent of client hashing.

Structure

{
  "version": "2025-08-04T12:00Z",
  "steps": {
    "onboarding.enter_email@v3": {
      "name": "Enter Email",
      "allowNext": ["onboarding.verify_email@v2"]
    },
    "onboarding.verify_email@v2": {
      "name": "Verify Email",
      "allowNext": ["onboarding.welcome@v1"]
    },
    "onboarding.welcome@v1": {
      "name": "Welcome"
    }
  }
}

Requirements

  • Controllers MUST maintain a version identifier and SHOULD publish signed snapshots via CI.
  • Servers MAY load the catalog from code, config, or a service. Clients MUST NOT fetch or rely on the catalog directly.
  • If catalog validation is enforced, clients MUST include x-step and SHOULD include x-catalog.
  • Step keys, journey names, catalog versions, and related control-plane identifiers MUST NOT embed application-specific user identifiers, hashed or encrypted PII, or per-user unique tokens. They SHOULD be low-cardinality tokens defined at design or release time, not per user.

Interoperability and OpenTelemetry Mapping

Media Types

  • application/ujht+json — single event.
  • application/ujht-batch+json — batch envelope.
  • application/ujht-ndjson — line-delimited stream of events.

OpenTelemetry (OTel) Mapping (Informative)

UJHT events can be mapped to OTel spans for path analysis:

  • trace_id — derived from the root event’s curr: first 16 bytes (32 hex chars).
  • span_id — first 8 bytes (16 hex chars) of curr.
  • parent_span_id — first 8 bytes of prev (empty for root).
  • namepayload.event or payload.path.
  • start/end time — from ts and optional dwell (x-dwellMs).
  • attributesujht.path, ujht.ref, and any x-* keys (for example, ujht.x-step, ujht.x-journey, ujht.x-catalog, ujht.x-variant).

All spans belonging to a single reconstructed journey MUST reuse the same trace_id derived from that journey’s root event.

When exporting to OTLP, controllers MUST ensure the truncation does not create cross-session linkability beyond the session chain and MUST NOT enrich UJHT-derived traces with stable user identifiers or identifiers derived from PII.

Privacy Requirements

  1. No persistent identifiers and no fingerprinting.
    • Clients MUST NOT use cookies, localStorage, IndexedDB, or other persistent state to store UJHT seeds, hashes, or identifiers. Session bridging MAY use sessionStorage.
    • Controllers and processors MUST NOT use UJHT events, alone or in combination with other signals, to construct cross-session or cross-site device fingerprints or stable user identifiers.
  2. Seed confidentiality.
    • The session seed S MUST remain client-confidential. Servers MUST NOT derive, request, or compute event hashes on behalf of the client.
    • Seeds MUST be fresh per session and MUST NOT be reused across independent sessions or sites.
  3. Data minimization.
    • ts values SHOULD be rounded to a coarser granularity (for example, whole seconds or tens of seconds) appropriate to the analytics use case.
    • ref MUST be origin-only or empty (no path, query, or fragment).
    • path MUST NOT encode PII or application-specific account identifiers.
    • Values of x-* MUST NOT contain PII, hashed or encrypted PII, or globally unique identifiers, and SHOULD be short, low-cardinality tokens.
  4. Transparency and control.
    • Deployments MUST disclose UJHT usage in an appropriate privacy notice and SHOULD honor opt-out mechanisms, including Do Not Track and equivalent user preference signals.
    • This specification does not determine the legal classification of UJHT data; controllers are responsible for ensuring that deployments comply with applicable data protection and privacy laws.
  5. Retention & purpose limitation.
    • Servers SHOULD define short retention periods for raw edges and favor aggregates.
    • UJHT data MUST be used solely for privacy-preserving journey analytics and UX improvement.
    • Controllers and processors MUST NOT repurpose UJHT events for authentication, advertising profiles, or cross-session tracking by joining them with stable identifiers (for example, account IDs, email addresses, or loyalty IDs) except in forms that are demonstrably aggregated and non-identifying.

Security Considerations

  • Collision resistance. HMAC-SHA-256 provides strong second-preimage resistance; chain rewrites are infeasible without the seed.
  • Integrity and trust. Servers cannot verify client honesty; UJHT is observational. Implementations should implement anomaly detection and rate limiting.
  • Replay. Treat curr as unique; ignore duplicates.
  • Transport. Use HTTPS/WSS; enforce modern TLS; set CSP to reduce script injection risk.
  • Control plane separation. The Journey Catalog defines semantics but MUST NOT introduce identifiers that enable cross-session re-identification. Step keys, journey names, and related metadata MUST NOT embed account IDs, email addresses, telephone numbers, or other direct identifiers, nor hashed or encrypted forms of such identifiers intended to be stable per user.

JSON Schema (Informative)

The wire format is unchanged. The schema below remains valid for v = 1; deployments may additionally lint x-* fields to token patterns and max lengths.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.org/schemas/ujht-event.json",
  "title": "UJHT Event v1",
  "type": "object",
  "required": ["v", "curr", "prev", "payload"],
  "properties": {
    "v": { "const": 1 },
    "curr": { "type": "string", "pattern": "^[0-9a-f]{64}$" },
    "prev": { "type": "string", "pattern": "^$|^[0-9a-f]{64}$" },
    "payload": {
      "type": "object",
      "required": ["path", "event", "ts", "ref"],
      "properties": {
        "path": { "type": "string", "minLength": 1 },
        "event": { "type": "string", "minLength": 1 },
        "ts": { "type": "integer", "minimum": 0 },
        "ref": { "type": "string" }
      },
      "additionalProperties": true
    }
  },
  "additionalProperties": false
}

Recommended lint (non-normative): keys starting with x- have values of type string, ≤256 chars, matching ^[A-Za-z0-9._:-]+$.