OpenUJI Specs
UJHST 1.0-draft

User Journey Hashed Step Tracking (UJHST) 1.0 — Editor’s Draft (Hybrid Control-Plane Update)

Short name: ujhst • Status: Editor’s Draft

User Journey Hashed Step Tracking (UJHST) defines a privacy-preserving mechanism for measuring a visitor’s in-session navigation path ("user journey") without cookies, local persistent identifiers, or device fingerprinting. UJHST relies on an ephemeral, client-held secret seed to compute a one-way hash chain across sequential events. Each transmitted event contains only the current hash, the previous hash, and a minimal payload. Servers stitch events into journeys by matching prev → curr edges without learning or inferring a stable user identifier.

This draft additionally specifies a Journey Catalog control plane. The catalog defines step keys and allowed transitions (edges). Clients emit human-readable step hints via namespaced x-* payload members; servers validate against the catalog while remaining agnostic to the client’s secret seed.

This is an Editor’s Draft and may change at any time. Feedback is welcome via the issue tracker.

Introduction

Traditional web analytics depend on cookies or fingerprinting to recognize visitors across page loads; these approaches raise privacy and regulatory concerns. UJHST provides a session-scoped alternative that avoids persistent identifiers while enabling ordered in-session path analysis for MPAs and SPAs. The hybrid control-plane model lets design and product stakeholders evolve a journey graph in parallel with implementation while the client independently computes the cryptographic chain.

Conformance

The key words MUST, MUST NOT, SHOULD, and MAY are to be interpreted as described in RFC 2119 and RFC 8174.

Conformance classes:

Terminology

Client
User agent executing site JavaScript.
Server
First-party collection endpoint receiving UJHST events.
Session
Single, ephemeral browsing context (tab/window lifetime).
Seed (S)
Cryptographically random secret generated per session and retained only client-side.
Event
Minimal record describing a step in the journey.
Hash chain
Sequence where each event hash commits to its payload and the previous hash.
Journey Catalog
A versioned server-side registry of step keys and allowed edges (control plane).
Step key
Human-readable identifier for a journey step (e.g., onboarding.enter_email@v3).

Data Model

Event Object

Each event sent by the client MUST be a JSON object:

{
  "v": 1,
  "curr": "<hex-encoded hash>",
  "prev": "<hex-encoded hash or empty string>",
  "payload": {
    "path": "/example",
    "event": "view",
    "ts": 1734373200000,
    "ref": "https://example.org",
    "x-step": "onboarding.enter_email@v3",
    "x-journey": "onboarding",
    "x-catalog": "2025-08-04T12:00Z",
    "x-variant": "A"
  }
}

Fields

  • v — protocol version (this spec defines 1).
  • curr — lowercase hex of current event hash hi.
  • prev — lowercase hex of previous event hash hi-1 or empty string.
  • payload — minimal event data. Required members:
    • path — normalized page path or route identifier; MUST NOT include PII.
    • event — event type token (e.g., "view", "action:signup").
    • ts — Unix epoch ms; SHOULD be rounded (see § Privacy Requirements).
    • ref — referrer origin or empty string.

Additional namespaced members under payload whose keys begin with x- are permitted; servers MUST ignore unknown members. When a Journey Catalog is enabled (§ Journey Catalog), clients SHOULD include:

  • x-step — step key for this event. MUST be included if the deployment enforces catalog validation.
  • x-journey — optional journey/funnel name.
  • x-catalog — catalog version identifier (e.g., timestamp or semantic version).
  • x-variant — optional experiment or variant label.

Values of x-* members SHOULD be short ASCII tokens (max 256 chars) and MUST NOT contain PII.

Canonicalization

Clients MUST canonicalize payload using the JSON Canonicalization Scheme (JCS) or an equivalent scheme that produces a stable byte sequence.

CANON(payload) || "|" || prev

Hash Algorithm

Clients MUST compute hi = HMAC_SHA256(S, CANON(payload) || "|" || hi-1) and encode as lowercase hex. Seeds MUST be generated with at least 128 bits of entropy (256 bits recommended) and MUST NOT be transmitted, persisted, or shared across sessions.

Algorithm agility: Clients MAY support additional algorithms (e.g., HMAC-SHA-512, BLAKE2) while continuing to support HMAC-SHA-256.

Client Processing Model

  1. On session start, generate a fresh seed S and derive an HMAC key. A session is the lifetime of a tab/window.
  2. Step discovery (hybrid control-plane): determine step metadata using one or more of:
    • HTML meta tags (ujhst-step, ujhst-journey, ujhst-catalog, ujhst-variant),
    • server response headers (UJHST-Step, UJHST-Journey, UJHST-Catalog, UJHST-Variant),
    • application configuration (e.g., route table constants).
    Clients MUST NOT fetch the catalog; step keys are opaque tokens for the client.
  3. For each step, construct and canonicalize payload, compute curr, transmit the event, and update prev ← curr.
  4. Implementations MAY store prev in sessionStorage to bridge navigations; they MUST NOT write cookies, localStorage, IndexedDB, or other persistent state for UJHST.
  5. Clients MUST suppress all UJHST transmission when a Do Not Track signal is enabled (see § User Preference Signaling).

Implementation note: Clients SHOULD emit event: "view" on route changes (SPA) and significant page loads (MPA), and event: "action:*" for key user actions.

Transport

TLS and HTTP Versions

  • All UJHST transmissions MUST use TLS (HTTPS) with TLS 1.2 or newer. Clear-text HTTP MUST NOT be used in production. For development, http://localhost is permitted.
  • HTTP/1.1, HTTP/2, and HTTP/3 are supported. Implementations SHOULD prefer HTTP/2 or HTTP/3 where available.

HTTP Interface

Method: POST • Headers: Content-Type: application/ujhst+json • Body: Event object (see Event Object). Servers MUST treat curr as an idempotency key. Responses: 204 on success; 400, 413, 429, 5xx as appropriate.

For reliability on unload, clients SHOULD use navigator.sendBeacon(). Fallbacks include asynchronous fetch() with keepalive.

Batching

Clients MAY send a batch envelope with media type application/ujhst-batch+json containing an array of events, ordered or unordered. Servers MUST accept out-of-order events and deduplicate by curr.

{
  "v": 1,
  "events": [ { /* event */ }, { /* event */ } ]
}

Streaming

Clients MAY stream events using NDJSON over HTTP with Content-Type: application/ujhst-ndjson. For HTTP/1.1, Transfer-Encoding: chunked MAY be used; HTTP/2 and HTTP/3 provide native data framing. Each line MUST contain a single event JSON object.

Optional Server Hints (Non-authoritative)

Servers MAY include response headers that provide non-authoritative hints to clients:

  • UJHST-Step, UJHST-Journey, UJHST-Catalog, UJHST-Variant — suggest step metadata for the next client emission.
  • UJHST-Expected-Next — a comma-separated list of likely next step keys for QA/observability.

Clients MUST NOT rely on hints for hashing or identity; they are advisory only.

WebSocket Subprotocol

Implementations MAY use secure WebSockets (wss://) with a subprotocol token ujhst.v1. Messages are UTF-8 JSON; both single events and batches are allowed. Clear-text ws:// MUST NOT be used in production. Clients MUST NOT open a WebSocket when a Do Not Track signal is enabled.

User Preference Signaling

  • Clients MUST suppress UJHST when navigator.doNotTrack === "1" or the UA signals a comparable setting.
  • Clients MUST NOT transmit an explicit "DNT notice" event solely to announce Do Not Track status.
  • Servers SHOULD also respect the HTTP DNT: 1 header to disable any server-side analytics logic.

Server Processing Model

Validation

  • Servers MUST validate version, types, and formats; MUST reject payloads containing obvious PII patterns; and MUST index prev for stitching.
  • Servers MUST NOT compute client event hashes, derive or require access to the session seed, or mint surrogate hashes intended to replace curr/prev.
  • Servers MAY normalize payload.ref to origin-only.

Storage vs. Streaming

  • Servers MAY store raw events as edges (curr, prev, path, event, ts, ref, received_at).
  • Servers MAY stream events to downstream processors (e.g., OTLP/HTTP, Kafka). Controllers MUST ensure downstream processors adhere to § Privacy Requirements and do not perform cross-session re-identification.
  • When streaming without storage, servers SHOULD implement back-pressure and transient buffering to avoid data loss.

Stitching

To reconstruct a journey, start from any event with prev = "" and follow successive events where prev equals the predecessor’s curr, ordered by ts (tie-break by received_at). Servers MUST tolerate out-of-order arrival and ignore duplicates by curr.

Catalog Validation & Anomalies

When a Journey Catalog is enabled and x-step is present:

  • Servers MUST verify that x-step exists in the active catalog version.
  • For non-root events, servers SHOULD verify that the transition from the predecessor’s x-step to the current x-step is allowed by the catalog. If the predecessor x-step is unknown or missing, validation MAY be skipped for that edge.
  • Servers MUST NOT drop events solely due to catalog mismatches; instead they SHOULD annotate with an anomaly flag and proceed.

Recommended anomaly tokens (attached as an implementation-defined note, e.g., x-anomaly): catalog_unknown_step, invalid_transition, catalog_version_mismatch.

Journey Catalog (Control Plane)

The Journey Catalog defines the semantic model of a journey independent of client hashing.

Structure

{
  "version": "2025-08-04T12:00Z",
  "steps": {
    "onboarding.enter_email@v3": {
      "name": "Enter Email",
      "allowNext": ["onboarding.verify_email@v2"]
    },
    "onboarding.verify_email@v2": {
      "name": "Verify Email",
      "allowNext": ["onboarding.welcome@v1"]
    },
    "onboarding.welcome@v1": {
      "name": "Welcome"
    }
  }
}

Requirements

  • Controllers MUST maintain a version identifier and SHOULD publish signed snapshots via CI.
  • Servers MAY load the catalog from code, config, or a service. Clients MUST NOT fetch or rely on the catalog directly.
  • If catalog validation is enforced, clients MUST include x-step and SHOULD include x-catalog.

Interoperability and OpenTelemetry Mapping

Media Types

  • application/ujhst+json — single event.
  • application/ujhst-batch+json — batch envelope.
  • application/ujhst-ndjson — line-delimited stream of events.

OpenTelemetry (OTel) Mapping (Informative)

UJHST events can be mapped to OTel spans for path analysis:

  • trace_id — derived from the root event’s curr: first 16 bytes (32 hex chars).
  • span_id — first 8 bytes (16 hex chars) of curr.
  • parent_span_id — first 8 bytes of prev (empty for root).
  • namepayload.event or payload.path.
  • start/end time — from ts and optional dwell (x-dwellMs).
  • attributesujhst.path, ujhst.ref, and any x-* keys (e.g., ujhst.x-step, ujhst.x-journey, ujhst.x-catalog, ujhst.x-variant).

When exporting to OTLP, controllers MUST ensure the truncation does not create cross-session linkability beyond the session chain and MUST NOT enrich with stable user identifiers.

Privacy Requirements

  1. No persistent identifiers. Clients MUST NOT use cookies, localStorage, IndexedDB, or other persistent state. Session bridging MAY use sessionStorage.
  2. Seed confidentiality. The session seed S MUST remain client-confidential. Servers MUST NOT derive, request, or compute event hashes on behalf of the client.
  3. Data minimization. ts SHOULD be rounded; ref MUST be origin-only; paths MUST NOT encode PII; values of x-* MUST NOT contain PII.
  4. Transparency and control. Deployments MUST disclose UJHST usage and SHOULD honor opt-out mechanisms, including Do Not Track.
  5. Retention & purpose limitation. Servers SHOULD define short retention for raw edges and favor aggregates. Data MUST be used solely for anonymous journey analytics and UX improvement.

Security Considerations

  • Collision resistance. HMAC-SHA-256 provides strong second-preimage resistance; chain rewrites are infeasible without the seed.
  • Integrity and trust. Servers cannot verify client honesty; UJHST is observational. Implement anomaly detection and rate limiting.
  • Replay. Treat curr as unique; ignore duplicates.
  • Transport. Use HTTPS/WSS; enforce modern TLS; set CSP to reduce script injection risk.
  • Control plane separation. The Journey Catalog defines semantics but MUST NOT introduce identifiers that enable cross-session re-identification.

JSON Schema (Informative)

The wire format is unchanged. The schema below remains valid; deployments may additionally lint x-* fields to token patterns and max lengths.

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://example.org/schemas/ujhst-event.json",
  "title": "UJHST Event v1",
  "type": "object",
  "required": ["v", "curr", "prev", "payload"],
  "properties": {
    "v": { "const": 1 },
    "curr": { "type": "string", "pattern": "^[0-9a-f]{64}$" },
    "prev": { "type": "string", "pattern": "^$|^[0-9a-f]{64}$" },
    "payload": {
      "type": "object",
      "required": ["path", "event", "ts", "ref"],
      "properties": {
        "path": { "type": "string", "minLength": 1 },
        "event": { "type": "string", "minLength": 1 },
        "ts": { "type": "integer", "minimum": 0 },
        "ref": { "type": "string" }
      },
      "additionalProperties": true
    }
  },
  "additionalProperties": false
}

Recommended lint (non-normative): keys starting with x- have values of type string, ≤256 chars, matching ^[A-Za-z0-9._:-]+$.