Parrhesia Relay Sync

Copy Markdown

1. Purpose

This document defines the Parrhesia proposal for relay-to-relay event synchronization.

It is intentionally transport-focused:

  • manage remote relay peers,
  • catch up on matching events,
  • keep a live stream open,
  • expose health and basic stats.

It does not define application data semantics.

Parrhesia syncs Nostr events. Callers decide which events matter and how to apply them.


2. Boundary

Parrhesia is responsible for

  • storing and validating events,
  • querying and streaming events,
  • running outbound sync workers against remote relays,
  • tracking peer configuration, worker health, and sync counters,
  • exposing peer management through Parrhesia.API.Sync.

Parrhesia is not responsible for

  • resource mapping,
  • trusted node allowlists for an app profile,
  • mutation payload validation beyond normal event validation,
  • conflict resolution,
  • replay winner selection,
  • database upsert/delete semantics.

For Tribes, those remain in TRIBES-NOSTRSYNC and AshNostrSync.


3. Security Foundation

Default posture

The baseline posture for sync traffic is:

  • no access to sync events by default,
  • no implicit trust from ordinary relay usage,
  • no reliance on plaintext confidentiality from public relays.

For the first implementation, Parrhesia should protect sync data primarily with:

  • authenticated server identities,
  • ACL-gated read and write access,
  • TLS with certificate pinning for outbound peers.

Server identity

Parrhesia owns a low-level server identity used for relay-to-relay authentication.

This identity is separate from:

  • TLS endpoint identity,
  • application event author pubkeys.

Recommended model:

  • Parrhesia has one local server-auth pubkey,
  • sync peers authenticate as server-auth pubkeys,
  • ACL grants are bound to those authenticated server-auth pubkeys,
  • application-level writer trust remains outside Parrhesia.

Identity lifecycle:

  1. use configured/imported key if provided,
  2. otherwise use persisted local identity,
  3. otherwise generate once during initial startup and persist it.

Private key export should not be supported.

ACLs

Sync traffic should use a real ACL layer, not moderation allowlists.

Current implementation note:

  • Parrhesia already has storage-backed moderation state such as allowed_pubkeys and blocked_ips,
  • that is not the sync ACL model,
  • sync protection must be enforced in the active websocket/query/count/negentropy/write path, not inferred from management tables alone.

Initial ACL model:

  • principal: authenticated pubkey,
  • capabilities: sync_read, sync_write,
  • match: event/filter shape such as kinds: [5000] and namespace tags.

This is enough for now. We do not need a separate user ACL model and server ACL model yet.

A sync peer is simply an authenticated principal with sync capabilities.

TLS pinning

Each outbound sync peer must include pinned TLS material.

Recommended pin type:

  • SPKI SHA-256 pins

Multiple pins should be allowed to support certificate rotation.


4. Sync Model

Each configured sync server represents one outbound worker managed by Parrhesia.

Implementation note:

  • Khatru-style relay designs benefit from explicit runtime stages,
  • Parrhesia sync should therefore plug into clear internal phases for connection admission, auth, query/count, subscription, negentropy, publish, and fanout,
  • this should stay a runtime refactor, not become extra sync semantics.

Minimum behavior:

  1. connect to the remote relay,
  2. run an initial catch-up query for the configured filters,
  3. ingest received events into the local relay through the normal API path,
  4. switch to a live subscription for the same filters,
  5. reconnect with backoff when disconnected.

The worker treats filters as opaque Nostr filters. It does not interpret app payloads.

Sync modes

Parrhesia supports two catch-up modes:

  • :req_stream — catch-up via REQ + overlap window, then live REQ subscription.
  • :negentropy_first — attempt NIP-77 negentropy catch-up first, then fetch missing event ids via REQ, then switch to live REQ subscription. Falls back to :req_stream behavior when negentropy is unavailable or fails.

This keeps deployment flexibility while allowing bandwidth-efficient catch-up on trusted links.

Topology and convergence semantics

Parrhesia sync is intentionally a relay sync foundation, not an application convergence engine.

Operationally:

  • use sync peers to define topology (mesh, hub/spoke, or staged rollout),
  • keep per-peer filters narrow and explicit,
  • treat sync health as transport/control-plane health, not proof of app-level convergence.

Delivery expectations:

  • persisted events: practical eventual convergence via reconnect + catch-up,
  • ephemeral events: best-effort only,
  • no global total ordering guarantee across nodes.

NIP-77

Parrhesia now has a real reusable relay-side NIP-77 engine:

  • proper NEG-OPEN / NEG-MSG / NEG-CLOSE / NEG-ERR framing,
  • a reusable negentropy codec and reconciliation engine,
  • bounded local (created_at, id) snapshot enumeration for matching filters,
  • connection/session integration with policy checks and resource limits.

That means NIP-77 can be used for bandwidth-efficient catch-up between trusted nodes.

The sync worker now exposes this as configuration (mode: :req_stream | :negentropy_first) so deployments can choose the operational tradeoff per peer.


5. API Surface

Primary control plane:

These APIs are in-process. HTTP management may expose them through Parrhesia.API.Admin or direct routing to Parrhesia.API.Sync.


6. Server Specification

put_server/2 is an upsert.

Suggested server shape:

%{
  id: "tribes-primary",
  url: "wss://relay-a.example/relay",
  enabled?: true,
  auth_pubkey: "<remote-server-auth-pubkey>",
  mode: :negentropy_first,
  filters: [
    %{
      "kinds" => [5000],
      "#r" => ["tribes.accounts.user", "tribes.chat.tribe"]
    }
  ],
  overlap_window_seconds: 300,
  relay_info_mode: :diagnostic,
  auth: %{
    type: :nip42,
    mode: :on_challenge
  },
  tls: %{
    mode: :required,
    hostname: "relay-a.example",
    ca_certfile: "/etc/tribes/sync-ca.pem",
    client_certfile: "/etc/tribes/node.crt",
    client_keyfile: "/etc/tribes/node.key",
    pins: [
      %{type: :spki_sha256, value: "<pin-a>"}
    ]
  },
  metadata: %{}
}

Required fields:

  • id
  • url
  • auth_pubkey
  • filters
  • tls

Recommended fields:

  • enabled?
  • mode
  • overlap_window_seconds
  • relay_info_mode
  • auth
  • metadata

Rules:

  • id must be stable and unique locally.
  • url is the remote relay websocket URL.
  • auth_pubkey is the expected remote server-auth pubkey.
  • filters must be valid NIP-01 filters.
  • filters are owned by the caller; Parrhesia only validates filter shape.
  • mode supports :req_stream and :negentropy_first; it defaults to :req_stream.
  • relay_info_mode supports :required, :diagnostic, and :disabled; it defaults to :required.
  • auth.mode supports :on_challenge and :disabled; it defaults to :on_challenge.
  • tls.mode defaults to :required.
  • tls.pins are optional and may be combined with dedicated CA trust and client certs.

7. Runtime State

Each server should have both configuration and runtime status.

Suggested runtime fields:

%{
  server_id: "tribes-primary",
  state: :running,
  connected?: true,
  last_connected_at: ~U[2026-03-16 10:00:00Z],
  last_disconnected_at: nil,
  last_sync_started_at: ~U[2026-03-16 10:00:00Z],
  last_sync_completed_at: ~U[2026-03-16 10:00:02Z],
  last_event_received_at: ~U[2026-03-16 10:12:45Z],
  last_eose_at: ~U[2026-03-16 10:00:02Z],
  reconnect_attempts: 0,
  last_error: nil
}

Parrhesia should keep this state generic. It is about relay sync health, not app state convergence.


8. Stats and Health

Per-server stats

server_stats/2 should return basic counters such as:

  • events_received
  • events_accepted
  • events_duplicate
  • events_rejected
  • query_runs
  • subscription_restarts
  • reconnects
  • last_remote_eose_at
  • last_error

Aggregate sync stats

sync_stats/1 should summarize:

  • total configured servers,
  • enabled servers,
  • running servers,
  • connected servers,
  • aggregate event counters,
  • aggregate reconnect count.

Health

sync_health/1 should be operator-oriented, for example:

%{
  "status" => "degraded",
  "servers_total" => 3,
  "servers_connected" => 2,
  "servers_failing" => [
    %{"id" => "tribes-secondary", "reason" => "connection_refused"}
  ]
}

This is intentionally simple. It should answer “is sync working?” without pretending to prove application convergence.


9. Event Ingest Path

Events received from a remote sync worker should enter Parrhesia through the same ingest path as any other accepted event.

That means:

  1. validate the event,
  2. run normal write policy,
  3. persist or reject,
  4. fan out locally,
  5. rely on duplicate-event behavior for idempotency.

This avoids a second ingest path with divergent behavior.

Before normal event acceptance, the sync worker should enforce:

  1. pinned TLS validation for the remote endpoint,
  2. remote server-auth identity match,
  3. local ACL grant permitting the peer to perform sync reads and/or writes.

The sync worker may attach request-context metadata such as:

%Parrhesia.API.RequestContext{
  caller: :sync,
  peer_id: "tribes-primary",
  metadata: %{sync_server_id: "tribes-primary"}
}

Recommended additional context when available:

  • remote_ip
  • subscription_id

This context is for telemetry, policy, and audit only. It must not become app sync semantics.


10. Persistence

Parrhesia persists enough sync control-plane state to survive restart:

  • local server identity reference,
  • configured ACL rules for sync principals,
  • configured sync servers (sync_servers table),
  • per-server sync runtime snapshot (sync_server_runtime table), including cursor/watermark and basic health counters.

This persistence is controlled by :sync.persist_state? (PARRHESIA_SYNC_PERSIST_STATE) and is enabled by default.

Parrhesia does not persist application replay heads or winner state. That remains in the embedding application.


11. Relationship to Runtime Features

Cross-node sync data plane

Parrhesia provides cross-node relay sync primitives through Parrhesia.API.Sync workers.

Local in-node fanout remains process-local (Parrhesia.Fanout.Dispatcher + subscription index). Cross-node event convergence is handled by authenticated relay-to-relay sync.

Management stats

Current admin stats is relay-global and minimal.

Sync adds a new dimension:

  • peer config,
  • worker state,
  • per-peer counters,
  • sync health summary.

That should be exposed without coupling it to app-specific sync semantics.


12. Tribes Usage

For Tribes, AshNostrSync should be able to:

  1. rely on Parrhesia’s local server identity,
  2. register one or more remote relays with Parrhesia.API.Sync.put_server/2,
  3. grant sync ACLs for trusted server-auth pubkeys,
  4. provide narrow Nostr filters for kind: 5000,
  5. observe sync health and counters,
  6. consume events via the normal local Parrhesia ingest/query/stream surface.

Tribes should not need Parrhesia to know:

  • what a resource namespace means,
  • which node pubkeys are trusted for Tribes,
  • how to resolve conflicts,
  • how to apply an upsert or delete.

That is the key boundary.