# `EtherCAT`
[🔗](https://github.com/sid2baker/ethercat/blob/main/lib/ethercat.ex#L1)

Public API for the EtherCAT master runtime.

`EtherCAT` is the entry point for a master-owned session lifecycle:

- `Master` owns startup, activation-blocked startup, and recovery policy
- `Domain` owns cyclic LRW exchange and the logical PDO image
- `Slave` owns ESM transitions and slave-local configuration
- `DC` owns distributed-clock initialization and runtime maintenance
- `Bus` serializes all frame I/O

Runtime footprint is intentionally small: no NIFs, no kernel module, and a
minimal runtime dependency surface. The bus uses raw sockets, sysfs, and OTP
directly, with `:telemetry` as the only runtime Hex dependency.

Public session lifecycle is exposed through `state/0`:

- `:idle`
- `:discovering`
- `:awaiting_preop`
- `:preop_ready`
- `:deactivated`
- `:operational`
- `:activation_blocked`
- `:recovering`

`await_running/1` waits for a usable session. Startup/activation paths drain
startup bus traffic before reporting ready, so the first mailbox or OP
exchange starts from a clean transport state. `await_operational/1` waits for
cyclic OP. Per-slave health is exposed through `slaves/0`.

## Runtime Lifecycle

This is the user-facing `state/0` lifecycle. It shows the main session flow
without trying to encode every exact `Master` state transition in one dense
graph.

```mermaid
flowchart TD
    A[start/1] --> B[discovering]
    B --> C[awaiting_preop]
    B -->|startup fails or stop/0| Z[idle]
    C -->|configured slaves become usable in PREOP| D{activation requested?}
    C -->|timeout, fatal startup failure, or stop/0| Z

    D -->|no| E[preop_ready]
    D -->|yes, target reached| F[operational]
    D -->|yes, transition incomplete| G[activation_blocked]

    E -->|activate/0 succeeds| F
    E -->|activate/0 is incomplete| G

    F -->|deactivate/0 to SAFEOP| H[deactivated]
    F -->|deactivate/0 to PREOP| E

    E -->|critical runtime fault| I[recovering]
    H -->|critical runtime fault| I
    F -->|critical runtime fault| I
    G -->|runtime faults remain after activation retry| I

    G -->|retry reaches OP| F
    G -->|retry settles in SAFEOP| H
    G -->|retry settles in PREOP| E

    I -->|faults cleared, target OP| F
    I -->|faults cleared, target SAFEOP| H
    I -->|faults cleared, target PREOP| E

    E -->|stop/0 or fatal exit| Z
    H -->|stop/0 or fatal exit| Z
    F -->|stop/0 or fatal exit| Z
    G -->|stop/0 or fatal exit| Z
    I -->|recovery fails or stop/0| Z
```

Physical link loss normally appears here as a runtime `:recovering` transition,
not an immediate return to `:idle`. `:idle` is reserved for explicit stop,
startup failure, bus-process exit, or fatal policy. For the exact master-side
state semantics, read `EtherCAT.Master`.

## Startup Sequence

```mermaid
sequenceDiagram
    autonumber
    participant App
    participant EtherCAT
    participant Bus
    participant Slaves
    participant DC

    App->>EtherCAT: start/1
    EtherCAT->>Bus: discover ring, assign stations, verify link
    opt DC is configured
        EtherCAT->>DC: initialize clocks
    end
    EtherCAT->>Slaves: start configured slaves
    Slaves->>Bus: reach PREOP through SII, mailbox, and PDO setup
    Slaves-->>EtherCAT: report ready at PREOP
    opt activation is requested and possible
        EtherCAT->>Bus: start cyclic domains
        opt DC runtime is available
            EtherCAT->>DC: start runtime maintenance
            opt DC lock is required
                EtherCAT->>DC: wait for lock
            end
        end
        EtherCAT->>Slaves: request SAFEOP then OP
    end
    EtherCAT-->>App: preop_ready, activation_blocked, or operational
```

## Usage

    EtherCAT.start(
      interface: "eth0",
      dc: %EtherCAT.DC.Config{
        cycle_ns: 1_000_000,
        await_lock?: true,
        lock_policy: :recovering
      },
      domains: [
        %EtherCAT.Domain.Config{id: :main, cycle_time_us: 1_000}
      ],
      slaves: [
        %EtherCAT.Slave.Config{name: :coupler},
        %EtherCAT.Slave.Config{
          name: :sensor,
          driver: MyApp.EL1809,
          process_data: {:all, :main}
        },
        %EtherCAT.Slave.Config{
          name: :valve,
          driver: MyApp.EL2809,
          process_data: {:all, :main}
        }
      ]
    )

    :ok = EtherCAT.await_running()

    EtherCAT.subscribe(:sensor, :ch1)   # receive {:ethercat, :signal, :sensor, :ch1, value}
    EtherCAT.write_output(:valve, :ch1, 1)

    EtherCAT.deactivate()
    EtherCAT.stop()

## Dynamic PREOP Configuration

    EtherCAT.start(
      interface: "eth0",
      domains: [%EtherCAT.Domain.Config{id: :main, cycle_time_us: 1_000}]
    )

    :ok = EtherCAT.await_running()

    :ok =
      EtherCAT.configure_slave(
        :slave_1,
        driver: MyApp.EL1809,
        process_data: {:all, :main},
        target_state: :op
      )

    :ok = EtherCAT.activate()
    :ok = EtherCAT.await_operational()

## Sub-modules

`EtherCAT.Master`, `EtherCAT.Slave`, `EtherCAT.Domain`, `EtherCAT.DC`,
`EtherCAT.Bus` — public runtime boundaries plus direct frame transactions.

# `domain_cycle_health`

```elixir
@type domain_cycle_health() :: :healthy | {:invalid, term()}
```

Cycle health reported by `domain_info/1`.

# `domain_freshness_info`

```elixir
@type domain_freshness_info() :: %{
  state: :not_ready | :fresh | :stale,
  refreshed_at_us: integer() | nil,
  age_us: non_neg_integer() | nil,
  stale_after_us: pos_integer()
}
```

Domain freshness snapshot reported in `domain_info/1`.

# `domain_info`

```elixir
@type domain_info() :: %{
  id: EtherCAT.Domain.domain_id(),
  cycle_time_us: pos_integer(),
  state: domain_runtime_state(),
  cycle_count: non_neg_integer(),
  miss_count: non_neg_integer(),
  total_miss_count: non_neg_integer(),
  cycle_health: domain_cycle_health(),
  logical_base: non_neg_integer(),
  image_size: non_neg_integer(),
  expected_wkc: non_neg_integer(),
  freshness: domain_freshness_info(),
  last_cycle_started_at_us: integer() | nil,
  last_cycle_completed_at_us: integer() | nil,
  last_valid_cycle_at_us: integer() | nil,
  last_invalid_cycle_at_us: integer() | nil,
  last_invalid_reason: term() | nil
}
```

Detailed snapshot returned by `domain_info/1`.

# `domain_runtime_state`

```elixir
@type domain_runtime_state() :: :open | :cycling | :stopped
```

Runtime state reported in `domain_info/1`.

# `domain_summary`

```elixir
@type domain_summary() :: {EtherCAT.Domain.domain_id(), pos_integer(), pid()}
```

Compact domain summary returned by `domains/0`.

# `master_query_error`

```elixir
@type master_query_error() ::
  {:error, :not_started | :timeout | {:server_exit, term()}}
```

Local wrapper errors returned when a synchronous master query cannot complete.

These are transport-level API failures, not session lifecycle states.

# `master_query_result`

```elixir
@type master_query_result(value) :: {:ok, value} | master_query_error()
```

Successful query value wrapped with `:ok`, or a local master query error.

# `session_state`

```elixir
@type session_state() ::
  :idle
  | :discovering
  | :awaiting_preop
  | :preop_ready
  | :deactivated
  | :operational
  | :activation_blocked
  | :recovering
```

Public master session states returned by `state/0` once the local query
succeeds.

# `signal_direction`

```elixir
@type signal_direction() :: :input | :output
```

Direction of a registered process-data signal.

# `slave_al_state`

```elixir
@type slave_al_state() :: :init | :preop | :safeop | :op
```

AL state reported in `slave_info/1`.

# `slave_attachment_summary`

```elixir
@type slave_attachment_summary() :: %{
  domain: EtherCAT.Domain.domain_id(),
  sm_index: non_neg_integer(),
  direction: signal_direction(),
  logical_address: non_neg_integer() | nil,
  sm_size: non_neg_integer() | nil,
  signal_count: non_neg_integer(),
  signals: [atom()]
}
```

SyncManager attachment summary reported in `slave_info/1`.

# `slave_esc_info`

```elixir
@type slave_esc_info() :: %{
  fmmu_count: non_neg_integer(),
  sm_count: non_neg_integer()
}
```

ESC capability snapshot reported in `slave_info/1`.

# `slave_identity`

```elixir
@type slave_identity() :: %{
  vendor_id: non_neg_integer(),
  product_code: non_neg_integer(),
  revision: non_neg_integer(),
  serial_number: non_neg_integer()
}
```

Identity snapshot reported in `slave_info/1`.

# `slave_info`

```elixir
@type slave_info() :: %{
  name: atom(),
  station: non_neg_integer(),
  al_state: slave_al_state(),
  identity: slave_identity() | nil,
  esc: slave_esc_info() | nil,
  driver: module() | nil,
  coe: boolean(),
  available_fmmus: non_neg_integer() | nil,
  used_fmmus: non_neg_integer(),
  attachments: [slave_attachment_summary()],
  pdo_health: map(),
  signals: [slave_signal_summary()],
  configuration_error: term() | nil
}
```

Detailed snapshot returned by `slave_info/1`.

# `slave_signal_summary`

```elixir
@type slave_signal_summary() :: %{
  name: atom(),
  domain: EtherCAT.Domain.domain_id(),
  direction: signal_direction(),
  sm_index: non_neg_integer(),
  bit_offset: non_neg_integer(),
  bit_size: pos_integer()
}
```

Signal registration summary reported in `slave_info/1`.

# `slave_summary`

```elixir
@type slave_summary() :: %{
  name: atom(),
  station: non_neg_integer(),
  server: :gen_statem.server_ref(),
  pid: pid() | nil,
  fault: term() | nil
}
```

Compact slave summary returned by `slaves/0`.

# `activate`

```elixir
@spec activate() :: :ok | {:error, term()}
```

Start cyclic operation after dynamic PREOP configuration.

This starts the DC runtime, starts all domains cycling, and advances all slaves
whose `target_state` is `:op`.

# `await_dc_locked`

```elixir
@spec await_dc_locked(timeout_ms :: pos_integer()) :: :ok | {:error, term()}
```

Wait for DC lock.

Returns `:ok` once the active DC runtime reports `:locked`.

# `await_operational`

```elixir
@spec await_operational(timeout_ms :: pos_integer()) :: :ok | {:error, term()}
```

Block until the master reaches operational cyclic runtime, then return `:ok`.

This is stricter than `await_running/1`: `:preop_ready` and `:deactivated`
are not enough.

# `await_running`

```elixir
@spec await_running(timeout_ms :: pos_integer()) :: :ok | {:error, term()}
```

Block until the master reaches a usable session state, then return `:ok`.

Returns `{:error, :timeout}` if startup does not complete within `timeout_ms` ms.
Returns `{:error, :not_started}` if `start/1` has not been called.
Returns startup degradation or runtime recovery errors if the session is not
currently usable.

# `bus`

```elixir
@spec bus() :: master_query_result(EtherCAT.Bus.server() | nil)
```

Return the stable bus server reference for direct frame transactions.

Returns `{:ok, bus}` while the session owns a running bus process.
Returns `{:ok, nil}` if the master process exists but the bus subsystem is
not currently running, such as after the session has settled back to `:idle`.

Returns `{:error, :not_started}` if the master does not exist,
`{:error, :timeout}` if the local master call itself times out,
and `{:error, {:server_exit, reason}}` if the local master dies mid-call.

# `configure_slave`

```elixir
@spec configure_slave(atom(), keyword() | EtherCAT.Slave.Config.t()) ::
  :ok | {:error, term()}
```

Configure a discovered slave while the session is still in PREOP.

This is the dynamic counterpart to providing `%EtherCAT.Slave.Config{}` entries
up front in `start/1`.

# `dc_status`

```elixir
@spec dc_status() :: master_query_result(EtherCAT.DC.Status.t())
```

Return a Distributed Clocks status snapshot for the current session.

Returns `{:ok, status}` on success.

Returns `{:error, :not_started}` if the master process does not exist.
Returns `{:error, :timeout}` if the local master call itself times out.

# `deactivate`

```elixir
@spec deactivate(:safeop | :preop) :: :ok | {:error, term()}
```

Leave OP while keeping the session alive.

`deactivate/0` settles the runtime in SAFEOP by default. Use
`deactivate(:preop)` when you need to re-enter PREOP for reconfiguration.

# `domain_info`

```elixir
@spec domain_info(atom()) ::
  {:ok, domain_info()}
  | {:error, :not_found | :timeout | {:server_exit, term()}}
```

Return a diagnostic snapshot for a domain.

Keys:
  - `:id` — domain atom identifier
  - `:cycle_time_us` — current cycle period in microseconds
  - `:state` — `:open | :cycling | :stopped`
  - `:cycle_count` — successful LRW cycles since last start
  - `:miss_count` — consecutive missed cycles (resets on success)
  - `:total_miss_count` — cumulative missed cycles since last start
  - `:logical_base` — current LRW logical start address for this domain image
  - `:image_size` — PDO image size in bytes
  - `:expected_wkc` — expected working counter for a healthy bus
  - `:freshness` — cached-input freshness window derived from the domain cycle time

## Example

    iex> EtherCAT.domain_info(:main)
    {:ok, %{
      id: :main,
      cycle_time_us: 1_000,
      state: :cycling,
      cycle_count: 12345,
      miss_count: 0,
      total_miss_count: 2,
      logical_base: 0,
      image_size: 4,
      expected_wkc: 3,
      freshness: %{state: :fresh, refreshed_at_us: 1_234_000, age_us: 250, stale_after_us: 3_000}
    }}

# `domains`

```elixir
@spec domains() :: master_query_result([domain_summary()])
```

Return `{:ok, [{id, cycle_time_us, pid}]}` for all running domains.

Returns `{:error, :not_started}` if the master does not exist and
`{:error, :timeout}` if the local master call itself times out.

# `download_sdo`

```elixir
@spec download_sdo(atom(), non_neg_integer(), non_neg_integer(), binary()) ::
  :ok | {:error, term()}
```

Download a CoE SDO value to a slave mailbox object entry.

This is a blocking acyclic mailbox transfer and is only valid once the slave
mailbox is configured, typically from PREOP onward.

# `last_failure`

```elixir
@spec last_failure() :: master_query_result(map() | nil)
```

Return the last terminal startup/runtime failure retained after the master
returned to `:idle`.

Returns `{:ok, failure}` on success, where `failure` may be `nil`.

Returns `{:error, :not_started}` if the master does not exist and
`{:error, :timeout}` if the local master call itself times out.

# `read_input`

```elixir
@spec read_input(atom(), atom()) :: {:ok, {term(), integer()}} | {:error, term()}
```

Read the latest decoded input sample for a slave input signal.

Returns `{value, refreshed_at_us}` where `refreshed_at_us` is the last valid
domain refresh time for the cached process-image sample, not a hardware-edge
timestamp.

Returns `{:error, :not_ready}` until the first domain cycle completes and
`{:error, {:stale, details}}` once the cached sample is older than the
domain freshness window.

# `reference_clock`

```elixir
@spec reference_clock() ::
  {:ok, %{name: atom() | nil, station: non_neg_integer()}} | {:error, term()}
```

Return the current DC reference clock as `%{name, station}`.

# `slave_info`

```elixir
@spec slave_info(atom()) ::
  {:ok, slave_info()} | {:error, :not_found | :timeout | {:server_exit, term()}}
```

Return a diagnostic snapshot for a slave.

Keys:
  - `:name` — slave atom name
  - `:station` — assigned bus station address
  - `:al_state` — current ESM state: `:init | :preop | :safeop | :op`
  - `:identity` — `%{vendor_id, product_code, revision, serial_number}` from SII, or `nil`
  - `:esc` — `%{fmmu_count, sm_count}` from ESC base registers, or `nil`
  - `:driver` — driver module in use
  - `:coe` — `true` if the slave has a mailbox (CoE-capable)
  - `:available_fmmus` — FMMUs supported by the ESC, or `nil`
  - `:used_fmmus` — count of active `{domain, SyncManager}` attachments
  - `:attachments` — list of `%{domain, sm_index, direction, logical_address, sm_size, signal_count, signals}`
  - `:signals` — list of `%{name, domain, direction, bit_offset, bit_size}` for registered signals
  - `:configuration_error` — last configuration failure term, or `nil`
    Common values are structured tuples such as
    `{:mailbox_config_failed, index, subindex, reason}`.

## Example

    iex> EtherCAT.slave_info(:sensor)
    {:ok, %{
      name: :sensor,
      station: 0x1001,
      al_state: :op,
      identity: %{vendor_id: 0x2, product_code: 0x07111389, revision: 0x00190000, serial_number: 0},
      esc: %{fmmu_count: 3, sm_count: 4},
      driver: MyApp.EL1809,
      coe: false,
      available_fmmus: 3,
      used_fmmus: 1,
      attachments: [
        %{domain: :main, sm_index: 3, direction: :input, logical_address: 0x0000, sm_size: 2, signal_count: 2, signals: [:ch1, :ch2]}
      ],
      pdo_health: %{state: :fresh, domains: [%{id: :main, state: :fresh, refreshed_at_us: 1_234_000, age_us: 250, stale_after_us: 3_000}]},
      signals: [
        %{name: :ch1, domain: :main, direction: :input, bit_offset: 0, bit_size: 1},
        ...
      ],
      configuration_error: nil
    }}

# `slaves`

```elixir
@spec slaves() :: master_query_result([slave_summary()])
```

Return `{:ok, [%{name:, station:, server:, pid:, fault:}]}` for all running
slaves.

Returns `{:error, :not_started}` if the master does not exist and
`{:error, :timeout}` if the local master call itself times out.

# `start`

```elixir
@spec start(keyword()) :: :ok | {:error, term()}
```

Start the master: open the interface, discover slaves, and begin
self-driving configuration.

Returns `:ok` once discovery has started. Call `await_running/1` to block
until startup finishes. If the master falls back to `:idle`, inspect
`last_failure/0` for the retained reason. For dynamic PREOP workflows, call
`activate/0` afterwards to enter cyclic operation.

Options:
  - `:interface` (required) — network interface, e.g. `"eth0"`
  - `:domains` — list of `%EtherCAT.Domain.Config{}` structs describing
    domain intent. High-level domain configs do not take `:logical_base`;
    the master allocates logical windows automatically.
  - `:slaves` — list of `%EtherCAT.Slave.Config{}` structs
    (position matters — station address = `base_station + index`).
    `nil` entries are rejected; use `%EtherCAT.Slave.Config{name: :coupler}` for
    unnamed couplers. If omitted or empty, one default slave process is started
    per discovered station and held in `:preop` for dynamic configuration.
  - `:base_station` — first station address, default `0x1000`
  - `:dc` — `%EtherCAT.DC.Config{}` for master-wide Distributed Clocks, or
    `nil` to disable DC. `await_lock?` gates startup activation; `lock_policy`
    controls the runtime reaction to DC lock loss.
  - `:frame_timeout_ms` — optional fixed bus frame response timeout in ms
    (otherwise auto-tuned from slave count and cycle time)
  - `:scan_poll_ms` — optional discovery poll interval in ms
  - `:scan_stable_ms` — optional identical-count stability window in ms before startup begins

# `state`

```elixir
@spec state() :: master_query_result(session_state())
```

Return the current public session state.

Values:
  - `:idle`
  - `:discovering`
  - `:awaiting_preop`
  - `:preop_ready`
  - `:deactivated` — session is live but intentionally settled below OP, typically SAFEOP
  - `:operational` — cyclic OP path is healthy; inspect `slaves/0` for non-critical per-slave faults
  - `:activation_blocked` — the transition to the desired runtime target is blocked
  - `:recovering` — runtime fault recovery in progress

Returns `{:ok, state}` on success.

`{:ok, :idle}` means the master process exists and the session is idle.
`{:error, :not_started}` means there is no local master process at all.

Returns `{:error, :not_started}` if the master does not exist and
`{:error, :timeout}` if the local master call itself times out.

# `stop`

```elixir
@spec stop() :: :ok | {:error, :already_stopped | :timeout | {:server_exit, term()}}
```

Stop the master: tear the session down completely.

Returns `{:error, :already_stopped}` if not running.

# `subscribe`

```elixir
@spec subscribe(atom(), atom(), pid()) ::
  :ok
  | {:error,
     {:not_registered, atom()} | :not_found | :timeout | {:server_exit, term()}}
```

Subscribe to named slave events.

`name` may refer to:

  - a registered process-data signal, delivered as
    `{:ethercat, :signal, slave_name, name, value}`
  - a named latch configured through `sync.latches`, delivered as
    `{:ethercat, :latch, slave_name, name, timestamp_ns}`

Unknown names are rejected with `{:error, {:not_registered, name}}`.

# `update_domain_cycle_time`

```elixir
@spec update_domain_cycle_time(atom(), pos_integer()) :: :ok | {:error, term()}
```

Update the live cycle period of a running domain.

This changes the `Domain` runtime directly. The master keeps its initial
domain plan; `domains/0` and `domain_info/1` reflect the live period owned by
the `Domain` process.

# `upload_sdo`

```elixir
@spec upload_sdo(atom(), non_neg_integer(), non_neg_integer()) ::
  {:ok, binary()} | {:error, term()}
```

Upload a CoE SDO value from a slave mailbox object entry.

This is a blocking acyclic mailbox transfer and is only valid once the slave
mailbox is configured, typically from PREOP onward.

# `write_output`

```elixir
@spec write_output(atom(), atom(), term()) :: :ok | {:error, term()}
```

Stage `value` into a slave output PDO for the next domain cycle.

This confirms the value was staged into the master's domain output buffer for
the next cycle. It does not prove the slave has already applied the value on
hardware.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
