Simulated EtherCAT slave segment for deep integration tests, virtual hardware, and simulator-backed tooling.
EtherCAT.Simulator executes EtherCAT datagrams against one or more in-memory
slaves with protocol-faithful ESC register, AL-state, mailbox, and logical
process-data behavior. It is the public process boundary for the simulator
runtime; device authorship lives in EtherCAT.Simulator.Slave, and the real
transport endpoints live in EtherCAT.Simulator.Transport.Udp and
EtherCAT.Simulator.Transport.Raw.
What This Is Not
This is not a hardware EtherCAT slave controller or a kernel-bypass slave NIC.
The simulator can now expose two host-side ingress styles:
udp: [...]throughEtherCAT.Simulator.Transport.Udpraw: [interface: ...]throughEtherCAT.Simulator.Transport.Rawraw: [primary: [...], secondary: [...]]for redundant raw ingress against one shared slave segment
In both cases, the slave segment is still userspace Elixir code that decodes EtherCAT datagrams, executes them against in-memory slaves, and encodes the reply. The raw mode is a host raw-socket endpoint, not a claim that the simulator is acting like a physical ESC.
Purpose
The simulator exists for:
- deep integration tests without physical hardware
- local virtual hardware during development
- higher-level tooling such as a future simulator widget in
kino_ethercat
Real hardware is not required for most tests because the code under test is still the real master, bus, link handling, and UDP transport. What gets virtualized is the slave segment. That is exactly where determinism helps: disconnects, bad WKCs, mailbox faults, retries, and recovery timing are easier to reproduce and assert in the simulator than on a physical bench.
Hardware runs still matter, but mainly as a complement:
- smoke validation on a real ring
- capture generation
- simulator-drift checks
Runtime Flow
The exchange path is intentionally simple. The simulator core is the same in both modes; only the outer transport wrapper changes.
flowchart TD
A{Master transport}
A -- :udp --> B[Bus.Transport.UdpSocket sends UDP payload]
A -- raw --> C[Bus.Transport.RawSocket sends EtherCAT Ethernet frame]
B --> D[Simulator.Transport.Udp receives UDP payload]
C --> E[Simulator.Transport.Raw.Endpoint receives EtherType 0x88A4 frame]
D --> F[Frame.decode converts payload into EtherCAT datagrams]
E --> F
F --> G[EtherCAT.Simulator executes datagrams against in-memory slaves]
G --> H[Simulated slaves update ESC state, AL state, mailbox, and PDO images]
H --> I[Simulator builds reply datagrams and WKC]
I --> J{Transport wrapper}
J -- UDP --> K[Frame.encode builds UDP reply payload]
J -- Raw --> L[EtherCAT payload is wrapped in Ethernet reply frame]
K --> M[Master receives reply and continues processing]
L --> MThe important boundary is that only the master-side EtherCAT logic is "real" here. On the simulator side, both endpoints are just transport adapters around the same in-memory slave segment.
Architecture
EtherCAT.Simulator is intentionally a small process boundary over the
multi-slave segment state.
It owns:
- the simulated slave list
- datagram execution across that list
- WKC accumulation
- injected runtime faults
- signal subscriptions and snapshots for tooling
- optional supervision of UDP or raw transport endpoints
It does not own device-profile logic inline. That lives in the simulator's
private slave runtime and profile modules under lib/ethercat/simulator/slave/.
Runtime implementation shape:
lib/ethercat/
├── simulator.ex
└── simulator/
├── driver_adapter.ex
├── fault.ex
├── runtime/
│ ├── faults.ex
│ ├── milestones.ex
│ ├── router.ex
│ ├── snapshot.ex
│ ├── subscriptions.ex
│ └── wiring.ex
├── transport.ex
├── transport/
│ ├── raw.ex
│ ├── raw/
│ │ ├── endpoint.ex
│ │ └── fault.ex
│ ├── udp.ex
│ └── udp/
│ └── fault.ex
└── slave/
├── behaviour.ex
├── definition.ex
├── driver.ex
├── object.ex
├── profile.ex
├── signals.ex
├── value.ex
└── reference/Unlike SOES, there is no embedded polling loop equivalent to ecat_slv().
Incoming EtherCAT datagrams drive the simulator state:
- register reads and writes
- AL control and status transitions
- EEPROM/SII reads
- SyncManager and FMMU programming
- logical process-data access
That is deliberate. The simulator preserves the observable protocol boundary, not the C control flow.
Fidelity Boundary
These protocol-facing parts should stay aligned with the spec model and any local simulator reference notes kept outside the tracked repo:
- datagram routing:
- broadcast
- auto-increment
- fixed-address
- logical
- register reads and writes
- AL control and status behavior
- EEPROM/SII read behavior
- SyncManager and FMMU state
- logical process-data read and write behavior
- WKC accounting
Intentionally simplified:
- embedded polling-loop shape from SOES
- HAL and firmware-driver structure
- hardware interrupt behavior
- link-carrier modeling below the protocol layer
- full DC behavior
The rule is: preserve protocol behavior, not firmware structure.
Public API
Main entry points:
start/1— start the supervised simulator runtime, includingudp: [...]or raw endpoint config when you want a real transport endpointchild_spec/1— supervisor-friendly form ofstart/1start_link/1— low-level in-memory simulator core onlystop/0— stop the singleton simulator runtimeprocess_datagrams/1— execute EtherCAT datagrams directlyprocess_datagrams/2— execute EtherCAT datagrams with simulator-local options such as raw ingress sideinject_fault/1/clear_faults/0— deterministic runtime fault injectionset_topology/1— switch the simulator between linear and redundant topology modes, including a deterministic single breakinfo/0,device_snapshot/1,signal_snapshot/2,connections/0— stable runtime snapshots for toolingsignals/1,signal_definitions/1,get_value/2,set_value/3connect/2,disconnect/2— cross-slave signal wiringsubscribe/3/unsubscribe/3— widget-friendly signal observation
Use EtherCAT.Simulator.Slave to build devices such as:
- digital I/O
- couplers
- mailbox-capable demo slaves
- analog and temperature devices
- servo and drive profiles
- simulated devices hydrated from a real
EtherCAT.Slave.Driverthroughfrom_driver/2
EtherCAT.Simulator.Slave.Definition is the public opaque authored device type
used by those builders and optional driver hydration.
Capabilities
The simulator is already strong enough to exercise the real master through:
- startup to
:operational - cyclic I/O roundtrips
- PREOP mailbox diagnostics
- recovery from realistic runtime faults
Implemented and validated surface:
- one or more simulated slaves behind one named simulator instance
- real UDP transport path through
EtherCAT.Bus.Transport.UdpSocket - single-link raw transport path through
EtherCAT.Bus.Transport.RawSocket - dual raw ingress endpoints for redundant master tests
- redundant topology modeling:
- healthy secondary passthrough
- deterministic single break through
set_topology({:redundant, break_after: n})
- startup addressing modes:
- broadcast
- auto-increment
- fixed-address
- logical
- AL transition discipline:
INIT -> PREOP -> SAFEOP -> OP
- SII/EEPROM reads through the normal master path
- SyncManager and FMMU programming
- cyclic LRW process-data exchange
- expedited and segmented CoE upload/download for mailbox-capable devices
- signal-level get/set, subscriptions, and snapshots for tooling
- cross-slave signal wiring
- real-device hydration through simulator companions on real drivers
The preferred public device story is driver-backed simulation:
coupler = EtherCAT.Simulator.Slave.from_driver(MyApp.EK1100, name: :coupler)
inputs = EtherCAT.Simulator.Slave.from_driver(MyApp.EL1809, name: :inputs)
outputs = EtherCAT.Simulator.Slave.from_driver(MyApp.EL2809, name: :outputs)Profile modules still exist, but they are implementation detail. The public story is: simulate real devices through real drivers and keep identity, PDO naming, and simulator hydration aligned.
Fault Model
The simulator has three fault boundaries:
EtherCAT.Simulatorfor datagram/runtime behaviorEtherCAT.Simulator.Transport.Udpfor malformed, stale, or mismatched UDP repliesEtherCAT.Simulator.Transport.Rawfor raw endpoint behavior such as delayed egress on one or both raw legs
Runtime fault injection supports:
- exchange-scoped faults such as dropped replies, WKC skew, and disconnects
- slave-local faults such as
SAFEOPretreat, power-cycle resets, AL error latch, mailbox aborts, and mailbox protocol faults - queued windows through
Fault.next/2 - scripted sequences through
Fault.script/1 - delayed activation through
Fault.after_ms/2 - milestone activation through
Fault.after_milestone/2
Current exchange-scoped runtime faults:
:drop_responses{:wkc_offset, delta}{:command_wkc_offset, command_name, delta}{:logical_wkc_offset, slave_name, delta}{:disconnect, slave_name}
Current milestones:
{:healthy_exchanges, count}{:healthy_polls, slave_name, count}{:mailbox_step, slave_name, step, count}
Current slave-local fault injections include:
{:power_cycle, slave_name}— reset the slave toINIT, clear volatile runtime state, and clear its fixed station address so the slave reconnect path must reclaim or restore it before PREOP rebuild can continue{:mailbox_abort, slave_name, index, subindex, abort_code}{:mailbox_abort, slave_name, index, subindex, abort_code, stage}{:mailbox_protocol_fault, slave_name, index, subindex, stage, fault_kind}
Direct slave-local injections stay active until clear_faults/0. The same
mailbox protocol fault injected as a step inside Fault.script/1 is consumed
on first match so reconnect/retry scenarios can fail once and self-heal on a
later master retry.
Example runtime and UDP-edge faults:
alias EtherCAT.Simulator.Fault
alias EtherCAT.Simulator.Transport.Raw.Fault, as: RawFault
alias EtherCAT.Simulator.Transport.Udp.Fault, as: UdpFault
EtherCAT.Simulator.inject_fault(Fault.drop_responses() |> Fault.next(10))
EtherCAT.Simulator.inject_fault(
Fault.retreat_to_safeop(:outputs)
|> Fault.after_milestone(Fault.healthy_polls(:outputs, 10))
)
EtherCAT.Simulator.inject_fault(
Fault.mailbox_protocol_fault(:mailbox, 0x2003, 0x01, :upload_segment, :toggle_mismatch)
)
EtherCAT.Simulator.Transport.Udp.inject_fault(
UdpFault.script([UdpFault.unsupported_type(), UdpFault.replay_previous()])
)
EtherCAT.Simulator.Transport.Raw.inject_fault(
RawFault.delay_response(200, endpoint: :secondary, from_ingress: :primary)
)Delay Semantics
The simulator currently supports delayed fault scheduling, not general transport-latency simulation.
What exists today:
Fault.after_ms/2delays when a fault becomes activeFault.after_milestone/2delays activation until a deterministic simulator milestone is observedTransport.Raw.Fault.delay_response/2delays raw response emission on selected endpoints for selected ingress directions- the DC register model carries
system_time_delay_nsso DC reads can expose realistic-looking delay values during clock setup and diagnostics
What does not exist today:
- no random jitter model
- no per-port or per-hop wire propagation model
That is deliberate. Most master regressions here are about missing replies, wrong WKCs, malformed mailbox exchanges, reconnect sequencing, and retained fault state. The raw transport delay control exists because raw redundant-path regressions need an honest endpoint-level seam; broader latency models would still be less useful than deterministic fault windows.
Testing Strategy
Repository integration coverage keeps two maintained variants built around the same real drivers:
test/integration/simulator/ring_test.exstest/integration/hardware/ring_test.exs
The simulator suite is the primary place for deterministic fault matrices:
- transient timeouts and dropped replies
- UDP reply corruption, replay, and stale-frame behavior
- WKC mismatch and logical-slave-targeted skew
- slave disconnect/reconnect and
SAFEOPretreat - startup mailbox failures during PREOP configuration
- public SDO upload/download mailbox protocol faults
- reconnect-time PREOP rebuild failures
- telemetry-triggered chained recovery follow-ups
- captured real-device cases such as
EL3202
Use fixture tiers deliberately:
- synthetic fixtures for protocol-isolated mailbox and reconnect matrices
- captured or curated real-device fixtures such as
EL3202for realistic startup and decode behavior - hardware tests as a final complement, not the only integration path
Prefer one simulator scenario per behavioral regression. Share helpers and ring builders aggressively, but keep distinct fault stories in separate files so failures localize cleanly.
Reference Material
When you need deeper simulator design notes, use your local helper material outside the tracked repo.
Relevant repo integration guides:
test/integration/simulator/README.mdtest/integration/hardware/README.md
Historical planning material may exist in local helper notes outside the tracked repo, but the maintained sources here are the current module docs, tests, and integration guides.
Summary
Functions
Returns a specification to start this module under a supervisor.
Types
@type call_error_reason() :: :not_found | :timeout | {:server_exit, term()}
@type connection() :: %{source: signal_ref(), target: signal_ref()}
@type fault() :: schedulable_fault()
@type fault_script_step() :: exchange_fault() | slave_fault() | {:wait_for_milestone, milestone()}
@type immediate_fault() :: exchange_fault() | {:next_exchange, exchange_fault()} | {:next_exchanges, pos_integer(), exchange_fault()} | {:fault_script, [fault_script_step(), ...]} | slave_fault()
@type milestone() :: {:healthy_exchanges, pos_integer()} | {:healthy_polls, atom(), pos_integer()} | {:mailbox_step, atom(), :upload_init | :upload_segment | :download_init | :download_segment, pos_integer()}
@type schedulable_fault() :: immediate_fault() | {:after_ms, non_neg_integer(), schedulable_fault()} | {:after_milestone, milestone(), schedulable_fault()}
@type slave_fault() :: {:retreat_to_safeop, atom()} | {:power_cycle, atom()} | {:latch_al_error, atom(), non_neg_integer()} | {:mailbox_abort, atom(), non_neg_integer(), non_neg_integer(), non_neg_integer()} | {:mailbox_abort, atom(), non_neg_integer(), non_neg_integer(), non_neg_integer(), :request | :upload_segment | :download_segment} | {:mailbox_protocol_fault, atom(), non_neg_integer(), non_neg_integer(), :request | :upload_init | :upload_segment | :download_init | :download_segment, :drop_response | :counter_mismatch | :toggle_mismatch | {:mailbox_type, 0..15} | {:coe_service, 0..15} | :invalid_coe_payload | {:sdo_command, 0..255} | :invalid_segment_padding | {:segment_command, 0..255}}
Functions
@spec child_spec(keyword()) :: Supervisor.child_spec()
Returns a specification to start this module under a supervisor.
See Supervisor.
@spec clear_faults() :: :ok | {:error, call_error_reason()}
@spec connect(signal_ref(), signal_ref()) :: :ok | {:error, :unknown_signal | :invalid_value | call_error_reason()}
@spec connections() :: {:ok, [connection()]} | {:error, call_error_reason()}
@spec device_snapshot(atom()) :: {:ok, map()} | {:error, call_error_reason()}
@spec disconnect(signal_ref(), signal_ref()) :: :ok | {:error, call_error_reason()}
@spec get_value(atom(), atom()) :: {:ok, term()} | {:error, :unknown_signal | call_error_reason()}
@spec info() :: {:ok, map()} | {:error, call_error_reason()}
@spec inject_fault(EtherCAT.Simulator.Fault.t() | fault()) :: :ok | {:error, :invalid_fault | call_error_reason()}
@spec output_image(atom()) :: {:ok, binary()} | {:error, call_error_reason()}
@spec process_datagrams([EtherCAT.Bus.Datagram.t()]) :: {:ok, [EtherCAT.Bus.Datagram.t()]} | {:error, :no_response | call_error_reason()}
@spec set_topology(:linear | :redundant | {:redundant, keyword()}) :: :ok | {:error, :invalid_topology | call_error_reason()}
@spec set_value(atom(), atom(), term()) :: :ok | {:error, :unknown_signal | :invalid_value | call_error_reason()}
@spec signal_definitions(atom()) :: {:ok, %{optional(atom()) => map()}} | {:error, call_error_reason()}
@spec signal_snapshot(atom(), atom()) :: {:ok, map()} | {:error, :unknown_signal | call_error_reason()}
@spec signals(atom()) :: {:ok, [atom()]} | {:error, call_error_reason()}
@spec start(keyword()) :: Supervisor.on_start() | {:error, term()}
@spec start_link(keyword()) :: GenServer.on_start()
@spec stop() :: :ok
@spec subscribe(atom(), atom() | :all, pid()) :: :ok | {:error, call_error_reason()}
@spec unsubscribe(atom(), atom() | :all, pid()) :: :ok | {:error, call_error_reason()}