Instructions for AI coding assistants working with this codebase.
Project Overview
Pure Elixir DICOM P10 parser and writer. Zero runtime dependencies. Parses and serializes medical imaging files per the DICOM standard (PS3.5, PS3.6, PS3.10).
Build and Test
mix deps.get # Install dev/test dependencies
mix compile # Compile
mix test # Run all tests
mix test --cover # Run with coverage
mix format --check-formatted # Check formatting
mix docs # Generate documentation
Architecture
lib/dicom.ex -- Public API: parse/1, parse_file/1, write/1, write_file/2
lib/dicom/
data_set.ex -- DataSet struct: Access, Enumerable, Inspect protocols
data_element.ex -- DataElement struct: tag + VR + value + length, Inspect
tag.ex -- Tag constants, parse/1, from_keyword/1, repeating?/1
vr.ex -- VR types, metadata (all/0, description/1, max_length/1)
uid.ex -- UID constants, generate/0, valid?/1, transfer_syntax?/1
value.ex -- VR-aware encode/decode, date/time conversion
transfer_syntax.ex -- 49 transfer syntax registry, encoding/1 dispatch
sop_class.ex -- 232 SOP class registry
json.ex -- DICOM JSON encode/decode (PS3.18 Annex F.2)
pixel_data.ex -- Frame extraction (native + encapsulated)
de_identification.ex -- Best-effort de-identification helpers (PS3.15 subset)
character_set.ex -- Specific Character Set decoding (PS3.5 6.1)
p10/
reader.ex -- Binary parser: preamble -> file meta -> data set
writer.ex -- Binary serializer: iodata pipeline -> IO.iodata_to_binary
file_meta.ex -- Preamble validation, skip_preamble/1, sanitize_preamble/1
stream.ex -- Streaming lazy event parser
dictionary/
registry.ex -- PS3.6 lookup: 5,035 tags, find_by_keyword/1Conventions
- Return
{:ok, result}or{:error, reason}from all public functions - Tags are
{group, element}tuples:{0x0010, 0x0010}= Patient Name - VR types are atoms:
:PN,:DA,:UI,:OB,:SQ, etc. - Binary parsing uses Elixir pattern matching exclusively
@specon all public functions@moduledocand@docon all public modules and functions- Reference DICOM standard sections in docs (e.g., "PS3.5 Section 6.2")
Code Style
- Run
mix formatbefore committing - Use
@compile {:inline, ...}for hot-path functions - Prefer iodata over binary concatenation in serialization paths
- Use list accumulation +
Map.new/1over incrementalMap.putin parsing loops
Testing
- Property-based tests with StreamData for encode/decode roundtrips
- Shared test helpers in
test/support/dicom_test_helpers.ex - Benchmark tests in
test/dicom/benchmark_test.exs - Maintain or improve coverage in the areas you touch
- Run
mix test --coverand check the HTML report incover/
DICOM Domain
Key concepts for working with this codebase:
- P10: File format (PS3.10) = 128-byte preamble + "DICM" + File Meta Info + Data Set
- Data Set: Ordered map of
{tag => DataElement}pairs - Tag:
{group, element}pair, e.g.,{0x0010, 0x0010}= Patient Name - VR: Value Representation = data type (PN = Person Name, DA = Date, UI = UID)
- Transfer Syntax: Encoding rules (byte order + VR explicit/implicit + compression)
- File Meta Info: Group 0002 elements, always Explicit VR Little Endian
- Sequence (SQ): Nested data -- a list of item maps, each item is a
%{tag => DataElement}
Security
- Never execute DICOM file content as code
- Preamble bytes (first 128 bytes) can contain arbitrary data -- use
sanitize_preamble/1 - Validate UIDs with
Dicom.UID.valid?/1before using them in file paths or URLs - Do not hardcode credentials or PHI (Protected Health Information) in tests
- DICOM files may contain patient data -- handle with care in examples and fixtures
PR Guidelines
- Keep changes focused on a single concern
- Include tests for new functionality
- Maintain or improve coverage for the changed area
- Update
@docand@moduledocfor public API changes