Instructions for AI coding assistants working with this codebase.

Project Overview

Pure Elixir DICOM P10 parser and writer. Zero runtime dependencies. Parses and serializes medical imaging files per the DICOM standard (PS3.5, PS3.6, PS3.10).

Build and Test

mix deps.get                     # Install dev/test dependencies
mix compile                      # Compile
mix test                         # Run all tests
mix test --cover                 # Run with coverage
mix format --check-formatted     # Check formatting
mix docs                         # Generate documentation

Architecture

lib/dicom.ex              -- Public API: parse/1, parse_file/1, write/1, write_file/2
lib/dicom/
  data_set.ex             -- DataSet struct: Access, Enumerable, Inspect protocols
  data_element.ex         -- DataElement struct: tag + VR + value + length, Inspect
  tag.ex                  -- Tag constants, parse/1, from_keyword/1, repeating?/1
  vr.ex                   -- VR types, metadata (all/0, description/1, max_length/1)
  uid.ex                  -- UID constants, generate/0, valid?/1, transfer_syntax?/1
  value.ex                -- VR-aware encode/decode, date/time conversion
  transfer_syntax.ex      -- 49 transfer syntax registry, encoding/1 dispatch
  sop_class.ex            -- 232 SOP class registry
  json.ex                 -- DICOM JSON encode/decode (PS3.18 Annex F.2)
  pixel_data.ex           -- Frame extraction (native + encapsulated)
  de_identification.ex    -- Best-effort de-identification helpers (PS3.15 subset)
  character_set.ex        -- Specific Character Set decoding (PS3.5 6.1)
  p10/
    reader.ex             -- Binary parser: preamble -> file meta -> data set
    writer.ex             -- Binary serializer: iodata pipeline -> IO.iodata_to_binary
    file_meta.ex          -- Preamble validation, skip_preamble/1, sanitize_preamble/1
    stream.ex             -- Streaming lazy event parser
  dictionary/
    registry.ex           -- PS3.6 lookup: 5,035 tags, find_by_keyword/1

Conventions

  • Return {:ok, result} or {:error, reason} from all public functions
  • Tags are {group, element} tuples: {0x0010, 0x0010} = Patient Name
  • VR types are atoms: :PN, :DA, :UI, :OB, :SQ, etc.
  • Binary parsing uses Elixir pattern matching exclusively
  • @spec on all public functions
  • @moduledoc and @doc on all public modules and functions
  • Reference DICOM standard sections in docs (e.g., "PS3.5 Section 6.2")

Code Style

  • Run mix format before committing
  • Use @compile {:inline, ...} for hot-path functions
  • Prefer iodata over binary concatenation in serialization paths
  • Use list accumulation + Map.new/1 over incremental Map.put in parsing loops

Testing

  • Property-based tests with StreamData for encode/decode roundtrips
  • Shared test helpers in test/support/dicom_test_helpers.ex
  • Benchmark tests in test/dicom/benchmark_test.exs
  • Maintain or improve coverage in the areas you touch
  • Run mix test --cover and check the HTML report in cover/

DICOM Domain

Key concepts for working with this codebase:

  • P10: File format (PS3.10) = 128-byte preamble + "DICM" + File Meta Info + Data Set
  • Data Set: Ordered map of {tag => DataElement} pairs
  • Tag: {group, element} pair, e.g., {0x0010, 0x0010} = Patient Name
  • VR: Value Representation = data type (PN = Person Name, DA = Date, UI = UID)
  • Transfer Syntax: Encoding rules (byte order + VR explicit/implicit + compression)
  • File Meta Info: Group 0002 elements, always Explicit VR Little Endian
  • Sequence (SQ): Nested data -- a list of item maps, each item is a %{tag => DataElement}

Security

  • Never execute DICOM file content as code
  • Preamble bytes (first 128 bytes) can contain arbitrary data -- use sanitize_preamble/1
  • Validate UIDs with Dicom.UID.valid?/1 before using them in file paths or URLs
  • Do not hardcode credentials or PHI (Protected Health Information) in tests
  • DICOM files may contain patient data -- handle with care in examples and fixtures

PR Guidelines

  • Keep changes focused on a single concern
  • Include tests for new functionality
  • Maintain or improve coverage for the changed area
  • Update @doc and @moduledoc for public API changes