Beaver.MLIR.Dialect.ROCDL (beaver v0.4.7)

Summary

Functions

rocdl.ballot - Vote across thread group

rocdl.barrier

rocdl.cvt.f32.bf8 - Convert bf8 to f32

rocdl.cvt.f32.fp8 - Convert fp8 to f32

rocdl.cvt.pk.bf8.f32 - Convert two f32's to bf8

rocdl.cvt.pk.f32.bf8 - Convert packed bf8 to packed f32

rocdl.cvt.pk.f32.fp8 - Convert packed fp8 to packed f32

rocdl.cvt.pk.fp8.f32 - Convert two f32's to fp8

rocdl.cvt.pkrtz - Convert two f32 input into a vector<2xf16>

rocdl.cvt.scale.pk8.bf16.bf8 - Scales 8 bf8 and converts them to 8 bf16.

rocdl.cvt.scale.pk8.bf16.fp4 - Scales 8 fp4 and converts them to 8 bf16.

rocdl.cvt.scale.pk8.bf16.fp8 - Scales 8 fp8 and converts them to 8 bf16.

rocdl.cvt.scale.pk8.f16.bf8 - Scales 8 bf8 and converts them to 8 f16.

rocdl.cvt.scale.pk8.f16.fp4 - Scales 8 fp4 and converts them to 8 f16.

rocdl.cvt.scale.pk8.f16.fp8 - Scales 8 fp8 and converts them to 8 f16.

rocdl.cvt.scale.pk8.f32.bf8 - Scales 8 bf8 and converts them to 8 f32.

rocdl.cvt.scale.pk8.f32.fp4 - Scales 8 fp4 and converts them to 8 f32.

rocdl.cvt.scale.pk8.f32.fp8 - Scales 8 fp8 and converts them to 8 f32.

rocdl.cvt.scale.pk16.bf16.bf6 - Scales 16 bf6 and converts them to 16 bf16.

rocdl.cvt.scale.pk16.bf16.fp6 - Scales 16 fp6 and converts them to 16 bf16.

rocdl.cvt.scale.pk16.f16.bf6 - Scales 16 bf6 and converts them to 16 f16.

rocdl.cvt.scale.pk16.f16.fp6 - Scales 16 fp6 and converts them to 16 f16.

rocdl.cvt.scale.pk16.f32.bf6 - Scales 16 bf6 and converts them to 16 f32.

rocdl.cvt.scale.pk16.f32.fp6 - Scales 16 fp6 and converts them to 16 f32.

rocdl.cvt.scalef32.2xpk16.bf6.f32 - Scale and convert two vector<16xf32> to 32 packed bf6

rocdl.cvt.scalef32.2xpk16.fp6.f32 - Scale and convert two vector<16xf32> to 32 packed fp6

rocdl.cvt.scalef32.f16.bf8 - Scaled convert bf8 from packed vector to f16, updating tied result

rocdl.cvt.scalef32.f16.fp8 - Scaled convert fp8 from packed vector to f16, updating tied result

rocdl.cvt.scalef32.f32.bf8 - Scaled convert bf8 from packed vector to f32

rocdl.cvt.scalef32.f32.fp8 - Scaled convert fp8 from packed vector to f32

rocdl.cvt.scalef32.pk32.bf6.bf16 - Scale and convert packed bf16 to packed bf6

rocdl.cvt.scalef32.pk32.bf6.f16 - Scale and convert packed f16 to packed bf6

rocdl.cvt.scalef32.pk32.bf16.bf6 - Scale and convert packed bf6 to packed bf16

rocdl.cvt.scalef32.pk32.bf16.fp6 - Scale and convert packed fp6 to packed bf16

rocdl.cvt.scalef32.pk32.f16.bf6 - Scale and convert packed bf6 to packed f16

rocdl.cvt.scalef32.pk32.f16.fp6 - Scale and convert packed fp6 to packed f16

rocdl.cvt.scalef32.pk32.f32.bf6 - Scale and convert packed bf6 to packed f32

rocdl.cvt.scalef32.pk32.f32.fp6 - Scale and convert packed fp6 to packed f32

rocdl.cvt.scalef32.pk32.fp6.bf16 - Scale and convert packed bf16 to packed fp6

rocdl.cvt.scalef32.pk32.fp6.f16 - Scale and convert packed f16 to packed fp6

rocdl.cvt.scalef32.pk.bf8.bf16 - Scaled convert two bf16to two bf8, updating packed vector

rocdl.cvt.scalef32.pk.bf8.f16 - Scaled convert two f16to two bf8, updating packed vector

rocdl.cvt.scalef32.pk.bf8.f32 - Scaled convert two f32 to two bf8, updating packed vector

rocdl.cvt.scalef32.pk.bf16.bf8 - Scaled convert two bf8to two bf16

rocdl.cvt.scalef32.pk.bf16.fp4 - Scale and convert two packed fp4 to packed bf16

rocdl.cvt.scalef32.pk.bf16.fp8 - Scaled convert two fp8to two bf16

rocdl.cvt.scalef32.pk.f16.bf8 - Scaled convert two bf8to two f16

rocdl.cvt.scalef32.pk.f16.fp4 - Scale and convert two packed fp4 to packed f16

rocdl.cvt.scalef32.pk.f16.fp8 - Scaled convert two fp8to two f16

rocdl.cvt.scalef32.pk.f32.bf8 - Scaled convert two bf8to two f32

rocdl.cvt.scalef32.pk.f32.fp4 - Scale and convert two packed fp4 to packed f32

rocdl.cvt.scalef32.pk.f32.fp8 - Scaled convert two fp8to two f32

rocdl.cvt.scalef32.pk.fp4.bf16 - Scale and convert two bf16 to packed fp4, updating tied vector

rocdl.cvt.scalef32.pk.fp4.f16 - Scale and convert two f16 to packed fp4, updating tied vector

rocdl.cvt.scalef32.pk.fp4.f32 - Scale and convert two f32 values to two packed fp4, updating tied vector

rocdl.cvt.scalef32.pk.fp8.bf16 - Scaled convert two bf16to two fp8, updating packed vector

rocdl.cvt.scalef32.pk.fp8.f16 - Scaled convert two f16to two fp8, updating packed vector

rocdl.cvt.scalef32.pk.fp8.f32 - Scaled convert two f32 to two fp8, updating packed vector

rocdl.cvt.scalef32.sr.bf8.bf16 - Scaled convert bf16to bf8 with stochiastic rounding, updating packed vector

rocdl.cvt.scalef32.sr.bf8.f16 - Scaled convert f16to bf8 with stochiastic rounding, updating packed vector

rocdl.cvt.scalef32.sr.bf8.f32 - Scaled convert f32to bf8 with stochiastic rounding, updating packed vector

rocdl.cvt.scalef32.sr.fp8.bf16 - Scaled convert bf16to fp8 with stochiastic rounding, updating packed vector

rocdl.cvt.scalef32.sr.fp8.f16 - Scaled convert f16to fp8 with stochiastic rounding, updating packed vector

rocdl.cvt.scalef32.sr.fp8.f32 - Scaled convert f32to fp8 with stochiastic rounding, updating packed vector

rocdl.cvt.scalef32.sr.pk32.bf6.bf16 - Scale and convert packed bf16 to packed bf6 with stochiastic rounding

rocdl.cvt.scalef32.sr.pk32.bf6.f16 - Scale and convert packed f16 to packed bf6 with stochiastic rounding

rocdl.cvt.scalef32.sr.pk32.bf6.f32 - Scale and convert packed f32 to packed bf6 with stochiastic rounding

rocdl.cvt.scalef32.sr.pk32.fp6.bf16 - Scale and convert packed bf16 to packed fp6 with stochiastic rounding

rocdl.cvt.scalef32.sr.pk32.fp6.f16 - Scale and convert packed f16 to packed fp6 with stochiastic rounding

rocdl.cvt.scalef32.sr.pk32.fp6.f32 - Scale and convert packed f32 to packed fp6 with stochiastic rounding

rocdl.cvt.scalef32.sr.pk.fp4.bf16 - Scale and convert two bf16 to packed fp4 with stochiastic rounding, updating tied vector

rocdl.cvt.scalef32.sr.pk.fp4.f16 - Scale and convert two f16 to packed fp4 with stochiastic rounding, updating tied vector

rocdl.cvt.scalef32.sr.pk.fp4.f32 - Scale and convert two f32 to packed fp4 with stochiastic rounding, updating tied vector

rocdl.cvt.sr.bf8.f32 - Convert f32 to bf8, stochiastic rounding

rocdl.cvt.sr.fp8.f32 - Convert f32 to fp8, stochiastic rounding

rocdl.ds_bpermute

rocdl.ds.read.tr4.b64

rocdl.ds.read.tr6.b96

rocdl.ds.read.tr8.b64

rocdl.ds.read.tr16.b64

rocdl.ds_swizzle

rocdl.fmed3 - Median of three float/half values

rocdl.global.load.lds

rocdl.grid.dim.x

rocdl.grid.dim.y

rocdl.grid.dim.z

rocdl.iglp.opt

rocdl.load.to.lds

rocdl.make.buffer.rsrc

rocdl.mbcnt.hi

rocdl.mbcnt.lo

rocdl.mfma.f32.4x4x1f32

rocdl.mfma.f32.4x4x2bf16

rocdl.mfma.f32.4x4x4bf16.1k

rocdl.mfma.f32.4x4x4f16

rocdl.mfma.f32.16x16x1f32

rocdl.mfma.f32.16x16x2bf16

rocdl.mfma.f32.16x16x4bf16.1k

rocdl.mfma.f32.16x16x4f16

rocdl.mfma.f32.16x16x4f32

rocdl.mfma.f32.16x16x8.xf32

rocdl.mfma.f32.16x16x8bf16

rocdl.mfma.f32.16x16x16bf16.1k

rocdl.mfma.f32.16x16x16f16

rocdl.mfma.f32.16x16x32.bf8.bf8

rocdl.mfma.f32.16x16x32.bf8.fp8

rocdl.mfma.f32.16x16x32.bf16

rocdl.mfma.f32.16x16x32.f16

rocdl.mfma.f32.16x16x32.fp8.bf8

rocdl.mfma.f32.16x16x32.fp8.fp8

rocdl.mfma.f32.32x32x1f32

rocdl.mfma.f32.32x32x2bf16

rocdl.mfma.f32.32x32x2f32

rocdl.mfma.f32.32x32x4.xf32

rocdl.mfma.f32.32x32x4bf16

rocdl.mfma.f32.32x32x4bf16.1k

rocdl.mfma.f32.32x32x4f16

rocdl.mfma.f32.32x32x8bf16.1k

rocdl.mfma.f32.32x32x8f16

rocdl.mfma.f32.32x32x16.bf8.bf8

rocdl.mfma.f32.32x32x16.bf8.fp8

rocdl.mfma.f32.32x32x16.bf16

rocdl.mfma.f32.32x32x16.f16

rocdl.mfma.f32.32x32x16.fp8.bf8

rocdl.mfma.f32.32x32x16.fp8.fp8

rocdl.mfma.f64.4x4x4f64

rocdl.mfma.f64.16x16x4f64

rocdl.mfma.i32.4x4x4i8

rocdl.mfma.i32.16x16x4i8

rocdl.mfma.i32.16x16x16i8

rocdl.mfma.i32.16x16x32.i8

rocdl.mfma.i32.16x16x64.i8

rocdl.mfma.i32.32x32x4i8

rocdl.mfma.i32.32x32x8i8

rocdl.mfma.i32.32x32x16.i8

rocdl.mfma.i32.32x32x32.i8

rocdl.mfma.scale.f32.16x16x128.f8f6f4

rocdl.mfma.scale.f32.32x32x64.f8f6f4

rocdl.permlane16.swap

rocdl.permlane32.swap

rocdl.permlanex16

rocdl.raw.buffer.atomic.cmpswap

rocdl.raw.buffer.atomic.fadd

rocdl.raw.buffer.atomic.fmax

rocdl.raw.buffer.atomic.smax

rocdl.raw.buffer.atomic.umin

rocdl.raw.buffer.load

rocdl.raw.buffer.store

rocdl.raw.ptr.buffer.atomic.cmpswap

rocdl.raw.ptr.buffer.atomic.fadd

rocdl.raw.ptr.buffer.atomic.fmax

rocdl.raw.ptr.buffer.atomic.smax

rocdl.raw.ptr.buffer.atomic.umin

rocdl.raw.ptr.buffer.load

rocdl.raw.ptr.buffer.load.lds

rocdl.raw.ptr.buffer.store

rocdl.readfirstlane - Get the value in first active lane.

rocdl.readlane - Get the value in the specific lane.

rocdl.s.barrier

rocdl.s.barrier.signal

rocdl.s.barrier.wait

rocdl.s.setprio

rocdl.s.sleep

rocdl.s.wait.dscnt

rocdl.s.wait.expcnt

rocdl.s.wait.loadcnt

rocdl.s.wait.storecnt

rocdl.s.waitcnt

rocdl.sched.barrier

rocdl.sched.group.barrier

rocdl.smfmac.f32.16x16x32.bf16

rocdl.smfmac.f32.16x16x32.f16

rocdl.smfmac.f32.16x16x64.bf8.bf8

rocdl.smfmac.f32.16x16x64.bf8.fp8

rocdl.smfmac.f32.16x16x64.fp8.bf8

rocdl.smfmac.f32.16x16x64.fp8.fp8

rocdl.smfmac.f32.32x32x16.bf16

rocdl.smfmac.f32.32x32x16.f16

rocdl.smfmac.f32.32x32x32.bf8.bf8

rocdl.smfmac.f32.32x32x32.bf8.fp8

rocdl.smfmac.f32.32x32x32.fp8.bf8

rocdl.smfmac.f32.32x32x32.fp8.fp8

rocdl.smfmac.i32.16x16x64.i8

rocdl.smfmac.i32.32x32x32.i8

rocdl.update.dpp

rocdl.wavefrontsize

rocdl.wmma.bf16.16x16x16.bf16

rocdl.wmma.f16.16x16x16.f16

rocdl.wmma.f32.16x16x16.bf8_bf8

rocdl.wmma.f32.16x16x16.bf8_fp8

rocdl.wmma.f32.16x16x16.bf16

rocdl.wmma.f32.16x16x16.f16

rocdl.wmma.f32.16x16x16.fp8_bf8

rocdl.wmma.f32.16x16x16.fp8_fp8

rocdl.wmma.i32.16x16x16.iu4

rocdl.wmma.i32.16x16x16.iu8

rocdl.wmma.i32.16x16x32.iu4

rocdl.workgroup.dim.x

rocdl.workgroup.dim.y

rocdl.workgroup.dim.z

rocdl.workgroup.id.x

rocdl.workgroup.id.y

rocdl.workgroup.id.z

rocdl.workitem.id.x

rocdl.workitem.id.y

rocdl.workitem.id.z

Functions

ballot(ssa)

rocdl.ballot - Vote across thread group

Operands

  • pred - Single, I1, 1-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Ballot provides a bit mask containing the 1-bit predicate value from each lane. The nth bit of the result contains the 1 bit contributed by the nth warp lane.

barrier(ssa)

rocdl.barrier

cvt_f32_bf8(ssa)

rocdl.cvt.f32.bf8 - Convert bf8 to f32

Attributes

  • byteSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • srcA - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert 8-bit bf8 value from the byteSelth bit of srcA to fp32.

cvt_f32_fp8(ssa)

rocdl.cvt.f32.fp8 - Convert fp8 to f32

Attributes

  • byteSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • srcA - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert 8-bit fp8 value from the byteSelth bit of srcA to fp32.

cvt_pk_bf8_f32(ssa)

rocdl.cvt.pk.bf8.f32 - Convert two f32's to bf8

Attributes

  • wordSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • srcA - Single, F32, 32-bit float
  • srcB - Single, F32, 32-bit float
  • old - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert srcA and srcB to bf8 and store into the low/high word of old, preserving the other word.

cvt_pk_f32_bf8(ssa)

rocdl.cvt.pk.f32.bf8 - Convert packed bf8 to packed f32

Attributes

  • wordSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert src based on $wordSel to packed fp32,

cvt_pk_f32_fp8(ssa)

rocdl.cvt.pk.f32.fp8 - Convert packed fp8 to packed f32

Attributes

  • wordSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert src based on $wordSel to packed fp32.

cvt_pk_fp8_f32(ssa)

rocdl.cvt.pk.fp8.f32 - Convert two f32's to fp8

Attributes

  • wordSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • srcA - Single, F32, 32-bit float
  • srcB - Single, F32, 32-bit float
  • old - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert srcA and srcB to fp8 and store into the low/high word of old, preserving the other word.

cvt_pkrtz(ssa)

rocdl.cvt.pkrtz - Convert two f32 input into a vector<2xf16>

Operands

  • srcA - Single, F32, 32-bit float
  • srcB - Single, F32, 32-bit float

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert two f32 values into a packed vector<2xf16>.

cvt_scale_pk8_bf16_bf8(ssa)

rocdl.cvt.scale.pk8.bf16.bf8 - Scales 8 bf8 and converts them to 8 bf16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V2I32Type, fixed-length vector of 32-bit signless integer values of length 2
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8BF16Type, fixed-length vector of bfloat16 type values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_bf16_fp4(ssa)

rocdl.cvt.scale.pk8.bf16.fp4 - Scales 8 fp4 and converts them to 8 bf16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8BF16Type, fixed-length vector of bfloat16 type values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_bf16_fp8(ssa)

rocdl.cvt.scale.pk8.bf16.fp8 - Scales 8 fp8 and converts them to 8 bf16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V2I32Type, fixed-length vector of 32-bit signless integer values of length 2
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8BF16Type, fixed-length vector of bfloat16 type values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_f16_bf8(ssa)

rocdl.cvt.scale.pk8.f16.bf8 - Scales 8 bf8 and converts them to 8 f16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V2I32Type, fixed-length vector of 32-bit signless integer values of length 2
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8F16Type, fixed-length vector of 16-bit float values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_f16_fp4(ssa)

rocdl.cvt.scale.pk8.f16.fp4 - Scales 8 fp4 and converts them to 8 f16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8F16Type, fixed-length vector of 16-bit float values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_f16_fp8(ssa)

rocdl.cvt.scale.pk8.f16.fp8 - Scales 8 fp8 and converts them to 8 f16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V2I32Type, fixed-length vector of 32-bit signless integer values of length 2
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8F16Type, fixed-length vector of 16-bit float values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_f32_bf8(ssa)

rocdl.cvt.scale.pk8.f32.bf8 - Scales 8 bf8 and converts them to 8 f32.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V2I32Type, fixed-length vector of 32-bit signless integer values of length 2
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8F32Type, fixed-length vector of 32-bit float values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_f32_fp4(ssa)

rocdl.cvt.scale.pk8.f32.fp4 - Scales 8 fp4 and converts them to 8 f32.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8F32Type, fixed-length vector of 32-bit float values of length 8

Description

Available on gfx1250+.

cvt_scale_pk8_f32_fp8(ssa)

rocdl.cvt.scale.pk8.f32.fp8 - Scales 8 fp8 and converts them to 8 f32.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V2I32Type, fixed-length vector of 32-bit signless integer values of length 2
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V8F32Type, fixed-length vector of 32-bit float values of length 8

Description

Available on gfx1250+.

cvt_scale_pk16_bf16_bf6(ssa)

rocdl.cvt.scale.pk16.bf16.bf6 - Scales 16 bf6 and converts them to 16 bf16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V3I32Type, fixed-length vector of 32-bit signless integer values of length 3
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V16BF16Type, fixed-length vector of bfloat16 type values of length 16

Description

Available on gfx1250+.

cvt_scale_pk16_bf16_fp6(ssa)

rocdl.cvt.scale.pk16.bf16.fp6 - Scales 16 fp6 and converts them to 16 bf16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V3I32Type, fixed-length vector of 32-bit signless integer values of length 3
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V16BF16Type, fixed-length vector of bfloat16 type values of length 16

Description

Available on gfx1250+.

cvt_scale_pk16_f16_bf6(ssa)

rocdl.cvt.scale.pk16.f16.bf6 - Scales 16 bf6 and converts them to 16 f16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V3I32Type, fixed-length vector of 32-bit signless integer values of length 3
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V16F16Type, fixed-length vector of 16-bit float values of length 16

Description

Available on gfx1250+.

cvt_scale_pk16_f16_fp6(ssa)

rocdl.cvt.scale.pk16.f16.fp6 - Scales 16 fp6 and converts them to 16 f16.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V3I32Type, fixed-length vector of 32-bit signless integer values of length 3
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V16F16Type, fixed-length vector of 16-bit float values of length 16

Description

Available on gfx1250+.

cvt_scale_pk16_f32_bf6(ssa)

rocdl.cvt.scale.pk16.f32.bf6 - Scales 16 bf6 and converts them to 16 f32.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V3I32Type, fixed-length vector of 32-bit signless integer values of length 3
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V16F32Type, fixed-length vector of 32-bit float values of length 16

Description

Available on gfx1250+.

cvt_scale_pk16_f32_fp6(ssa)

rocdl.cvt.scale.pk16.f32.fp6 - Scales 16 fp6 and converts them to 16 f32.

Attributes

  • scaleSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, ROCDL_V3I32Type, fixed-length vector of 32-bit signless integer values of length 3
  • scale - Single, I32, 32-bit signless integer

Results

  • res - Single, ROCDL_V16F32Type, fixed-length vector of 32-bit float values of length 16

Description

Available on gfx1250+.

cvt_scalef32_2xpk16_bf6_f32(ssa)

rocdl.cvt.scalef32.2xpk16.bf6.f32 - Scale and convert two vector<16xf32> to 32 packed bf6

Operands

  • src0 - Single, ROCDL_V16F32Type, fixed-length vector of 32-bit float values of length 16
  • src1 - Single, ROCDL_V16F32Type, fixed-length vector of 32-bit float values of length 16
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 single-precision float values, packed into two length-16 vectors that will be logically concanenated, to packed bf6, dividing by the exponent part of scale before doing so.

cvt_scalef32_2xpk16_fp6_f32(ssa)

rocdl.cvt.scalef32.2xpk16.fp6.f32 - Scale and convert two vector<16xf32> to 32 packed fp6

Operands

  • src0 - Single, ROCDL_V16F32Type, fixed-length vector of 32-bit float values of length 16
  • src1 - Single, ROCDL_V16F32Type, fixed-length vector of 32-bit float values of length 16
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 single-precision float values, packed into two length-16 vectors that will be logically concanenated, to packed fp6, dividing by the exponent part of scale before doing so.

cvt_scalef32_f16_bf8(ssa)

rocdl.cvt.scalef32.f16.bf8 - Scaled convert bf8 from packed vector to f16, updating tied result

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute
  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2
  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2

Description

Convert a bf8 byte from src, selected by srcSelIndex, to f16 while multiplying it by the expontent of scale, and place it into the dstLoHiSelth bit of oldVdst preserving the other element of that vector in the return value.

The bytes are stored as an i32 and not a <4 x i8>.

cvt_scalef32_f16_fp8(ssa)

rocdl.cvt.scalef32.f16.fp8 - Scaled convert fp8 from packed vector to f16, updating tied result

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute
  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2
  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2

Description

Convert a fp8 byte from src, selected by srcSelIndex, to f16 while multiplying it by the expontent of scale, and place it into the dstLoHiSelth bit of oldVdst preserving the other element of that vector in the return value.

The bytes are stored as an i32 and not a <4 x i8>.

cvt_scalef32_f32_bf8(ssa)

rocdl.cvt.scalef32.f32.bf8 - Scaled convert bf8 from packed vector to f32

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, F32, 32-bit float

Description

Convert a bf8 byte from src, selected by srcSelIndex, to f32, multiplying it by the exponent of scale.

The bytes are stored in an i32, not a <4 x i8>.

cvt_scalef32_f32_fp8(ssa)

rocdl.cvt.scalef32.f32.fp8 - Scaled convert fp8 from packed vector to f32

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, F32, 32-bit float

Description

Convert a fp8 byte from src, selected by srcSelIndex, to f32, multiplying it by the exponent of scale.

The bytes are stored in an i32, not a <4 x i8>.

cvt_scalef32_pk32_bf6_bf16(ssa)

rocdl.cvt.scalef32.pk32.bf6.bf16 - Scale and convert packed bf16 to packed bf6

Operands

  • src - Single, ROCDL_V32BF16Type, fixed-length vector of bfloat16 type values of length 32
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed bf16 values to packed bf6, dividing by the exponent part of scale before doing so.

cvt_scalef32_pk32_bf6_f16(ssa)

rocdl.cvt.scalef32.pk32.bf6.f16 - Scale and convert packed f16 to packed bf6

Operands

  • src - Single, ROCDL_V32F16Type, fixed-length vector of 16-bit float values of length 32
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed f16 values to packed bf6, dividing by the exponent part of scale before doing so.

cvt_scalef32_pk32_bf16_bf6(ssa)

rocdl.cvt.scalef32.pk32.bf16.bf6 - Scale and convert packed bf6 to packed bf16

Operands

  • src - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V32BF16Type, fixed-length vector of bfloat16 type values of length 32

Description

Convert 32 packed bf6 values to packed bf16, multiplying by the exponent part of scale before doing so.

cvt_scalef32_pk32_bf16_fp6(ssa)

rocdl.cvt.scalef32.pk32.bf16.fp6 - Scale and convert packed fp6 to packed bf16

Operands

  • src - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V32BF16Type, fixed-length vector of bfloat16 type values of length 32

Description

Convert 32 packed fp6 values to packed bf16, multiplying by the exponent part of scale before doing so.

cvt_scalef32_pk32_f16_bf6(ssa)

rocdl.cvt.scalef32.pk32.f16.bf6 - Scale and convert packed bf6 to packed f16

Operands

  • src - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V32F16Type, fixed-length vector of 16-bit float values of length 32

Description

Convert 32 packed bf6 values to packed f16, multiplying by the exponent part of scale before doing so.

cvt_scalef32_pk32_f16_fp6(ssa)

rocdl.cvt.scalef32.pk32.f16.fp6 - Scale and convert packed fp6 to packed f16

Operands

  • src - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V32F16Type, fixed-length vector of 16-bit float values of length 32

Description

Convert 32 packed fp6 values to packed f16, multiplying by the exponent part of scale before doing so.

cvt_scalef32_pk32_f32_bf6(ssa)

rocdl.cvt.scalef32.pk32.f32.bf6 - Scale and convert packed bf6 to packed f32

Operands

  • src - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V32F32Type, fixed-length vector of 32-bit float values of length 32

Description

Convert 32 packed bf6 values to packed f32, multiplying by the exponent part of scale before doing so.

cvt_scalef32_pk32_f32_fp6(ssa)

rocdl.cvt.scalef32.pk32.f32.fp6 - Scale and convert packed fp6 to packed f32

Operands

  • src - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V32F32Type, fixed-length vector of 32-bit float values of length 32

Description

Convert 32 packed fp6 values to packed f32, multiplying by the exponent part of scale before doing so.

cvt_scalef32_pk32_fp6_bf16(ssa)

rocdl.cvt.scalef32.pk32.fp6.bf16 - Scale and convert packed bf16 to packed fp6

Operands

  • src - Single, ROCDL_V32BF16Type, fixed-length vector of bfloat16 type values of length 32
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed bf16 values to packed fp6, dividing by the exponent part of scale before doing so.

cvt_scalef32_pk32_fp6_f16(ssa)

rocdl.cvt.scalef32.pk32.fp6.f16 - Scale and convert packed f16 to packed fp6

Operands

  • src - Single, ROCDL_V32F16Type, fixed-length vector of 16-bit float values of length 32
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed f16 values to packed fp6, dividing by the exponent part of scale before doing so.

cvt_scalef32_pk_bf8_bf16(ssa)

rocdl.cvt.scalef32.pk.bf8.bf16 - Scaled convert two bf16to two bf8, updating packed vector

Attributes

  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2
  • src0 - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2

Description

Convert two bf16 values in src0 to two bf8 bytes, dividing by the exponent in scale. The bytes are packed into a 16-bit value which is inserted into oldVdst at the dstLoHiSel position, with the entire updated vector being returned.

cvt_scalef32_pk_bf8_f16(ssa)

rocdl.cvt.scalef32.pk.bf8.f16 - Scaled convert two f16to two bf8, updating packed vector

Attributes

  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2
  • src0 - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2

Description

Convert two f16 values in src0 to two bf8 bytes, dividing by the exponent in scale. The bytes are packed into a 16-bit value which is inserted into oldVdst at the dstLoHiSel position, with the entire updated vector being returned.

cvt_scalef32_pk_bf8_f32(ssa)

rocdl.cvt.scalef32.pk.bf8.f32 - Scaled convert two f32 to two bf8, updating packed vector

Attributes

  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2
  • src0 - Single, F32, 32-bit float
  • src1 - Single, F32, 32-bit float
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2

Description

Convert two f32 values in src0 and src1 to two bf8 bytes, dividing by the exponent in scale. The bytes are packed into a 16-bit value which is inserted into oldVdst at the dstLoHiSel position, with the entire updated vector being returned.

cvt_scalef32_pk_bf16_bf8(ssa)

rocdl.cvt.scalef32.pk.bf16.bf8 - Scaled convert two bf8to two bf16

Attributes

  • srcLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2

Description

Convert two packed bf8 values in src0 to two bf16 values, multiplying by the exponent in scale. The two values to be converted are selected from the low or high half of src (a packed vector represented as an i32) on the basis of srcLoHiSel.

cvt_scalef32_pk_bf16_fp4(ssa)

rocdl.cvt.scalef32.pk.bf16.fp4 - Scale and convert two packed fp4 to packed bf16

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2

Description

Convert two packed fp4 (f4E2M1) values stored as one byte of a 32-bit integer to packed bf16, multiplying by the exponent part of scale before doing so.

The byte to convert is chosen by srcSelIndex.

cvt_scalef32_pk_bf16_fp8(ssa)

rocdl.cvt.scalef32.pk.bf16.fp8 - Scaled convert two fp8to two bf16

Attributes

  • srcLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2

Description

Convert two packed fp8 values in src0 to two bf16 values, multiplying by the exponent in scale. The two values to be converted are selected from the low or high half of src (a packed vector represented as an i32) on the basis of srcLoHiSel.

cvt_scalef32_pk_f16_bf8(ssa)

rocdl.cvt.scalef32.pk.f16.bf8 - Scaled convert two bf8to two f16

Attributes

  • srcLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2

Description

Convert two packed bf8 values in src0 to two f16 values, multiplying by the exponent in scale. The two values to be converted are selected from the low or high half of src (a packed vector represented as an i32) on the basis of srcLoHiSel.

cvt_scalef32_pk_f16_fp4(ssa)

rocdl.cvt.scalef32.pk.f16.fp4 - Scale and convert two packed fp4 to packed f16

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2

Description

Convert two packed fp4 (f4E2M1) values stored as one byte of a 32-bit integer to packed f16, multiplying by the exponent part of scale before doing so.

The byte to convert is chosen by srcSelIndex.

cvt_scalef32_pk_f16_fp8(ssa)

rocdl.cvt.scalef32.pk.f16.fp8 - Scaled convert two fp8to two f16

Attributes

  • srcLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2

Description

Convert two packed fp8 values in src0 to two f16 values, multiplying by the exponent in scale. The two values to be converted are selected from the low or high half of src (a packed vector represented as an i32) on the basis of srcLoHiSel.

cvt_scalef32_pk_f32_bf8(ssa)

rocdl.cvt.scalef32.pk.f32.bf8 - Scaled convert two bf8to two f32

Attributes

  • srcLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F32Type, fixed-length vector of 32-bit float values of length 2

Description

Convert two packed bf8 values in src0 to two f32 values, multiplying by the exponent in scale. The two values to be converted are selected from the low or high half of src (a packed vector represented as an i32) on the basis of srcLoHiSel.

cvt_scalef32_pk_f32_fp4(ssa)

rocdl.cvt.scalef32.pk.f32.fp4 - Scale and convert two packed fp4 to packed f32

Attributes

  • srcSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F32Type, fixed-length vector of 32-bit float values of length 2

Description

Convert two packed fp4 (f4E2M1) values stored as one byte of a 32-bit integer to packed f32, multiplying by the exponent part of scale before doing so.

The byte to convert is chosen by srcSelIndex.

cvt_scalef32_pk_f32_fp8(ssa)

rocdl.cvt.scalef32.pk.f32.fp8 - Scaled convert two fp8to two f32

Attributes

  • srcLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • src - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2F32Type, fixed-length vector of 32-bit float values of length 2

Description

Convert two packed fp8 values in src0 to two f32 values, multiplying by the exponent in scale. The two values to be converted are selected from the low or high half of src (a packed vector represented as an i32) on the basis of srcLoHiSel.

cvt_scalef32_pk_fp4_bf16(ssa)

rocdl.cvt.scalef32.pk.fp4.bf16 - Scale and convert two bf16 to packed fp4, updating tied vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert two packed bf16 values to packed fp4, dividing by the exponent part of scale before doing so.

The two scaled values are packed into a byte. That byte is used to update the dstSelIndexth byte of oldVdst, which is returned in its entirity.

cvt_scalef32_pk_fp4_f16(ssa)

rocdl.cvt.scalef32.pk.fp4.f16 - Scale and convert two f16 to packed fp4, updating tied vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert two packed f16 values to packed fp4, dividing by the exponent part of scale before doing so.

The two scaled values are packed into a byte. That byte is used to update the dstSelIndexth byte of oldVdst, which is returned in its entirity.

cvt_scalef32_pk_fp4_f32(ssa)

rocdl.cvt.scalef32.pk.fp4.f32 - Scale and convert two f32 values to two packed fp4, updating tied vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, F32, 32-bit float
  • src1 - Single, F32, 32-bit float
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert two single-precision float values, passed in src0 and src1 into two fp4 values, dividing them by the expontent part of scale before doing so.

The two scaled values are packed into a byte. That byte is used to update the dstSelIndexth byte of oldVdst, which is returned in its entirity.

cvt_scalef32_pk_fp8_bf16(ssa)

rocdl.cvt.scalef32.pk.fp8.bf16 - Scaled convert two bf16to two fp8, updating packed vector

Attributes

  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2
  • src0 - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2

Description

Convert two bf16 values in src0 to two fp8 bytes, dividing by the exponent in scale. The bytes are packed into a 16-bit value which is inserted into oldVdst at the dstLoHiSel position, with the entire updated vector being returned.

cvt_scalef32_pk_fp8_f16(ssa)

rocdl.cvt.scalef32.pk.fp8.f16 - Scaled convert two f16to two fp8, updating packed vector

Attributes

  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2
  • src0 - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2

Description

Convert two f16 values in src0 to two fp8 bytes, dividing by the exponent in scale. The bytes are packed into a 16-bit value which is inserted into oldVdst at the dstLoHiSel position, with the entire updated vector being returned.

cvt_scalef32_pk_fp8_f32(ssa)

rocdl.cvt.scalef32.pk.fp8.f32 - Scaled convert two f32 to two fp8, updating packed vector

Attributes

  • dstLoHiSel - Single, I1Attr, 1-bit signless integer attribute

Operands

  • oldVdst - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2
  • src0 - Single, F32, 32-bit float
  • src1 - Single, F32, 32-bit float
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V2I16Type, fixed-length vector of 16-bit signless integer values of length 2

Description

Convert two f32 values in src0 and src1 to two fp8 bytes, dividing by the exponent in scale. The bytes are packed into a 16-bit value which is inserted into oldVdst at the dstLoHiSel position, with the entire updated vector being returned.

cvt_scalef32_sr_bf8_bf16(ssa)

rocdl.cvt.scalef32.sr.bf8.bf16 - Scaled convert bf16to bf8 with stochiastic rounding, updating packed vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, BF16, bfloat16 type
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert a bf16 value in src0 to a bf8 bytes, dividing by the exponent in scale and using seed for stochiastic rounding. Place the resulting byte in the dstSelIndexth bit of oldVdst and return the entire packed vector, which is stored as an i32.

cvt_scalef32_sr_bf8_f16(ssa)

rocdl.cvt.scalef32.sr.bf8.f16 - Scaled convert f16to bf8 with stochiastic rounding, updating packed vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, F16, 16-bit float
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert a f16 value in src0 to a bf8 bytes, dividing by the exponent in scale and using seed for stochiastic rounding. Place the resulting byte in the dstSelIndexth bit of oldVdst and return the entire packed vector, which is stored as an i32.

cvt_scalef32_sr_bf8_f32(ssa)

rocdl.cvt.scalef32.sr.bf8.f32 - Scaled convert f32to bf8 with stochiastic rounding, updating packed vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, F32, 32-bit float
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert a f32 value in src0 to a bf8 bytes, dividing by the exponent in scale and using seed for stochiastic rounding. Place the resulting byte in the dstSelIndexth bit of oldVdst and return the entire packed vector, which is stored as an i32.

cvt_scalef32_sr_fp8_bf16(ssa)

rocdl.cvt.scalef32.sr.fp8.bf16 - Scaled convert bf16to fp8 with stochiastic rounding, updating packed vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, BF16, bfloat16 type
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert a bf16 value in src0 to a fp8 bytes, dividing by the exponent in scale and using seed for stochiastic rounding. Place the resulting byte in the dstSelIndexth bit of oldVdst and return the entire packed vector, which is stored as an i32.

cvt_scalef32_sr_fp8_f16(ssa)

rocdl.cvt.scalef32.sr.fp8.f16 - Scaled convert f16to fp8 with stochiastic rounding, updating packed vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, F16, 16-bit float
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert a f16 value in src0 to a fp8 bytes, dividing by the exponent in scale and using seed for stochiastic rounding. Place the resulting byte in the dstSelIndexth bit of oldVdst and return the entire packed vector, which is stored as an i32.

cvt_scalef32_sr_fp8_f32(ssa)

rocdl.cvt.scalef32.sr.fp8.f32 - Scaled convert f32to fp8 with stochiastic rounding, updating packed vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src0 - Single, F32, 32-bit float
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert a f32 value in src0 to a fp8 bytes, dividing by the exponent in scale and using seed for stochiastic rounding. Place the resulting byte in the dstSelIndexth bit of oldVdst and return the entire packed vector, which is stored as an i32.

cvt_scalef32_sr_pk32_bf6_bf16(ssa)

rocdl.cvt.scalef32.sr.pk32.bf6.bf16 - Scale and convert packed bf16 to packed bf6 with stochiastic rounding

Operands

  • src - Single, ROCDL_V32BF16Type, fixed-length vector of bfloat16 type values of length 32
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed bf16 values to packed bf6, dividing by the exponent part of scale before doing so and applying random rounding derived from seed.

cvt_scalef32_sr_pk32_bf6_f16(ssa)

rocdl.cvt.scalef32.sr.pk32.bf6.f16 - Scale and convert packed f16 to packed bf6 with stochiastic rounding

Operands

  • src - Single, ROCDL_V32F16Type, fixed-length vector of 16-bit float values of length 32
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed f16 values to packed bf6, dividing by the exponent part of scale before doing so and applying random rounding derived from seed.

cvt_scalef32_sr_pk32_bf6_f32(ssa)

rocdl.cvt.scalef32.sr.pk32.bf6.f32 - Scale and convert packed f32 to packed bf6 with stochiastic rounding

Operands

  • src - Single, ROCDL_V32F32Type, fixed-length vector of 32-bit float values of length 32
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed f32 values to packed bf6, dividing by the exponent part of scale before doing so and applying random rounding derived from seed.

cvt_scalef32_sr_pk32_fp6_bf16(ssa)

rocdl.cvt.scalef32.sr.pk32.fp6.bf16 - Scale and convert packed bf16 to packed fp6 with stochiastic rounding

Operands

  • src - Single, ROCDL_V32BF16Type, fixed-length vector of bfloat16 type values of length 32
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed bf16 values to packed fp6, dividing by the exponent part of scale before doing so and applying random rounding derived from seed.

cvt_scalef32_sr_pk32_fp6_f16(ssa)

rocdl.cvt.scalef32.sr.pk32.fp6.f16 - Scale and convert packed f16 to packed fp6 with stochiastic rounding

Operands

  • src - Single, ROCDL_V32F16Type, fixed-length vector of 16-bit float values of length 32
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed f16 values to packed fp6, dividing by the exponent part of scale before doing so and applying random rounding derived from seed.

cvt_scalef32_sr_pk32_fp6_f32(ssa)

rocdl.cvt.scalef32.sr.pk32.fp6.f32 - Scale and convert packed f32 to packed fp6 with stochiastic rounding

Operands

  • src - Single, ROCDL_V32F32Type, fixed-length vector of 32-bit float values of length 32
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, ROCDL_V6I32Type, fixed-length vector of 32-bit signless integer values of length 6

Description

Convert 32 packed f32 values to packed fp6, dividing by the exponent part of scale before doing so and applying random rounding derived from seed.

cvt_scalef32_sr_pk_fp4_bf16(ssa)

rocdl.cvt.scalef32.sr.pk.fp4.bf16 - Scale and convert two bf16 to packed fp4 with stochiastic rounding, updating tied vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src - Single, ROCDL_V2BF16Type, fixed-length vector of bfloat16 type values of length 2
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert two packed bf16 values to packed fp4, dividing by the exponent part of scale before doing so and using seed as the random seed for stochiastic rounding.

The two scaled values are packed (little-endian) into a byte. That byte is used to update the dstSelIndexth byte of oldVdst, which is returned in its entirity.

cvt_scalef32_sr_pk_fp4_f16(ssa)

rocdl.cvt.scalef32.sr.pk.fp4.f16 - Scale and convert two f16 to packed fp4 with stochiastic rounding, updating tied vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src - Single, ROCDL_V2F16Type, fixed-length vector of 16-bit float values of length 2
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert two packed f16 values to packed fp4, dividing by the exponent part of scale before doing so and using seed as the random seed for stochiastic rounding.

The two scaled values are packed (little-endian) into a byte. That byte is used to update the dstSelIndexth byte of oldVdst, which is returned in its entirity.

cvt_scalef32_sr_pk_fp4_f32(ssa)

rocdl.cvt.scalef32.sr.pk.fp4.f32 - Scale and convert two f32 to packed fp4 with stochiastic rounding, updating tied vector

Attributes

  • dstSelIndex - Single, I32Attr, 32-bit signless integer attribute

Operands

  • oldVdst - Single, I32, 32-bit signless integer
  • src - Single, ROCDL_V2F32Type, fixed-length vector of 32-bit float values of length 2
  • seed - Single, I32, 32-bit signless integer
  • scale - Single, F32, 32-bit float

Results

  • res - Single, I32, 32-bit signless integer

Description

Convert two packed f32 values to packed fp4, dividing by the exponent part of scale before doing so and using seed as the random seed for stochiastic rounding.

The two scaled values are packed (little-endian) into a byte. That byte is used to update the dstSelIndexth byte of oldVdst, which is returned in its entirity.

cvt_sr_bf8_f32(ssa)

rocdl.cvt.sr.bf8.f32 - Convert f32 to bf8, stochiastic rounding

Attributes

  • byteSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • srcA - Single, F32, 32-bit float
  • srcB - Single, I32, 32-bit signless integer
  • old - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert srcA to bf8, adding the rounding factor from srcB, and store into the byteSelth byte of old, preserving the others.

cvt_sr_fp8_f32(ssa)

rocdl.cvt.sr.fp8.f32 - Convert f32 to fp8, stochiastic rounding

Attributes

  • byteSel - Single, I32Attr, 32-bit signless integer attribute

Operands

  • srcA - Single, F32, 32-bit float
  • srcB - Single, I32, 32-bit signless integer
  • old - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Convert srcA to fp8, adding the rounding factor from srcB, and store into the byteSelth byte of old, preserving the others.

ds_bpermute(ssa)

rocdl.ds_bpermute

Operands

  • index - Single, I32, 32-bit signless integer
  • src - Single, I32, 32-bit signless integer

Results

  • res - Single, I32, 32-bit signless integer

ds_read_tr4_b64(ssa)

rocdl.ds.read.tr4.b64

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • ptr - Single, ROCDLBufferLDS, LLVM pointer in address space 3

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

ds_read_tr6_b96(ssa)

rocdl.ds.read.tr6.b96

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • ptr - Single, ROCDLBufferLDS, LLVM pointer in address space 3

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

ds_read_tr8_b64(ssa)

rocdl.ds.read.tr8.b64

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • ptr - Single, ROCDLBufferLDS, LLVM pointer in address space 3

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

ds_read_tr16_b64(ssa)

rocdl.ds.read.tr16.b64

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • ptr - Single, ROCDLBufferLDS, LLVM pointer in address space 3

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

ds_swizzle(ssa)

rocdl.ds_swizzle

Operands

  • src - Single, I32, 32-bit signless integer
  • offset - Single, I32, 32-bit signless integer

Results

  • res - Single, I32, 32-bit signless integer

fmed3(ssa)

rocdl.fmed3 - Median of three float/half values

Operands

  • src0 - Single, anonymous/composite constraint, floating point LLVM type or LLVM dialect-compatible vector of floating point LLVM type
  • src1 - Single, anonymous/composite constraint, floating point LLVM type or LLVM dialect-compatible vector of floating point LLVM type
  • src2 - Single, anonymous/composite constraint, floating point LLVM type or LLVM dialect-compatible vector of floating point LLVM type

Results

  • res - Single, anonymous/composite constraint, floating point LLVM type or LLVM dialect-compatible vector of floating point LLVM type

Description

Computes the median of three floating-point values using the AMDGPU fmed3 intrinsic. This operation is equivalent to max(min(a, b), min(max(a, b), c)) but uses the hardware-accelerated V_MED3_F16/V_MED3_F32 instruction for better performance.

The operation supports both scalar and vector floating-point types (f16, f32).

Example:

// Scalar f32 median
%result = rocdl.fmed3 %a, %b, %c : f32

// Vector f16 median
%result = rocdl.fmed3 %va, %vb, %vc : vector<4xf16>

global_load_lds(ssa)

rocdl.global.load.lds

Attributes

  • size - Single, I32Attr, 32-bit signless integer attribute
  • offset - Single, I32Attr, 32-bit signless integer attribute
  • aux - Single, I32Attr, 32-bit signless integer attribute
  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • globalPtr - Single, ROCDLGlobalBuffer, LLVM pointer in address space 1
  • ldsPtr - Single, ROCDLBufferLDS, LLVM pointer in address space 3

grid_dim_x(ssa)

rocdl.grid.dim.x

grid_dim_y(ssa)

rocdl.grid.dim.y

grid_dim_z(ssa)

rocdl.grid.dim.z

iglp_opt(ssa)

rocdl.iglp.opt

load_to_lds(ssa)

rocdl.load.to.lds

Attributes

  • size - Single, I32Attr, 32-bit signless integer attribute
  • offset - Single, I32Attr, 32-bit signless integer attribute
  • aux - Single, I32Attr, 32-bit signless integer attribute
  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • globalPtr - Single, LLVM_AnyPointer, LLVM pointer type
  • ldsPtr - Single, ROCDLBufferLDS, LLVM pointer in address space 3

make_buffer_rsrc(ssa)

rocdl.make.buffer.rsrc

Operands

  • base - Single, LLVM_AnyPointer, LLVM pointer type
  • stride - Single, I16, 16-bit signless integer
  • numRecords - Single, I64, 64-bit signless integer
  • flags - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_AnyPointer, LLVM pointer type

mbcnt_hi(ssa)

rocdl.mbcnt.hi

Attributes

  • arg_attrs - Optional, DictArrayAttr, Array of dictionary attributes
  • res_attrs - Optional, DictArrayAttr, Array of dictionary attributes

Operands

  • in0 - Single, I32, 32-bit signless integer
  • in1 - Single, I32, 32-bit signless integer

Results

  • res - Single, I32, 32-bit signless integer

mbcnt_lo(ssa)

rocdl.mbcnt.lo

Attributes

  • arg_attrs - Optional, DictArrayAttr, Array of dictionary attributes
  • res_attrs - Optional, DictArrayAttr, Array of dictionary attributes

Operands

  • in0 - Single, I32, 32-bit signless integer
  • in1 - Single, I32, 32-bit signless integer

Results

  • res - Single, I32, 32-bit signless integer

mfma_f32_4x4x1f32(ssa)

rocdl.mfma.f32.4x4x1f32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_4x4x2bf16(ssa)

rocdl.mfma.f32.4x4x2bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_4x4x4bf16_1k(ssa)

rocdl.mfma.f32.4x4x4bf16.1k

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_4x4x4f16(ssa)

rocdl.mfma.f32.4x4x4f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x1f32(ssa)

rocdl.mfma.f32.16x16x1f32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x2bf16(ssa)

rocdl.mfma.f32.16x16x2bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x4bf16_1k(ssa)

rocdl.mfma.f32.16x16x4bf16.1k

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x4f16(ssa)

rocdl.mfma.f32.16x16x4f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x4f32(ssa)

rocdl.mfma.f32.16x16x4f32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x8_xf32(ssa)

rocdl.mfma.f32.16x16x8.xf32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x8bf16(ssa)

rocdl.mfma.f32.16x16x8bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x16bf16_1k(ssa)

rocdl.mfma.f32.16x16x16bf16.1k

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x16f16(ssa)

rocdl.mfma.f32.16x16x16f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x32_bf8_bf8(ssa)

rocdl.mfma.f32.16x16x32.bf8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x32_bf8_fp8(ssa)

rocdl.mfma.f32.16x16x32.bf8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x32_bf16(ssa)

rocdl.mfma.f32.16x16x32.bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x32_f16(ssa)

rocdl.mfma.f32.16x16x32.f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x32_fp8_bf8(ssa)

rocdl.mfma.f32.16x16x32.fp8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_16x16x32_fp8_fp8(ssa)

rocdl.mfma.f32.16x16x32.fp8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x1f32(ssa)

rocdl.mfma.f32.32x32x1f32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x2bf16(ssa)

rocdl.mfma.f32.32x32x2bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x2f32(ssa)

rocdl.mfma.f32.32x32x2f32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x4_xf32(ssa)

rocdl.mfma.f32.32x32x4.xf32

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x4bf16(ssa)

rocdl.mfma.f32.32x32x4bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x4bf16_1k(ssa)

rocdl.mfma.f32.32x32x4bf16.1k

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x4f16(ssa)

rocdl.mfma.f32.32x32x4f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x8bf16_1k(ssa)

rocdl.mfma.f32.32x32x8bf16.1k

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x8f16(ssa)

rocdl.mfma.f32.32x32x8f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x16_bf8_bf8(ssa)

rocdl.mfma.f32.32x32x16.bf8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x16_bf8_fp8(ssa)

rocdl.mfma.f32.32x32x16.bf8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x16_bf16(ssa)

rocdl.mfma.f32.32x32x16.bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x16_f16(ssa)

rocdl.mfma.f32.32x32x16.f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x16_fp8_bf8(ssa)

rocdl.mfma.f32.32x32x16.fp8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f32_32x32x16_fp8_fp8(ssa)

rocdl.mfma.f32.32x32x16.fp8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f64_4x4x4f64(ssa)

rocdl.mfma.f64.4x4x4f64

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_f64_16x16x4f64(ssa)

rocdl.mfma.f64.16x16x4f64

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_4x4x4i8(ssa)

rocdl.mfma.i32.4x4x4i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_16x16x4i8(ssa)

rocdl.mfma.i32.16x16x4i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_16x16x16i8(ssa)

rocdl.mfma.i32.16x16x16i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_16x16x32_i8(ssa)

rocdl.mfma.i32.16x16x32.i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_16x16x64_i8(ssa)

rocdl.mfma.i32.16x16x64.i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_32x32x4i8(ssa)

rocdl.mfma.i32.32x32x4i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_32x32x8i8(ssa)

rocdl.mfma.i32.32x32x8i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_32x32x16_i8(ssa)

rocdl.mfma.i32.32x32x16.i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_i32_32x32x32_i8(ssa)

rocdl.mfma.i32.32x32x32.i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_scale_f32_16x16x128_f8f6f4(ssa)

rocdl.mfma.scale.f32.16x16x128.f8f6f4

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

mfma_scale_f32_32x32x64_f8f6f4(ssa)

rocdl.mfma.scale.f32.32x32x64.f8f6f4

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

permlane16_swap(ssa)

rocdl.permlane16.swap

Attributes

  • fi - Single, I1Attr, 1-bit signless integer attribute
  • boundControl - Single, I1Attr, 1-bit signless integer attribute

Operands

  • old - Single, I32, 32-bit signless integer
  • src - Single, I32, 32-bit signless integer

Results

  • res - Single, anonymous/composite constraint, LLVM dialect-compatible struct of 32-bit signless integerand32-bit signless integer

Description

Performs a permlane16.swap operation with the given operands, applying the permutation specified by $fi to the provided inputs.

permlane32_swap(ssa)

rocdl.permlane32.swap

Attributes

  • fi - Single, I1Attr, 1-bit signless integer attribute
  • boundControl - Single, I1Attr, 1-bit signless integer attribute

Operands

  • old - Single, I32, 32-bit signless integer
  • src - Single, I32, 32-bit signless integer

Results

  • res - Single, anonymous/composite constraint, LLVM dialect-compatible struct of 32-bit signless integerand32-bit signless integer

Description

Performs a permlane32.swap operation with the given operands, applying the permutation specified by $fi to the provided inputs.

permlanex16(ssa)

rocdl.permlanex16

Attributes

  • fi - Single, I1Attr, 1-bit signless integer attribute
  • boundControl - Single, I1Attr, 1-bit signless integer attribute

Operands

  • old - Single, LLVM_Type, LLVM dialect-compatible type
  • src0 - Single, LLVM_Type, LLVM dialect-compatible type
  • src1 - Single, LLVM_Type, LLVM dialect-compatible type
  • src2 - Single, LLVM_Type, LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Performs a permlanex16 operation with the given operands, applying the permutation specified by $fi to the provided inputs.

raw_buffer_atomic_cmpswap(ssa)

rocdl.raw.buffer.atomic.cmpswap

Operands

  • src - Single, LLVM_Type, LLVM dialect-compatible type
  • cmp - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

raw_buffer_atomic_fadd(ssa)

rocdl.raw.buffer.atomic.fadd

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, LLVM_Type, LLVM dialect-compatible type
  • soffset - Single, LLVM_Type, LLVM dialect-compatible type
  • aux - Single, LLVM_Type, LLVM dialect-compatible type

raw_buffer_atomic_fmax(ssa)

rocdl.raw.buffer.atomic.fmax

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, LLVM_Type, LLVM dialect-compatible type
  • soffset - Single, LLVM_Type, LLVM dialect-compatible type
  • aux - Single, LLVM_Type, LLVM dialect-compatible type

raw_buffer_atomic_smax(ssa)

rocdl.raw.buffer.atomic.smax

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, LLVM_Type, LLVM dialect-compatible type
  • soffset - Single, LLVM_Type, LLVM dialect-compatible type
  • aux - Single, LLVM_Type, LLVM dialect-compatible type

raw_buffer_atomic_umin(ssa)

rocdl.raw.buffer.atomic.umin

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, LLVM_Type, LLVM dialect-compatible type
  • soffset - Single, LLVM_Type, LLVM dialect-compatible type
  • aux - Single, LLVM_Type, LLVM dialect-compatible type

raw_buffer_load(ssa)

rocdl.raw.buffer.load

Operands

  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, LLVM_Type, LLVM dialect-compatible type
  • soffset - Single, LLVM_Type, LLVM dialect-compatible type
  • aux - Single, LLVM_Type, LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

raw_buffer_store(ssa)

rocdl.raw.buffer.store

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, LLVM_Type, LLVM dialect-compatible type
  • offset - Single, LLVM_Type, LLVM dialect-compatible type
  • soffset - Single, LLVM_Type, LLVM dialect-compatible type
  • aux - Single, LLVM_Type, LLVM dialect-compatible type

raw_ptr_buffer_atomic_cmpswap(ssa)

rocdl.raw.ptr.buffer.atomic.cmpswap

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • src - Single, LLVM_Type, LLVM dialect-compatible type
  • cmp - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

raw_ptr_buffer_atomic_fadd(ssa)

rocdl.raw.ptr.buffer.atomic.fadd

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

raw_ptr_buffer_atomic_fmax(ssa)

rocdl.raw.ptr.buffer.atomic.fmax

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

raw_ptr_buffer_atomic_smax(ssa)

rocdl.raw.ptr.buffer.atomic.smax

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

raw_ptr_buffer_atomic_umin(ssa)

rocdl.raw.ptr.buffer.atomic.umin

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

raw_ptr_buffer_load(ssa)

rocdl.raw.ptr.buffer.load

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

raw_ptr_buffer_load_lds(ssa)

rocdl.raw.ptr.buffer.load.lds

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • ldsPtr - Single, ROCDLBufferLDS, LLVM pointer in address space 3
  • size - Single, I32, 32-bit signless integer
  • voffset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • offset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

raw_ptr_buffer_store(ssa)

rocdl.raw.ptr.buffer.store

Attributes

  • alias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • noalias_scopes - Optional, LLVM_AliasScopeArrayAttr, LLVM dialect alias scope array
  • tbaa - Optional, LLVM_TBAATagArrayAttr, LLVM dialect TBAA tag metadata array

Operands

  • vdata - Single, LLVM_Type, LLVM dialect-compatible type
  • rsrc - Single, ROCDLBufferRsrc, LLVM pointer in address space 8
  • offset - Single, I32, 32-bit signless integer
  • soffset - Single, I32, 32-bit signless integer
  • aux - Single, I32, 32-bit signless integer

readfirstlane(ssa)

rocdl.readfirstlane - Get the value in first active lane.

Operands

  • src - Single, LLVM_Type, LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Returns the value in the lowest active lane of the input operand.

readlane(ssa)

rocdl.readlane - Get the value in the specific lane.

Operands

  • src0 - Single, LLVM_Type, LLVM dialect-compatible type
  • src1 - Single, I32, 32-bit signless integer

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

Description

Get the value in lane src1 from input src0.

s_barrier(ssa)

rocdl.s.barrier

s_barrier_signal(ssa)

rocdl.s.barrier.signal

s_barrier_wait(ssa)

rocdl.s.barrier.wait

s_setprio(ssa)

rocdl.s.setprio

s_sleep(ssa)

rocdl.s.sleep

s_wait_dscnt(ssa)

rocdl.s.wait.dscnt

s_wait_expcnt(ssa)

rocdl.s.wait.expcnt

s_wait_loadcnt(ssa)

rocdl.s.wait.loadcnt

s_wait_storecnt(ssa)

rocdl.s.wait.storecnt

s_waitcnt(ssa)

rocdl.s.waitcnt

sched_barrier(ssa)

rocdl.sched.barrier

sched_group_barrier(ssa)

rocdl.sched.group.barrier

smfmac_f32_16x16x32_bf16(ssa)

rocdl.smfmac.f32.16x16x32.bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_16x16x32_f16(ssa)

rocdl.smfmac.f32.16x16x32.f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_16x16x64_bf8_bf8(ssa)

rocdl.smfmac.f32.16x16x64.bf8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_16x16x64_bf8_fp8(ssa)

rocdl.smfmac.f32.16x16x64.bf8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_16x16x64_fp8_bf8(ssa)

rocdl.smfmac.f32.16x16x64.fp8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_16x16x64_fp8_fp8(ssa)

rocdl.smfmac.f32.16x16x64.fp8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_32x32x16_bf16(ssa)

rocdl.smfmac.f32.32x32x16.bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_32x32x16_f16(ssa)

rocdl.smfmac.f32.32x32x16.f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_32x32x32_bf8_bf8(ssa)

rocdl.smfmac.f32.32x32x32.bf8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_32x32x32_bf8_fp8(ssa)

rocdl.smfmac.f32.32x32x32.bf8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_32x32x32_fp8_bf8(ssa)

rocdl.smfmac.f32.32x32x32.fp8.bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_f32_32x32x32_fp8_fp8(ssa)

rocdl.smfmac.f32.32x32x32.fp8.fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_i32_16x16x64_i8(ssa)

rocdl.smfmac.i32.16x16x64.i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

smfmac_i32_32x32x32_i8(ssa)

rocdl.smfmac.i32.32x32x32.i8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

update_dpp(ssa)

rocdl.update.dpp

Attributes

  • dppCtrl - Single, I32Attr, 32-bit signless integer attribute
  • rowMask - Single, I32Attr, 32-bit signless integer attribute
  • bankMask - Single, I32Attr, 32-bit signless integer attribute
  • boundCtrl - Single, I1Attr, 1-bit signless integer attribute

Operands

  • old - Single, LLVM_Type, LLVM dialect-compatible type
  • src - Single, LLVM_Type, LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wavefrontsize(ssa)

rocdl.wavefrontsize

wmma_bf16_16x16x16_bf16(ssa)

rocdl.wmma.bf16.16x16x16.bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f16_16x16x16_f16(ssa)

rocdl.wmma.f16.16x16x16.f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f32_16x16x16_bf8_bf8(ssa)

rocdl.wmma.f32.16x16x16.bf8_bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f32_16x16x16_bf8_fp8(ssa)

rocdl.wmma.f32.16x16x16.bf8_fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f32_16x16x16_bf16(ssa)

rocdl.wmma.f32.16x16x16.bf16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f32_16x16x16_f16(ssa)

rocdl.wmma.f32.16x16x16.f16

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f32_16x16x16_fp8_bf8(ssa)

rocdl.wmma.f32.16x16x16.fp8_bf8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_f32_16x16x16_fp8_fp8(ssa)

rocdl.wmma.f32.16x16x16.fp8_fp8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_i32_16x16x16_iu4(ssa)

rocdl.wmma.i32.16x16x16.iu4

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_i32_16x16x16_iu8(ssa)

rocdl.wmma.i32.16x16x16.iu8

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

wmma_i32_16x16x32_iu4(ssa)

rocdl.wmma.i32.16x16x32.iu4

Operands

  • args - Variadic, LLVM_Type, variadic of LLVM dialect-compatible type

Results

  • res - Single, LLVM_Type, LLVM dialect-compatible type

workgroup_dim_x(ssa)

rocdl.workgroup.dim.x

workgroup_dim_y(ssa)

rocdl.workgroup.dim.y

workgroup_dim_z(ssa)

rocdl.workgroup.dim.z

workgroup_id_x(ssa)

rocdl.workgroup.id.x

workgroup_id_y(ssa)

rocdl.workgroup.id.y

workgroup_id_z(ssa)

rocdl.workgroup.id.z

workitem_id_x(ssa)

rocdl.workitem.id.x

workitem_id_y(ssa)

rocdl.workitem.id.y

workitem_id_z(ssa)

rocdl.workitem.id.z