HuggingfaceClient.Hub.Kernels (huggingface_client v0.1.0)

Copy Markdown View Source

HuggingFace Kernels API — load and run custom compute kernels from the Hub.

Kernels are optimized computation primitives (GPU kernels, custom ops) shared on the Hub. They can be loaded and used in training/inference pipelines.

See: https://huggingface.co/docs/kernels

Example

# List available kernels
{:ok, kernels} = HuggingfaceClient.list_kernels(access_token: token)

# Get kernel info
{:ok, kernel} = HuggingfaceClient.kernel_info(
  "kernels/flash-attention-2",
  access_token: token
)

# Get kernel download URLs
{:ok, files} = HuggingfaceClient.kernel_files(
  "kernels/flash-attention-2",
  backend: "cuda",
  access_token: token
)

Summary

Functions

Returns the download URL for a kernel file.

Returns the files available for a kernel (binaries, headers, etc.).

Gets detailed information about a specific kernel.

Lists available kernels on the Hub.

Searches for kernels compatible with a specific operation or architecture.

Functions

download_url(kernel_id, opts \\ [])

@spec download_url(
  String.t(),
  keyword()
) :: String.t()

Returns the download URL for a kernel file.

Example

url = HuggingfaceClient.kernel_download_url(
  "flash-attention/flash-attention-2-cuda",
  filename: "flash_attention_2_cuda.so",
  access_token: token
)

files(kernel_id, opts \\ [])

@spec files(
  String.t(),
  keyword()
) :: {:ok, [map()]} | {:error, Exception.t()}

Returns the files available for a kernel (binaries, headers, etc.).

Options

  • :backend — filter to specific backend: "cuda", "triton", "metal"
  • :version — specific version/revision (default: "main")
  • :access_token

Example

{:ok, files} = HuggingfaceClient.kernel_files(
  "flash-attention/flash-attention-2-cuda",
  backend: "cuda",
  access_token: token
)
Enum.each(files, fn f -> IO.puts(f["rfilename"]) end)

info(kernel_id, opts \\ [])

@spec info(
  String.t(),
  keyword()
) :: {:ok, map()} | {:error, Exception.t()}

Gets detailed information about a specific kernel.

Example

{:ok, kernel} = HuggingfaceClient.kernel_info("flash-attention/flash-attention-2-cuda",
  access_token: token
)
IO.puts("Supported backends: #{inspect(kernel["tags"])}")

list(opts \\ [])

@spec list(keyword()) :: {:ok, [map()]} | {:error, Exception.t()}

Lists available kernels on the Hub.

Options

  • :search — filter by name
  • :backend — filter by backend: "cuda", "triton", "metal", "cpu"
  • :sort — sort field: "downloads", "likes", "lastModified"
  • :limit — max results
  • :access_token

Example

{:ok, kernels} = HuggingfaceClient.list_kernels(
  backend: "cuda",
  sort: "downloads",
  access_token: token
)
Enum.each(kernels, fn k -> IO.puts(k["id"]) end)

search(query, opts \\ [])

@spec search(
  String.t(),
  keyword()
) :: {:ok, [map()]} | {:error, Exception.t()}

Searches for kernels compatible with a specific operation or architecture.

Example

{:ok, matches} = HuggingfaceClient.search_kernels("flash attention",
  backend: "cuda",
  access_token: token
)