Image.Generation (image v0.62.0)

Implements image generation functions using Axon machine learning models managed by Bumblebee.

Configuration

The machine learning model to be used is configurable however only Stable Diffusion is currently supported.

The default configuration is:

# runtime.exs
config :image, :generator,
  repository_id: "CompVis/stable-diffusion-v1-4",
  scheduler: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"},
  featurizer: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "feature_extractor"},
  safety_checker: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "safety_checker"},
  autostart: false

Autostart

If autostart: true is configured (the default is false) then a process is started under a supervisor to execute the generation requests. If running the process under an application supervision tree is desired, set autostart: false. In that case the function Image.Generation.generator/2 can be used to return a Supervisor.child_spec/0.

Adding a image generation server to an application supervision tree

To add image generation to an application supervision tree, use Image.Generation.generator/2 to return a child spec: For example:

# Application.ex
def start(_type, _args) do
  children = [
    # default classifier configuration
    Image.Generation.generator()
  ]

  Supervisor.start_link(
    children,
    strategy: :one_for_one
  )
end

Starting a supervised image generation process

If a dynamically started image generation process is required one can be started under a supervisor with:

iex> Supervisor.start_link([Image.Generation.generator()], strategy: :one_for_one)

Summary

Functions

generator(generator \\ Application.get_env(:image, :generator, []), options \\ [])

Returns a child spec for service that generates images from text using Stable Diffusion implemented in Bumblebee.

text_to_image(prompt, options \\ [])

Generates an image from a textual description using Bumblebee's suport of the Stable Diffusion model.

Functions

generator(generator \\ Application.get_env(:image, :generator, []), options \\ [])

Returns a child spec for service that generates images from text using Stable Diffusion implemented in Bumblebee.

Arguments

generator is a keyword list of configuration options for an image generator or :default.
options is a keyword list of options.

Options

:num_steps determines the number of steps to execute in the generation model. The default is 20. Changing this to 40 may increase image quality.
:num_images_per_prompt determines how many image alternatives are returned. The default is 1.
:name is the name given to the child process. THe default is Image.Generation.Server.

Default configuration

If generator is set to :default the following configuration is used:

[
  repository_id: "CompVis/stable-diffusion-v1-4",
  scheduler: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"},
  featurizer: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "feature_extractor"},
  safety_checker: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "safety_checker"},
  autostart: false
]

If no generator is specified (or it is set to :default then the configuration is derived from runtime.exs which is then merged into the default configuration. In runtime.exs the configuration would be specified as follows:

config :image, :generator,
  repository_id: "CompVis/stable-diffusion-v1-4",
  scheduler: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"},
  featurizer: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "feature_extractor"},
  safety_checker: {:hf, "CompVis/stable-diffusion-v1-4", subdir: "safety_checker"},
  autostart: false

Automatically starting the service

The :autostart configuration option determines if the image generation service is started when the :image application is started. The default is false. To cause the service to be started at application start, add the following to your runtime.exs:

config :image, :generator,
  autostart: true

text_to_image(prompt, options \\ [])

@spec text_to_image(prompt :: String.t(), options :: Keyword.t()) :: [
  Vix.Vips.Image.t()
]

Generates an image from a textual description using Bumblebee's suport of the Stable Diffusion model.

Arguments

prompt is a String.t/0 description of the scene to be generated.
options is a keyword list of options. The default is negative_prompt: "".

Options

:negative_prompt is a String.t/0 that tells Stable Diffusion what you don't want to see in the generated images. When specified, it guides the generation process not to include things in the image according to a given text.

Example

iex> Image.Generation.text_to_image "impressionist purple numbat in the style of monet"
[%Vix.Vips.Image{ref: #Reference<0.1281915998.296878104.76045>}]