View Source LangChain.ChatModels.ChatBumblebee (LangChain v0.2.0)
Represents a chat model hosted by Bumblebee and accessed through an
Nx.Serving
.
Many types of models can be hosted through Bumblebee, so this attempts to represent the most common features and provide a single implementation where possible.
For streaming responses, the Bumblebee serving must be configured with
stream: true
and should include stream_done: true
as well.
Example:
Bumblebee.Text.generation(model_info, tokenizer, generation_config,
# ...
stream: true,
stream_done: true
)
This supports a non streaming response as well, in which case, a completed
LangChain.Message
is returned at the completion.
The stream_done
option sends a final message to let us know when the stream
is complete and includes some token information.
The chat model can be created like this and provided to an LLMChain:
ChatBumblebee.new!(%{
serving: @serving_name,
template_format: @template_format,
receive_timeout: @receive_timeout,
stream: true
})
The serving
is the module name of the Nx.Serving
that is hosting the
model.
The following are the supported values for template_format
. These are
provided by LangChain.Utils.ChatTemplates
.
Chat models are trained against specific content formats for the messages.
Some models have no special concept of a system message. See the
LangChain.Utils.ChatTemplates
documentation for specific format examples.
Using the wrong format with a model may result in poor performance or hallucinations. It will not result in an error.
Full example of chat through Bumblebee
Here's a full example of having a streaming conversation with Llama 2 through Bumblebee.
defmodule MyApp.BumblebeeChat do
@doc false
alias LangChain.Message
alias LangChain.ChatModels.ChatBumblebee
alias LangChain.Chains.LLMChain
def run_chat do
# Used when streaming responses. The function fires as data is received.
callback_fn = fn
%LangChain.MessageDelta{} = delta ->
# write to the console as the response is streamed back
IO.write(delta.content)
%LangChain.Message{} = message ->
# inspect the fully finished message that was assembled from all the deltas
IO.inspect(message, label: "FULLY ASSEMBLED MESSAGE")
end
# create and run the chain
{:ok, _updated_chain, %Message{} = message} =
LLMChain.new!(%{
llm:
ChatBumblebee.new!(%{
serving: Llama2ChatModel,
template_format: :llama_2,
stream: true
}),
verbose: true
})
|> LLMChain.add_message(Message.new_system!("You are a helpful assistant."))
|> LLMChain.add_message(Message.new_user!("What is the capital of Taiwan? And share up to 5 interesting facts about the city."))
|> LLMChain.run(callback_fn: callback_fn)
# print the LLM's fully assembled answer
IO.puts("\n\n")
IO.puts(message.content)
IO.puts("\n\n")
end
end
Then run the code in IEx:
recompile; MyApp.BumblebeeChat.run_chat
Summary
Functions
Setup a ChatBumblebee client configuration.
Setup a ChatBumblebee client configuration and return it or raise an error if invalid.
Types
@type callback_fn() :: (LangChain.Message.t() | LangChain.MessageDelta.t() -> any())
Functions
@spec new(attrs :: map()) :: {:ok, t()} | {:error, Ecto.Changeset.t()}
Setup a ChatBumblebee client configuration.
Setup a ChatBumblebee client configuration and return it or raise an error if invalid.