LLMs#

Module for handling processing logic using LLMs.

This module provides classes and utilities for interacting with LLMs in document processing workflows. It includes functionality for managing LLM configurations, handling API calls, processing text and image inputs, tracking token usage and costs, and managing rate limits for LLM requests.

The module supports various LLM providers through the litellm library, enabling both text-only and multimodal (vision) capabilities. It implements efficient asynchronous processing patterns and provides detailed usage statistics for monitoring and cost management.

class contextgem.public.llms.DocumentLLMGroup(**data)[source]#

Bases: _GenericLLMProcessor

Represents a group of DocumentLLMs with unique roles for processing document content.

This class manages multiple LLMs assigned to specific roles for text and vision processing. It ensures role compliance and facilitates extraction of aspects and concepts from documents.

Variables:
  • llms – A list of DocumentLLM instances, each with a unique role (e.g., extractor_text, reasoner_text, extractor_vision, reasoner_vision). At least 2 instances with distinct roles are required.

  • output_language – Language for produced output text (justifications, explanations). Values: “en” (always English) or “adapt” (matches document/image language). All LLMs in the group must share the same output_language setting.

Note:

Refer to the DocumentLLM class for more information on constructing LLMs for the group.

Example:
LLM group definition#
from contextgem import DocumentLLM, DocumentLLMGroup

# Create a text extractor LLM with a fallback
text_extractor = DocumentLLM(
    model="openai/gpt-4o-mini",
    api_key="your-openai-api-key",  # Replace with your actual API key
    role="extractor_text",
)

# Create a fallback LLM for the text extractor
text_extractor_fallback = DocumentLLM(
    model="anthropic/claude-3-5-haiku",
    api_key="your-anthropic-api-key",  # Replace with your actual API key
    role="extractor_text",  # Must have the same role as the primary LLM
    is_fallback=True,
)

# Assign the fallback LLM to the primary text extractor
text_extractor.fallback_llm = text_extractor_fallback

# Create a text reasoner LLM
text_reasoner = DocumentLLM(
    model="openai/o3-mini",
    api_key="your-openai-api-key",  # Replace with your actual API key
    role="reasoner_text",  # For more complex tasks that require reasoning
)

# Create a vision extractor LLM
vision_extractor = DocumentLLM(
    model="openai/gpt-4o-mini",
    api_key="your-openai-api-key",  # Replace with your actual API key
    role="extractor_vision",  # For handling images
)

# Create a vision reasoner LLM
vision_reasoner = DocumentLLM(
    model="openai/gpt-4o",
    api_key="your-openai-api-key",
    role="reasoner_vision",  # For more complex vision tasks that require reasoning
)

# Create a DocumentLLMGroup with all four LLMs
llm_group = DocumentLLMGroup(
    llms=[text_extractor, text_reasoner, vision_extractor, vision_reasoner],
    output_language="en",  # All LLMs must have the same output language ("en" is default)
)
# This group will have 5 LLMs: four main ones, with different roles,
# and one fallback LLM for a specific LLM. Each LLM can have a fallback LLM.

# Get usage statistics for the whole group or for a specific role
group_usage = llm_group.get_usage()
text_extractor_usage = llm_group.get_usage(llm_role="extractor_text")

# Get cost statistics for the whole group or for a specific role
all_costs = llm_group.get_cost()
text_extractor_cost = llm_group.get_cost(llm_role="extractor_text")

# Reset usage and cost statistics for the whole group or for a specific role
llm_group.reset_usage_and_cost()
llm_group.reset_usage_and_cost(llm_role="extractor_text")

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

llms: list[DocumentLLM]#
output_language: LanguageRequirement#
property is_group: bool#

Abstract property, to be implemented by subclasses.

Whether the LLM is a single instance or a group.

property list_roles: list[Literal['extractor_text', 'reasoner_text', 'extractor_vision', 'reasoner_vision']]#

Returns a list of all roles assigned to the LLMs in this group.

Returns:

A list of LLM role identifiers

Return type:

list[LLMRoleAny]

group_update_output_language(output_language)[source]#

Updates the output language for all LLMs in the group.

Parameters:

output_language (LanguageRequirement) – The new output language to set for all LLMs

Return type:

None

_eq_deserialized_llm_config(other)[source]#

Custom config equality method to compare this DocumentLLMGroup with a deserialized instance.

Uses the _eq_deserialized_llm_config method of the DocumentLLM class to compare each LLM in the group, including fallbacks, if any.

Parameters:

other (DocumentLLMGroup) – Another DocumentLLMGroup instance to compare with

Returns:

True if the instances are equal, False otherwise

Return type:

bool

get_usage(llm_role=None)[source]#

Retrieves the usage information of the LLMs in the group, filtered by the specified LLM role if provided.

Parameters:

llm_role (Optional[str]) – Optional; A string representing the role of the LLM to filter the usage data. If None, returns usage for all LLMs in the group.

Returns:

A list of usage statistics containers for the specified LLMs and their fallbacks.

Return type:

list[_LLMUsageOutputContainer]

Raises:

ValueError – If no LLM with the specified role exists in the group.

get_cost(llm_role=None)[source]#

Retrieves the accumulated cost information of the LLMs in the group, filtered by the specified LLM role if provided.

Parameters:

llm_role (Optional[str]) – Optional; A string representing the role of the LLM to filter the cost data. If None, returns cost for all LLMs in the group.

Returns:

A list of cost statistics containers for the specified LLMs and their fallbacks.

Return type:

list[_LLMCostOutputContainer]

Raises:

ValueError – If no LLM with the specified role exists in the group.

reset_usage_and_cost(llm_role=None)[source]#

Resets the usage and cost statistics for LLMs in the group.

This method clears accumulated usage and cost data, which is useful when processing multiple documents sequentially and tracking metrics for each document separately.

Parameters:

llm_role (Optional[str]) – Optional; A string representing the role of the LLM to reset statistics for. If None, resets statistics for all LLMs in the group.

Raises:

ValueError – If no LLM with the specified role exists in the group.

Return type:

None

Returns:

None

class contextgem.public.llms.DocumentLLM(**data)[source]#

Bases: _GenericLLMProcessor

Handles processing documents with a specific LLM.

This class serves as an abstraction for interacting with a LLM. It provides functionality for querying the LLM with text or image inputs, and manages prompt preparation and token usage tracking. The class can be configured with different roles based on the document processing task.

Variables:
  • model – Model identifier in format {model_provider}/{model_name}. See https://docs.litellm.ai/docs/providers for supported providers.

  • deployment_id – Deployment ID for the LLM. Primarily used with Azure OpenAI.

  • api_key – API key for LLM authentication. Not required for local models (e.g., Ollama).

  • api_base – Base URL of the API endpoint.

  • api_version – API version. Primarily used with Azure OpenAI.

  • role – Role type for the LLM (e.g., “extractor_text”, “reasoner_text”, “extractor_vision”, “reasoner_vision”). Defaults to “extractor_text”.

  • system_message – Preparatory system-level message to set context for LLM responses.

  • temperature – Sampling temperature (0.0 to 1.0) controlling response creativity. Lower values produce more predictable outputs, higher values generate more varied responses. Defaults to 0.3.

  • max_tokens – Maximum tokens allowed in the generated response. Defaults to 4096.

  • max_completion_tokens – Maximum token size for output completions in o1 models. Defaults to 16000.

  • top_p – Nucleus sampling value (0.0 to 1.0) controlling output focus/randomness. Lower values make output more deterministic, higher values produce more diverse outputs. Defaults to 0.3.

  • num_retries_failed_request – Number of retries when LLM request fails. Defaults to 3.

  • max_retries_failed_request – LLM provider-specific retry count for failed requests. Defaults to 0.

  • max_retries_invalid_data – Number of retries when LLM returns invalid data. Defaults to 3.

  • timeout – Timeout in seconds for LLM API calls. Defaults to 120 seconds.

  • pricing_details – LLMPricing object with pricing details for cost calculation.

  • is_fallback – Indicates whether the LLM is a fallback model. Defaults to False.

  • fallback_llm – DocumentLLM to use as fallback if current one fails. Must have the same role as the current LLM.

  • output_language – Language for produced output text (justifications, explanations). Can be “en” (English) or “adapt” (adapts to document/image language). Defaults to “en”.

  • async_limiter – Controls frequency of async LLM API requests for concurrent tasks. Defaults to allowing 3 acquisitions per 10-second period to prevent rate limit issues. See mjpieters/aiolimiter for configuration details.

  • seed – Seed for random number generation to help produce more consistent outputs across multiple runs. When set to a specific integer value, the LLM will attempt to use this seed for sampling operations. However, deterministic output is still not guaranteed even with the same seed, as other factors may influence the model’s response. Defaults to None.

Parameters:
  • model (NonEmptyStr)

  • deployment_id (Optional[NonEmptyStr])

  • api_key (Optional[NonEmptyStr])

  • api_base (Optional[NonEmptyStr])

  • api_version (Optional[NonEmptyStr])

  • role (LLMRoleAny)

  • system_message (Optional[NonEmptyStr])

  • temperature (Optional[float])

  • max_tokens (Optional[int])

  • max_completion_tokens (Optional[int])

  • top_p (Optional[float])

  • num_retries_failed_request (Optional[int])

  • max_retries_failed_request (Optional[int])

  • max_retries_invalid_data (Optional[int])

  • timeout (Optional[int])

  • pricing_details (Optional[dict[NonEmptyStr, float]])

  • is_fallback (bool)

  • fallback_llm (Optional[DocumentLLM])

  • output_language (LanguageRequirement)

  • seed (Optional[StrictInt])

Note:

  • LLM groups

    Refer to the DocumentLLMGroup class for more information on constructing LLM groups, which are a collection of LLMs with unique roles, used for complex document processing tasks.

  • LLM role

    The role of an LLM is an abstraction to differentiate between tasks of different complexity. For example, if an aspect/concept is assigned llm_role="extractor_text", it means that the aspect/concept is extracted from the document using the LLM with the role set to “extractor_text”. This helps to channel different tasks to different LLMs, ensuring that the task is handled by the most appropriate model. Usually, domain expertise is required to determine the most appropriate role for a specific aspect/concept. But for simple use cases, you can skip the role assignment completely, in which case the role will default to “extractor_text”.

Example:
LLM definition#
from contextgem import DocumentLLM, LLMPricing

# Create a single LLM for text extraction
text_extractor = DocumentLLM(
    model="openai/gpt-4o-mini",
    api_key="your-api-key",  # Replace with your actual API key
    role="extractor_text",  # Role for text extraction
    pricing_details=LLMPricing(  # optional
        input_per_1m_tokens=0.150, output_per_1m_tokens=0.600
    ),
)

# Create a fallback LLM in case the primary model fails
fallback_text_extractor = DocumentLLM(
    model="anthropic/claude-3-7-sonnet",
    api_key="your-anthropic-api-key",  # Replace with your actual API key
    role="extractor_text",  # must be the same as the role of the primary LLM
    is_fallback=True,
    pricing_details=LLMPricing(  # optional
        input_per_1m_tokens=3.00, output_per_1m_tokens=15.00
    ),
)
# Assign the fallback LLM to the primary LLM
text_extractor.fallback_llm = fallback_text_extractor

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model: NonEmptyStr#
deployment_id: Optional[NonEmptyStr]#
api_key: Optional[NonEmptyStr]#
api_base: Optional[NonEmptyStr]#
api_version: Optional[NonEmptyStr]#
role: LLMRoleAny#
system_message: Optional[NonEmptyStr]#
temperature: Optional[StrictFloat]#
max_tokens: Optional[StrictInt]#
max_completion_tokens: Optional[StrictInt]#
top_p: Optional[StrictFloat]#
num_retries_failed_request: Optional[StrictInt]#
max_retries_failed_request: Optional[StrictInt]#
max_retries_invalid_data: Optional[StrictInt]#
timeout: Optional[StrictInt]#
pricing_details: Optional[LLMPricing]#
is_fallback: StrictBool#
fallback_llm: Optional[DocumentLLM]#
output_language: LanguageRequirement#
seed: Optional[StrictInt]#
property async_limiter: AsyncLimiter#
property is_group: bool#

Abstract property, to be implemented by subclasses.

Whether the LLM is a single instance or a group.

property list_roles: list[Literal['extractor_text', 'reasoner_text', 'extractor_vision', 'reasoner_vision']]#

Returns a list containing the role of this LLM.

(For a single LLM, this returns a list with just one element - the LLM’s role. For LLM groups, the method implementation returns roles of all LLMs in the group.)

Returns:

A list containing the role of this LLM.

Return type:

list[LLMRoleAny]

_update_default_prompt(prompt_path, prompt_type)[source]#

For advanced users only!

Update the default Jinja2 prompt template for the LLM.

This method allows you to replace the built-in prompt templates with custom ones for specific extraction types. The framework uses these templates to guide the LLM in extracting structured information from documents.

The custom prompt must be a valid Jinja2 template and include all the necessary variables that are present in the default prompt. Otherwise, the extraction may fail. Default prompts are located under contextgem/internal/prompts/

IMPORTANT NOTES:

The default prompts are complex and specifically designed for various steps of LLM extraction with the framework. Such prompts include the necessary instructions, template variables, nested structures and loops, etc.

Only use custom prompts if you MUST have a deeper customization and adaptation of the default prompts to your specific use case. Otherwise, the default prompts should be sufficient for most use cases.

Use at your own risk!

Parameters:
  • prompt_path (str | Path) – Path to the Jinja2 template file (.j2 extension required)

  • prompt_type (DefaultPromptType) – Type of prompt to update (“aspect” or “concept”)

Return type:

None

_eq_deserialized_llm_config(other)[source]#

Custom config equality method to compare this DocumentLLM with a deserialized instance.

Compares the __dict__ of both instances and performs specific checks for certain attributes that require special handling.

Note that, by default, the reconstructed deserialized DocumentLLM will be only partially equal (==) to the original one, as the api credentials are redacted, and the attached prompt templates, async limiter, and async lock are not serialized and point to different objects in memory post-initialization. Also, usage and cost are reset by default pre-serialization.

Parameters:

other (DocumentLLM) – Another DocumentLLM instance to compare with

Returns:

True if the instances are equal, False otherwise

Return type:

bool

get_usage()[source]#

Retrieves the usage information of the LLM and its fallback LLM if configured.

This method collects token usage statistics for the current LLM instance and its fallback LLM (if configured), providing insights into API consumption.

Returns:

A list of usage statistics containers for the LLM and its fallback.

Return type:

list[_LLMUsageOutputContainer]

get_cost()[source]#

Retrieves the accumulated cost information of the LLM and its fallback LLM if configured.

This method collects cost statistics for the current LLM instance and its fallback LLM (if configured), providing insights into API usage expenses.

Returns:

A list of cost statistics containers for the LLM and its fallback.

Return type:

list[_LLMCostOutputContainer]

reset_usage_and_cost()[source]#

Resets the usage and cost statistics for the LLM and its fallback LLM (if configured).

This method clears accumulated usage and cost data, which is useful when processing multiple documents sequentially and tracking metrics for each document separately.

Return type:

None

Returns:

None