gen_ai_hub.orchestration_v2 package

async arun(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration request asynchronously (non-streaming).

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the CompletionPostResponse object

Return type:

async arun_with_retries(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None, max_retries=10, base_delay=1.0)

Executes an orchestration request asynchronously with automatic retry on rate limits (429) and server errors. Uses exponential backoff with jitter to handle rate limiting gracefully.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None
max_retries (int, optional) -- the maximum number of retry attempts, defaults to 10
base_delay (float, optional) -- the initial delay between retries in seconds, defaults to 1.0

Returns:

the OrchestrationResponseWithRetries with retry count information

Return type:

Raises:

ValueError -- if no configuration is provided.
OrchestrationError -- if request fails after all retries (includes retry count).

async astream(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration streaming request asynchronously.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the AsyncSSEClient object

Return type:

AsyncSSEClient

close_http_connection(): Closes the httpx synchronous client.

embed(config, input, timeout=None)

Executes an embeddings request synchronously.

Parameters:

config (EmbeddingsOrchestrationConfig) -- the embeddings orchestration configuration
input (EmbeddingsInput) -- the input text to embed
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the EmbeddingsPostResponse object

Return type:

handle_retry(retry_count, base_delay, error, max_retries)

Handles retry logic with exponential backoff and jitter. If Retry-After header exists, use it as min_delay to add jitter on top

Parameters:

retry_count (int) -- the incremented retry attempt number
base_delay (float) -- the initial delay between retries in seconds
error (OrchestrationError) -- the exception that occurred
max_retries (int) -- the maximum number of retry attempts

Raises:

error -- throws the original error if no retry should be attempted

Returns:

the number of seconds to wait before next retry

Return type:

float

run(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration request synchronously (non-streaming).

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None if not provided, the default configuration is used.
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the CompletionPostResponse object

Return type:

run_with_retries(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None, max_retries=10, base_delay=1.0)

Executes an orchestration request with automatic retry on rate limits (429) and server errors.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None
max_retries (int, optional) -- the maximum number of retry attempts, defaults to 10
base_delay (float, optional) -- the initial delay between retries in seconds, defaults to 1.0

Returns:

the OrchestrationResponseWithRetries with retry count information

Return type:

See https://api.sap.com/api/ORCHESTRATION_API_v2/overview

Raises:

ValueError -- if no configuration is provided.
OrchestrationError -- if request fails after all retries (includes retry count).

stream(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration streaming request synchronously.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional if not provided, the default configuration is used.) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

An Iterable[StreamCompletionPostResponse] object

Return type:

Iterable[StreamCompletionPostResponse]

class OutputFiltering

Bases: ABCBaseModel

Module for managing and applying output content filters.

Args:

filters: List of ContentFilter objects to be applied to output content.

stream_options: Module-specific streaming options.

filters: List[AzureContentSafetyOutputFilterConfig | LlamaGuard38bFilterConfig | ContentFilter]

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

stream_options: FilteringStreamOptions | None

class OutputTranslationConfig

Bases: TranslationConfig

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

target_language: str | SAPDocumentTranslationApplyToSelector

class ProfileEntity

Bases: str, Enum

Enumerates the entity categories that can be masked by the SAP Data Privacy Integration service.

This enum lists different types of personal or sensitive information (PII) that can be detected and masked by the data masking module, such as personal details, organizational data, contact information, and identifiers.

Values:

PERSON: Represents personal names.

ORG: Represents organizational names.

UNIVERSITY: Represents educational institutions.

LOCATION: Represents geographical locations.

EMAIL: Represents email addresses.

PHONE: Represents phone numbers.

ADDRESS: Represents physical addresses.

SAP_IDS_INTERNAL: Represents internal SAP identifiers.

SAP_IDS_PUBLIC: Represents public SAP identifiers.

URL: Represents URLs.

USERNAME_PASSWORD: Represents usernames and passwords.

NATIONAL_ID: Represents national identification numbers.

IBAN: Represents International Bank Account Numbers.

SSN: Represents Social Security Numbers.

CREDIT_CARD_NUMBER: Represents credit card numbers.

PASSPORT: Represents passport numbers.

DRIVING_LICENSE: Represents driving license numbers.

NATIONALITY: Represents nationality information.

RELIGIOUS_GROUP: Represents religious group affiliation.

POLITICAL_GROUP: Represents political group affiliation.

PRONOUNS_GENDER: Represents pronouns and gender identity.

GENDER: Represents gender information.

SEXUAL_ORIENTATION: Represents sexual orientation.

TRADE_UNION: Represents trade union membership.

SENSITIVE_DATA: Represents any other sensitive information.

__new__(value)

ADDRESS = 'profile-address'

CREDIT_CARD_NUMBER = 'profile-credit-card-number'

DRIVING_LICENSE = 'profile-driverlicense'

EMAIL = 'profile-email'

ETHNICITY = 'profile-ethnicity'

GENDER = 'profile-gender'

IBAN = 'profile-iban'

LOCATION = 'profile-location'

NATIONALITY = 'profile-nationality'

NATIONAL_ID = 'profile-nationalid'

ORG = 'profile-org'

PASSPORT = 'profile-passport'

PERSON = 'profile-person'

PHONE = 'profile-phone'

POLITICAL_GROUP = 'profile-political-group'

PRONOUNS_GENDER = 'profile-pronouns-gender'

RELIGIOUS_GROUP = 'profile-religious-group'

SAP_IDS_INTERNAL = 'profile-sapids-internal'

SAP_IDS_PUBLIC = 'profile-sapids-public'

SENSITIVE_DATA = 'profile-sensitive-data'

SEXUAL_ORIENTATION = 'profile-sexual-orientation'

SSN = 'profile-ssn'

TRADE_UNION = 'profile-trade-union'

UNIVERSITY = 'profile-university'

URL = 'profile-url'

USERNAME_PASSWORD = 'profile-username-password'

class PromptTemplatingModuleConfig

Bases: ABCBaseModel

model: LLMModelDetails

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

prompt: Template | TemplateRef

class PromptTokensDetails

Bases: ABCBaseModel

Represents the details of prompt tokens used in a specific operation.

Attributes:: audio_tokens (Optional[int]): Audio input tokens present in the prompt. cached_tokens (Optional[int]): Cached tokens present in the prompt.

audio_tokens: int | None

cached_tokens: int | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class ResponseChatMessage

Bases: ABCBaseModel

Represents a response message in a conversation.

Args:

role: The role of the entity sending the message.

content: The text content of the assistant message.

refusal: A string indicating refusal reason.

tool_calls: A list of tool call objects.

content: str

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

refusal: str | None

role: Role

tool_calls: List[MessageToolCall] | None

class ResponseFormatJsonObject

Bases: ABCBaseModel

Response format JSON Object that the model output should adhere to.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type_: ResponseFormatType

class ResponseFormatJsonSchema

Bases: ABCBaseModel

json_schema: JSONResponseSchema

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type_: ResponseFormatType

class ResponseFormatText

Bases: ABCBaseModel

Response format that the model output should adhere to.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type_: ResponseFormatType

class SAPAPIError

Bases: ABCBaseModel

Represents an error returned from an SAP API.

Attributes:

request_id (str): The unique identifier of the request associated: with the error.

code (int): The http error code.

message (str): A detailed message describing the error.

location (str): The location where the error occurred

intermediate_results (Optional[ModuleResults]): Optional attribute: to store any processing results if available or applicable.

code: int

headers: dict[str, str] | None

intermediate_results: ModuleResults | None

location: str

message: str

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

request_id: str

class SAPAPIErrorStreaming

Bases: ABCBaseModel

Represents an error returned from an SAP API.

Attributes:

request_id (str): The unique identifier of the request associated: with the error.

code (int): The http error code.

message (str): A detailed message describing the error.

location (str): The location where the error occurred

intermediate_results (Optional[ModuleResults]): Optional attribute: to store any processing results if available or applicable.

code: int

headers: dict[str, str] | None

intermediate_results: ModuleResultsStreaming | None

location: str

message: str

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

request_id: str

class SAPDocumentTranslation

Bases: ABCBaseModel

Configuration for translation module.

Args:

type: The type of translation module (e.g., 'sap_document_translation').

config: Configuration object for the translation module.

config: TranslationConfig

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type_: TranslationType

class SAPDocumentTranslationApplyToSelector

Bases: ABCBaseModel

This selector allows you to define the scope of translation, such as specific placeholders or messages with specific roles. For example, {"category": "placeholders",

"items": ["user_input"], "source_language": "de-DE"} targets the value of "user_input" in placeholder_values specified in the request payload; and considers the value to be in German.

category: Literal['placeholders', 'template_roles']

items: list[str]

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

source_language: str

class SAPDocumentTranslationInput

Bases: SAPDocumentTranslation

Configuration for input translation

Args:

type: The type of translation module (e.g., 'sap_document_translation').

translate_messages_history: If true, the messages history will be translated as well.

config: Configuration object for the translation module.

config: InputTranslationConfig | TranslationConfig

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

translate_messages_history: bool | None

class SAPDocumentTranslationOutput

Bases: SAPDocumentTranslation

Configuration for output translation

Args:

type: The type of translation module (e.g., 'sap_document_translation').

config: Configuration object for the translation module.

config: OutputTranslationConfig | TranslationConfig

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class StreamCompletionPostResponse

Bases: ABCBaseModel

final_result: StreamLLMModuleResult | None

intermediate_failures: List[SAPAPIError] | None

intermediate_results: StreamModuleResults | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

request_id: str

class StreamDelta

Bases: ABCBaseModel

content: str

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

role: str | None

tool_calls: List[StreamToolCall] | None

class StreamFunctionObject

Bases: FunctionCall

Represents a function call with its name and arguments.

Attributes:

name: str: The name of the function to call.
arguments: str: The arguments to call the function with, as generated by the model in JSON format. Note that the model does not always generate valid JSON, and may hallucinate parameters not defined by your function schema. Validate the arguments in your code before calling your function.

arguments: str | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None

class StreamLLMChoice

Bases: ABCBaseModel

delta: StreamDelta

finish_reason: str | None

index: int

logprobs: ChoiceLogprobs | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class StreamLLMModuleResult

Bases: LLMModuleResult

choices: List[StreamLLMChoice]

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

usage: TokenUsage | None

class StreamModuleResults

Bases: ModuleResults

llm: StreamLLMModuleResult | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output_unmasking: List[StreamLLMChoice] | None

class StreamToolCall

Bases: ABCBaseModel

function: StreamFunctionObject | None

id: str | None

index: int

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type_: Literal['function']

class SystemMessage

Bases: ABCBaseModel

Represents a system message in a prompt or conversation template.

System messages typically provide context or instructions to the AI model.

Args:

role: The role of the entity sending the message.

content: The text content of the system message.

content: str | List[TextPart]

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

role: Role

class Template

Bases: ABCBaseModel

Represents a configurable template for generating prompts or conversations.

Args:

defaults: A dict of default values for template variables.

tools: A list of tool definitions.

response_format: A response format that the model output should adhere to.

defaults: dict | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

response_format: ResponseFormatText | ResponseFormatJsonObject | ResponseFormatJsonSchema | None

template: List[SystemMessage | UserMessage | AssistantMessage | ToolChatMessage | DeveloperChatMessage | ResponseChatMessage]

tools: List[dict | FunctionTool] | None

class TemplateRef

Bases: ABCBaseModel

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

template_ref: TemplateRefByID | TemplateRefByScenarioNameVersion

class TemplateRefByID

Bases: ABCBaseModel

Represents a prompt template reference for generating prompts or conversations. Args:

id(str): ID of the template in prompt registry scope(Optional[Literal["resource_group", "tenant"]]): Defines the scope that is searched

for the referenced template. 'tenant' indicates the template is shared across all resource groups within the tenant, while 'resource_group' indicates the template is only accessible within the specific resource group. Defaults to 'tenant'.

id: str

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

scope: Literal['resource_group', 'tenant'] | None

class TemplateRefByScenarioNameVersion

Bases: ABCBaseModel

Represents a prompt template reference for generating prompts or conversations. Args:

scenario(str): Scenario name

name(str): Name of template

version(str): Version of template

scope(Optional[Literal["resource_group", "tenant"]]): Defines the scope that is searched
for the referenced template. 'tenant' indicates the template is shared across all resource groups within the tenant, while 'resource_group' indicates the template is only accessible within the specific resource group. Defaults to 'tenant'.

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

scenario: str

scope: Literal['resource_group', 'tenant'] | None

version: str

class TextPart

Bases: ABCBaseModel

Represents a text segment within a multimodal content block.

Args:

text: The string content of the text part.

type: The type identifier, defaulting to "text".

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

text: str

type_: Literal['text']

class TokenUsage

Bases: ABCBaseModel

Usage of tokens in the response

completion_tokens: int

completion_tokens_details: CompletionTokensDetails | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

prompt_tokens: int

prompt_tokens_details: PromptTokensDetails | None

total_tokens: int

class ToolChatMessage

Bases: ABCBaseModel

content: str | List[TextPart]

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

role: Role

tool_call_id: str

class TopLogprob

Bases: ABCBaseModel

Represents one of the most likely tokens and its log probability at a given token position.

Attributes:

token: The token.

logprob: The log probability of this token.

bytes: UTF-8 bytes of the token, if applicable.

bytes: List[int] | None

logprob: float

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

token: str

class TranslationConfig

Bases: ABCBaseModel

Configuration for sap_document_translation translation provider.

Args:

source_language: Language of the text to be translated. Example: de-DE

target_language: Language to which the text should be translated. Example: en-US

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

source_language: str | None

target_language: str

class TranslationModuleConfig

Bases: ABCBaseModel

Configuration for translation module

Args:

input: Configuration for input translation

output: Configuration for output translation

input: SAPDocumentTranslationInput | SAPDocumentTranslation | None

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output: SAPDocumentTranslationOutput | SAPDocumentTranslation | None

class UserMessage

Bases: ABCBaseModel

Represents a user message in a prompt or conversation template.

User messages typically contain queries or inputs from the user.

Args:

role: The role of the entity sending the message.

content: The message content, which may be plain text or a sequence of text and images.

classmethod content_validation(content): Validates and maps the content field to the appropriate types.

content: str | TextPart | ImagePart | List[str | TextPart | ImagePart | ImageItem]

model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'frozen': False}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

role: Role

function_tool(func=None, *, description=None, strict=False)

Decorator that converts a function into a FunctionTool.

Usage:

@function_tool def my_func(...): ...

@function_tool() def my_func(...): ...

Parameters:

func (Callable | None)
description (str | None)
strict (bool)

Return type:

Callable[[Callable], FunctionTool] | FunctionTool

python_type_to_json_type(py_type)

Convert a Python type to a JSON Schema type.

Parameters:: py_type (any) -- the Python type to convert
Returns:: A dictionary representing the JSON Schema type.
Return type:: dict

Subpackages

gen_ai_hub.orchestration_v2.models package

Submodules

gen_ai_hub.orchestration_v2.exceptions module

Exceptions for the orchestration service module.

exception OrchestrationError

Bases: Exception

__init__(request_id, headers, message, code, location, intermediate_results, retries=0)

Initializes the OrchestrationError with detailed context.

Parameters:

request_id (str) -- unique identifier for the request that encountered the error.
headers (httpx.Headers) -- HTTP headers associated with the request, useful in case of e.g. rate limiting..
message (str) -- Detailed error message describing the issue.
code (int) -- Error code associated with the specific type of failure.
location (str) -- Specific component or step in the orchestration process where the error occurred.
intermediate_results (ModuleResults) -- State information and partial results from various modules at the time of the error, useful for debugging.
retries (int, optional) -- Number of retries attempted before the error was raised.
errors (Optional[list[dict[str, Any]]]) -- Raw error payload(s) from the API. Can contain multiple errors.

exception OrchestrationErrorList

Bases: Exception

__init__(errors)

Parameters:: errors (list[OrchestrationError])

gen_ai_hub.orchestration_v2.service module

Module for orchestration service handling requests and responses.

Provides synchronous and asynchronous methods to run orchestration pipelines.

class OrchestrationService

Bases: object

A service for executing orchestration requests, allowing for the generation of LLM-generated content through a pipeline of configured modules.

This service supports both synchronous and asynchronous request execution. For streaming responses, special care is taken to not close the underlying HTTP stream prematurely.

Args:

api_url: The base URL for the orchestration API.

config: The default orchestration configuration.

config_ref: The reference to default orchestration configuration.

proxy_client: A GenAIHubProxyClient instance.

deployment_id: Optional deployment ID.

config_name: Optional configuration name.

config_id: Optional configuration ID.

timeout: Optional timeout for HTTP requests.

__init__(api_url=None, config=None, config_ref=None, proxy_client=None, deployment_id=None, config_name=None, config_id=None, timeout=None)

Initializes the OrchestrationService.

Parameters:

api_url (Optional[str], optional) -- the base URL for the orchestration API, defaults to None
config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
proxy_client (Optional[GenAIHubProxyClient], optional) -- the GenAIHubProxyClient instance, defaults to None
deployment_id (Optional[str], optional) -- the deployment ID, defaults to None
config_name (Optional[str], optional) -- the configuration name, defaults to None
config_id (Optional[str], optional) -- the configuration ID, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout for HTTP requests, defaults to None

Raises:

ValueError -- if both config and config_ref are provided.

async aclose_http_connection(): Closes the httpx asynchronous client.

async aembed(config, input, timeout=None)

Executes an embeddings request asynchronously.

Parameters:

config (EmbeddingsOrchestrationConfig) -- the embeddings orchestration configuration
input (EmbeddingsInput) -- the input text to embed
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the EmbeddingsPostResponse object

Return type:

async arun(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration request asynchronously (non-streaming).

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the CompletionPostResponse object

Return type:

async arun_with_retries(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None, max_retries=10, base_delay=1.0)

Executes an orchestration request asynchronously with automatic retry on rate limits (429) and server errors. Uses exponential backoff with jitter to handle rate limiting gracefully.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None
max_retries (int, optional) -- the maximum number of retry attempts, defaults to 10
base_delay (float, optional) -- the initial delay between retries in seconds, defaults to 1.0

Returns:

the OrchestrationResponseWithRetries with retry count information

Return type:

Raises:

ValueError -- if no configuration is provided.
OrchestrationError -- if request fails after all retries (includes retry count).

async astream(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration streaming request asynchronously.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the AsyncSSEClient object

Return type:

AsyncSSEClient

close_http_connection(): Closes the httpx synchronous client.

embed(config, input, timeout=None)

Executes an embeddings request synchronously.

Parameters:

config (EmbeddingsOrchestrationConfig) -- the embeddings orchestration configuration
input (EmbeddingsInput) -- the input text to embed
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the EmbeddingsPostResponse object

Return type:

handle_retry(retry_count, base_delay, error, max_retries)

Handles retry logic with exponential backoff and jitter. If Retry-After header exists, use it as min_delay to add jitter on top

Parameters:

retry_count (int) -- the incremented retry attempt number
base_delay (float) -- the initial delay between retries in seconds
error (OrchestrationError) -- the exception that occurred
max_retries (int) -- the maximum number of retry attempts

Raises:

error -- throws the original error if no retry should be attempted

Returns:

the number of seconds to wait before next retry

Return type:

float

run(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None)

Executes an orchestration request synchronously (non-streaming).

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None if not provided, the default configuration is used.
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None

Returns:

the CompletionPostResponse object

Return type:

run_with_retries(config=None, config_ref=None, placeholder_values=None, history=None, timeout=None, max_retries=10, base_delay=1.0)

Executes an orchestration request with automatic retry on rate limits (429) and server errors.

Parameters:

config (Optional[OrchestrationConfig], optional) -- the orchestration configuration, defaults to None
config_ref (Optional[OrchestrationConfigReference], optional) -- the orchestration configuration reference, defaults to None
placeholder_values (Optional[dict], optional) -- the template values, defaults to None
history (Optional[List[ChatMessage]], optional) -- the message history, defaults to None
timeout (Union[int, float, httpx.Timeout, None], optional) -- the timeout overwrite per request, defaults to None
max_retries (int, optional) -- the maximum number of retry attempts, defaults to 10
base_delay (float, optional) -- the initial delay between retries in seconds, defaults to 1.0

Returns:

the OrchestrationResponseWithRetries with retry count information

Return type: