Our SDK offers a developer-friendly way to consume foundational models available in the SAP generative AI hub. We strive to facilitate seamless interactions with these models by providing integrations that act as drop-in replacements for the native client SDKs and LangChain. This allows developers to use familiar interfaces and workflows. Usage is as follows.

Native Client Integrations

Completions

OpenAI

Completions equivalent to openai.Completions. Below is an example usage of Completions in generative AI hub sdk. All models that support the legacy completion endpoint can be used.

from gen_ai_hub.proxy.native.openai import completions

response = completions.create(
    model_name="gpt-4o-mini",
    prompt="The Answer to the Ultimate Question of Life, the Universe, and Everything is",
    max_tokens=20,
    temperature=0
)
print(response)

ChatCompletions equivalent to openai.ChatCompletions Below is an example usage of ChatCompletions in generative AI hub sdk.

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

kwargs = dict(model_name='gpt-4o-mini', messages=messages)
response = chat.completions.create(**kwargs)

print(response)
#example where deployment_id is passed instead of model_name parameter

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

response = chat.completions.create(deployment_id="dcef02e219ae4916", messages=messages)
print(response)

Structured model outputs

LLM output as json objects is a powerful feature that allows you to define the structure of the output you expect from the model.

see https://platform.openai.com/docs/guides/structured-outputs/examples

from pydantic import BaseModel
from gen_ai_hub.proxy.native.openai import chat

class Person(BaseModel):
    name: str
    age: int

response = chat.completions.parse(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me about John Doe, aged 30."}],
    response_format=Person
)
person = response.choices[0].message.parsed  # Fully typed Person
print(person)

Google GenAI

Generate Content

from gen_ai_hub.proxy.native.google_genai.clients import Client
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
client = Client(proxy_client=proxy_client)

response = client.models.generate_content(model="gemini-2.5-flash",
    contents="How many paws are there for a dog?"
)

print(response)
# Using another model
response = client.models.generate_content(model="gemini-2.0-flash",
                                          contents="Explain the theory of relativity in simple terms.")
print(response)

Generate Content streaming

from gen_ai_hub.proxy.native.google_genai.clients import Client
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

client = Client(
    proxy_client=proxy_client,
)


response_stream = client.models.generate_content_stream(model="gemini-2.5-flash",
contents="Explain singularity in short terms.")

for chunk in response_stream:
    print("Chunk: ", chunk.text)

Functional Calling of Google Genai

from google.genai import types
from gen_ai_hub.proxy.native.google_genai.clients import Client
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

def get_current_weather(location: str) -> str:
  """Returns the current weather.

  Args:
    location: The city and state, e.g. San Francisco, CA
  """
  return 'sunny'


proxy_client = get_proxy_client('gen-ai-hub')

client = Client(
    proxy_client=proxy_client,
)
response = client.models.generate_content(
  model='gemini-2.5-flash',
  contents='What is the weather like in Boston?',
  config=types.GenerateContentConfig(tools=[get_current_weather]),
)
response

Amazon

Invoke Model

import json
from gen_ai_hub.proxy.native.amazon.clients import Session

bedrock = Session().client(model_name="amazon--nova-premier")
body = json.dumps(
    {
        "inputText": "Explain black holes in astrophysics to 8th graders.",
        "textGenerationConfig": {
            "maxTokenCount": 3072,
            "stopSequences": [],
            "temperature": 0.7,
            "topP": 0.9,
        },
    }
)
response = bedrock.invoke_model(body=body)
response_body = json.loads(response.get("body").read())
print(response_body)

Converse

from gen_ai_hub.proxy.native.amazon.clients import Session

bedrock = Session().client(model_name="anthropic--claude-4-sonnet")
conversation = [
    {
        "role": "user",
        "content": [
            {
                "text": "Describe the purpose of a 'hello world' program in one line."
            }
        ],
    }
]
response = bedrock.converse(
    messages=conversation,
    inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
)
print(response)

Embeddings

OpenAI

Embeddings are equivalent to openai.Embeddings. See below examples of how to use Embeddings in generative AI hub sdk.

from gen_ai_hub.proxy.native.openai import embeddings

response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002"
)
print(response.data)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002",
    encoding_format='base64'
)
print(response.data)

Amazon

import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="amazon--nova-premier")
body = json.dumps(
    {
        "inputText": "Please recommend books with a theme similar to the movie 'Inception'.",
    }
)
response = bedrock.invoke_model(
    body=body,
)
response_body = json.loads(response.get("body").read())
print(response_body)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002",
    encoding_format='base64'
)
print(response.data)

Langchain Integration

LangChain provides an interface that abstracts provider-specific details into a common interface. Classes like Chat and Embeddings are interchangeable.

The list of the available models can be found here: Supported Models

Harmonized Model Initialization

The init_llm and init_embedding_model functions allow easy initialization of langchain model interfaces in a harmonized way in generative AI hub sdk

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from gen_ai_hub.proxy.langchain.init_models import init_llm

template = """Question: {question}
    Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=['question'])
question = 'What is a supernova?'

llm = init_llm('gpt-5-nano', max_tokens=300)
chain = prompt | llm | StrOutputParser()
response = chain.invoke({'question': question})
print(response)
from gen_ai_hub.proxy.langchain.init_models import init_embedding_model

text = 'Every decoding is another encoding.'

embeddings = init_embedding_model('text-embedding-ada-002')
response = embeddings.embed_query(text)
print(response)

LLM

from langchain import PromptTemplate

from gen_ai_hub.proxy.langchain.openai import OpenAI  # langchain class representing the AICore OpenAI models
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
# non-chat model
model_name = "mistralai--mistral-small-instruct"

llm = OpenAI(proxy_model_name=model_name, proxy_client=proxy_client)  # can be used as usual with langchain

template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = prompt | llm

question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"

print(llm_chain.invoke({'question': question}))

Chat model

from langchain.prompts.chat import (
    AIMessagePromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)

from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

chat_llm = ChatOpenAI(proxy_model_name='gpt-4o-mini', proxy_client=proxy_client)
template = 'You are a helpful assistant that translates english to pirate.'

system_message_prompt = SystemMessagePromptTemplate.from_template(template)

example_human = HumanMessagePromptTemplate.from_template('Hi')
example_ai = AIMessagePromptTemplate.from_template('Ahoy!')
human_template = '{text}'

human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
    [system_message_prompt, example_human, example_ai, human_message_prompt])

chain = chat_prompt | chat_llm

response = chain.invoke({'text': 'I love planking.'})
print(response.content)

Structured model outputs

from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
from langchain.schema import HumanMessage
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int
chat_model = ChatOpenAI(proxy_model_name="gpt-4o-mini", proxy_client=get_proxy_client())
chat_model = chat_model.with_structured_output(method="json_schema", schema=Person, strict=True)

message = HumanMessage(content="Tell me about a person named John who is 30")
print(chat_model.invoke([message]))

Embeddings

from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002', proxy_client=proxy_client)

response = embedding_model.embed_query('Every decoding is another encoding.')

#call without passing proxy_client

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')

response = embedding_model.embed_query('Every decoding is another encoding.')
print(response)

Using New Models Before Official SDK Support

You can use models via Gen AI Hub even before they are officially listed, provided their provider family (e.g., Google, Amazon Bedrock) is supported.

  1. Native SDK Clients:

    If using the provider's native SDK (like boto3, google-genai) through the Gen AI Hub proxy, you can often use the new model name/ID directly with existing client methods.

  2. Langchain Integration (init_llm):

    The init_llm helper simplifies creating Langchain LLM objects configured for the proxy.

    • Alternative: You can always bypass init_llm and instantiate the Langchain classes (e.g., ChatGoogleGenerativeAI, ChatBedrock, ChatBedrockConverse) directly.

    • Bedrock Specifics:

      • Requires model_id in addition to model_name. Find IDs here. init_llm automatically selects the appropriate Bedrock API (older Invoke via ChatBedrock or newer Converse via ChatBedrockConverse) based on known models.

      • Crucially: For new Bedrock models or to force a specific API (Invoke/Converse), you must pass the corresponding initialization function (init_chat_model or init_chat_converse_model) to the init_func argument of init_llm.

from gen_ai_hub.proxy.langchain.init_models import init_llm
# Import specific init functions for overriding Bedrock behavior
from gen_ai_hub.proxy.langchain.amazon import (
    init_chat_model as amazon_init_invoke_model,
    init_chat_converse_model as amazon_init_converse_model
)
from gen_ai_hub.proxy.langchain.google_genai import init_chat_model as google_genai_init_chat_model

# --- Google Example ---
llm_google = init_llm(model_name='gemini-newer-version', init_func=google_genai_init_chat_model) # Often just needs model_name

# --- Bedrock Example (New Model requiring Converse API) ---
model_name_amazon = 'anthropic--claude-newer-version'
model_id_amazon = 'anthropic.claude-newer-version-v1:0' # Use actual ID

llm_amazon = init_llm(
    model_name_amazon,
    model_id=model_id_amazon,
    init_func=amazon_init_converse_model # Explicitly select Converse API
)

# --- Bedrock Example (Explicitly using older Invoke API) ---
# llm_amazon_invoke = init_llm(
#     'some-model-name',
#     model_id='some-model-id',
#     init_func=amazon_init_invoke_model # Explicitly select Invoke API
# )