Our SDK offers a developer-friendly way to consume foundational models available in the SAP generative AI hub. We strive to facilitate seamless interactions with these models by providing integrations that act as drop-in replacements for the native client SDKs and LangChain. This allows developers to use familiar interfaces and workflows. Usage is as follows.
Native Client Integrations
Completions
OpenAI
Completions equivalent to openai.Completions.
Below is an example usage of Completions in generative AI hub sdk.
All models that support the legacy completion endpoint can be used.
from gen_ai_hub.proxy.native.openai import completions
response = completions.create(
model_name="gpt-4o-mini",
prompt="The Answer to the Ultimate Question of Life, the Universe, and Everything is",
max_tokens=20,
temperature=0
)
print(response)
ChatCompletions equivalent to openai.ChatCompletions
Below is an example usage of ChatCompletions in generative AI hub sdk.
from gen_ai_hub.proxy.native.openai import chat
messages = [{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
{"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]
kwargs = dict(model_name='gpt-4o-mini', messages=messages)
response = chat.completions.create(**kwargs)
print(response)
#example where deployment_id is passed instead of model_name parameter
from gen_ai_hub.proxy.native.openai import chat
messages = [{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
{"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]
response = chat.completions.create(deployment_id="dcef02e219ae4916", messages=messages)
print(response)
Structured model outputs
LLM output as json objects is a powerful feature that allows you to define the structure of the output you expect from the model.
see https://platform.openai.com/docs/guides/structured-outputs/examples
from pydantic import BaseModel
from gen_ai_hub.proxy.native.openai import chat
class Person(BaseModel):
name: str
age: int
response = chat.completions.parse(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Tell me about John Doe, aged 30."}],
response_format=Person
)
person = response.choices[0].message.parsed # Fully typed Person
print(person)
Google GenAI
Generate Content
from gen_ai_hub.proxy.native.google_genai.clients import Client
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
client = Client(proxy_client=proxy_client)
response = client.models.generate_content(model="gemini-2.5-flash",
contents="How many paws are there for a dog?"
)
print(response)
# Using another model
response = client.models.generate_content(model="gemini-2.0-flash",
contents="Explain the theory of relativity in simple terms.")
print(response)
Generate Content streaming
from gen_ai_hub.proxy.native.google_genai.clients import Client
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
client = Client(
proxy_client=proxy_client,
)
response_stream = client.models.generate_content_stream(model="gemini-2.5-flash",
contents="Explain singularity in short terms.")
for chunk in response_stream:
print("Chunk: ", chunk.text)
Functional Calling of Google Genai
from google.genai import types
from gen_ai_hub.proxy.native.google_genai.clients import Client
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
def get_current_weather(location: str) -> str:
"""Returns the current weather.
Args:
location: The city and state, e.g. San Francisco, CA
"""
return 'sunny'
proxy_client = get_proxy_client('gen-ai-hub')
client = Client(
proxy_client=proxy_client,
)
response = client.models.generate_content(
model='gemini-2.5-flash',
contents='What is the weather like in Boston?',
config=types.GenerateContentConfig(tools=[get_current_weather]),
)
response
Amazon
Invoke Model
import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="amazon--nova-premier")
body = json.dumps(
{
"inputText": "Explain black holes in astrophysics to 8th graders.",
"textGenerationConfig": {
"maxTokenCount": 3072,
"stopSequences": [],
"temperature": 0.7,
"topP": 0.9,
},
}
)
response = bedrock.invoke_model(body=body)
response_body = json.loads(response.get("body").read())
print(response_body)
Converse
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="anthropic--claude-4-sonnet")
conversation = [
{
"role": "user",
"content": [
{
"text": "Describe the purpose of a 'hello world' program in one line."
}
],
}
]
response = bedrock.converse(
messages=conversation,
inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
)
print(response)
Embeddings
OpenAI
Embeddings are equivalent to openai.Embeddings. See below examples of how to use Embeddings in generative AI hub sdk.
from gen_ai_hub.proxy.native.openai import embeddings
response = embeddings.create(
input="Every decoding is another encoding.",
model_name="text-embedding-ada-002"
)
print(response.data)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
input="Every decoding is another encoding.",
model_name="text-embedding-ada-002",
encoding_format='base64'
)
print(response.data)
Amazon
import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="amazon--nova-premier")
body = json.dumps(
{
"inputText": "Please recommend books with a theme similar to the movie 'Inception'.",
}
)
response = bedrock.invoke_model(
body=body,
)
response_body = json.loads(response.get("body").read())
print(response_body)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
input="Every decoding is another encoding.",
model_name="text-embedding-ada-002",
encoding_format='base64'
)
print(response.data)
Langchain Integration
LangChain provides an interface that abstracts provider-specific details into a common interface. Classes like Chat
The list of the available models can be found here: Supported Models
Harmonized Model Initialization
The init_llm and init_embedding_model functions allow easy initialization of langchain model interfaces in a harmonized way in generative AI hub sdk
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from gen_ai_hub.proxy.langchain.init_models import init_llm
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=['question'])
question = 'What is a supernova?'
llm = init_llm('gpt-5-nano', max_tokens=300)
chain = prompt | llm | StrOutputParser()
response = chain.invoke({'question': question})
print(response)
from gen_ai_hub.proxy.langchain.init_models import init_embedding_model
text = 'Every decoding is another encoding.'
embeddings = init_embedding_model('text-embedding-ada-002')
response = embeddings.embed_query(text)
print(response)
LLM
from langchain import PromptTemplate
from gen_ai_hub.proxy.langchain.openai import OpenAI # langchain class representing the AICore OpenAI models
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
# non-chat model
model_name = "mistralai--mistral-small-instruct"
llm = OpenAI(proxy_model_name=model_name, proxy_client=proxy_client) # can be used as usual with langchain
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = prompt | llm
question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"
print(llm_chain.invoke({'question': question}))
Chat model
from langchain.prompts.chat import (
AIMessagePromptTemplate,
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
chat_llm = ChatOpenAI(proxy_model_name='gpt-4o-mini', proxy_client=proxy_client)
template = 'You are a helpful assistant that translates english to pirate.'
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
example_human = HumanMessagePromptTemplate.from_template('Hi')
example_ai = AIMessagePromptTemplate.from_template('Ahoy!')
human_template = '{text}'
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
[system_message_prompt, example_human, example_ai, human_message_prompt])
chain = chat_prompt | chat_llm
response = chain.invoke({'text': 'I love planking.'})
print(response.content)
Structured model outputs
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
from langchain.schema import HumanMessage
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
chat_model = ChatOpenAI(proxy_model_name="gpt-4o-mini", proxy_client=get_proxy_client())
chat_model = chat_model.with_structured_output(method="json_schema", schema=Person, strict=True)
message = HumanMessage(content="Tell me about a person named John who is 30")
print(chat_model.invoke([message]))
Embeddings
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002', proxy_client=proxy_client)
response = embedding_model.embed_query('Every decoding is another encoding.')
#call without passing proxy_client
embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')
response = embedding_model.embed_query('Every decoding is another encoding.')
print(response)
Using New Models Before Official SDK Support
You can use models via Gen AI Hub even before they are officially listed, provided their provider family (e.g., Google, Amazon Bedrock) is supported.
Native SDK Clients:
If using the provider's native SDK (like
boto3,google-genai) through the Gen AI Hub proxy, you can often use the new model name/ID directly with existing client methods.Langchain Integration (
init_llm):The
init_llmhelper simplifies creating Langchain LLM objects configured for the proxy.Alternative: You can always bypass
init_llmand instantiate the Langchain classes (e.g.,ChatGoogleGenerativeAI,ChatBedrock,ChatBedrockConverse) directly.Bedrock Specifics:
Requires
model_idin addition tomodel_name. Find IDs here.init_llmautomatically selects the appropriate Bedrock API (older Invoke viaChatBedrockor newer Converse viaChatBedrockConverse) based on known models.Crucially: For new Bedrock models or to force a specific API (Invoke/Converse), you must pass the corresponding initialization function (
init_chat_modelorinit_chat_converse_model) to theinit_funcargument ofinit_llm.
from gen_ai_hub.proxy.langchain.init_models import init_llm
# Import specific init functions for overriding Bedrock behavior
from gen_ai_hub.proxy.langchain.amazon import (
init_chat_model as amazon_init_invoke_model,
init_chat_converse_model as amazon_init_converse_model
)
from gen_ai_hub.proxy.langchain.google_genai import init_chat_model as google_genai_init_chat_model
# --- Google Example ---
llm_google = init_llm(model_name='gemini-newer-version', init_func=google_genai_init_chat_model) # Often just needs model_name
# --- Bedrock Example (New Model requiring Converse API) ---
model_name_amazon = 'anthropic--claude-newer-version'
model_id_amazon = 'anthropic.claude-newer-version-v1:0' # Use actual ID
llm_amazon = init_llm(
model_name_amazon,
model_id=model_id_amazon,
init_func=amazon_init_converse_model # Explicitly select Converse API
)
# --- Bedrock Example (Explicitly using older Invoke API) ---
# llm_amazon_invoke = init_llm(
# 'some-model-name',
# model_id='some-model-id',
# init_func=amazon_init_invoke_model # Explicitly select Invoke API
# )