Our SDK offers a developer-friendly way to consume foundational models available in the SAP generative AI hub. We strive to facilitate seamless interactions with these models by providing integrations that act as drop-in replacements for the native client SDKs and LangChain. This allows developers to use familiar interfaces and workflows. Usage is as follows.

Native Client Integrations

Completions

OpenAI

Completions equivalent to openai.Completions. Below is an example usage of Completions in generative AI hub sdk. All models that support the legacy completion endpoint can be used.

from gen_ai_hub.proxy.native.openai import completions

response = completions.create(
    model_name="gpt-4o-mini",
    prompt="The Answer to the Ultimate Question of Life, the Universe, and Everything is",
    max_tokens=20,
    temperature=0
)
print(response)

ChatCompletions equivalent to openai.ChatCompletions Below is an example usage of ChatCompletions in generative AI hub sdk.

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

kwargs = dict(model_name='gpt-4o-mini', messages=messages)
response = chat.completions.create(**kwargs)

print(response)

#example where model_name is passed with model_version parameter

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

response = chat.completions.create(model_name='gpt-4o-mini', model_version="latest", messages=messages)
print(response)

#example where deployment_id is passed instead of model_name parameter

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

response = chat.completions.create(deployment_id="dcef02e219ae4916", messages=messages)
print(response)

Responses API

Responses equivalent to openai.Responses. Below is an example usage of Responses in generative AI hub sdk.

see https://developers.openai.com/api/docs/guides/migrate-to-responses

from gen_ai_hub.proxy.native.openai import responses

response = responses.create(
    model="gpt-5",
    instructions="You are a helpful assistant.",
    input="What is the capital of France?",
)
print(response.output_text)

Structured model outputs

LLM output as json objects is a powerful feature that allows you to define the structure of the output you expect from the model.

see https://platform.openai.com/docs/guides/structured-outputs/examples

from pydantic import BaseModel
from gen_ai_hub.proxy.native.openai import chat, responses

class Person(BaseModel):
    name: str
    age: int

response = chat.completions.parse(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me about John Doe, aged 30."}],
    response_format=Person
)
person = response.choices[0].message.parsed  # Fully typed Person
print(person)

# Example using responses

response = responses.parse(
    model="gpt-5",
    input="Tell me about John Doe aged 30.",
    text_format=Person
)
print(response.output_parsed) # Fully typed Person

Google GenAI

Generate Content

from gen_ai_hub.proxy.native.google_genai import Client
from gen_ai_hub.proxy import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
client = Client(proxy_client=proxy_client)

response = client.models.generate_content(model="gemini-2.5-flash",
    contents="How many paws are there for a dog?"
)

print(response)
# Using another model
response = client.models.generate_content(model="gemini-2.0-flash",
                                          contents="Explain the theory of relativity in simple terms.")
print(response)

Generate Content streaming

from gen_ai_hub.proxy.native.google_genai import Client
from gen_ai_hub.proxy import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

client = Client(
    proxy_client=proxy_client,
)


response_stream = client.models.generate_content_stream(model="gemini-2.5-flash",
contents="Explain singularity in short terms.")

for chunk in response_stream:
    print("Chunk: ", chunk.text)

Functional Calling of Google Genai

from google.genai import types
from gen_ai_hub.proxy.native.google_genai import Client
from gen_ai_hub.proxy import get_proxy_client

def get_current_weather(location: str) -> str:
  """Returns the current weather.

  Args:
    location: The city and state, e.g. San Francisco, CA
  """
  return 'sunny'


proxy_client = get_proxy_client('gen-ai-hub')

client = Client(
    proxy_client=proxy_client,
)
response = client.models.generate_content(
  model='gemini-2.5-flash',
  contents='What is the weather like in Boston?',
  config=types.GenerateContentConfig(tools=[get_current_weather]),
)
response

Amazon

Invoke Model

import json
from gen_ai_hub.proxy.native.amazon import Session

bedrock = Session().client(model_name="amazon--nova-premier")
body = json.dumps(
    {
        "inputText": "Explain black holes in astrophysics to 8th graders.",
        "textGenerationConfig": {
            "maxTokenCount": 3072,
            "stopSequences": [],
            "temperature": 0.7,
            "topP": 0.9,
        },
    }
)
response = bedrock.invoke_model(body=body)
response_body = json.loads(response.get("body").read())
print(response_body)

Converse

from gen_ai_hub.proxy.native.amazon import Session

bedrock = Session().client(model_name="anthropic--claude-4-sonnet")
conversation = [
    {
        "role": "user",
        "content": [
            {
                "text": "Describe the purpose of a 'hello world' program in one line."
            }
        ],
    }
]
response = bedrock.converse(
    messages=conversation,
    inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
)
print(response)

Embeddings

OpenAI

Embeddings are equivalent to openai.Embeddings. See below examples of how to use Embeddings in generative AI hub sdk.

from gen_ai_hub.proxy.native.openai import embeddings

response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002"
)
print(response.data)

from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002",
    encoding_format='base64'
)
print(response.data)

Amazon

import json
from gen_ai_hub.proxy.native.amazon import Session
bedrock = Session().client(model_name="amazon--nova-premier")
body = json.dumps(
    {
        "inputText": "Please recommend books with a theme similar to the movie 'Inception'.",
    }
)
response = bedrock.invoke_model(
    body=body,
)
response_body = json.loads(response.get("body").read())
print(response_body)

from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002",
    encoding_format='base64'
)
print(response.data)

Langchain Integration

LangChain provides an interface that abstracts provider-specific details into a common interface. Classes like Chat and Embeddings are interchangeable.

The list of the available models can be found here: Supported Models

Harmonized Model Initialization

The init_llm and init_embedding_model functions allow easy initialization of langchain model interfaces in a harmonized way in generative AI hub sdk

from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from gen_ai_hub.proxy.langchain import init_llm

template = """Question: {question}
    Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=['question'])
question = 'What is a supernova?'

llm = init_llm('gpt-5-nano', max_tokens=300)
chain = prompt | llm | StrOutputParser()
response = chain.invoke({'question': question})
print(response)

from gen_ai_hub.proxy.langchain import init_embedding_model

text = 'Every decoding is another encoding.'

embeddings = init_embedding_model('text-embedding-ada-002')
response = embeddings.embed_query(text)
print(response)

LLM

from langchain import PromptTemplate

from gen_ai_hub.proxy.langchain import OpenAI  # langchain class representing the AICore OpenAI models
from gen_ai_hub.proxy import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
# non-chat model
model_name = "mistralai--mistral-small-instruct"

llm = OpenAI(proxy_model_name=model_name, proxy_client=proxy_client)  # can be used as usual with langchain

template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = prompt | llm

question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"

print(llm_chain.invoke({'question': question}))

Chat model

from langchain.prompts.chat import (
    AIMessagePromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)

from gen_ai_hub.proxy.langchain import ChatOpenAI
from gen_ai_hub.proxy import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

chat_llm = ChatOpenAI(proxy_model_name='gpt-4o-mini', proxy_client=proxy_client)
template = 'You are a helpful assistant that translates english to pirate.'

system_message_prompt = SystemMessagePromptTemplate.from_template(template)

example_human = HumanMessagePromptTemplate.from_template('Hi')
example_ai = AIMessagePromptTemplate.from_template('Ahoy!')
human_template = '{text}'

human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
    [system_message_prompt, example_human, example_ai, human_message_prompt])

chain = chat_prompt | chat_llm

response = chain.invoke({'text': 'I love planking.'})
print(response.content)

Structured model outputs

from gen_ai_hub.proxy.langchain import ChatOpenAI
from gen_ai_hub.proxy import get_proxy_client
from langchain.schema import HumanMessage
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int
chat_model = ChatOpenAI(proxy_model_name="gpt-4o-mini", proxy_client=get_proxy_client())
chat_model = chat_model.with_structured_output(method="json_schema", schema=Person, strict=True)

message = HumanMessage(content="Tell me about a person named John who is 30")
print(chat_model.invoke([message]))

Embeddings

from gen_ai_hub.proxy.langchain import OpenAIEmbeddings
from gen_ai_hub.proxy import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002', proxy_client=proxy_client)

response = embedding_model.embed_query('Every decoding is another encoding.')

#call without passing proxy_client

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')

response = embedding_model.embed_query('Every decoding is another encoding.')
print(response)

SAP RPT-1 Models

SAP-RPT-1 is a relational pretrained transformer for use on relational and structured data. It's developed and maintained by SAP.

Relational Foundation Models (RFMs) are large-scale machine learning models designed to understand, process, and do predictions on tabular and relational data.

SAP-RPT-1 is a table-native model aiming to achieve highest prediction quality and lowest error rates for predictions on tabular business data. It's pretrained, and doesn't need additional training or fine-tuning steps.

RPT-1 solves predictive tasks such as classification and regression out-of-the-box without requiring any training or fine-tuning via in-context learning. Due to its table-native architecture, prediction quality on enterprise data is typically very high, ahead of state-of-the-art narrow AI models and LLMs employed for such tasks.

You can get predictions directly by the native SDK client.

For detailed information about RPT model API visit this page

Example of usage RPTClient for the regression task

This is simple example of RPTClient usage with minimall fields in body and with pydantic models usage.

from gen_ai_hub.proxy.native.sap import RPTRequest, PredictionConfig, TargetColumn, RPTClient

rows_regression = [
    {
        "PRODUCT": "Couch",
        "PRICE": 999.99,
        "ORDERDATE": "28-11-2025",
        "ID": "35",
        "DISCOUNT_RATE": "[PREDICT]",
    },
    {
        "PRODUCT": "Office Chair",
        "PRICE": 150.80,
        "ORDERDATE": "02-11-2025",
        "ID": "44",
        "DISCOUNT_RATE": 0.12,
    },
    {
        "PRODUCT": "Server Rack",
        "PRICE": 2200.00,
        "ORDERDATE": "01-11-2025",
        "ID": "104",
        "DISCOUNT_RATE": 0.05,
    },
    {
        "PRODUCT": "Standing Desk",
        "PRICE": 640.00,
        "ORDERDATE": "05-11-2025",
        "ID": "205",
        "DISCOUNT_RATE": 0.10,
    },
    {
        "PRODUCT": "Monitor 27 inch",
        "PRICE": 289.99,
        "ORDERDATE": "08-11-2025",
        "ID": "306",
        "DISCOUNT_RATE": "[PREDICT]",
    },
]
client = RPTClient()
body = RPTRequest(
            prediction_config=PredictionConfig(
                target_columns=[
                    TargetColumn(name="DISCOUNT_RATE", task_type="regression")
                ]),
            rows=rows_regression
        )
response = client.predict(body=body, model_name="sap-rpt-1-small")
print(response.predictions)

#example with model_version
response = client.predict(body=body, model_name="sap-rpt-1-small", model_version="latest")
print(response.predictions)

Example of usage RPTClient for the classification task

This example shows the possibility to use just dictionary for RPTClient.

example_request_by_columns_dict = {
  "prediction_config": {
    "target_columns": [
      {
        "name": "COSTCENTER",
        "prediction_placeholder": "[PREDICT]",
        "task_type": "classification"
      }
    ]
  },
  "columns": {
      "PRODUCT": ["Couch", "Office Chair", "Server Rack"],
      "PRICE": [999.99, 150.8, 2200.00],
      "ORDERDATE": ["28-11-2025", "02-11-2025", "01-11-2025"],
      "ID": ["35", "44", "104"],
      "COSTCENTER": ["[PREDICT]", "Office Furniture", "Data Infrastructure"]
  },
  "data_schema": {
      "PRODUCT": {
          "dtype": "string"
      },
      "PRICE": {
          "dtype": "numeric"
      },
      "ORDERDATE": {
          "dtype": "date"
      },
      "ID": {
          "dtype": "string"
      },
      "COSTCENTER": {
          "dtype": "string"
      }
  }
}

response = client.predict(body=example_request_by_columns_dict, model_name="sap-rpt-1-small")
print(response.predictions)

Example of async usage of RPTClient

The RPTClient also supports asynchronous calls by apredict method.

await client.apredict(body=example_request_by_columns_dict, model_name="sap-rpt-1-small")

Using New Models Before Official SDK Support

You can use models via Gen AI Hub even before they are officially listed, provided their provider family (e.g., Google, Amazon Bedrock) is supported.

Native SDK Clients:

If using the provider's native SDK (like boto3, google-genai) through the Gen AI Hub proxy, you can often use the new model name/ID directly with existing client methods.
Langchain Integration (init_llm):

The init_llm helper simplifies creating Langchain LLM objects configured for the proxy.
- Alternative: You can always bypass init_llm and instantiate the Langchain classes (e.g., ChatGoogleGenerativeAI, ChatBedrock, ChatBedrockConverse) directly.
- Bedrock Specifics:
  - Requires model_id in addition to model_name. Find IDs here. init_llm automatically selects the appropriate Bedrock API (older Invoke via ChatBedrock or newer Converse via ChatBedrockConverse) based on known models.
  - Crucially: For new Bedrock models or to force a specific API (Invoke/Converse), you must pass the corresponding initialization function (init_chat_model or init_chat_converse_model) to the init_func argument of init_llm.

from gen_ai_hub.proxy.langchain import init_llm
# Import specific init functions for overriding Bedrock behavior
from gen_ai_hub.proxy.langchain.amazon import (
    init_chat_model as amazon_init_invoke_model,
    init_chat_converse_model as amazon_init_converse_model
)
from gen_ai_hub.proxy.langchain.google_genai import init_chat_model as google_genai_init_chat_model

# --- Google Example ---
llm_google = init_llm(model_name='gemini-newer-version', init_func=google_genai_init_chat_model) # Often just needs model_name

# --- Bedrock Example (New Model requiring Converse API) ---
model_name_amazon = 'anthropic--claude-newer-version'
model_id_amazon = 'anthropic.claude-newer-version-v1:0' # Use actual ID

llm_amazon = init_llm(
    model_name_amazon,
    model_id=model_id_amazon,
    init_func=amazon_init_converse_model # Explicitly select Converse API
)

# --- Bedrock Example (Explicitly using older Invoke API) ---
# llm_amazon_invoke = init_llm(
#     'some-model-name',
#     model_id='some-model-id',
#     init_func=amazon_init_invoke_model # Explicitly select Invoke API
# )