Our SDK offers a developer-friendly way to consume foundational models available in the SAP generative AI hub. We strive to facilitate seamless interactions with these models by providing integrations that act as drop-in replacements for the native client SDKs and LangChain. This allows developers to use familiar interfaces and workflows. Usage is as follows.

Native Client Integrations

As of now, there are integrations with three types of native client SDKs (OpenAI, Google, Amazon). The following contains at least one example per SDK. Note: Some providers share the same interface and can be consumed using the same SDK. For example, Anthropic Claude and Amazon Titan can be used with the Amazon SDK.

Chat

Applicable SDK Portion

Provider

Model Name

gen_ai_hub.proxy.native.amazon.clients.Session

Amazon

amazon--nova-lite

amazon--nova-micro

amazon--nova-pro

amazon--titan-text-express

amazon--titan-text-lite

Anthropic

anthropic--claude-3-haiku

anthropic--claude-3-opus

anthropic--claude-3-sonnet

anthropic--claude-3.5-sonnet

gen_ai_hub.proxy.native.google.clients.GenerativeModel

Google

gemini-1.0-pro

gemini-1.5-flash

gemini-1.5-pro

gen_ai_hub.proxy.native.openai

Meta

meta--llama3-70b-instruct

meta--llama3.1-70b-instruct

MistralAI

mistralai--mixtral-8x7b-instruct-v01

OpenAI

gpt-4

gpt-4-32k

gpt-4-turbo

gpt-4o

gpt-4o-mini

o1

o3-mini

Embedding

Applicable SDK Portion

Provider

Model Name

gen_ai_hub.proxy.native.amazon.clients.Session

Amazon

amazon--titan-embed-text

gen_ai_hub.proxy.native.openai

OpenAI

text-embedding-3-small

text-embedding-3-large

text-embedding-ada-002

Completions

OpenAI

Completions equivalent to openai.Completions. Below is an example usage of Completions in generative AI hub sdk. All models that support the legacy completion endpoint can be used.

from gen_ai_hub.proxy.native.openai import completions

response = completions.create(
    model_name="meta--llama3.1-70b-instruct",
    prompt="The Answer to the Ultimate Question of Life, the Universe, and Everything is",
    max_tokens=20,
    temperature=0
)
print(response)

ChatCompletions equivalent to openai.ChatCompletions Below is an example usage of ChatCompletions in generative AI hub sdk.

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

kwargs = dict(model_name='gpt-4o-mini', messages=messages)
response = chat.completions.create(**kwargs)

print(response)
#example where deployment_id is passed instead of model_name parameter

from gen_ai_hub.proxy.native.openai import chat

messages = [{"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]

response = chat.completions.create(deployment_id="dcef02e219ae4916", messages=messages)
print(response)

Google Vertex AI

Generate Content

from gen_ai_hub.proxy.native.google_vertexai.clients import GenerativeModel
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
kwargs = dict({'model_name': 'gemini-1.0-pro'})
model = GenerativeModel(proxy_client=proxy_client, **kwargs)
content = [{
    "role": "user",
    "parts": [{
        "text": "Write a short story about a magic kingdom."
    }]
}]
model_response = model.generate_content(content)
print(model_response)

Function calling of Gemini model using start_chat

from gen_ai_hub.proxy.native.google_vertexai.clients import GenerativeModel
# According to Gemini API it is recommended to use function_calling via chat interface, as this captures user-model back-and-forth interaction.

# example 1 of function calling using start_chat
def multiply(a:float, b:float):
    """returns a * b."""
    return a*b

kwargs = {'model_name': 'gemini-1.0-pro'}
model = GenerativeModel(**kwargs)
chat = model.start_chat(enable_automatic_function_calling=True)
prompt = 'I have 6 cats, each owns 2 mittens, how many mittens is that in total?'
response = chat.send_message(prompt, tools=[multiply])

print(response)
for content in chat.history:
    part = content.parts[0]
    print(content.role, "->", type(part).to_dict(part))
    print('-'*80)
# example 2 of function calling using start_chat

def start_music(energetic: bool, loud: bool, bpm: int) -> str:
    """Play some music matching the specified parameters.

    Args:
      energetic: Whether the music is energetic or not.
      loud: Whether the music is loud or not.
      bpm: The beats per minute of the music.

    Returns: The name of the song being played.
    """
    print(f"Starting music! {energetic=} {loud=}, {bpm=}")
    return "Never gonna give you up."


def dim_lights(brightness: float) -> bool:
    """Dim the lights.

    Args:
      brightness: The brightness of the lights, 0.0 is off, 1.0 is full.
    """
    print(f"Lights are now set to {brightness:.0%}")
    return True

tools = [start_music, dim_lights]
kwargs = {'model_name': 'gemini-1.0-pro'}
model = GenerativeModel(**kwargs)
chat = model.start_chat()

prompt = "Turn this place into a party!"
response = chat.send_message(prompt, tools=[tools])
print(response)
prompt = "Music played should be energetic"
response = chat.send_message(prompt, tools=[tools])
print(response)
prompt = "Light should dim"
response = chat.send_message(prompt, tools=[tools])
print(response)

Amazon

Invoke Model

import json
from gen_ai_hub.proxy.native.amazon.clients import Session

bedrock = Session().client(model_name="amazon--titan-text-express")
body = json.dumps(
    {
        "inputText": "Explain black holes in astrophysics to 8th graders.",
        "textGenerationConfig": {
            "maxTokenCount": 3072,
            "stopSequences": [],
            "temperature": 0.7,
            "topP": 0.9,
        },
    }
)
response = bedrock.invoke_model(body=body)
response_body = json.loads(response.get("body").read())
print(response_body)

Converse

import json
from gen_ai_hub.proxy.native.amazon.clients import Session

bedrock = Session().client(model_name="anthropic--claude-3-haiku")
conversation = [
    {
        "role": "user",
        "content": [
            {
                "text": "Describe the purpose of a 'hello world' program in one line."
            }
        ],
    }
]
response = bedrock.converse(
    messages=conversation,
    inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
)
print(response)

Embeddings

OpenAI

Embeddings are equivalent to openai.Embeddings. See below examples of how to use Embeddings in generative AI hub sdk.

from gen_ai_hub.proxy.native.openai import embeddings

response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002"
)
print(response.data)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002",
    encoding_format='base64'
)
print(response.data)

Amazon

import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="amazon--titan-embed-text")
body = json.dumps(
    {
        "inputText": "Please recommend books with a theme similar to the movie 'Inception'.",
    }
)
response = bedrock.invoke_model(
    body=body,
)
response_body = json.loads(response.get("body").read())
print(response_body)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002",
    encoding_format='base64'
)
print(response.data)

Langchain Integration

LangChain provides an interface that abstracts provider-specific details into a common interface. Classes like Chat and Embeddings are interchangeable. The corresponding classes are listed below:

Chat Classes

LangChain Class

Provider

Model Name

gen_ai_hub.proxy.langchain.amazon.ChatBedrock

Amazon

amazon--titan-text-express

amazon--titan-text-lite

Anthropic

anthropic--claude-3-haiku

anthropic--claude-3-opus

anthropic--claude-3-sonnet

gen_ai_hub.proxy.langchain.amazon.ChatBedrockConverse

Amazon

amazon--nova-lite

amazon--nova-micro

amazon--nova-pro

amazon--titan-text-express

amazon--titan-text-lite

Anthropic

anthropic--claude-3-haiku

anthropic--claude-3-opus

anthropic--claude-3-sonnet

gen_ai_hub.proxy.langchain.google_gemini.ChatGoogleGenerativeAI

Google

gemini-1.0-pro

gemini-1.5-flash

gemini-1.5-pro

gen_ai_hub.proxy.langchain.openai.ChatOpenAI

Meta

meta--llama3-70b-instruct

meta--llama3.1-70b-instruct

MistralAI

mistralai--mixtral-8x7b-instruct-v01

OpenAI

gpt-4

gpt-4-32k

gpt-4-turbo

gpt-4o

gpt-4o-mini

o1

o3-mini

Note : ChatBedrockConverse LangChain class does not support "System prompts" for Amazon titan models.

Embeddings Classes

Applicable SDK Portion

Provider

Model Name

gen_ai_hub.proxy.langchain.amazon.BedrockEmbeddings

Amazon

amazon--titan-embed-text

gen_ai_hub.proxy.langchain.openai.OpenAIEmbeddings

OpenAI

text-embedding-3-small

text-embedding-3-large

text-embedding-ada-002

Harmonized Model Initialization

The init_llm and init_embedding_model functions allow easy initialization of langchain model interfaces in a harmonized way in generative AI hub sdk

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from gen_ai_hub.proxy.langchain.init_models import init_llm

template = """Question: {question}
    Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=['question'])
question = 'What is a supernova?'

llm = init_llm('meta--llama3.1-70b-instruct', max_tokens=300)
chain = prompt | llm | StrOutputParser()
response = chain.invoke({'question': question})
print(response)
from gen_ai_hub.proxy.langchain.init_models import init_embedding_model

text = 'Every decoding is another encoding.'

embeddings = init_embedding_model('text-embedding-ada-002')
response = embeddings.embed_query(text)
print(response)

LLM

from langchain import PromptTemplate

from gen_ai_hub.proxy.langchain.openai import OpenAI  # langchain class representing the AICore OpenAI models
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
# non-chat model
model_name = "meta--llama3.1-70b-instruct"

llm = OpenAI(proxy_model_name=model_name, proxy_client=proxy_client)  # can be used as usual with langchain

template = """Question: {question}

Answer: Let's think step by step."""

prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = prompt | llm

question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"

print(llm_chain.invoke({'question': question}))

Chat model

from langchain.prompts.chat import (
    AIMessagePromptTemplate,
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)

from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

chat_llm = ChatOpenAI(proxy_model_name='gpt-4o-mini', proxy_client=proxy_client)
template = 'You are a helpful assistant that translates english to pirate.'

system_message_prompt = SystemMessagePromptTemplate.from_template(template)

example_human = HumanMessagePromptTemplate.from_template('Hi')
example_ai = AIMessagePromptTemplate.from_template('Ahoy!')
human_template = '{text}'

human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
    [system_message_prompt, example_human, example_ai, human_message_prompt])

chain = chat_prompt | chat_llm

response = chain.invoke({'text': 'I love planking.'})
print(response.content)

Embeddings

from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002', proxy_client=proxy_client)

response = embedding_model.embed_query('Every decoding is another encoding.')

#call without passing proxy_client

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')

response = embedding_model.embed_query('Every decoding is another encoding.')
print(response)

Using New Models in Gen AI Hub Before They Are Added to SDK

This works only if the model belongs to a model family for which the corresponding native api is already supported in the SDK e.g., openai, google_vertexai, etc. For querying these models with native clients, no additional steps are needed, you can refer to the examples for supported models. For using them with langchain integration, please refer to the following example.

from gen_ai_hub.proxy.langchain.amazon import init_chat_model as amazon_init_chat_model
from gen_ai_hub.proxy.langchain.google_vertexai import init_chat_model as google_vertexai_init_chat_model
from gen_ai_hub.proxy.langchain.init_models import init_llm

#usage of new model, which is not added to SDK yet
model_name = 'gemini-newer-version'
init_func = google_vertexai_init_chat_model
llm = init_llm(model_name, init_func=init_func)

# usage of new amazon model, which is not added to SDK yet

# In case of Amazon models, you need to provide model_id additionally.
# You can see the full list of amazon model ids here: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
model_name = 'anthropic--claude-newer-version'
model_id = 'anthropic.claude-newer-version-202401220-v1:0'
init_func = amazon_init_chat_model
llm = init_llm(model_name, model_id=model_id, init_func=init_func)