Our SDK offers a developer-friendly way to consume foundational models available in the SAP generative AI hub. We strive to facilitate seamless interactions with these models by providing integrations that act as drop-in replacements for the native client SDKs and LangChain. This allows developers to use familiar interfaces and workflows. Usage is as follows.
Native Client Integrations
As of now, there are integrations with three types of native client SDKs (OpenAI, Google, Amazon). The following contains at least one example per SDK. Note: Some providers share the same interface and can be consumed using the same SDK. For example, Anthropic Claude and Amazon Titan can be used with the Amazon SDK.
Chat
Applicable SDK Portion |
Provider |
Model Name |
---|---|---|
gen_ai_hub.proxy.native.amazon.clients.Session |
Amazon |
amazon--nova-lite |
amazon--nova-micro |
||
amazon--nova-pro |
||
amazon--titan-text-express |
||
amazon--titan-text-lite |
||
Anthropic |
anthropic--claude-3-haiku |
|
anthropic--claude-3-opus |
||
anthropic--claude-3-sonnet |
||
anthropic--claude-3.5-sonnet |
||
gen_ai_hub.proxy.native.google.clients.GenerativeModel |
gemini-1.0-pro |
|
gemini-1.5-flash |
||
gemini-1.5-pro |
||
gen_ai_hub.proxy.native.openai |
Meta |
meta--llama3-70b-instruct |
meta--llama3.1-70b-instruct |
||
MistralAI |
mistralai--mixtral-8x7b-instruct-v01 |
|
OpenAI |
gpt-4 |
|
gpt-4-32k |
||
gpt-4-turbo |
||
gpt-4o |
||
gpt-4o-mini |
||
o1 |
||
o3-mini |
Embedding
Applicable SDK Portion |
Provider |
Model Name |
---|---|---|
gen_ai_hub.proxy.native.amazon.clients.Session |
Amazon |
amazon--titan-embed-text |
gen_ai_hub.proxy.native.openai |
OpenAI |
text-embedding-3-small |
text-embedding-3-large |
||
text-embedding-ada-002 |
Completions
OpenAI
Completions
equivalent to openai.Completions
.
Below is an example usage of Completions in generative AI hub sdk.
All models that support the legacy completion endpoint can be used.
from gen_ai_hub.proxy.native.openai import completions
response = completions.create(
model_name="meta--llama3.1-70b-instruct",
prompt="The Answer to the Ultimate Question of Life, the Universe, and Everything is",
max_tokens=20,
temperature=0
)
print(response)
ChatCompletions
equivalent to openai.ChatCompletions
Below is an example usage of ChatCompletions in generative AI hub sdk.
from gen_ai_hub.proxy.native.openai import chat
messages = [{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
{"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]
kwargs = dict(model_name='gpt-4o-mini', messages=messages)
response = chat.completions.create(**kwargs)
print(response)
#example where deployment_id is passed instead of model_name parameter
from gen_ai_hub.proxy.native.openai import chat
messages = [{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
{"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
{"role": "user", "content": "Do other Azure Cognitive Services support this too?"}]
response = chat.completions.create(deployment_id="dcef02e219ae4916", messages=messages)
print(response)
Google Vertex AI
Generate Content
from gen_ai_hub.proxy.native.google_vertexai.clients import GenerativeModel
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
kwargs = dict({'model_name': 'gemini-1.0-pro'})
model = GenerativeModel(proxy_client=proxy_client, **kwargs)
content = [{
"role": "user",
"parts": [{
"text": "Write a short story about a magic kingdom."
}]
}]
model_response = model.generate_content(content)
print(model_response)
Function calling of Gemini model using start_chat
from gen_ai_hub.proxy.native.google_vertexai.clients import GenerativeModel
# According to Gemini API it is recommended to use function_calling via chat interface, as this captures user-model back-and-forth interaction.
# example 1 of function calling using start_chat
def multiply(a:float, b:float):
"""returns a * b."""
return a*b
kwargs = {'model_name': 'gemini-1.0-pro'}
model = GenerativeModel(**kwargs)
chat = model.start_chat(enable_automatic_function_calling=True)
prompt = 'I have 6 cats, each owns 2 mittens, how many mittens is that in total?'
response = chat.send_message(prompt, tools=[multiply])
print(response)
for content in chat.history:
part = content.parts[0]
print(content.role, "->", type(part).to_dict(part))
print('-'*80)
# example 2 of function calling using start_chat
def start_music(energetic: bool, loud: bool, bpm: int) -> str:
"""Play some music matching the specified parameters.
Args:
energetic: Whether the music is energetic or not.
loud: Whether the music is loud or not.
bpm: The beats per minute of the music.
Returns: The name of the song being played.
"""
print(f"Starting music! {energetic=} {loud=}, {bpm=}")
return "Never gonna give you up."
def dim_lights(brightness: float) -> bool:
"""Dim the lights.
Args:
brightness: The brightness of the lights, 0.0 is off, 1.0 is full.
"""
print(f"Lights are now set to {brightness:.0%}")
return True
tools = [start_music, dim_lights]
kwargs = {'model_name': 'gemini-1.0-pro'}
model = GenerativeModel(**kwargs)
chat = model.start_chat()
prompt = "Turn this place into a party!"
response = chat.send_message(prompt, tools=[tools])
print(response)
prompt = "Music played should be energetic"
response = chat.send_message(prompt, tools=[tools])
print(response)
prompt = "Light should dim"
response = chat.send_message(prompt, tools=[tools])
print(response)
Amazon
Invoke Model
import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="amazon--titan-text-express")
body = json.dumps(
{
"inputText": "Explain black holes in astrophysics to 8th graders.",
"textGenerationConfig": {
"maxTokenCount": 3072,
"stopSequences": [],
"temperature": 0.7,
"topP": 0.9,
},
}
)
response = bedrock.invoke_model(body=body)
response_body = json.loads(response.get("body").read())
print(response_body)
Converse
import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="anthropic--claude-3-haiku")
conversation = [
{
"role": "user",
"content": [
{
"text": "Describe the purpose of a 'hello world' program in one line."
}
],
}
]
response = bedrock.converse(
messages=conversation,
inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
)
print(response)
Embeddings
OpenAI
Embeddings
are equivalent to openai.Embeddings
. See below examples of how to use Embeddings
in generative AI hub sdk.
from gen_ai_hub.proxy.native.openai import embeddings
response = embeddings.create(
input="Every decoding is another encoding.",
model_name="text-embedding-ada-002"
)
print(response.data)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
input="Every decoding is another encoding.",
model_name="text-embedding-ada-002",
encoding_format='base64'
)
print(response.data)
Amazon
import json
from gen_ai_hub.proxy.native.amazon.clients import Session
bedrock = Session().client(model_name="amazon--titan-embed-text")
body = json.dumps(
{
"inputText": "Please recommend books with a theme similar to the movie 'Inception'.",
}
)
response = bedrock.invoke_model(
body=body,
)
response_body = json.loads(response.get("body").read())
print(response_body)
from gen_ai_hub.proxy.native.openai import embeddings
# example with encoding format passed as parameter
response = embeddings.create(
input="Every decoding is another encoding.",
model_name="text-embedding-ada-002",
encoding_format='base64'
)
print(response.data)
Langchain Integration
LangChain provides an interface that abstracts provider-specific details into a common interface. Classes like Chat
Chat Classes
LangChain Class |
Provider |
Model Name |
---|---|---|
gen_ai_hub.proxy.langchain.amazon.ChatBedrock |
Amazon |
amazon--titan-text-express |
amazon--titan-text-lite |
||
Anthropic |
anthropic--claude-3-haiku |
|
anthropic--claude-3-opus |
||
anthropic--claude-3-sonnet |
||
gen_ai_hub.proxy.langchain.amazon.ChatBedrockConverse |
Amazon |
amazon--nova-lite |
amazon--nova-micro |
||
amazon--nova-pro |
||
amazon--titan-text-express |
||
amazon--titan-text-lite |
||
Anthropic |
anthropic--claude-3-haiku |
|
anthropic--claude-3-opus |
||
anthropic--claude-3-sonnet |
||
gen_ai_hub.proxy.langchain.google_gemini.ChatGoogleGenerativeAI |
gemini-1.0-pro |
|
gemini-1.5-flash |
||
gemini-1.5-pro |
||
gen_ai_hub.proxy.langchain.openai.ChatOpenAI |
Meta |
meta--llama3-70b-instruct |
meta--llama3.1-70b-instruct |
||
MistralAI |
mistralai--mixtral-8x7b-instruct-v01 |
|
OpenAI |
gpt-4 |
|
gpt-4-32k |
||
gpt-4-turbo |
||
gpt-4o |
||
gpt-4o-mini |
||
o1 |
||
o3-mini |
Note : ChatBedrockConverse LangChain class does not support "System prompts" for Amazon titan models.
Embeddings Classes
Applicable SDK Portion |
Provider |
Model Name |
---|---|---|
gen_ai_hub.proxy.langchain.amazon.BedrockEmbeddings |
Amazon |
amazon--titan-embed-text |
gen_ai_hub.proxy.langchain.openai.OpenAIEmbeddings |
OpenAI |
text-embedding-3-small |
text-embedding-3-large |
||
text-embedding-ada-002 |
Harmonized Model Initialization
The init_llm
and init_embedding_model
functions allow easy initialization of langchain model interfaces in a harmonized way in generative AI hub sdk
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from gen_ai_hub.proxy.langchain.init_models import init_llm
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=['question'])
question = 'What is a supernova?'
llm = init_llm('meta--llama3.1-70b-instruct', max_tokens=300)
chain = prompt | llm | StrOutputParser()
response = chain.invoke({'question': question})
print(response)
from gen_ai_hub.proxy.langchain.init_models import init_embedding_model
text = 'Every decoding is another encoding.'
embeddings = init_embedding_model('text-embedding-ada-002')
response = embeddings.embed_query(text)
print(response)
LLM
from langchain import PromptTemplate
from gen_ai_hub.proxy.langchain.openai import OpenAI # langchain class representing the AICore OpenAI models
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
# non-chat model
model_name = "meta--llama3.1-70b-instruct"
llm = OpenAI(proxy_model_name=model_name, proxy_client=proxy_client) # can be used as usual with langchain
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = prompt | llm
question = "What NFL team won the Super Bowl in the year Justin Bieber was born?"
print(llm_chain.invoke({'question': question}))
Chat model
from langchain.prompts.chat import (
AIMessagePromptTemplate,
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
chat_llm = ChatOpenAI(proxy_model_name='gpt-4o-mini', proxy_client=proxy_client)
template = 'You are a helpful assistant that translates english to pirate.'
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
example_human = HumanMessagePromptTemplate.from_template('Hi')
example_ai = AIMessagePromptTemplate.from_template('Ahoy!')
human_template = '{text}'
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages(
[system_message_prompt, example_human, example_ai, human_message_prompt])
chain = chat_prompt | chat_llm
response = chain.invoke({'text': 'I love planking.'})
print(response.content)
Embeddings
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
proxy_client = get_proxy_client('gen-ai-hub')
embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002', proxy_client=proxy_client)
response = embedding_model.embed_query('Every decoding is another encoding.')
#call without passing proxy_client
embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')
response = embedding_model.embed_query('Every decoding is another encoding.')
print(response)
Using New Models in Gen AI Hub Before They Are Added to SDK
This works only if the model belongs to a model family for which the corresponding native api is already supported in the SDK e.g., openai, google_vertexai, etc. For querying these models with native clients, no additional steps are needed, you can refer to the examples for supported models. For using them with langchain integration, please refer to the following example.
from gen_ai_hub.proxy.langchain.amazon import init_chat_model as amazon_init_chat_model
from gen_ai_hub.proxy.langchain.google_vertexai import init_chat_model as google_vertexai_init_chat_model
from gen_ai_hub.proxy.langchain.init_models import init_llm
#usage of new model, which is not added to SDK yet
model_name = 'gemini-newer-version'
init_func = google_vertexai_init_chat_model
llm = init_llm(model_name, init_func=init_func)
# usage of new amazon model, which is not added to SDK yet
# In case of Amazon models, you need to provide model_id additionally.
# You can see the full list of amazon model ids here: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
model_name = 'anthropic--claude-newer-version'
model_id = 'anthropic.claude-newer-version-202401220-v1:0'
init_func = amazon_init_chat_model
llm = init_llm(model_name, model_id=model_id, init_func=init_func)