gen_ai_hub.document_grounding.clients package

Parameters:

pipeline_id (str)
execution_id (str)
document_id (str)

get_execution_documents(pipeline_id, execution_id, top=None, skip=None, count=None)

Get Documents for a Pipeline Execution

Parameters:

pipeline_id (str) -- Pipeline ID
execution_id (str) -- Execution ID
top (Optional[int], optional) -- the maximum number of documents to return, defaults to None
skip (Optional[int], optional) -- number of documents to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total documents, defaults to None

Returns:

Documents for the Pipeline Execution

Return type:

get_pipeline_by_id(pipeline_id)

Get details of a pipeline by pipeline id.

Parameters:: pipeline_id (str) -- Pipeline ID
Returns:: Details of the pipeline
Return type:: BasePipelineResponse

get_pipeline_document_by_id(pipeline_id, document_id)

Get Document by ID for a Pipeline

Parameters:

pipeline_id (str) -- Pipeline ID
document_id (str) -- Document ID

Returns:

Document for the Pipeline

Return type:

get_pipeline_documents(pipeline_id, top=None, skip=None, count=None)

Get Documents for a Pipeline

Parameters:

pipeline_id (str) -- Pipeline ID
top (Optional[int], optional) -- the maximum number of documents to return, defaults to None
skip (Optional[int], optional) -- number of documents to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total documents, defaults to None

Returns:

Documents for the Pipeline

Return type:

GetPipelineExecutionsResponse

get_pipeline_execution_by_id(pipeline_id, execution_id)

Get Pipeline Execution by ID

Parameters:

pipeline_id (str) -- Pipeline ID
execution_id (str) -- Execution ID

Returns:

Pipeline Execution

Return type:

PipelineExecution

get_pipeline_executions(pipeline_id, last_execution=None, top=None, skip=None, count=None)

Get Pipeline Executions

Parameters:

pipeline_id (str) -- Pipeline ID
last_execution (Optional[bool], optional) -- flag to get only the last execution, defaults to None
top (Optional[int], optional) -- number of executions to retrieve, defaults to None
skip (Optional[int], optional) -- number of executions to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total executions, defaults to None

Returns:

Pipeline Executions

Return type:

get_pipeline_status(pipeline_id)

Get pipeline status by pipeline id

Parameters:: pipeline_id (str) -- Pipeline ID
Returns:: Status of the pipeline
Return type:: GetPipelineStatusResponse

get_pipelines(top=None, skip=None, count=None)

Get all pipelines.

Returns:

Get all pipelines

Return type:

GetPipelinesResponse

Parameters:

top (int | None)
skip (int | None)
count (bool | None)

search_pipelines(body)

Pipeline Search by Metadata

Parameters:: body (SearchPipelineRequest) -- The search request object containing metadata filters.
Returns:: Search results containing matching pipelines.
Return type:: SearchPipelinesResponse

trigger_pipeline(request)

Trigger Pipeline Manually

Parameters:: request (ManualPipelineTrigger) -- The manual trigger request object.
Returns:: Response of the trigger operation
Return type:: requests.Response

class RetrievalAPIClient

Bases: object

The Retrieval API enables querying and retrieving relevant content from configured data repositories, such as vector or external document sources (e.g., help.sap.com).

Retrieval combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.

Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval

__init__(proxy_client=None)

Initialize the RetrievalAPIClient.

Parameters:: proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client for making API requests.

get_data_repositories(top=None, skip=None, count=None)

List all data repositories available to the tenant.

Parameters:

top (Optional[int], optional) -- the number of items to return, defaults to None
skip (Optional[int], optional) -- the number of items to skip, defaults to None
count (Optional[bool], optional) -- whether to include a count of total items, defaults to None

Returns:

DataRepositories model containing the list of data repositories

Return type:

DataRepositories

get_data_repository_by_id(repository_id)

Get a single data repository by its unique ID.

Parameters:: repository_id (str) -- the unique identifier of the data repository
Returns:: DataRepository model representing the data repository
Return type:: DataRepository

search(search_input)

Perform a retrieval search for relevant content.

Parameters:: search_input (RetrievalSearchInput) -- RetrievalSearchInput model defining the query and filters.
Returns:: RetrievalSearchResults model containing repositories, documents, and chunks.
Return type:: RetrievalSearchResults

class VectorAPIClient

Bases: object

The Vector API provides management and search capabilities for vector-based document collections.

It enables creating, retrieving, updating, and deleting collections, as well as managing documents and performing semantic vector searches within those collections.

Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector

__init__(proxy_client=None)

Initializes the VectorAPIClient

Parameters:: proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client to use for requests

create_collection(collection_request)

Create a new collection.

Parameters:: collection_request (CollectionCreateRequest) -- The object containing the collection configuration.
Returns:: requests.Response empty object with 202 status code
Return type:: requests.Response

create_documents(collection_id, request)

Create documents in a collection.

Parameters:

collection_id (str) -- The ID of the collection to add documents to.
request (DocumentsCreateRequest) -- The object containing the documents to create.

Returns:

A DocumentsListResponse object containing the created documents

Return type:

delete_collection(collection_id)

Delete collection by ID.

Parameters:: collection_id (str) -- The ID of the collection to delete.
Returns:: requests.Response empty object with 204 status code
Return type:: requests.Response

delete_document(collection_id, document_id)

Delete a document from a collection.

Parameters:

collection_id (str) -- The ID of the collection to delete the document from.
document_id (str) -- The ID of the document to delete.

Returns:

requests.Response empty object with 204 status code

Return type:

requests.Response

get_collection_by_id(collection_id)

Get collection details by ID.

Parameters:: collection_id (str) -- The ID of the collection to retrieve.
Returns:: A Collection object containing the collection details
Return type:: Collection

get_collection_creation_status(collection_id)

Get creation status for a collection.

Parameters:: collection_id (str) -- The ID of the collection to retrieve the creation status for.
Returns:: A CollectionCreationStatusResponse object containing the creation status
Return type:: CollectionCreationStatusResponse

get_collection_deletion_status(collection_id)

Get deletion status for a collection.

Parameters:: collection_id (str) -- The ID of the collection to retrieve the deletion status for.
Returns:: A CollectionDeletionStatusResponse object containing the deletion status
Return type:: CollectionDeletionStatusResponse

get_collections(top=None, skip=None, count=None)

Get all collections.

Parameters:

top (Optional[int], optional) -- the number of collections to retrieve, defaults to None
skip (Optional[int], optional) -- the number of collections to skip, defaults to None
count (Optional[bool], optional) -- whether to include the total count of collections, defaults to None

Returns:

A CollectionsListResponse object containing the list of collections

Return type:

CollectionsListResponse

get_document_by_id(collection_id, document_id)

Get a document by ID from a collection.

Parameters:

collection_id (str) -- The ID of the collection to retrieve the document from.
document_id (str) -- The ID of the document to retrieve.

Returns:

A Document object containing the document details

Return type:

get_documents(collection_id, top=None, skip=None, count=None)

Get documents from a collection.

Parameters:

collection_id (str) -- The ID of the collection to retrieve documents from.
top (Optional[int], optional) -- the number of documents to retrieve, defaults to None
skip (Optional[int], optional) -- the number of documents to skip, defaults to None
count (Optional[bool], optional) -- whether to include the total count of documents, defaults to None

Returns:

A DocumentsResponse object containing the list of documents

Return type:

DocumentsResponse

search(request)

Perform semantic search in vector collections.

Parameters:: request (TextSearchRequest) -- The object containing the search parameters.
Returns:: A VectorSearchResults object containing the search results
Return type:: VectorSearchResults

update_documents(collection_id, request)

Update documents in a collection.

Parameters:

collection_id (str) -- The ID of the collection to update documents in.
request (DocumentsUpdateRequest) -- The object containing the documents to update.

Returns:

A DocumentsListResponse object containing the updated documents

Return type:

See https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Pipelines

Submodules

gen_ai_hub.document_grounding.clients.pipeline_api_client module

Pipeline API client for Document Grounding.

This module provides the PipelineAPIClient class for managing document vectorization pipelines. Pipelines automate the process of fetching documents from data repositories, preprocessing and chunking content, generating semantic embeddings, and storing them in HANA Vector Store.

Supported data repositories:

Microsoft SharePoint
AWS S3
SFTP

API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Pipelines

class PipelineAPIClient

Bases: object

Fetches documents from a supported data source
Preprocesses and chunks the document content, and generates semantic embeddings. Semantic embeddings are multidimensional representations of textual information.
Stores semantic embeddings into the HANA Vector Store

The Pipeline API is compatible with the following data repositories:

Microsoft SharePoint
AWS S3
SFTP

__init__(proxy_client=None)

Initializes the PipelineAPIClient

Parameters:: proxy_client (Optional[GenAIHubProxyClient], optional) -- proxy client to use for requests, defaults to None

create_pipeline(pipeline_request)

Create a document vectorization pipeline

Parameters:: pipeline_request (CreatePipelineRequest) -- The object containing the pipeline configuration.
Returns:: ID of the created pipeline
Return type:: PipelineIdResponse

delete_pipeline_by_id(pipeline_id)

Delete a pipeline by pipeline id

Parameters:: pipeline_id (str) -- ID of the pipeline to delete
Returns:: Response of the delete operation
Return type:: requests.Response

get_execution_document_by_id(pipeline_id, execution_id, document_id)

Get Document by ID for a Pipeline Execution

Returns:

Document for the Pipeline Execution

Return type:

Parameters:

pipeline_id (str)
execution_id (str)
document_id (str)

get_execution_documents(pipeline_id, execution_id, top=None, skip=None, count=None)

Get Documents for a Pipeline Execution

Parameters:

pipeline_id (str) -- Pipeline ID
execution_id (str) -- Execution ID
top (Optional[int], optional) -- the maximum number of documents to return, defaults to None
skip (Optional[int], optional) -- number of documents to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total documents, defaults to None

Returns:

Documents for the Pipeline Execution

Return type:

get_pipeline_by_id(pipeline_id)

Get details of a pipeline by pipeline id.

Parameters:: pipeline_id (str) -- Pipeline ID
Returns:: Details of the pipeline
Return type:: BasePipelineResponse

get_pipeline_document_by_id(pipeline_id, document_id)

Get Document by ID for a Pipeline

Parameters:

pipeline_id (str) -- Pipeline ID
document_id (str) -- Document ID

Returns:

Document for the Pipeline

Return type:

get_pipeline_documents(pipeline_id, top=None, skip=None, count=None)

Get Documents for a Pipeline

Parameters:

pipeline_id (str) -- Pipeline ID
top (Optional[int], optional) -- the maximum number of documents to return, defaults to None
skip (Optional[int], optional) -- number of documents to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total documents, defaults to None

Returns:

Documents for the Pipeline

Return type:

GetPipelineExecutionsResponse

get_pipeline_execution_by_id(pipeline_id, execution_id)

Get Pipeline Execution by ID

Parameters:

pipeline_id (str) -- Pipeline ID
execution_id (str) -- Execution ID

Returns:

Pipeline Execution

Return type:

PipelineExecution

get_pipeline_executions(pipeline_id, last_execution=None, top=None, skip=None, count=None)

Get Pipeline Executions

Parameters:

pipeline_id (str) -- Pipeline ID
last_execution (Optional[bool], optional) -- flag to get only the last execution, defaults to None
top (Optional[int], optional) -- number of executions to retrieve, defaults to None
skip (Optional[int], optional) -- number of executions to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total executions, defaults to None

Returns:

Pipeline Executions

Return type:

get_pipeline_status(pipeline_id)

Get pipeline status by pipeline id

Parameters:: pipeline_id (str) -- Pipeline ID
Returns:: Status of the pipeline
Return type:: GetPipelineStatusResponse

get_pipelines(top=None, skip=None, count=None)

Get all pipelines.

Returns:

Get all pipelines

Return type:

GetPipelinesResponse

Parameters:

top (int | None)
skip (int | None)
count (bool | None)

search_pipelines(body)

Pipeline Search by Metadata

Parameters:: body (SearchPipelineRequest) -- The search request object containing metadata filters.
Returns:: Search results containing matching pipelines.
Return type:: SearchPipelinesResponse

trigger_pipeline(request)

Trigger Pipeline Manually

Parameters:: request (ManualPipelineTrigger) -- The manual trigger request object.
Returns:: Response of the trigger operation
Return type:: requests.Response

gen_ai_hub.document_grounding.clients.retrieval_api_client module

Retrieval API client for Document Grounding.

This module provides the RetrievalAPIClient class for querying and retrieving relevant content from configured data repositories. The Retrieval API combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.

Supported repository types:

Vector stores
External document sources (e.g., help.sap.com)

API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval

class RetrievalAPIClient

Bases: object

The Retrieval API enables querying and retrieving relevant content from configured data repositories, such as vector or external document sources (e.g., help.sap.com).

Retrieval combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.

Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval

__init__(proxy_client=None)

Initialize the RetrievalAPIClient.

Parameters:: proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client for making API requests.

get_data_repositories(top=None, skip=None, count=None)

List all data repositories available to the tenant.

Parameters:

top (Optional[int], optional) -- the number of items to return, defaults to None
skip (Optional[int], optional) -- the number of items to skip, defaults to None
count (Optional[bool], optional) -- whether to include a count of total items, defaults to None

Returns:

DataRepositories model containing the list of data repositories

Return type:

DataRepositories

get_data_repository_by_id(repository_id)

Get a single data repository by its unique ID.

Parameters:: repository_id (str) -- the unique identifier of the data repository
Returns:: DataRepository model representing the data repository
Return type:: DataRepository

search(search_input)

Perform a retrieval search for relevant content.

Parameters:: search_input (RetrievalSearchInput) -- RetrievalSearchInput model defining the query and filters.
Returns:: RetrievalSearchResults model containing repositories, documents, and chunks.
Return type:: RetrievalSearchResults

gen_ai_hub.document_grounding.clients.vector_api_client module

Vector API client for Document Grounding.

This module provides the VectorAPIClient class for managing vector-based document collections and performing semantic searches. The Vector API enables creating, retrieving, updating, and deleting collections, as well as managing documents within those collections.

Key capabilities:

Collection management (create, read, update, delete)
Document management within collections
Semantic vector search across collections
Collection status tracking (creation/deletion)

API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector

class VectorAPIClient

Bases: object

The Vector API provides management and search capabilities for vector-based document collections.

It enables creating, retrieving, updating, and deleting collections, as well as managing documents and performing semantic vector searches within those collections.

Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector

__init__(proxy_client=None)

Initializes the VectorAPIClient

Parameters:: proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client to use for requests

create_collection(collection_request)

Create a new collection.

Parameters:: collection_request (CollectionCreateRequest) -- The object containing the collection configuration.
Returns:: requests.Response empty object with 202 status code
Return type:: requests.Response

create_documents(collection_id, request)

Create documents in a collection.

Parameters:

collection_id (str) -- The ID of the collection to add documents to.
request (DocumentsCreateRequest) -- The object containing the documents to create.

Returns:

A DocumentsListResponse object containing the created documents

Return type:

delete_collection(collection_id)

Delete collection by ID.

Parameters:: collection_id (str) -- The ID of the collection to delete.
Returns:: requests.Response empty object with 204 status code
Return type:: requests.Response

delete_document(collection_id, document_id)

Delete a document from a collection.

Parameters:

collection_id (str) -- The ID of the collection to delete the document from.
document_id (str) -- The ID of the document to delete.

Returns:

requests.Response empty object with 204 status code

Return type:

requests.Response

get_collection_by_id(collection_id)

Get collection details by ID.

Parameters:: collection_id (str) -- The ID of the collection to retrieve.
Returns:: A Collection object containing the collection details
Return type:: Collection

get_collection_creation_status(collection_id)

Get creation status for a collection.

Parameters:: collection_id (str) -- The ID of the collection to retrieve the creation status for.
Returns:: A CollectionCreationStatusResponse object containing the creation status
Return type:: CollectionCreationStatusResponse

get_collection_deletion_status(collection_id)

Get deletion status for a collection.

Parameters:: collection_id (str) -- The ID of the collection to retrieve the deletion status for.
Returns:: A CollectionDeletionStatusResponse object containing the deletion status
Return type:: CollectionDeletionStatusResponse

get_collections(top=None, skip=None, count=None)

Get all collections.

Parameters:

top (Optional[int], optional) -- the number of collections to retrieve, defaults to None
skip (Optional[int], optional) -- the number of collections to skip, defaults to None
count (Optional[bool], optional) -- whether to include the total count of collections, defaults to None

Returns:

A CollectionsListResponse object containing the list of collections

Return type:

CollectionsListResponse

get_document_by_id(collection_id, document_id)

Get a document by ID from a collection.

Parameters:

collection_id (str) -- The ID of the collection to retrieve the document from.
document_id (str) -- The ID of the document to retrieve.

Returns:

A Document object containing the document details

Return type:

get_documents(collection_id, top=None, skip=None, count=None)

Get documents from a collection.

Parameters:

collection_id (str) -- The ID of the collection to retrieve documents from.
top (Optional[int], optional) -- the number of documents to retrieve, defaults to None
skip (Optional[int], optional) -- the number of documents to skip, defaults to None
count (Optional[bool], optional) -- whether to include the total count of documents, defaults to None

Returns:

A DocumentsResponse object containing the list of documents

Return type:

DocumentsResponse

search(request)

Perform semantic search in vector collections.

Parameters:: request (TextSearchRequest) -- The object containing the search parameters.
Returns:: A VectorSearchResults object containing the search results
Return type:: VectorSearchResults

update_documents(collection_id, request)

Update documents in a collection.

Parameters:

collection_id (str) -- The ID of the collection to update documents in.
request (DocumentsUpdateRequest) -- The object containing the documents to update.

Returns:

A DocumentsListResponse object containing the updated documents

Return type: