gen_ai_hub.document_grounding.clients package

Clients subpackage for Document Grounding API.

This subpackage contains the API client implementations for interacting with the SAP Generative AI Hub Document Grounding services.

Available clients:
  • PipelineAPIClient: Manages document vectorization pipelines from various data sources

  • RetrievalAPIClient: Performs retrieval operations across configured data repositories

  • VectorAPIClient: Manages vector collections and performs semantic searches

Submodules

gen_ai_hub.document_grounding.clients.pipeline_api_client module

Pipeline API client for Document Grounding.

This module provides the PipelineAPIClient class for managing document vectorization pipelines. Pipelines automate the process of fetching documents from data repositories, preprocessing and chunking content, generating semantic embeddings, and storing them in HANA Vector Store.

Supported data repositories:
  • Microsoft SharePoint

  • AWS S3

  • SFTP

API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Pipelines

class PipelineAPIClient

Bases: object

The Pipelines API creates and manages vector stores based on documents from user data repositories: S3, SFTP, and Microsoft SharePoint. Each pipeline represents a configured end-to-end process including the following steps:

  • Fetches documents from a supported data source

  • Preprocesses and chunks the document content, and generates semantic embeddings. Semantic embeddings are multidimensional representations of textual information.

  • Stores semantic embeddings into the HANA Vector Store

The Pipeline API is compatible with the following data repositories:

  • Microsoft SharePoint

  • AWS S3

  • SFTP

See https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Pipelines

__init__(proxy_client=None)

Initializes the PipelineAPIClient

Parameters:

proxy_client (Optional[GenAIHubProxyClient], optional) -- proxy client to use for requests, defaults to None

create_pipeline(pipeline_request)

Create a document vectorization pipeline

Parameters:

pipeline_request (CreatePipelineRequest) -- The object containing the pipeline configuration.

Returns:

ID of the created pipeline

Return type:

PipelineIdResponse

delete_pipeline_by_id(pipeline_id)

Delete a pipeline by pipeline id

Parameters:

pipeline_id (str) -- ID of the pipeline to delete

Returns:

Response of the delete operation

Return type:

requests.Response

get_execution_document_by_id(pipeline_id, execution_id, document_id)

Get Document by ID for a Pipeline Execution

Returns:

Document for the Pipeline Execution

Return type:

Document

Parameters:
  • pipeline_id (str)

  • execution_id (str)

  • document_id (str)

get_execution_documents(pipeline_id, execution_id, top=None, skip=None, count=None)

Get Documents for a Pipeline Execution

Parameters:
  • pipeline_id (str) -- Pipeline ID

  • execution_id (str) -- Execution ID

  • top (Optional[int], optional) -- the maximum number of documents to return, defaults to None

  • skip (Optional[int], optional) -- number of documents to skip, defaults to None

  • count (Optional[bool], optional) -- flag to include count of total documents, defaults to None

Returns:

Documents for the Pipeline Execution

Return type:

DocumentsStatusResponse

get_pipeline_by_id(pipeline_id)

Get details of a pipeline by pipeline id.

Parameters:

pipeline_id (str) -- Pipeline ID

Returns:

Details of the pipeline

Return type:

BasePipelineResponse

get_pipeline_document_by_id(pipeline_id, document_id)

Get Document by ID for a Pipeline

Parameters:
  • pipeline_id (str) -- Pipeline ID

  • document_id (str) -- Document ID

Returns:

Document for the Pipeline

Return type:

Document

get_pipeline_documents(pipeline_id, top=None, skip=None, count=None)

Get Documents for a Pipeline

Parameters:
  • pipeline_id (str) -- Pipeline ID

  • top (Optional[int], optional) -- the maximum number of documents to return, defaults to None

  • skip (Optional[int], optional) -- number of documents to skip, defaults to None

  • count (Optional[bool], optional) -- flag to include count of total documents, defaults to None

Returns:

Documents for the Pipeline

Return type:

DocumentsStatusResponse

get_pipeline_execution_by_id(pipeline_id, execution_id)

Get Pipeline Execution by ID

Parameters:
  • pipeline_id (str) -- Pipeline ID

  • execution_id (str) -- Execution ID

Returns:

Pipeline Execution

Return type:

PipelineExecution

get_pipeline_executions(pipeline_id, last_execution=None, top=None, skip=None, count=None)

Get Pipeline Executions

Parameters:
  • pipeline_id (str) -- Pipeline ID

  • last_execution (Optional[bool], optional) -- flag to get only the last execution, defaults to None

  • top (Optional[int], optional) -- number of executions to retrieve, defaults to None

  • skip (Optional[int], optional) -- number of executions to skip, defaults to None

  • count (Optional[bool], optional) -- flag to include count of total executions, defaults to None

Returns:

Pipeline Executions

Return type:

GetPipelineExecutionsResponse

get_pipeline_status(pipeline_id)

Get pipeline status by pipeline id

Parameters:

pipeline_id (str) -- Pipeline ID

Returns:

Status of the pipeline

Return type:

GetPipelineStatusResponse

get_pipelines(top=None, skip=None, count=None)

Get all pipelines.

Returns:

Get all pipelines

Return type:

GetPipelinesResponse

Parameters:
  • top (int | None)

  • skip (int | None)

  • count (bool | None)

search_pipelines(body)

Pipeline Search by Metadata

Parameters:

body (SearchPipelineRequest) -- The search request object containing metadata filters.

Returns:

Search results containing matching pipelines.

Return type:

SearchPipelinesResponse

trigger_pipeline(request)

Trigger Pipeline Manually

Parameters:

request (ManualPipelineTrigger) -- The manual trigger request object.

Returns:

Response of the trigger operation

Return type:

requests.Response

gen_ai_hub.document_grounding.clients.retrieval_api_client module

Retrieval API client for Document Grounding.

This module provides the RetrievalAPIClient class for querying and retrieving relevant content from configured data repositories. The Retrieval API combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.

Supported repository types:
  • Vector stores

  • External document sources (e.g., help.sap.com)

API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval

class RetrievalAPIClient

Bases: object

The Retrieval API enables querying and retrieving relevant content from configured data repositories, such as vector or external document sources (e.g., help.sap.com).

Retrieval combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.

Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval

__init__(proxy_client=None)

Initialize the RetrievalAPIClient.

Parameters:

proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client for making API requests.

get_data_repositories(top=None, skip=None, count=None)

List all data repositories available to the tenant.

Parameters:
  • top (Optional[int], optional) -- the number of items to return, defaults to None

  • skip (Optional[int], optional) -- the number of items to skip, defaults to None

  • count (Optional[bool], optional) -- whether to include a count of total items, defaults to None

Returns:

DataRepositories model containing the list of data repositories

Return type:

DataRepositories

get_data_repository_by_id(repository_id)

Get a single data repository by its unique ID.

Parameters:

repository_id (str) -- the unique identifier of the data repository

Returns:

DataRepository model representing the data repository

Return type:

DataRepository

search(search_input)

Perform a retrieval search for relevant content.

Parameters:

search_input (RetrievalSearchInput) -- RetrievalSearchInput model defining the query and filters.

Returns:

RetrievalSearchResults model containing repositories, documents, and chunks.

Return type:

RetrievalSearchResults

gen_ai_hub.document_grounding.clients.vector_api_client module

Vector API client for Document Grounding.

This module provides the VectorAPIClient class for managing vector-based document collections and performing semantic searches. The Vector API enables creating, retrieving, updating, and deleting collections, as well as managing documents within those collections.

Key capabilities:
  • Collection management (create, read, update, delete)

  • Document management within collections

  • Semantic vector search across collections

  • Collection status tracking (creation/deletion)

API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector

class VectorAPIClient

Bases: object

The Vector API provides management and search capabilities for vector-based document collections.

It enables creating, retrieving, updating, and deleting collections, as well as managing documents and performing semantic vector searches within those collections.

Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector

__init__(proxy_client=None)

Initializes the VectorAPIClient

Parameters:

proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client to use for requests

create_collection(collection_request)

Create a new collection.

Parameters:

collection_request (CollectionCreateRequest) -- The object containing the collection configuration.

Returns:

requests.Response empty object with 202 status code

Return type:

requests.Response

create_documents(collection_id, request)

Create documents in a collection.

Parameters:
  • collection_id (str) -- The ID of the collection to add documents to.

  • request (DocumentsCreateRequest) -- The object containing the documents to create.

Returns:

A DocumentsListResponse object containing the created documents

Return type:

DocumentsListResponse

delete_collection(collection_id)

Delete collection by ID.

Parameters:

collection_id (str) -- The ID of the collection to delete.

Returns:

requests.Response empty object with 204 status code

Return type:

requests.Response

delete_document(collection_id, document_id)

Delete a document from a collection.

Parameters:
  • collection_id (str) -- The ID of the collection to delete the document from.

  • document_id (str) -- The ID of the document to delete.

Returns:

requests.Response empty object with 204 status code

Return type:

requests.Response

get_collection_by_id(collection_id)

Get collection details by ID.

Parameters:

collection_id (str) -- The ID of the collection to retrieve.

Returns:

A Collection object containing the collection details

Return type:

Collection

get_collection_creation_status(collection_id)

Get creation status for a collection.

Parameters:

collection_id (str) -- The ID of the collection to retrieve the creation status for.

Returns:

A CollectionCreationStatusResponse object containing the creation status

Return type:

CollectionCreationStatusResponse

get_collection_deletion_status(collection_id)

Get deletion status for a collection.

Parameters:

collection_id (str) -- The ID of the collection to retrieve the deletion status for.

Returns:

A CollectionDeletionStatusResponse object containing the deletion status

Return type:

CollectionDeletionStatusResponse

get_collections(top=None, skip=None, count=None)

Get all collections.

Parameters:
  • top (Optional[int], optional) -- the number of collections to retrieve, defaults to None

  • skip (Optional[int], optional) -- the number of collections to skip, defaults to None

  • count (Optional[bool], optional) -- whether to include the total count of collections, defaults to None

Returns:

A CollectionsListResponse object containing the list of collections

Return type:

CollectionsListResponse

get_document_by_id(collection_id, document_id)

Get a document by ID from a collection.

Parameters:
  • collection_id (str) -- The ID of the collection to retrieve the document from.

  • document_id (str) -- The ID of the document to retrieve.

Returns:

A Document object containing the document details

Return type:

Document

get_documents(collection_id, top=None, skip=None, count=None)

Get documents from a collection.

Parameters:
  • collection_id (str) -- The ID of the collection to retrieve documents from.

  • top (Optional[int], optional) -- the number of documents to retrieve, defaults to None

  • skip (Optional[int], optional) -- the number of documents to skip, defaults to None

  • count (Optional[bool], optional) -- whether to include the total count of documents, defaults to None

Returns:

A DocumentsResponse object containing the list of documents

Return type:

DocumentsResponse

search(request)

Perform semantic search in vector collections.

Parameters:

request (TextSearchRequest) -- The object containing the search parameters.

Returns:

A VectorSearchResults object containing the search results

Return type:

VectorSearchResults

update_documents(collection_id, request)

Update documents in a collection.

Parameters:
  • collection_id (str) -- The ID of the collection to update documents in.

  • request (DocumentsUpdateRequest) -- The object containing the documents to update.

Returns:

A DocumentsListResponse object containing the updated documents

Return type:

DocumentsListResponse