gen_ai_hub.document_grounding.clients package
Clients subpackage for Document Grounding API.
This subpackage contains the API client implementations for interacting with the SAP Generative AI Hub Document Grounding services.
- Available clients:
PipelineAPIClient: Manages document vectorization pipelines from various data sources
RetrievalAPIClient: Performs retrieval operations across configured data repositories
VectorAPIClient: Manages vector collections and performs semantic searches
Submodules
gen_ai_hub.document_grounding.clients.pipeline_api_client module
Pipeline API client for Document Grounding.
This module provides the PipelineAPIClient class for managing document vectorization pipelines. Pipelines automate the process of fetching documents from data repositories, preprocessing and chunking content, generating semantic embeddings, and storing them in HANA Vector Store.
- Supported data repositories:
Microsoft SharePoint
AWS S3
SFTP
API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Pipelines
- class PipelineAPIClient
Bases:
objectThe Pipelines API creates and manages vector stores based on documents from user data repositories: S3, SFTP, and Microsoft SharePoint. Each pipeline represents a configured end-to-end process including the following steps:
Fetches documents from a supported data source
Preprocesses and chunks the document content, and generates semantic embeddings. Semantic embeddings are multidimensional representations of textual information.
Stores semantic embeddings into the HANA Vector Store
The Pipeline API is compatible with the following data repositories:
Microsoft SharePoint
AWS S3
SFTP
See https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Pipelines
- __init__(proxy_client=None)
Initializes the PipelineAPIClient
- Parameters:
proxy_client (Optional[GenAIHubProxyClient], optional) -- proxy client to use for requests, defaults to None
- create_pipeline(pipeline_request)
Create a document vectorization pipeline
- Parameters:
pipeline_request (CreatePipelineRequest) -- The object containing the pipeline configuration.
- Returns:
ID of the created pipeline
- Return type:
- delete_pipeline_by_id(pipeline_id)
Delete a pipeline by pipeline id
- Parameters:
pipeline_id (str) -- ID of the pipeline to delete
- Returns:
Response of the delete operation
- Return type:
requests.Response
- get_execution_document_by_id(pipeline_id, execution_id, document_id)
Get Document by ID for a Pipeline Execution
- Returns:
Document for the Pipeline Execution
- Return type:
- Parameters:
pipeline_id (str)
execution_id (str)
document_id (str)
- get_execution_documents(pipeline_id, execution_id, top=None, skip=None, count=None)
Get Documents for a Pipeline Execution
- Parameters:
pipeline_id (str) -- Pipeline ID
execution_id (str) -- Execution ID
top (Optional[int], optional) -- the maximum number of documents to return, defaults to None
skip (Optional[int], optional) -- number of documents to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total documents, defaults to None
- Returns:
Documents for the Pipeline Execution
- Return type:
- get_pipeline_by_id(pipeline_id)
Get details of a pipeline by pipeline id.
- Parameters:
pipeline_id (str) -- Pipeline ID
- Returns:
Details of the pipeline
- Return type:
- get_pipeline_document_by_id(pipeline_id, document_id)
Get Document by ID for a Pipeline
- Parameters:
pipeline_id (str) -- Pipeline ID
document_id (str) -- Document ID
- Returns:
Document for the Pipeline
- Return type:
- get_pipeline_documents(pipeline_id, top=None, skip=None, count=None)
Get Documents for a Pipeline
- Parameters:
pipeline_id (str) -- Pipeline ID
top (Optional[int], optional) -- the maximum number of documents to return, defaults to None
skip (Optional[int], optional) -- number of documents to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total documents, defaults to None
- Returns:
Documents for the Pipeline
- Return type:
- get_pipeline_execution_by_id(pipeline_id, execution_id)
Get Pipeline Execution by ID
- Parameters:
pipeline_id (str) -- Pipeline ID
execution_id (str) -- Execution ID
- Returns:
Pipeline Execution
- Return type:
- get_pipeline_executions(pipeline_id, last_execution=None, top=None, skip=None, count=None)
Get Pipeline Executions
- Parameters:
pipeline_id (str) -- Pipeline ID
last_execution (Optional[bool], optional) -- flag to get only the last execution, defaults to None
top (Optional[int], optional) -- number of executions to retrieve, defaults to None
skip (Optional[int], optional) -- number of executions to skip, defaults to None
count (Optional[bool], optional) -- flag to include count of total executions, defaults to None
- Returns:
Pipeline Executions
- Return type:
- get_pipeline_status(pipeline_id)
Get pipeline status by pipeline id
- Parameters:
pipeline_id (str) -- Pipeline ID
- Returns:
Status of the pipeline
- Return type:
- get_pipelines(top=None, skip=None, count=None)
Get all pipelines.
- Returns:
Get all pipelines
- Return type:
- Parameters:
top (int | None)
skip (int | None)
count (bool | None)
- search_pipelines(body)
Pipeline Search by Metadata
- Parameters:
body (SearchPipelineRequest) -- The search request object containing metadata filters.
- Returns:
Search results containing matching pipelines.
- Return type:
- trigger_pipeline(request)
Trigger Pipeline Manually
- Parameters:
request (ManualPipelineTrigger) -- The manual trigger request object.
- Returns:
Response of the trigger operation
- Return type:
requests.Response
gen_ai_hub.document_grounding.clients.retrieval_api_client module
Retrieval API client for Document Grounding.
This module provides the RetrievalAPIClient class for querying and retrieving relevant content from configured data repositories. The Retrieval API combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.
- Supported repository types:
Vector stores
External document sources (e.g., help.sap.com)
API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval
- class RetrievalAPIClient
Bases:
objectThe Retrieval API enables querying and retrieving relevant content from configured data repositories, such as vector or external document sources (e.g., help.sap.com).
Retrieval combines semantic search with repository metadata filtering and supports custom retrieval configurations for chunk/document granularity.
Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Retrieval
- __init__(proxy_client=None)
Initialize the RetrievalAPIClient.
- Parameters:
proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client for making API requests.
- get_data_repositories(top=None, skip=None, count=None)
List all data repositories available to the tenant.
- Parameters:
top (Optional[int], optional) -- the number of items to return, defaults to None
skip (Optional[int], optional) -- the number of items to skip, defaults to None
count (Optional[bool], optional) -- whether to include a count of total items, defaults to None
- Returns:
DataRepositories model containing the list of data repositories
- Return type:
- get_data_repository_by_id(repository_id)
Get a single data repository by its unique ID.
- Parameters:
repository_id (str) -- the unique identifier of the data repository
- Returns:
DataRepository model representing the data repository
- Return type:
- search(search_input)
Perform a retrieval search for relevant content.
- Parameters:
search_input (RetrievalSearchInput) -- RetrievalSearchInput model defining the query and filters.
- Returns:
RetrievalSearchResults model containing repositories, documents, and chunks.
- Return type:
gen_ai_hub.document_grounding.clients.vector_api_client module
Vector API client for Document Grounding.
This module provides the VectorAPIClient class for managing vector-based document collections and performing semantic searches. The Vector API enables creating, retrieving, updating, and deleting collections, as well as managing documents within those collections.
- Key capabilities:
Collection management (create, read, update, delete)
Document management within collections
Semantic vector search across collections
Collection status tracking (creation/deletion)
API Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector
- class VectorAPIClient
Bases:
objectThe Vector API provides management and search capabilities for vector-based document collections.
It enables creating, retrieving, updating, and deleting collections, as well as managing documents and performing semantic vector searches within those collections.
Reference: https://api.sap.com/api/DOCUMENT_GROUNDING_API/resource/Vector
- __init__(proxy_client=None)
Initializes the VectorAPIClient
- Parameters:
proxy_client (Optional[GenAIHubProxyClient], optional) -- Optional proxy client to use for requests
- create_collection(collection_request)
Create a new collection.
- Parameters:
collection_request (CollectionCreateRequest) -- The object containing the collection configuration.
- Returns:
requests.Response empty object with 202 status code
- Return type:
requests.Response
- create_documents(collection_id, request)
Create documents in a collection.
- Parameters:
collection_id (str) -- The ID of the collection to add documents to.
request (DocumentsCreateRequest) -- The object containing the documents to create.
- Returns:
A DocumentsListResponse object containing the created documents
- Return type:
- delete_collection(collection_id)
Delete collection by ID.
- Parameters:
collection_id (str) -- The ID of the collection to delete.
- Returns:
requests.Response empty object with 204 status code
- Return type:
requests.Response
- delete_document(collection_id, document_id)
Delete a document from a collection.
- Parameters:
collection_id (str) -- The ID of the collection to delete the document from.
document_id (str) -- The ID of the document to delete.
- Returns:
requests.Response empty object with 204 status code
- Return type:
requests.Response
- get_collection_by_id(collection_id)
Get collection details by ID.
- Parameters:
collection_id (str) -- The ID of the collection to retrieve.
- Returns:
A Collection object containing the collection details
- Return type:
- get_collection_creation_status(collection_id)
Get creation status for a collection.
- Parameters:
collection_id (str) -- The ID of the collection to retrieve the creation status for.
- Returns:
A CollectionCreationStatusResponse object containing the creation status
- Return type:
CollectionCreationStatusResponse
- get_collection_deletion_status(collection_id)
Get deletion status for a collection.
- Parameters:
collection_id (str) -- The ID of the collection to retrieve the deletion status for.
- Returns:
A CollectionDeletionStatusResponse object containing the deletion status
- Return type:
CollectionDeletionStatusResponse
- get_collections(top=None, skip=None, count=None)
Get all collections.
- Parameters:
top (Optional[int], optional) -- the number of collections to retrieve, defaults to None
skip (Optional[int], optional) -- the number of collections to skip, defaults to None
count (Optional[bool], optional) -- whether to include the total count of collections, defaults to None
- Returns:
A CollectionsListResponse object containing the list of collections
- Return type:
- get_document_by_id(collection_id, document_id)
Get a document by ID from a collection.
- Parameters:
collection_id (str) -- The ID of the collection to retrieve the document from.
document_id (str) -- The ID of the document to retrieve.
- Returns:
A Document object containing the document details
- Return type:
- get_documents(collection_id, top=None, skip=None, count=None)
Get documents from a collection.
- Parameters:
collection_id (str) -- The ID of the collection to retrieve documents from.
top (Optional[int], optional) -- the number of documents to retrieve, defaults to None
skip (Optional[int], optional) -- the number of documents to skip, defaults to None
count (Optional[bool], optional) -- whether to include the total count of documents, defaults to None
- Returns:
A DocumentsResponse object containing the list of documents
- Return type:
- search(request)
Perform semantic search in vector collections.
- Parameters:
request (TextSearchRequest) -- The object containing the search parameters.
- Returns:
A VectorSearchResults object containing the search results
- Return type:
- update_documents(collection_id, request)
Update documents in a collection.
- Parameters:
collection_id (str) -- The ID of the collection to update documents in.
request (DocumentsUpdateRequest) -- The object containing the documents to update.
- Returns:
A DocumentsListResponse object containing the updated documents
- Return type: