gen_ai_hub.document_grounding.models package

Models subpackage for Document Grounding API.

This subpackage contains Pydantic model definitions for all Document Grounding API requests and responses. Models are organized by API domain:

  • pipeline: Models for Pipeline API (document vectorization pipelines)

  • retrieval: Models for Retrieval API (content retrieval from repositories)

  • vector: Models for Vector API (vector collection management and search)

These models provide type-safe data structures for interacting with the Document Grounding APIs and ensure proper validation of request/response data.

Submodules

gen_ai_hub.document_grounding.models.pipeline module

Pydantic models for Pipeline API.

This module defines data models for the Pipeline API, which manages document vectorization pipelines from various data sources (Microsoft SharePoint, AWS S3, SFTP).

Model categories:
  • Pipeline configuration models (create/get requests and responses)

  • Pipeline execution models (tracking pipeline runs)

  • Document models (tracking document processing status)

  • Search and metadata models (filtering pipelines by metadata)

  • Trigger models (manual pipeline execution)

All models use Pydantic for validation and serialization.

class BasePipelineResponse

Bases: BaseModel

id: str
metadata: MetaData | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: str
class CommonConfiguration

Bases: BaseModel

destination: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DataRepositoryMetadataItem

Bases: BaseModel

key: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

value: List[str]
class Document

Bases: BaseModel

absoluteUrl: str | None
createdTimestamp: datetime | None
downloadLocation: str | None
id: str
lastUpdatedTimestamp: datetime | None
metadataId: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: DocumentStatus | None
title: str | None
viewLocation: str | None
class DocumentStatus

Bases: str, Enum

__new__(value)
DEINDEXED = 'DEINDEXED'
FAILED = 'FAILED'
FAILED_TO_BE_RETRIED = 'FAILED_TO_BE_RETRIED'
INDEXED = 'INDEXED'
REINDEXED = 'REINDEXED'
TO_BE_PROCESSED = 'TO_BE_PROCESSED'
TO_BE_SCHEDULED = 'TO_BE_SCHEDULED'
class DocumentsStatusResponse

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[Document]
class GetPipelineExecutionsResponse

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[PipelineExecution]
class GetPipelineStatusResponse

Bases: BaseModel

lastStarted: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: str | None
class GetPipelinesResponse

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[Annotated[MSSharePointPipelineGetResponse | S3PipelineGetResponse | SFTPPipelineGetResponse, FieldInfo(annotation=NoneType, required=True, discriminator='type')]]
class MSSharePointConfiguration

Bases: BaseModel

destination: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sharePoint: SharePointConfig
class MSSharePointConfigurationGetResponse

Bases: BaseModel

destination: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sharePoint: SharePointConfig
class MSSharePointPipelineCreateRequest

Bases: BaseModel

configuration: MSSharePointConfiguration
metadata: MetaData | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['MSSharePoint']
class MSSharePointPipelineGetResponse

Bases: BasePipelineResponse

configuration: MSSharePointConfigurationGetResponse
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['MSSharePoint']
class ManualPipelineTrigger

Bases: BaseModel

metadataOnly: bool | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pipelineId: str
class MetaData

Bases: BaseModel

destination: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class PipelineExecution

Bases: BaseModel

createdAt: datetime | None
id: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

modifiedAt: datetime | None
status: PipelineExecutionStatus | None
class PipelineExecutionStatus

Bases: str, Enum

__new__(value)
FINISHED = 'FINISHED'
FINISHED_WITH_ERRORS = 'FINISHEDWITHERRORS'
INPROGRESS = 'INPROGRESS'
NEW = 'NEW'
TIMEOUT = 'TIMEOUT'
UNKNOWN = 'UNKNOWN'
class PipelineIdResponse

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pipelineId: str
class S3PipelineCreateRequest

Bases: BaseModel

configuration: CommonConfiguration
metadata: MetaData | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['S3']
class S3PipelineGetResponse

Bases: BasePipelineResponse

configuration: CommonConfiguration
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['S3']
class SFTPPipelineCreateRequest

Bases: BaseModel

configuration: CommonConfiguration
metadata: MetaData | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['SFTP']
class SFTPPipelineGetResponse

Bases: BasePipelineResponse

configuration: CommonConfiguration
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

type: Literal['SFTP']
class SearchPipelineData

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pipelineId: str
class SearchPipelineRequest

Bases: BaseModel

dataRepositoryMetadata: List[DataRepositoryMetadataItem]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class SearchPipelinesResponse

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[SearchPipelineData]
class SharePointConfig

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

site: SharePointSite
class SharePointSite

Bases: BaseModel

includePaths: List[str] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

gen_ai_hub.document_grounding.models.retrieval module

Pydantic models for Retrieval API.

This module defines data models for the Retrieval API, which enables querying and retrieving relevant content from configured data repositories (vector stores and external document sources).

Model categories:
  • Data repository models (repository information and metadata)

  • Chunk and document models (content structure)

  • Search filter and configuration models (query parameters)

  • Search input and result models (request/response structures)

The Retrieval API supports semantic search combined with metadata filtering for precise content retrieval across multiple repository types.

class DataRepositories

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[DataRepository]
class DataRepository

Bases: BaseModel

id: str
metadata: List[RetrievalKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str
type: Literal['vector', 'help.sap.com'] | str
class DataRepositoryWithDocuments

Bases: BaseModel

documents: List[RetrievalDocument]
id: str
metadata: List[RetrievalKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str
class RetrievalChunk

Bases: BaseModel

content: str
id: str
metadata: List[RetrievalKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalDataRepositorySearchResult

Bases: BaseModel

dataRepository: DataRepositoryWithDocuments
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalDocument

Bases: BaseModel

chunks: List[RetrievalChunk]
id: str
metadata: List[RetrievalDocumentKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalDocumentKeyValueListPair

Bases: RetrievalKeyValueListPair

matchMode: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalKeyValueListPair

Bases: BaseModel

key: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

value: List[str]
class RetrievalPerFilterSearchResult

Bases: BaseModel

filterId: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

results: List[RetrievalDataRepositorySearchResult]
class RetrievalPerFilterSearchResultError

Bases: BaseModel

message: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalPerFilterSearchResultWithError

Bases: BaseModel

error: RetrievalPerFilterSearchResultError
filterId: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalSearchConfiguration

Bases: BaseModel

maxChunkCount: int | None
maxDocumentCount: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class RetrievalSearchDocumentKeyValueListPair

Bases: BaseModel

key: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

selectMode: List[str] | None
value: List[str]
class RetrievalSearchFilter

Bases: BaseModel

chunkMetadata: List[RetrievalKeyValueListPair] | None
dataRepositories: List[str] | None
dataRepositoryMetadata: List[RetrievalKeyValueListPair] | None
dataRepositoryType: Literal['vector', 'help.sap.com'] | str
documentMetadata: List[RetrievalSearchDocumentKeyValueListPair] | None
id: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

searchConfiguration: RetrievalSearchConfiguration | None
class RetrievalSearchInput

Bases: BaseModel

filters: List[RetrievalSearchFilter]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

query: str
class RetrievalSearchResults

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

results: List[RetrievalPerFilterSearchResult | RetrievalPerFilterSearchResultWithError]

gen_ai_hub.document_grounding.models.vector module

Pydantic models for Vector API.

This module defines data models for the Vector API, which provides management and search capabilities for vector-based document collections.

Model categories:
  • Collection models (collection configuration and management)

  • Document and chunk models (content structure with embeddings)

  • Embedding configuration models (embedding model settings)

  • Search models (semantic search requests and results)

  • Status models (collection creation/deletion tracking)

The Vector API enables semantic search across document collections using vector embeddings for similarity-based retrieval.

class BaseDocument

Bases: BaseModel

chunks: List[TextOnlyBaseChunk]
metadata: List[VectorKeyValueListPair]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class Collection

Bases: BaseModel

embeddingConfig: EmbeddingConfig
id: str
metadata: List[VectorKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str | None
class CollectionCreateRequest

Bases: BaseModel

embeddingConfig: EmbeddingConfig
metadata: List[VectorKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str | None
class CollectionCreatedResponse

Bases: BaseModel

collectionURL: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: Literal['CREATED']
class CollectionDeletedResponse

Bases: BaseModel

collectionURL: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: Literal['DELETED']
class CollectionPendingResponse

Bases: BaseModel

Location: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: Literal['PENDING']
class CollectionsListResponse

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[Collection]
class Document

Bases: BaseDocument

id: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DocumentOutput

Bases: BaseModel

chunks: List[VectorChunk]
id: str
metadata: List[VectorKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DocumentWithoutChunks

Bases: BaseModel

id: str
metadata: List[VectorKeyValueListPair]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DocumentsChunk

Bases: BaseModel

documents: List[DocumentOutput]
id: str
metadata: List[VectorKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

title: str
class DocumentsCreateRequest

Bases: BaseModel

documents: List[BaseDocument]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DocumentsListResponse

Bases: BaseModel

documents: List[DocumentWithoutChunks]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class DocumentsResponse

Bases: BaseModel

count: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

resources: List[DocumentWithoutChunks]
class DocumentsUpdateRequest

Bases: BaseModel

documents: List[Document]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class EmbeddingConfig

Bases: BaseModel

modelName: str | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class TextOnlyBaseChunk

Bases: BaseModel

content: str
metadata: List[VectorKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class TextSearchRequest

Bases: BaseModel

filters: List[VectorSearchFilter]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

query: str
class VectorChunk

Bases: BaseModel

content: str
id: str
metadata: List[VectorKeyValueListPair] | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class VectorKeyValueListPair

Bases: BaseModel

key: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

value: List[str]
class VectorPerFilterSearchResult

Bases: BaseModel

filterId: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

results: List[DocumentsChunk]
class VectorSearchConfiguration

Bases: BaseModel

maxChunkCount: int | None
maxDocumentCount: int | None
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class VectorSearchDocumentKeyValueListPair

Bases: BaseModel

key: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

selectMode: List[str] | None
value: List[str]
class VectorSearchFilter

Bases: BaseModel

chunkMetadata: List[VectorKeyValueListPair] | None
collectionIds: List[str]
collectionMetadata: List[VectorKeyValueListPair] | None
configuration: VectorSearchConfiguration
documentMetadata: List[VectorSearchDocumentKeyValueListPair] | None
id: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class VectorSearchResults

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

results: List[VectorPerFilterSearchResult]