SAP Help Home SAP Intelligent RPA Help Portal SAP Intelligent RPA Community

Module - Document Information Extraction Activities

Set of activities related to Document Information Extraction and Business OCR service.

Author:
  • SAP Intelligent RPA R&D team

Activities

Extract Data (Template) Deprecated

Extract data with Document Information Extraction using the chosen document template and given PDF file.


Status Substitute Activity
Deprecated irpa_sapdox.dox.extractDataWithTemplateDetection

Comment:

This activity has been deprecated. Please use the new Extract Data (Using template) activity instead to use automatic template detection features.



Technical Name Type Minimal Agent Version
extractData synchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
templateArtifact any mandatory An object representing a dox template artifact.
readOnlyDataType any mandatory An object representing response JSON schema.
documentPath string mandatory Path to the document.
filterSenderEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter sender business entity by this sub-type.
filterReceiverEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter receiver business entity by this sub-type.

Output Parameters:

Name Type Description
extractedData any Object based on readOnlyDataType containing extracted information from the document.

Errors:

Error Class Package Description
InvalidArgument irpa_core Invalid document path


Extract Data (Template)

Extract data with the Document Information Extraction service using the chosen schema or document template and given file.


Technical Name Type Minimal Agent Version
extractDataWithTemplateDetection asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
schemaUid string mandatory UUID of the selected schema.
isDetectMode boolean mandatory Parameter to enable the automatic detection.
templateArtifact any mandatory An object representing a dox template artifact.
readOnlyDataType any mandatory An object representing response JSON schema.
documentPath string mandatory Path to the document.
filterSenderEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter sender business entity by this sub-type.
filterReceiverEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter receiver business entity by this sub-type.

Output Parameters:

Name Type Description
extractedData any This parameter returns the data that has been extracted using the Documentation Information Extraction service.

Errors:

Error Class Package Description
InvalidArgument irpa_core Invalid document path


Extract Data (Pre-trained Model)

Extract data with the Document Information Extraction service using pre-trained models for different document types.


Technical Name Type Minimal Agent Version
extractDataWithoutTemplate asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
documentType irpa_sapdox.enums.doxDocumentType mandatory Type of document to extract.
documentPath string mandatory Path to the document.
filterSenderEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter sender business entity by this sub-type.
filterReceiverEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter receiver business entity by this sub-type.

Output Parameters:

Name Type Description
extractedData any This parameter returns the data that has been extracted using the Documentation Information Extraction service. The extracted data is returned as an object based on the previously defined 'readOnlyDataType' parameter.

Errors:

Error Class Package Description
InvalidArgument irpa_core Invalid document path


Open Document (Online OCR)

Extract the Image/PDF document using OCR provided by the Document Information Extraction service. Once a document is opened, other PDF activities can be used later.


Technical Name Type Minimal Agent Version
doxOCR asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
documentPath string mandatory Full path of the existing document.

Errors:

Error Class Package Description
SequenceError irpa_core Another PDF file is already opened
InvalidArgument irpa_core Invalid document path


Create Employee Entity Enrichment Data

Create employee entity master data in Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.createEmployeeEntityEnrichmentData asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
employeeDataEntities Array.<irpa_sapdox.employeeEntity> mandatory List of Employee entity objects.

Output Parameters:

Name Type Description
requestId any Request ID of created record in Document Information Extraction service.


Create Business Entity Enrichment Data

Create business entity master data in Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.createBusinessEntityEnrichmentData asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
businessDataEntities Array.<irpa_sapdox.businessEntity> mandatory List of Business entity objects.
assignEntitySubtype irpa_sapdox.enums.entitySubType optional None Create enrichment business entity of this sub-type.

Output Parameters:

Name Type Description
requestId any Request ID of created record in Document Information Extraction service.


Create Product Entity Enrichment Data

Activity to create product entity master data in the Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.createProductEntityEnrichmentData asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
productDataEntities Array.<irpa_sapdox.productEntity> mandatory List of Product entity objects.

Output Parameters:

Name Type Description
requestId string Request ID of the created record in the Document Information Extraction service.


Activate Master Data

Activate master data in Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.activateMasterData asynchronous WIN-2.0.0 (WIN for Windows)

Output Parameters:

Name Type Description
requestId any Request ID from activation job at Document Information Extraction service.


Delete Master Data

Delete all master data from the Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.deleteAllMasterData asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
entityType irpa_sapdox.enums.entityType mandatory Entity type for which the master data needs to be deleted.
entitySubType irpa_sapdox.enums.entitySubType optional None Filter deletion of business entity by this sub-type.

Output Parameters:

Name Type Description
requestId any Extracted request ID from deletion at Document Information Extraction service.


Delete Master Data Records

Activity to delete single or multiple master data records from the Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.deleteMasterDataRecord asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
recordIds Array. mandatory Record IDs of records which need to be deleted.
entityType irpa_sapdox.enums.entityType mandatory Entity type for which the master data needs to be deleted.
entitySubType irpa_sapdox.enums.entitySubType optional None Filter deletion of business entity by this sub-type.

Output Parameters:

Name Type Description
deletedRecords any Number of records deleted at Document Information Extraction service

Errors:

Error Class Package Description
InvalidArgument irpa_core Invalid record Id's


Get Enrichment Data

Receive one or more enrichment data entities from the Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.getEnrichmentData asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
entityType irpa_sapdox.enums.entityType mandatory Entity type of the enrichment data.
filterEntityEnrichmentBy irpa_sapdox.enums.entitySubType optional None Filter business entity data by this sub-type.

Output Parameters:

Name Type Description
enrichmentData any List of records filtered by entity type and sub type at Document Information Extraction service


Get Enrichment Data Status (Creation/Deletion)

Retrieve the creation or deletion status of a master data record from the Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.getEnrichmentDataCreationOrDeletionStatus asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
requestId string mandatory Job ID of the record.

Output Parameters:

Name Type Description
status any Status of creation or deletion job at Document Information Extraction service

Errors:

Error Class Package Description
InvalidArgument irpa_core Invalid job id.


Get Data Activation Details

Retrieve information on a master data activation record from the Document Information Extraction service.


Technical Name Type Minimal Agent Version
enrichment.getDataActivationDetails asynchronous WIN-2.0.0 (WIN for Windows)

Input Parameters:

Name Type Attributes Default Description
requestId string mandatory Data activation job record ID.

Output Parameters:

Name Type Description
status any Status of activation job at Document Information Extraction service

Errors:

Error Class Package Description
InvalidArgument irpa_core Invalid id.