public interface IXCrawlerParameters
Copyright (c) SAP AG 2003
Modifier and Type | Interface and Description |
---|---|
static class |
IXCrawlerParameters.LogLevel
Log levels for crawler log files
|
static class |
IXCrawlerParameters.ModificationCheckMode
Modes for checking whether a resource was modified
|
Modifier and Type | Method and Description |
---|---|
String |
getConfigurableName()
Get the name of the configurable the CrawlerParameters have been created from (may be null).
|
boolean |
getCrawlHidden()
Check, whether hidden resources are included in the crawl.
|
boolean |
getCrawlSystem()
Check, whether system resources are included in the crawl.
|
boolean |
getCrawlVariants()
Check, whether variants of resources are included in the crawl.
|
boolean |
getCrawlVersions()
Check, whether versions of resources are included in the crawl.
|
String |
getDescription()
Get the description of the parameter set.
|
long |
getDocumentTimeoutInSeconds()
Get the document timeout in seconds.
|
int |
getErrorCacheCapacity()
Get the capacity of the cache for the error-set.
|
IPropertyName |
getExcludedHrefPropertyName()
Get the name of the property which holds the HREFs of a resource from a web-repository which are restricted by robot-rules.
|
int |
getFilteredCacheCapacity()
Get the capacity of the cache for the filtered-set.
|
boolean |
getFindAllDocsInDepth()
Deprecated.
not used anymore (returns always true)
|
int |
getFinishedCacheCapacity()
Get the capacity of the cache for the finished-set.
|
boolean |
getFollowLinks()
Check, whether links are followed.
|
boolean |
getFollowRedirects()
Check, whether redirects on web-sites are followed.
|
int |
getFoundCacheCapacity()
Get the capacity of the cache for the found-set.
|
IPropertyName |
getHrefPropertyName()
Get the name of the property which holds the HREFs of a resource from a web-repository.
|
String |
getLogFilePath()
Get the path to the crawler log file.
|
int |
getMaxBacklogFiles()
Get the maximum number of old crawler log files.
|
int |
getMaxDepth()
Get the maximum depth of the crawl process (0 is unlimited).
|
long |
getMaxLogFileSizeInBytes()
Get the maximum size of the crawler log file in bytes.
|
IXCrawlerParameters.LogLevel |
getMaxLogLevel()
Get the maximum log level.
|
IXCrawlerParameters.ModificationCheckMode |
getModificationCheckMode()
Get the mode for checking whether a resource was modified.
|
int |
getOldCacheCapacity()
Get the capacity of the cache for the old-set.
|
int |
getPostprocessedCacheCapacity()
Get the capacity of the cache for the postprocessed-set.
|
int |
getPostprocessingCacheCapacity()
Get the capacity of the cache for the postprocessing-set.
|
int |
getProviderCount()
Get the number of provider threads.
|
int |
getProvidingCacheCapacity()
Get the capacity of the cache for the providing-set.
|
long |
getRequestDelayInMilliseconds()
Get the number of milliseconds every crawler thread waits after retrieving a resource from a repository to reduce
the load on the underlying persistence (e.g. database) or channel (e.g. network).
|
boolean |
getRespectNoFollow()
Check, whether the http://sapportals.com/xmlns/cm/follow-links property should be respected
Added in 7.X |
boolean |
getRespectNoIndex()
Check, whether the http://sapportals.com/xmlns/cm/index-content property should be respected
|
boolean |
getRespectRobots()
Check, whether the robot-rules of web-servers are respected.
|
IResourceFilter[] |
getResultFilters()
Get the resource filters which are applied to the result of the crawl but do not narrow the scope.
|
int |
getRetrieverCount()
Get the number of retriever threads.
|
int |
getRetrievingCacheCapacity()
Get the capacity of the cache for the retrieving-set.
|
IResourceFilter[] |
getScopeFilters()
Get the resource filters which narrow the scope of the crawl.
|
long |
getSleepDistanceInMilliseconds()
Deprecated.
not used anymore (returns always 0)
|
long |
getSleepDurationInMilliseconds()
Deprecated.
not used anymore (returns always 0)
|
boolean |
getTest()
Check, whether the crawler runs in test-mode (no passing of results to the result receivers).
|
int |
getTodoCacheCapacity()
Get the capacity of the cache for the todo-set.
|
boolean |
getUseACL()
Check, whether the ACL version number is used to determine whether a resource has changed.
|
boolean |
getUseChecksum()
Check, whether a checksum is used to determine whether a resource has changed.
|
boolean |
getUseETag()
Check, whether the ETag is used to determine whether a resource has changed.
|
String getConfigurableName()
String getDescription()
int getMaxDepth()
int getRetrieverCount()
int getProviderCount()
boolean getUseChecksum()
boolean getUseETag()
boolean getUseACL()
boolean getFollowLinks()
boolean getFollowRedirects()
boolean getCrawlVersions()
boolean getCrawlVariants()
boolean getCrawlHidden()
boolean getCrawlSystem()
IXCrawlerParameters.ModificationCheckMode getModificationCheckMode()
long getRequestDelayInMilliseconds()
boolean getFindAllDocsInDepth()
boolean getRespectRobots()
boolean getRespectNoIndex()
boolean getRespectNoFollow()
boolean getTest()
IResourceFilter[] getScopeFilters()
IResourceFilter[] getResultFilters()
IPropertyName getHrefPropertyName()
IPropertyName getExcludedHrefPropertyName()
int getTodoCacheCapacity()
int getRetrievingCacheCapacity()
int getFoundCacheCapacity()
int getProvidingCacheCapacity()
int getFinishedCacheCapacity()
int getOldCacheCapacity()
int getPostprocessingCacheCapacity()
int getPostprocessedCacheCapacity()
int getErrorCacheCapacity()
int getFilteredCacheCapacity()
long getSleepDistanceInMilliseconds()
long getSleepDurationInMilliseconds()
long getMaxLogFileSizeInBytes()
int getMaxBacklogFiles()
String getLogFilePath()
IXCrawlerParameters.LogLevel getMaxLogLevel()
long getDocumentTimeoutInSeconds()
Access Rights |
---|
SC | DC | Public Part | ACH |
---|---|---|---|
[sap.com] KMC-CM
|
[sap.com] tc/km/frwk
|
api
|
EP-KM-CM
|
[sap.com] KMC-WPC
|
[sap.com] tc/kmc/wpc/wpcfacade
|
api
|
EP-PIN-WPC-WCM
|
Copyright 2018 SAP AG Complete Copyright Notice