Interface CollectionProcessingEngine
CollectionProcessingEngine (CPE) processes a collection of artifacts (for text
analysis applications, this will be a collection of documents) and produces collection-level
results.
A CPE consists of a CollectionReader, zero or more
AnalysisEngines and zero or more
CasConsumers. The Collection Reader is responsible for reading
artifacts from a collection and setting up the CAS. The AnalysisEngines analyze each CAS and the
results are passed on to the CAS Consumers. CAS Consumers perform analysis over multiple CASes
and generally produce collection-level results in some application-specific data structure.
Processing is started by calling the process() method. Processing can be controlled via
thepause(), resume(), and stop() methods.
Listeners can register with the CPE by calling the
addStatusCallbackListener(StatusCallbackListener) method. These listeners receive status
callbacks during the processing. At any time, performance and progress reports are available from
the getPerformanceReport() and getProgress() methods.
A CPE implementation may choose to implement parallelization of the processing, but this is not a requirement of the architecture.
Note that a CPE only supports processing one collection at a time. Attempting to start a new processing job while a previous processing job is running will result in an exception. Processing multiple collections simultaneously is done by instantiating and configuring multiple instances of the CPE.
A CollectionProcessingEngine instance can be obtained by calling
UIMAFramework.produceCollectionProcessingEngine(CpeDescription).
-
Method Summary
Modifier and TypeMethodDescriptionvoidaddStatusCallbackListener(StatusCallbackListener aListener) Registers a listener to receive status callbacks.Gets theCasProcessorss in this CPE, in the order in which they will be executed.Gets the Collection Reader for this CPE.Gets a performance report for the processing that is currently occurring or has just completed.Progress[]Gets a progress report for the processing that is currently occurring or has just completed.voidinitialize(CpeDescription aCpeDescription, Map<String, Object> aAdditionalParams) Initializes this CPE from acpeDescriptionApplications do not need to call this method.booleanisPaused()Determines whether this CPE's processing is currently paused.booleanDetermines whether this CPE is currently processing.voidkill()Kill CPM hard.voidpause()Pauses processing.voidprocess()Initiates processing of a collection.voidUnregisters a status callback listener.voidresume()Resumes processing that has been paused.voidstop()Stops processing.
-
Method Details
-
initialize
void initialize(CpeDescription aCpeDescription, Map<String, Object> aAdditionalParams) throws ResourceInitializationExceptionInitializes this CPE from acpeDescriptionApplications do not need to call this method. It is called automatically by the framework and cannot be called a second time.- Parameters:
aCpeDescription- CPE description, generally parsed from an XML fileaAdditionalParams- a Map containing additional parameters. May benullif there are no parameters. Each class that implements this interface can decide what additional parameters it supports.- Throws:
ResourceInitializationException- if a failure occurs during initialization.UIMA_IllegalStateException- if this method is called more than once on a single instance.
-
addStatusCallbackListener
Registers a listener to receive status callbacks.- Parameters:
aListener- the listener to add
-
removeStatusCallbackListener
Unregisters a status callback listener.- Parameters:
aListener- the listener to remove
-
process
Initiates processing of a collection. This method starts the processing in another thread and returns immediately. Status of the processing can be obtained by registering a listener with theaddStatusCallbackListener(StatusCallbackListener)method.A CPE can only process one collection at a time. If this method is called while a previous processing request has not yet completed, a
UIMA_IllegalStateExceptionwill result. To find out whether a CPE is free to begin another processing request, call theisProcessing()method.- Throws:
ResourceInitializationException- if an error occurs during initializationUIMA_IllegalStateException- if this CPE is currently processing
-
isProcessing
boolean isProcessing()Determines whether this CPE is currently processing. This means that a processing request has been submitted and has not yet completed or beenstop()ped. If processing is paused, this method will still returntrue.- Returns:
- true if and only if this CPE is currently processing.
-
pause
void pause()Pauses processing. Processing can later be resumed by calling theresume()method.- Throws:
UIMA_IllegalStateException- if no processing is currently occuring
-
isPaused
boolean isPaused()Determines whether this CPE's processing is currently paused.- Returns:
- true if and only if this CPE's processing is currently paused.
-
resume
void resume()Resumes processing that has been paused.- Throws:
UIMA_IllegalStateException- if processing is not currently paused
-
stop
void stop()Stops processing.- Throws:
UIMA_IllegalStateException- if no processing is currently occuring
-
getPerformanceReport
ProcessTrace getPerformanceReport()Gets a performance report for the processing that is currently occurring or has just completed.- Returns:
- an object containing performance statistics
-
getProgress
Progress[] getProgress()Gets a progress report for the processing that is currently occurring or has just completed.- Returns:
- an array of
Progressobjects, each of which represents the progress in a different set of units (for example number of entities or bytes)
-
getCollectionReader
BaseCollectionReader getCollectionReader()Gets the Collection Reader for this CPE.- Returns:
- the collection reader
-
getCasProcessors
CasProcessor[] getCasProcessors()Gets theCasProcessorss in this CPE, in the order in which they will be executed.- Returns:
- an array of
CasProcessors
-
kill
void kill()Kill CPM hard.
-