Interface AnnotationIndex<T extends AnnotationFS>
- Type Parameters:
T- The top most Java cover class (usually a JCas Class) specified for the underlying index.
- All Superinterfaces:
Collection<T>,FSIndex<T>,Iterable<T>
- All Known Implementing Classes:
FsIndex_annotation
uima.tcas.Annotation (or its subtypes). You can obtain an AnnotationIndex by
calling:
AnnotationIndex idx = cas.getAnnotationIndex(); or
AnnotationIndex<SomeJCasType> idx = jcas.getAnnotationIndex(SomeJCasType.class);
Note that the AnnotationIndex defines the following sort order between two annotations:
- Annotations are sorted in increasing order of their start offset. That is, for any
annotations a and b, if
a.start < b.startthena < b. - Annotations whose start offsets are equal are next sorted by decreasing order of their
end offsets. That is, if
a.start = b.startanda.end > b.end, thena < b. This causes annotations with larger spans to be sorted before annotations with smaller spans, which produces an iteration order similar to a preorder tree traversal. - Annotations whose start offsets are equal and whose end offsets are equal are sorted based on
TypePrioritiesif type priorities are specified. Type Priorities specification is an optional element of the component descriptor). When type priorities are in use, ifa.start = b.start,a.end = b.end, and the type ofais defined before the type ofbin the type priorities, thena < b. - If none of the above rules apply, then the ordering is arbitrary. This will occur if you have two annotations of the exact same type that also have the same span. It will also occur if you have not defined any type priority between two annotations that have the same span.
In the method descriptions below, the notation a < b, where a and
b are annotations, should be taken to mean a comes before
b in the index, according to the above rules.
-
Field Summary
Fields inherited from interface org.apache.uima.cas.FSIndex
BAG_INDEX, DEFAULT_BAG_INDEX, SET_INDEX, SORTED_INDEX -
Method Summary
Modifier and TypeMethodDescriptioniterator(boolean ambiguous) Return an iterator over annotations that can be constrained to be unambiguous.subiterator(AnnotationFS annot) Return a subiterator whose bounds are defined by the input annotation.subiterator(AnnotationFS annot, boolean ambiguous, boolean strict) Return a subiterator whose bounds are defined by theannot.Create an annotation tree withannotas root node.Methods inherited from interface java.util.Collection
add, addAll, clear, contains, containsAll, equals, hashCode, isEmpty, parallelStream, remove, removeAll, removeIf, retainAll, spliterator, toArray, toArray, toArray
-
Method Details
-
iterator
Return an iterator over annotations that can be constrained to be unambiguous.A disambiguated iterator is defined as follows. The first annotation returned is the same as would be returned by the corresponding ambiguous iterator. If the unambiguous iterator has returned
apreviously, it will next return the smallestbs.t. a < b and a.getEnd() <= b.getBegin(). In other words, thebannotation's start will be large enough to not overlap the span ofa.An unambiguous iterator makes a snapshot copy of the index containing just the disambiguated items, and iterates over that. It doesn't check for concurrent index modifications (the ambiguous iterator does check for this).
- Parameters:
ambiguous- If set to false, iterator will be unambiguous.- Returns:
- A annotation iterator.
-
subiterator
Return a subiterator whose bounds are defined by the input annotation.The
annotis used for 3 purposes:- It is used to compute the position in the index where the iteration starts.
- It is used to compute end point where the iterator stops when moving forward.
- It is used to specify which annotations will be skipped while iterating.
The starting position is computed by first finding a position whose annotation compares equal with the
annot(this might be one of several), and then advancing until reaching a position where the annotation there is not equal to theannot. If no item in the index is equal (meaning it has the same begin, the same end, and is the same type as theannot) then the iterator is positioned to the first annotation which is greater than theannot, or if there are no annotations greater than theannot, the iterator is marked invalid.The iterator will stop (become invalid) when
- it runs out of items in the index going forward or backwards, or
- while moving forward, it reaches a point where the annotation at that position has a start
is beyond the
annot'send position, or - while moving backwards, it reaches a position in front of its original starting position.
While iterating, it operates like a
strictiterator; annotations whose end positions are > the end position ofannotare skipped.This is equivalent to returning annotations
bsuch thatannot < b, andannot.getEnd() >= b.getBegin(), skippingb'swhose end position is > annot.getEnd().
For annotations x, y,
x < yhere is to be interpreted as "x comes before y in the index", according to the rules defined in the description ofthis class.This definition implies that annotations
bthat have the same span asannotmay or may not be returned by the subiterator. This is determined by the type priorities; the subiterator will only return such an annotationbif the type ofannotprecedes the type ofbin the type priorities definition. If you have not specified the priority, or ifannotandbare of the same type, then the behavior is undefined.For example, if you have an annotation
Sof typeSentenceand an annotationPof typeParagraphthat have the same span, and you have definedParagraphbeforeSentencein your type priorities, thensubiterator(P)will give you an iterator that will returnS, butsubiterator(S)will give you an iterator that will NOT returnP. The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the type priorities.Calling
subiterator(a)is equivalent to callingsubiterator(a, true, true).. Seesubiterator(AnnotationFS, boolean, boolean).- Parameters:
annot- Defines the boundaries of the subiterator.- Returns:
- A subiterator.
-
subiterator
Return a subiterator whose bounds are defined by theannot.The
annotis used in 2 or 3 ways.- It specifies the left-most position in the index where the iteration starts.
- It specifies an end point where the iterator stops.
- If
strictis specified, the end point also specifies which annotations will be skipped while iterating.
The starting position is computed by first finding the position whose annotation compares equal with the
annot, and then advancing until reaching a position where the annotation there is not equal to theannot. If no item in the index is equal (meaning it has the same begin, the same end, and is the same type as theannot) then the iterator is positioned to the first annotation which is greater than theannot, or if there are no annotations greater than theannot, the iterator is marked invalid.The iterator will stop (become invalid) when
- it runs out of items in the index going forward or backwards, or
- while moving forward, it reaches a point where the annotation at that position has a start
is beyond the
annot'send position, or - while moving backwards, it reaches a position in front of its original starting position
Ignoring
strictandambiguousfor a moment, this is equivalent to returning annotationsbsuch thatannot < busing the standard annotation comparator, andannot.getEnd() >= b.getBegin(), and also bounded by the index itself.
A
strictsubiterator skips annotations whereannot.getEnd() < b.getEnd().A
ambiguous = falsespecification produces an unambigouse iterator, which computes a subset of the annotations, going forward, such that annotations whosebeginis contained within the previous returned annotation's span, are skipped.For annotations x,y,
x < yhere is to be interpreted as "x comes before y in the index", according to the rules defined in the description ofthis class.If
strict = truethen annotations whose end is >annot.getEnd()are skipped.These definitions imply that annotations
bthat have the same span asannotmay or may not be returned by the subiterator. This is determined by the type priorities; the subiterator will only return such an annotationbif the type ofannotprecedes the type ofbin the type priorities definition. If you have not specified the priority, or ifannotandbare of the same type, then the behavior is undefined.For example, if you have an annotation
Sof typeSentenceand an annotationPof typeParagraphthat have the same span, and you have definedParagraphbeforeSentencein your type priorities, thensubiterator(P)will give you an iterator that will returnS, butsubiterator(S)will give you an iterator that will NOT returnP. The intuition is that a Paragraph is conceptually larger than a Sentence, as defined by the type priorities.- Parameters:
annot- Annotation setting boundary conditions for subiterator.ambiguous- If set tofalse, resulting iterator will be unambiguous.strict- Controls if annotations that overlap to the right are considered in or out.- Returns:
- A subiterator.
-
tree
Create an annotation tree withannotas root node. The tree is defined as follows: for each node in the tree, the children are the sequence of annotations that would be obtained from a strict, unambiguous subiterator of the node's annotation.- Parameters:
annot- The annotation at the root of the tree. This must be of type T or a subtype- Returns:
- The annotation tree rooted at
annot.
-