public class CompoundCharacterTokenizer extends Object
| Constructor and Description |
|---|
CompoundCharacterTokenizer(Pattern pattern) |
CompoundCharacterTokenizer(Set<String> compoundWords)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
List<String> |
tokenize(String text)
Tokenize a string into tokens.
|
public CompoundCharacterTokenizer(Set<String> compoundWords)
It is assumed the compound words are sorted in descending order of length.
compoundWords - A set of strings like _79_99_, _80_99_ or _92_99_ .public CompoundCharacterTokenizer(Pattern pattern)
Copyright © 2008–2025 The Apache Software Foundation. All rights reserved.