Package weka.core.tokenizers
Class AlphabeticTokenizer
java.lang.Object
weka.core.tokenizers.Tokenizer
weka.core.tokenizers.AlphabeticTokenizer
- All Implemented Interfaces:
Serializable,Enumeration<String>,OptionHandler,RevisionHandler
Alphabetic string tokenizer, tokens are to be
formed only from contiguous alphabetic sequences.
- Version:
- $Revision: 10203 $
- Author:
- Asrhaf M. Kibriya (amk14@cs.waikato.ac.nz), FracPete (fracpete at waikato dot ac dot nz)
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionReturns the revision string.Returns a string describing the stemmerbooleanreturns whether there are more elements stillstatic voidRuns the tokenizer with the given options and strings to tokenize.returns the next elementvoidSets the string to tokenize.Methods inherited from class weka.core.tokenizers.Tokenizer
getOptions, listOptions, runTokenizer, setOptions, tokenizeMethods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface java.util.Enumeration
asIterator
-
Constructor Details
-
AlphabeticTokenizer
public AlphabeticTokenizer()
-
-
Method Details
-
globalInfo
Returns a string describing the stemmer- Specified by:
globalInfoin classTokenizer- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
hasMoreElements
public boolean hasMoreElements()returns whether there are more elements still- Specified by:
hasMoreElementsin interfaceEnumeration<String>- Specified by:
hasMoreElementsin classTokenizer- Returns:
- true if there are still more elements
-
nextElement
returns the next element- Specified by:
nextElementin interfaceEnumeration<String>- Specified by:
nextElementin classTokenizer- Returns:
- the next element
-
tokenize
Sets the string to tokenize. Tokenization happens immediately. -
getRevision
Returns the revision string.- Returns:
- the revision
-
main
Runs the tokenizer with the given options and strings to tokenize. The tokens are printed to stdout.- Parameters:
args- the commandline options and strings to tokenize
-