Package weka.classifiers.meta
Class RandomSubSpace
- All Implemented Interfaces:
Serializable,Cloneable,Classifier,BatchPredictor,CapabilitiesHandler,CapabilitiesIgnorer,CommandlineRunnable,OptionHandler,Randomizable,RevisionHandler,TechnicalInformationHandler,WeightedInstancesHandler
public class RandomSubSpace
extends RandomizableParallelIteratedSingleClassifierEnhancer
implements WeightedInstancesHandler, TechnicalInformationHandler
This method constructs a decision tree based classifier that maintains highest accuracy on training data and improves on generalization accuracy as it grows in complexity. The classifier consists of multiple trees constructed systematically by pseudorandomly selecting subsets of components of the feature vector, that is, trees constructed in randomly chosen subspaces.
For more information, see
Tin Kam Ho (1998). The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(8):832-844. URL http://citeseer.ist.psu.edu/ho98random.html. BibTeX:
For more information, see
Tin Kam Ho (1998). The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence. 20(8):832-844. URL http://citeseer.ist.psu.edu/ho98random.html. BibTeX:
@article{Ho1998,
author = {Tin Kam Ho},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
number = {8},
pages = {832-844},
title = {The Random Subspace Method for Constructing Decision Forests},
volume = {20},
year = {1998},
ISSN = {0162-8828},
URL = {http://citeseer.ist.psu.edu/ho98random.html}
}
Valid options are:
-P Size of each subspace: < 1: percentage of the number of attributes >=1: absolute number of attributes
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)Options after -- are passed to the designated classifier.
- Version:
- $Revision: 15801 $
- Author:
- Bernhard Pfahringer (bernhard@cs.waikato.ac.nz), Peter Reutemann (fracpete@cs.waikato.ac.nz)
- See Also:
-
Field Summary
Fields inherited from class weka.classifiers.AbstractClassifier
BATCH_SIZE_DEFAULT, NUM_DECIMAL_PLACES_DEFAULT -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionTool tip text for this propertyvoidbuildClassifier(Instances data) builds the classifier.double[]distributionForInstance(Instance instance) Calculates the class membership probabilities for the given test instance.double[][]Batch scoring method.Gets the preferred batch size from the base learner if it implements BatchPredictor.String[]Gets the current settings of the Classifier.Returns the revision string.doubleGets the size of each subSpace, as a percentage of the training set size.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing classifierbooleanReturns true if the base classifier implements BatchPredictor and is able to generate batch predictions efficientlyReturns an enumeration describing the available options.static voidMain method for testing this class.voidsetBatchSize(String size) Set the batch size to use.voidsetOptions(String[] options) Parses a given list of options.voidsetSubSpaceSize(double value) Sets the size of each subSpace, as a percentage of the training set size.Returns the tip text for this propertytoString()Returns description of the bagged classifier.Methods inherited from class weka.classifiers.RandomizableParallelIteratedSingleClassifierEnhancer
getSeed, seedTipText, setSeedMethods inherited from class weka.classifiers.ParallelIteratedSingleClassifierEnhancer
getNumExecutionSlots, numExecutionSlotsTipText, setNumExecutionSlotsMethods inherited from class weka.classifiers.IteratedSingleClassifierEnhancer
getNumIterations, numIterationsTipText, setNumIterationsMethods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getCapabilities, getClassifier, postExecution, preExecution, setClassifierMethods inherited from class weka.classifiers.AbstractClassifier
classifyInstance, debugTipText, doNotCheckCapabilitiesTipText, forName, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces
-
Constructor Details
-
RandomSubSpace
public RandomSubSpace()Constructor.
-
-
Method Details
-
globalInfo
Returns a string describing classifier- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformationin interfaceTechnicalInformationHandler- Returns:
- the technical information about this class
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classRandomizableParallelIteratedSingleClassifierEnhancer- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-P Size of each subspace: < 1: percentage of the number of attributes >=1: absolute number of attributes
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classRandomizableParallelIteratedSingleClassifierEnhancer- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
getOptions
Gets the current settings of the Classifier.- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classRandomizableParallelIteratedSingleClassifierEnhancer- Returns:
- an array of strings suitable for passing to setOptions
-
subSpaceSizeTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getSubSpaceSize
public double getSubSpaceSize()Gets the size of each subSpace, as a percentage of the training set size.- Returns:
- the subSpace size, as a percentage.
-
setSubSpaceSize
public void setSubSpaceSize(double value) Sets the size of each subSpace, as a percentage of the training set size.- Parameters:
value- the subSpace size, as a percentage.
-
buildClassifier
builds the classifier.- Specified by:
buildClassifierin interfaceClassifier- Overrides:
buildClassifierin classParallelIteratedSingleClassifierEnhancer- Parameters:
data- the training data to be used for generating the classifier.- Throws:
Exception- if the classifier could not be built successfully
-
distributionForInstance
Calculates the class membership probabilities for the given test instance.- Specified by:
distributionForInstancein interfaceClassifier- Overrides:
distributionForInstancein classAbstractClassifier- Parameters:
instance- the instance to be classified- Returns:
- preedicted class probability distribution
- Throws:
Exception- if distribution can't be computed successfully
-
batchSizeTipText
Tool tip text for this property- Overrides:
batchSizeTipTextin classAbstractClassifier- Returns:
- the tool tip for this property
-
setBatchSize
Set the batch size to use. Gets passed through to the base learner if it implements BatchPredictor. Otherwise it is just ignored.- Specified by:
setBatchSizein interfaceBatchPredictor- Overrides:
setBatchSizein classAbstractClassifier- Parameters:
size- the batch size to use
-
getBatchSize
Gets the preferred batch size from the base learner if it implements BatchPredictor. Returns 1 as the preferred batch size otherwise.- Specified by:
getBatchSizein interfaceBatchPredictor- Overrides:
getBatchSizein classAbstractClassifier- Returns:
- the batch size to use
-
distributionsForInstances
Batch scoring method. Calls the appropriate method for the base learner if it implements BatchPredictor. Otherwise it simply calls the distributionForInstance() method repeatedly.- Specified by:
distributionsForInstancesin interfaceBatchPredictor- Overrides:
distributionsForInstancesin classAbstractClassifier- Parameters:
insts- the instances to get predictions for- Returns:
- an array of probability distributions, one for each instance
- Throws:
Exception- if a problem occurs
-
implementsMoreEfficientBatchPrediction
public boolean implementsMoreEfficientBatchPrediction()Returns true if the base classifier implements BatchPredictor and is able to generate batch predictions efficiently- Specified by:
implementsMoreEfficientBatchPredictionin interfaceBatchPredictor- Overrides:
implementsMoreEfficientBatchPredictionin classAbstractClassifier- Returns:
- true if the base classifier can generate batch predictions efficiently
-
toString
Returns description of the bagged classifier. -
getRevision
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classAbstractClassifier- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
args- the options
-