Package weka.attributeSelection
Class ClassifierSubsetEval
java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.HoldOutSubsetEvaluator
weka.attributeSelection.ClassifierSubsetEval
- All Implemented Interfaces:
Serializable,ErrorBasedMeritEvaluator,SubsetEvaluator,CapabilitiesHandler,CapabilitiesIgnorer,CommandlineRunnable,OptionHandler,RevisionHandler
public class ClassifierSubsetEval
extends HoldOutSubsetEvaluator
implements OptionHandler, ErrorBasedMeritEvaluator
Classifier subset evaluator:
Evaluates attribute subsets on training data or a separate hold out testing set. Uses a classifier to estimate the 'merit' of a set of attributes.
Valid options are:
Evaluates attribute subsets on training data or a separate hold out testing set. Uses a classifier to estimate the 'merit' of a set of attributes.
Valid options are:
-B <classifier> class name of the classifier to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-T Use the training data to estimate accuracy.
-H <filename> Name of the hold out/test set to estimate accuracy on.
-percentage-split Perform a percentage split on the training data. Use in conjunction with -T.
-P Split percentage to use (default = 90).
-S Random seed for percentage split (default = 1).
-E <DEFAULT|ACC|RMSE|MAE|F-MEAS|AUC|AUPRC|CORR-COEFF> Performance evaluation measure to use for selecting attributes. (Default = default: accuracy for discrete class and rmse for numeric class)
-IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
Options specific to scheme weka.classifiers.rules.ZeroR:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
- Version:
- $Revision: 10332 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final intstatic final Tag[]Holds all tags for metrics -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidbuildEvaluator(Instances data) Generates a attribute evaluator.Returns the tip text for this propertydoubleevaluateSubset(BitSet subset) Evaluates a subset of attributesdoubleevaluateSubset(BitSet subset, Instance holdOut, boolean retrain) Evaluates a subset of attributes with respect to a single instance.doubleevaluateSubset(BitSet subset, Instances holdOut) Evaluates a subset of attributes with respect to a set of instances.Returns the tip text for this propertyReturns the capabilities of this evaluator.Get the classifier used as the base learner.Gets the currently set performance evaluation measure used for selecting attributes for the decision tableGets the file that holds hold out/test instances.Get the class value (label or index) to use with IR metric evaluation of subsets.String[]Gets the current settings of ClassifierSubsetEvalReturns the revision string.intgetSeed()Get the random seed used to randomize the data before performing a percentage splitGet the split percentage to usebooleanGet whether to perform a percentage split on the training data for evaluationbooleanGet if training data is to be used instead of hold out/test dataReturns a string describing this attribute evaluatorReturns the tip text for this propertyReturns the tip text for this propertyReturns an enumeration describing the available options.static voidMain method for testing this class.Returns the tip text for this propertyvoidsetClassifier(Classifier newClassifier) Set the classifier to use for accuracy estimationvoidsetEvaluationMeasure(SelectedTag newMethod) Sets the performance evaluation measure to use for selecting attributes for the decision tablevoidSet the file that contains hold out/test instancesvoidsetIRClassValue(String val) Set the class value (label or index) to use with IR metric evaluation of subsets.voidsetOptions(String[] options) Parses a given list of options.voidsetSeed(int s) Set the random seed used to randomize the data before performing a percentage splitvoidSet the split percentage to usevoidsetUsePercentageSplit(boolean p) Set whether to perform a percentage split on the training data for evaluationvoidsetUseTraining(boolean t) Set if training data is to be used instead of hold out/test dataReturns the tip text for this propertytoString()Returns a string describing classifierSubsetEvalReturns the tip text for this propertyReturns the tip text for this propertyMethods inherited from class weka.attributeSelection.ASEvaluation
clean, doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, postProcess, preExecution, run, runEvaluator, setDoNotCheckCapabilities
-
Field Details
-
EVAL_DEFAULT
public static final int EVAL_DEFAULT- See Also:
-
EVAL_ACCURACY
public static final int EVAL_ACCURACY- See Also:
-
EVAL_RMSE
public static final int EVAL_RMSE- See Also:
-
EVAL_MAE
public static final int EVAL_MAE- See Also:
-
EVAL_FMEASURE
public static final int EVAL_FMEASURE- See Also:
-
EVAL_AUC
public static final int EVAL_AUC- See Also:
-
EVAL_AUPRC
public static final int EVAL_AUPRC- See Also:
-
EVAL_CORRELATION
public static final int EVAL_CORRELATION- See Also:
-
EVAL_PLUGIN
public static final int EVAL_PLUGIN- See Also:
-
TAGS_EVALUATION
Holds all tags for metrics
-
-
Constructor Details
-
ClassifierSubsetEval
public ClassifierSubsetEval()
-
-
Method Details
-
globalInfo
Returns a string describing this attribute evaluator- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classASEvaluation- Returns:
- an enumeration of all the available options.
-
setOptions
Parses a given list of options. Valid options are:-B <classifier> class name of the classifier to use for accuracy estimation. Place any classifier options LAST on the command line following a "--". eg.: -B weka.classifiers.bayes.NaiveBayes ... -- -K (default: weka.classifiers.rules.ZeroR)
-T Use the training data to estimate accuracy.
-H <filename> Name of the hold out/test set to estimate accuracy on.
-percentage-split Perform a percentage split on the training data. Use in conjunction with -T.
-P Split percentage to use (default = 90).
-S Random seed for percentage split (default = 1).
-E <DEFAULT|ACC|RMSE|MAE|F-MEAS|AUC|AUPRC|CORR-COEFF> Performance evaluation measure to use for selecting attributes. (Default = default: accuracy for discrete class and rmse for numeric class)
-IRclass <label | index> Optional class value (label or 1-based index) to use in conjunction with IR statistics (f-meas, auc or auprc). Omitting this option will use the class-weighted average.
Options specific to scheme weka.classifiers.rules.ZeroR:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
-num-decimal-places The number of decimal places for the output of numbers in the model (default 2).
-batch-size The desired batch size for batch prediction (default 100).
- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classASEvaluation- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
seedTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSeed
public void setSeed(int s) Set the random seed used to randomize the data before performing a percentage split- Parameters:
s- the seed to use
-
getSeed
public int getSeed()Get the random seed used to randomize the data before performing a percentage split- Returns:
- the seed to use
-
usePercentageSplitTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setUsePercentageSplit
public void setUsePercentageSplit(boolean p) Set whether to perform a percentage split on the training data for evaluation- Parameters:
p- true if a percentage split is to be performed
-
getUsePercentageSplit
public boolean getUsePercentageSplit()Get whether to perform a percentage split on the training data for evaluation- Returns:
- true if a percentage split is to be performed
-
splitPercentTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setSplitPercent
Set the split percentage to use- Parameters:
sp- the split percentage to use
-
getSplitPercent
Get the split percentage to use- Returns:
- the split percentage to use
-
setIRClassValue
Set the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.- Parameters:
val- the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
-
getIRClassValue
Get the class value (label or index) to use with IR metric evaluation of subsets. Leaving this unset will result in the class weighted average for the IR metric being used.- Returns:
- the class label or 1-based index of the class label to use when evaluating subsets with an IR metric
-
IRClassValueTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
evaluationMeasureTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getEvaluationMeasure
Gets the currently set performance evaluation measure used for selecting attributes for the decision table- Returns:
- the performance evaluation measure
-
setEvaluationMeasure
Sets the performance evaluation measure to use for selecting attributes for the decision table- Parameters:
newMethod- the new performance evaluation metric to use
-
classifierTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setClassifier
Set the classifier to use for accuracy estimation- Parameters:
newClassifier- the Classifier to use.
-
getClassifier
Get the classifier used as the base learner.- Returns:
- the classifier used as the classifier
-
holdOutFileTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getHoldOutFile
Gets the file that holds hold out/test instances.- Returns:
- File that contains hold out instances
-
setHoldOutFile
Set the file that contains hold out/test instances- Parameters:
h- the hold out file
-
useTrainingTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getUseTraining
public boolean getUseTraining()Get if training data is to be used instead of hold out/test data- Returns:
- true if training data is to be used instead of hold out data
-
setUseTraining
public void setUseTraining(boolean t) Set if training data is to be used instead of hold out/test data- Parameters:
t- true if training data is to be used instead of hold out data
-
getOptions
Gets the current settings of ClassifierSubsetEval- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classASEvaluation- Returns:
- an array of strings suitable for passing to setOptions()
-
getCapabilities
Returns the capabilities of this evaluator.- Specified by:
getCapabilitiesin interfaceCapabilitiesHandler- Overrides:
getCapabilitiesin classASEvaluation- Returns:
- the capabilities of this evaluator
- See Also:
-
buildEvaluator
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options.- Specified by:
buildEvaluatorin classASEvaluation- Parameters:
data- set of instances serving as training data- Throws:
Exception- if the evaluator has not been generated successfully
-
evaluateSubset
Evaluates a subset of attributes- Specified by:
evaluateSubsetin interfaceSubsetEvaluator- Parameters:
subset- a bitset representing the attribute subset to be evaluated- Returns:
- the error rate
- Throws:
Exception- if the subset could not be evaluated
-
evaluateSubset
Evaluates a subset of attributes with respect to a set of instances. Calling this function overrides any test/hold out instances set from setHoldOutFile.- Specified by:
evaluateSubsetin classHoldOutSubsetEvaluator- Parameters:
subset- a bitset representing the attribute subset to be evaluatedholdOut- a set of instances (possibly separate and distinct from those use to build/train the evaluator) with which to evaluate the merit of the subset- Returns:
- the "merit" of the subset on the holdOut data
- Throws:
Exception- if the subset cannot be evaluated
-
evaluateSubset
Evaluates a subset of attributes with respect to a single instance. Calling this function overides any hold out/test instances set through setHoldOutFile.- Specified by:
evaluateSubsetin classHoldOutSubsetEvaluator- Parameters:
subset- a bitset representing the attribute subset to be evaluatedholdOut- a single instance (possibly not one of those used to build/train the evaluator) with which to evaluate the merit of the subsetretrain- true if the classifier should be retrained with respect to the new subset before testing on the holdOut instance.- Returns:
- the "merit" of the subset on the holdOut instance
- Throws:
Exception- if the subset cannot be evaluated
-
toString
Returns a string describing classifierSubsetEval -
getRevision
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classASEvaluation- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
args- the options
-