Package weka.attributeSelection
Class CfsSubsetEval
java.lang.Object
weka.attributeSelection.ASEvaluation
weka.attributeSelection.CfsSubsetEval
- All Implemented Interfaces:
Serializable,SubsetEvaluator,CapabilitiesHandler,CapabilitiesIgnorer,CommandlineRunnable,OptionHandler,RevisionHandler,TechnicalInformationHandler,ThreadSafe
public class CfsSubsetEval
extends ASEvaluation
implements SubsetEvaluator, ThreadSafe, OptionHandler, TechnicalInformationHandler
CfsSubsetEval :
Evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.
Subsets of features that are highly correlated with the class while having low intercorrelation are preferred.
For more information see:
M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand. BibTeX:
Evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.
Subsets of features that are highly correlated with the class while having low intercorrelation are preferred.
For more information see:
M. A. Hall (1998). Correlation-based Feature Subset Selection for Machine Learning. Hamilton, New Zealand. BibTeX:
@phdthesis{Hall1998,
address = {Hamilton, New Zealand},
author = {M. A. Hall},
school = {University of Waikato},
title = {Correlation-based Feature Subset Selection for Machine Learning},
year = {1998}
}
Valid options are:
-M Treat missing values as a separate value.
-L Don't include locally predictive attributes.
-Z Precompute the full correlation matrix at the outset, rather than compute correlations lazily (as needed) during the search. Use this in conjuction with parallel processing in order to speed up a backward search.
-P <int> The size of the thread pool, for example, the number of cores in the CPU. (default 1)
-E <int> The number of threads to use, which should be >= size of thread pool. (default 1)
-D Output debugging info.
- Version:
- $Revision: 15520 $
- Author:
- Mark Hall (mhall@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidbuildEvaluator(Instances data) Generates a attribute evaluator.voidclean()Tells the evaluator that the attribute selection process is complete.Returns the tip text for this propertydoubleevaluateSubset(BitSet subset) evaluates a subset of attributesReturns the capabilities of this evaluator.booleangetDebug()Set whether to output debugging infobooleanReturn true if including locally predictive attributesbooleanReturn true is missing is treated as a separate valueintGets the number of threads.String[]Gets the current settings of CfsSubsetEvalintGets the number of threads.booleanGet whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.Returns the revision string.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this attribute evaluatorReturns an enumeration describing the available options.Returns the tip text for this propertystatic voidMain method for testing this class.Returns the tip text for this propertyint[]postProcess(int[] attributeSet) Calls locallyPredictive in order to include locally predictive attributes (if requested).voidsetDebug(boolean d) Set whether to output debugging infovoidsetLocallyPredictive(boolean b) Include locally predictive attributesvoidsetMissingSeparate(boolean b) Treat missing as a separate valuevoidsetNumThreads(int nT) Sets the number of threadsvoidsetOptions(String[] options) Parses and sets a given list of options.voidsetPoolSize(int nT) Sets the number of threadsvoidsetPreComputeCorrelationMatrix(boolean p) Set whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.toString()returns a string describing CFSMethods inherited from class weka.attributeSelection.ASEvaluation
doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, postExecution, preExecution, run, runEvaluator, setDoNotCheckCapabilities
-
Constructor Details
-
CfsSubsetEval
public CfsSubsetEval()Constructor
-
-
Method Details
-
globalInfo
Returns a string describing this attribute evaluator- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformationin interfaceTechnicalInformationHandler- Returns:
- the technical information about this class
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classASEvaluation- Returns:
- an enumeration of all the available options.
-
setOptions
Parses and sets a given list of options. Valid options are:-M Treat missing values as a separate value.
-L Don't include locally predictive attributes.
-Z Precompute the full correlation matrix at the outset, rather than compute correlations lazily (as needed) during the search. Use this in conjuction with parallel processing in order to speed up a backward search.
-P <int> The size of the thread pool, for example, the number of cores in the CPU. (default 1)
-E <int> The number of threads to use, which should be >= size of thread pool. (default 1)
-D Output debugging info.
- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classASEvaluation- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
preComputeCorrelationMatrixTipText
- Returns:
- a string to describe the option
-
setPreComputeCorrelationMatrix
public void setPreComputeCorrelationMatrix(boolean p) Set whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.- Parameters:
p- true if the correlation matrix is to be pre-computed at the outset
-
getPreComputeCorrelationMatrix
public boolean getPreComputeCorrelationMatrix()Get whether to pre-compute the full correlation matrix at the outset, rather than computing individual correlations lazily (as needed) during the search.- Returns:
- true if the correlation matrix is to be pre-computed at the outset
-
numThreadsTipText
- Returns:
- a string to describe the option
-
getNumThreads
public int getNumThreads()Gets the number of threads. -
setNumThreads
public void setNumThreads(int nT) Sets the number of threads -
poolSizeTipText
- Returns:
- a string to describe the option
-
getPoolSize
public int getPoolSize()Gets the number of threads. -
setPoolSize
public void setPoolSize(int nT) Sets the number of threads -
locallyPredictiveTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setLocallyPredictive
public void setLocallyPredictive(boolean b) Include locally predictive attributes- Parameters:
b- true or false
-
getLocallyPredictive
public boolean getLocallyPredictive()Return true if including locally predictive attributes- Returns:
- true if locally predictive attributes are to be used
-
missingSeparateTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setMissingSeparate
public void setMissingSeparate(boolean b) Treat missing as a separate value- Parameters:
b- true or false
-
getMissingSeparate
public boolean getMissingSeparate()Return true is missing is treated as a separate value- Returns:
- true if missing is to be treated as a separate value
-
setDebug
public void setDebug(boolean d) Set whether to output debugging info- Parameters:
d- true if debugging info is to be output
-
getDebug
public boolean getDebug()Set whether to output debugging info- Returns:
- true if debugging info is to be output
-
debugTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getOptions
Gets the current settings of CfsSubsetEval- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classASEvaluation- Returns:
- an array of strings suitable for passing to setOptions()
-
getCapabilities
Returns the capabilities of this evaluator.- Specified by:
getCapabilitiesin interfaceCapabilitiesHandler- Overrides:
getCapabilitiesin classASEvaluation- Returns:
- the capabilities of this evaluator
- See Also:
-
buildEvaluator
Generates a attribute evaluator. Has to initialize all fields of the evaluator that are not being set via options. CFS also discretises attributes (if necessary) and initializes the correlation matrix.- Specified by:
buildEvaluatorin classASEvaluation- Parameters:
data- set of instances serving as training data- Throws:
Exception- if the evaluator has not been generated successfully
-
evaluateSubset
evaluates a subset of attributes- Specified by:
evaluateSubsetin interfaceSubsetEvaluator- Parameters:
subset- a bitset representing the attribute subset to be evaluated- Returns:
- the merit
- Throws:
Exception- if the subset could not be evaluated
-
toString
returns a string describing CFS -
postProcess
Calls locallyPredictive in order to include locally predictive attributes (if requested).- Overrides:
postProcessin classASEvaluation- Parameters:
attributeSet- the set of attributes found by the search- Returns:
- a possibly ranked list of postprocessed attributes
- Throws:
Exception- if postprocessing fails for some reason
-
clean
public void clean()Description copied from class:ASEvaluationTells the evaluator that the attribute selection process is complete. It can then clean up data structures, references to training data as necessary in order to save memory- Overrides:
cleanin classASEvaluation
-
getRevision
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classASEvaluation- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
args- the options
-