Package weka.clusterers
Class FarthestFirst
java.lang.Object
weka.clusterers.AbstractClusterer
weka.clusterers.RandomizableClusterer
weka.clusterers.FarthestFirst
- All Implemented Interfaces:
Serializable,Cloneable,Clusterer,CapabilitiesHandler,CapabilitiesIgnorer,CommandlineRunnable,OptionHandler,Randomizable,RevisionHandler,TechnicalInformationHandler
Cluster data using the FarthestFirst algorithm.
For more information see:
Hochbaum, Shmoys (1985). A best possible heuristic for the k-center problem. Mathematics of Operations Research. 10(2):180-184.
Sanjoy Dasgupta: Performance Guarantees for Hierarchical Clustering. In: 15th Annual Conference on Computational Learning Theory, 351-363, 2002.
Notes:
- works as a fast simple approximate clusterer
- modelled after SimpleKMeans, might be a useful initializer for it BibTeX:
For more information see:
Hochbaum, Shmoys (1985). A best possible heuristic for the k-center problem. Mathematics of Operations Research. 10(2):180-184.
Sanjoy Dasgupta: Performance Guarantees for Hierarchical Clustering. In: 15th Annual Conference on Computational Learning Theory, 351-363, 2002.
Notes:
- works as a fast simple approximate clusterer
- modelled after SimpleKMeans, might be a useful initializer for it BibTeX:
@article{Hochbaum1985,
author = {Hochbaum and Shmoys},
journal = {Mathematics of Operations Research},
number = {2},
pages = {180-184},
title = {A best possible heuristic for the k-center problem},
volume = {10},
year = {1985}
}
@inproceedings{Dasgupta2002,
author = {Sanjoy Dasgupta},
booktitle = {15th Annual Conference on Computational Learning Theory},
pages = {351-363},
publisher = {Springer},
title = {Performance Guarantees for Hierarchical Clustering},
year = {2002}
}
Valid options are:
-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
- Version:
- $Revision: 15520 $
- Author:
- Bernhard Pfahringer (bernhard@cs.waikato.ac.nz)
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidbuildClusterer(Instances data) Generates a clusterer.intclusterInstance(Instance instance) Classifies a given instance.Returns default capabilities of the clusterer.Get the centroids found by FarthestFirstintgets the number of clusters to generateString[]Gets the current settings of FarthestFirstReturns the revision string.Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.Returns a string describing this clustererReturns an enumeration describing the available options.static voidMain method for testing this class.intReturns the number of clusters.Returns the tip text for this propertyvoidsetNumClusters(int n) set the number of clusters to generatevoidsetOptions(String[] options) Parses a given list of options.toString()return a string describing this clustererMethods inherited from class weka.clusterers.RandomizableClusterer
getSeed, seedTipText, setSeedMethods inherited from class weka.clusterers.AbstractClusterer
debugTipText, distributionForInstance, doNotCheckCapabilitiesTipText, forName, getDebug, getDoNotCheckCapabilities, makeCopies, makeCopy, postExecution, preExecution, run, runClusterer, setDebug, setDoNotCheckCapabilities
-
Constructor Details
-
FarthestFirst
public FarthestFirst()
-
-
Method Details
-
globalInfo
Returns a string describing this clusterer- Returns:
- a description of the evaluator suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformationin interfaceTechnicalInformationHandler- Returns:
- the technical information about this class
-
getCapabilities
Returns default capabilities of the clusterer.- Specified by:
getCapabilitiesin interfaceCapabilitiesHandler- Specified by:
getCapabilitiesin interfaceClusterer- Overrides:
getCapabilitiesin classAbstractClusterer- Returns:
- the capabilities of this clusterer
- See Also:
-
buildClusterer
Generates a clusterer. Has to initialize all fields of the clusterer that are not being set via options.- Specified by:
buildClustererin interfaceClusterer- Specified by:
buildClustererin classAbstractClusterer- Parameters:
data- set of instances serving as training data- Throws:
Exception- if the clusterer has not been generated successfully
-
clusterInstance
Classifies a given instance.- Specified by:
clusterInstancein interfaceClusterer- Overrides:
clusterInstancein classAbstractClusterer- Parameters:
instance- the instance to be assigned to a cluster- Returns:
- the number of the assigned cluster as an integer if the class is enumerated, otherwise the predicted value
- Throws:
Exception- if instance could not be classified successfully
-
numberOfClusters
Returns the number of clusters.- Specified by:
numberOfClustersin interfaceClusterer- Specified by:
numberOfClustersin classAbstractClusterer- Returns:
- the number of clusters generated for a training dataset.
- Throws:
Exception- if number of clusters could not be returned successfully
-
getClusterCentroids
Get the centroids found by FarthestFirst- Returns:
- the centroids found by FarthestFirst
-
listOptions
Returns an enumeration describing the available options.- Specified by:
listOptionsin interfaceOptionHandler- Overrides:
listOptionsin classRandomizableClusterer- Returns:
- an enumeration of all the available options.
-
numClustersTipText
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setNumClusters
set the number of clusters to generate- Parameters:
n- the number of clusters to generate- Throws:
Exception- if number of clusters is negative
-
getNumClusters
public int getNumClusters()gets the number of clusters to generate- Returns:
- the number of clusters to generate
-
setOptions
Parses a given list of options. Valid options are:-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
- Specified by:
setOptionsin interfaceOptionHandler- Overrides:
setOptionsin classRandomizableClusterer- Parameters:
options- the list of options as an array of strings- Throws:
Exception- if an option is not supported
-
getOptions
Gets the current settings of FarthestFirst- Specified by:
getOptionsin interfaceOptionHandler- Overrides:
getOptionsin classRandomizableClusterer- Returns:
- an array of strings suitable for passing to setOptions()
-
toString
return a string describing this clusterer -
getRevision
Returns the revision string.- Specified by:
getRevisionin interfaceRevisionHandler- Overrides:
getRevisionin classAbstractClusterer- Returns:
- the revision
-
main
Main method for testing this class.- Parameters:
argv- should contain the following arguments:-t training file [-N number of clusters]
-