A famous dataset from R.A. Fisher (1936) simplified to predict only the virginica class (i.e., as a binary classification problem).
Format
A data frame containing 150 rows and 4 columns.
- sep.len
sepal length in cm
- sep.wid
sepal width in cm
- pet.len
petal length in cm
- pet.wid
petal width in cm
- virginica
Criterion: Does an iris belong to the class "virginica"?
Values:
TRUEvs.FALSE(33.33% vs.66.67%).
Details
To improve usability, we made the following changes:
The criterion was binarized from a factor variable with three levels (
Iris-setosa,Iris-versicolor,Iris-virginica), into a logical variable (i.e.,TRUEfor all instances ofIris-virginicaandFALSEfor the two other levels).
Other than that, the data remains consistent with the original dataset.
References
Fisher, R.A. (1936): The use of multiple measurements in taxonomic problems. Annual Eugenics, 7, Part II, pp. 179–188.
See also
Other datasets:
blood,
breastcancer,
car,
contraceptive,
creditapproval,
fertility,
forestfires,
heart.cost,
heart.test,
heart.train,
heartdisease,
mushrooms,
sonar,
titanic,
voting,
wine
