Physiological data of patients tested for breast cancer.
Format
A data frame containing 699 patients (rows) and 9 variables (columns).
- thickness
Clump Thickness
- cellsize.unif
Uniformity of Cell Size
- cellshape.unif
Uniformity of Cell Shape
- adhesion
Marginal Adhesion
- epithelial
Single Epithelial Cell Size
- nuclei.bare
Bare Nuclei
- chromatin
Bland Chromatin
- nucleoli
Normal Nucleoli
- mitoses
Mitoses
- diagnosis
Criterion: Absence/presence of breast cancer.
Values:
FALSEvs.TRUE(65.0% vs.\ 35.0%).
Source
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Original)
Original creator:
Dr. William H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA
Details
We made the following enhancements to the original data for improved usability:
The ID number of the cases was excluded.
The numeric criterion with value
2for benign and4for malignant was converted to logical (i.e.,TRUE/FALSE).16 cases were excluded because they contained
NAvalues.
Other than that, the data remains consistent with the original dataset.
See also
Other datasets:
blood,
car,
contraceptive,
creditapproval,
fertility,
forestfires,
heart.cost,
heart.test,
heart.train,
heartdisease,
iris.v,
mushrooms,
sonar,
titanic,
voting,
wine
