datasets for classifications :
1) Adult Census Income :
48,842 samples with 14 features like age, education, and occupation.
A binary classification dataset predicting if income exceeds $50,000.
2) Breast Cancer Wisconsin (Diagnostic):
569 samples with 30 features.
Classification of tumors as benign or malignant based on features like radius,
texture, and smoothness.
3) Titanic Survival Prediction Dataset:
1,309 samples with features like age, sex, and ticket class.
A binary classification dataset predicting survival on the Titanic.
4)Iris dataset :
Size: 150 samples.
Features: 4 (sepal length, sepal width, petal length, petal width).
Task: Multi-class classification to classify flowers into one of three species:
Setosa, Versicolor, or Virginica.
datasets for regression :
1) Concrete Compressive Strength Dataset :
1,030 samples with 8 features related to the properties of concrete materials.
Predicts the compressive strength of concrete based on material proportions.
2) Medical Cost Personal Dataset :
1,338 samples with 7 features like age, sex, and BMI.
Predicts insurance costs based on health features.
3) California Housing Prices :
Size: 20,640 samples with 8 features.
Features: Median income, house age, latitude, longitude, etc.
Task: Predict house prices in California.
4) Auto MPG Dataset (UCI):
Size: 398 samples with 8 features.
Features: Cylinders, horsepower, weight, etc.
Task: Predict miles per gallon (MPG) for various cars
5)Boston Housing:
Size: 506 samples.
Features: 13 features (e.g., crime rate, average number of rooms, distance to
employment centers).
Task: predicting the median house prices in Boston neighborhoods.
/////////////////////////////////////////////// The New York City Airbnb Open Data
/////////////////////////////////////////
New York City Airbnb Open Data :
Size: 49,000 samples with 96 features.
Features: Neighborhood, room type, price, minimum nights, etc.
-> The New York City Airbnb Open Data can also be used for:
-----> Regression: To predict rental prices based on features like location, room
type, and number of reviews.
-----> Classification: To predict if a listing is booked frequently (e.g., high/low
demand) based on similar features.
//////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////
final dataset :
For classification apart from wine data :
1) adult census
2) cerebral one
3) vehicle one backup
for regression :
1)house sales , county usa backup