"Цель: Предсказать цену автомобиля (Price) на основе других характеристик.\n",
"\n",
"Применение: Это может быть полезно для автосалонов, онлайн-площадок по продаже автомобилей, а также для частных лиц, которые хотят оценить рыночную стоимость своего автомобиля.\n",
"\n",
"Задача классификации:\n",
"\n",
"Цель: Классифицировать автомобили по категориям (например, \"Эконом\", \"Средний\", \"Премиум\") на основе цены и других характеристик.\n",
"\n",
"Применение: Это может быть полезно для маркетинговых кампаний, определения целевой аудитории, а также для анализа рынка автомобилей."
"C:\\Users\\Egor\\AppData\\Local\\Temp\\ipykernel_18436\\3209090058.py:21: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.\n",
"The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.\n",
"\n",
"For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\numpy\\lib\\_nanfunctions_impl.py:1241: RuntimeWarning: Mean of empty slice\n",
"C:\\Users\\Egor\\AppData\\Local\\Temp\\ipykernel_18436\\3209090058.py:22: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.\n",
"The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.\n",
"\n",
"For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.\n",
"C:\\Users\\Egor\\AppData\\Local\\Temp\\ipykernel_18436\\3209090058.py:23: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.\n",
"The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.\n",
"\n",
"For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.\n",
"### 5. Построение конвейера и обучение моделей\n",
"#### 5.1. Задача регрессии"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"Вывод: Линейная регрессия показала относительно низкое качество предсказаний. RMSE и MAE достаточно высоки, что указывает на то, что модель плохо предсказывает цены автомобилей.\n",
"\n",
"Дерево решений (DecisionTreeRegressor):\n",
"\n",
"RMSE: 141914.29349587928\n",
"\n",
"MAE: 9887.588955657844\n",
"\n",
"Вывод: Дерево решений показало значительно более высокое значение RMSE по сравнению с линейной регрессией, что указывает на то, что модель сильно переобучилась. Однако MAE ниже, чем у линейной регрессии, что может указывать на то, что модель лучше предсказывает средние значения цен.\n",
"\n",
"Случайный лес (RandomForestRegressor):\n",
"\n",
"RMSE: 173537.46233609488\n",
"\n",
"MAE: 12656.846663315797\n",
"\n",
"Вывод: Случайный лес показал еще более высокое значение RMSE, что указывает на то, что модель также сильно переобучилась. MAE выше, чем у линейной регрессии, что говорит о том, что модель предсказывает цены хуже, чем линейная регрессия.\n",
"Вывод: Логистическая регрессия показала низкое качество классификации. F1-score и точность близки к 0.5, что указывает на то, что модель почти не лучше случайного угадывания. ROC-AUC также низкий, что говорит о плохой способности модели различать классы.\n",
"\n",
"Дерево решений (DecisionTreeClassifier):\n",
"\n",
"F1-score: 0.6836168013771631\n",
"\n",
"Accuracy: 0.6876299376299376\n",
"\n",
"ROC-AUC: 0.8222065250497814\n",
"\n",
"Вывод: Дерево решений показало значительно лучшее качество классификации по сравнению с логистической регрессией. F1-score и точность выше, а ROC-AUC значительно лучше, что указывает на то, что модель хорошо различает классы.\n",
"\n",
"Случайный лес (RandomForestClassifier):\n",
"\n",
"F1-score: 0.6943295967769952\n",
"\n",
"Accuracy: 0.6993243243243243\n",
"\n",
"ROC-AUC: 0.856645400908623\n",
"\n",
"Вывод: Случайный лес показал лучшее качество классификации среди всех моделей. F1-score, точность и ROC-AUC выше, чем у дерева решений, что указывает на то, что модель хорошо обобщает данные и различает классы.\n",
"\n",
"Общие выводы:\n",
"Задача регрессии: Линейная регрессия показала лучшее качество предсказаний цен по сравнению с деревьями решений и случайным лесом, несмотря на высокие значения RMSE и MAE. Деревья решений и случайный лес показали сильное переобучение, что привело к очень высоким значениям RMSE.\n",
"\n",
"Задача классификации: Случайный лес показал лучшее качество классификации по сравнению с логистической регрессией и деревом решений. Логистическая регрессия показала низкое качество, в то время как дерево решений и случайный лес показали хорошие результаты, причем случайный лес показал наилучшие результаты"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
"c:\\Users\\Egor\\AppData\\Local\\Programs\\Python\\Python312\\Lib\\site-packages\\sklearn\\impute\\_base.py:598: UserWarning: Skipping features without any observed values: ['Mileage']. At least one non-missing value is needed for imputation with strategy='median'.\n",
" warnings.warn(\n"
]
}
],
"source": [
"# Оценка смещения и дисперсии для задачи регрессии\n",
"Вывод: Дерево решений показало очень высокое значение RMSE при кросс-валидации, что указывает на сильное переобучение. Стандартное отклонение также высокое, что говорит о нестабильности модели. Это означает, что модель плохо обобщает данные и имеет высокую дисперсию.\n",
"\n",
"Случайный лес (RandomForestRegressor):\n",
"\n",
"Cross-Validation RMSE: 181627.2578040142\n",
"\n",
"Std: 137879.8905706371\n",
"\n",
"Вывод: Случайный лес также показал высокое значение RMSE при кросс-валидации, хотя и немного ниже, чем у дерева решений. Стандартное отклонение также высокое, что указывает на нестабильность модели. Это говорит о том, что модель также переобучена и имеет высокую дисперсию.\n",
"Вывод: Дерево решений показало хороший F1-score при кросс-валидации, но стандартное отклонение относительно высокое. Это указывает на некоторую нестабильность модели, хотя и не такую высокую, как в случае регрессии. Модель имеет умеренную дисперсию.\n",
"\n",
"Случайный лес (RandomForestClassifier):\n",
"\n",
"Cross-Validation F1-score: 0.692567227648008\n",
"\n",
"Std: 0.004169193228958696\n",
"\n",
"#### Вывод: Случайный лес показал лучший F1-score при кросс-валидации по сравнению с деревом решений. Стандартное отклонение также ниже, что указывает на более стабильную модель. Это говорит о том, что случайный лес лучше обобщает данные и имеет меньшую дисперсию по сравнению с деревом решений.\n",
"\n",
"Общие выводы:\n",
"Задача регрессии: И дерево решений, и случайный лес показали высокие значения RMSE и высокое стандартное отклонение при кросс-валидации. Это указывает на сильное переобучение и высокую дисперсию. Модели плохо обобщают данные и нестабильны.\n",
"\n",
"Задача классификации: Дерево решений показало хороший F1-score, но с высоким стандартным отклонением, что указывает на некоторую нестабильность. Случайный лес показал лучший F1-score и более низкое стандартное отклонение, что говорит о более стабильной и обобщающей способности модели."
"1) Плохое качество предсказаний в первой диаграмме: \n",
"Модель регрессии плохо предсказывает цены автомобилей, так как точки на диаграмме рассеяния распределены хаотично и далеко от диагонали. Значительные ошибки: Ошибки предсказаний значительны, что указывает на то, что модель не может точно предсказать цены автомобилей. Необходимость улучшения модели: Для улучшения качества предсказаний стоит рассмотреть другие модели, такие как градиентный бустинг или нейронные сети, а также улучшить предобработку данных.\n",
"\n",
"2) Выводы по второй диаграмме:\n",
"Хорошее качество классификации: Матрица ошибок показывает высокие значения на диагонали и низкие значения вне диагонали, что указывает на хорошее качество классификации.\n",
"Правильные предсказания: Большинство предсказаний модели являются правильными, что говорит оее способности хорошо различать классы.\n",
"Низкие ошибки: Низкие значения вне диагонали указывают на то, что модель допускает мало ошибок при классификации."