2024-10-19 17:27:50 +04:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3><b>Уф.. начинаем длинную тяжелую лабу...</b></h3>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<b>3 набора данных, с которыми будет проводиться работа:</b>\n",
"<ol>\n",
" <li>\n",
" <p>Объекты вокруг Земли</p>\n",
" <a href=\"https://www.kaggle.com/datasets/sameepvani/nasa-nearest-earth-objects\">Ссылка</a> \n",
" </li>\n",
" <li>\n",
" <p>Оценки студентов на экзамене</p>\n",
" <a href=\"https://www.kaggle.com/datasets/spscientist/students-performance-in-exams\">Ссылка</a>\n",
" </li>\n",
" <li>\n",
" <p>Прогноз цены мобильного телефона</p>\n",
" <a href=\"https://www.kaggle.com/datasets/dewangmoghe/mobile-phone-price-prediction\">Ссылка</a>\n",
" </li>\n",
"</ol>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2024-11-04 20:29:09 +04:00
"<div style=\"margin: 40px;\">\n",
"<h4 >Информация о первом датасете:</h4>\n",
2024-10-19 17:27:50 +04:00
"\n",
2024-11-04 20:29:09 +04:00
"<p style=\"margin: 40px;\"><b>О наборе данных</b><br/>\n",
2024-10-19 17:27:50 +04:00
"Контекст<br/>\n",
"В космическом пространстве существует бесконечное количество объектов. Некоторые из них находятся ближе, чем мы думаем. Хотя нам может казаться, что расстояние в 70 000 км не может причинить нам вред, в астрономическом масштабе это очень маленькое расстояние, которое может нарушить многие природные явления. Таким образом, эти объекты/астероиды могут причинить вред. Поэтому разумно знать, что нас окружает и что может причинить нам вред. Таким образом, этот набор данных содержит список сертифицированных Н А С А астероидов, которые классифицируются как ближайшие к Земле объекты.</p>\n",
"\n",
"<br/>\n",
"<h4>Информация о втором датасете:</h4>\n",
2024-11-04 20:29:09 +04:00
"<p style=\"margin: 40px;\"><b>О наборе данных</b><br/>\n",
2024-10-19 17:27:50 +04:00
"Контекст<br/>\n",
"Оценки, полученные студентами<br/>\n",
"Содержание<br/>\n",
"Этот набор данных состоит из оценок, полученных учениками по различным предметам.<br/>\n",
"Благодарности<br/>\n",
"http://roycekimmons.com/tools/generated_data/exams<br/>\n",
"Вдохновение<br/>\n",
"Понять влияние предыстории родителей, подготовки к тестированию и т.д. Н а успеваемость учащихся</p>\n",
"<br/>\n",
"\n",
"<h4>Информация о третьем датасете:</h4>\n",
2024-11-04 20:29:09 +04:00
"<p style=\"margin: 40px;\"><b>О наборе данных</b><br/>\n",
2024-10-19 17:27:50 +04:00
"Этот набор данных был собран путём с б о р а данных с онлайн-сайтов.\n",
"Столбцы выглядят следующим образом.\n",
"\n",
"Название: в этом столбце содержится название мобильного телефона.\n",
"\n",
"Рейтинг: в этом столбце указаны оценки, выставленные телефону. Минимальная оценка — 0, максимальная — 5.\n",
"\n",
"Spec_score: в этом столбце указана оценка телефона на основе е г о характеристик. Минимальное значение — 0, максимальное — 100.\n",
"\n",
"No_of_sim: в этом столбце указано, поддерживает ли телефон две SIM-карты, 3G, 4G, 5G, LTE.\n",
"\n",
"Оперативная память: В этом столбце содержится информация о оперативной памяти телефона\n",
"\n",
"Аккумулятор: В этой колонке представлена информация о характеристиках аккумулятора телефона.\n",
"\n",
"Дисплей: В этом столбце содержится информация о размере экрана телефона.\n",
"\n",
"Камера: В этой колонке представлена информация о камере, задней и фронтальной.\n",
"\n",
"В не шняя_па мять: этот столбец содержит информацию о том, поддерживает ли устройство внешнюю память и\n",
"какой объём памяти.\n",
"\n",
"Android_version: этот столбец сообщает нам о версии Android на телефоне.\n",
"\n",
"Цена: Цена телефона.\n",
"\n",
"Компания: Компания, которой принадлежит телефон.\n",
"\n",
"В с тр о е нна я_па мять: в этом столбце представлена информация о встроенной памяти телефона.\n",
"\n",
"б ыс тр а я_за р ядка : показывает, поддерживает ли устройство быструю зарядку. Если да, то насколько.\n",
"\n",
"Screen_resolution: Это описывает разрешение экрана телефона.\n",
"\n",
"Процессор: В этом столбце приведена информация о процессоре телефона.\n",
"\n",
"Имя_пр о це с с о р а : в этом столбце описывается название процессора.\n",
"<br/>\n",
2024-11-04 20:29:09 +04:00
"</div>\n"
2024-10-19 17:27:50 +04:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2024-11-04 20:29:09 +04:00
"<p style=\"margin: 40px;\">начинаем...<br>первое...<br>Проблемная область: Это данные о ближайших к Земле объектах (астероиды и кометы), которые могут угрожать нашей планете. Важно анализировать их траектории, размеры и скорость для предотвращения потенциальных катастроф.<br>Объекты наблюдения: Астероиды, кометы и другие объекты.<br>Атрибуты: 'id', 'name', 'est_diameter_min', 'est_diameter_max', 'relative_velocity', 'miss_distance', 'orbiting_body', 'sentry_object', 'absolute_magnitude', 'hazardous'<br>Связи между объектами: Нет явных связей между объектами, но можно изучать корреляции между размером, скоростью и расстоянием объекта.</p>"
2024-10-19 17:27:50 +04:00
]
},
{
"cell_type": "code",
2024-11-04 20:29:09 +04:00
"execution_count": 1,
2024-10-19 17:27:50 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"вот столько колонОчек 10\n",
"вот такие колонОчки: ['id', 'name', 'est_diameter_min', 'est_diameter_max', 'relative_velocity', 'miss_distance', 'orbiting_body', 'sentry_object', 'absolute_magnitude', 'hazardous']\n"
]
}
],
"source": [
"import pandas as pd\n",
"\n",
"data = pd.read_csv(\"./csv/1.csv\", sep=\",\")\n",
"print(\"вот столько колонОчек\", data.columns.size)\n",
"print(\"вот такие колонОчки:\", data.columns.tolist()) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
2024-11-04 20:29:09 +04:00
"<p style=\"margin: 40px;\">\n",
"Получение сведений о пропущенных данных <br>Типы пропущенных данных:<br>None - представление пустых данных в Python<br>NaN - представление пустых данных в Pandas<br>'' - пустая строка\n",
"</p>"
2024-10-19 17:27:50 +04:00
]
},
{
"cell_type": "code",
2024-11-16 17:29:15 +04:00
"execution_count": 2,
2024-10-19 17:27:50 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"id 0\n",
"name 0\n",
"est_diameter_min 0\n",
"est_diameter_max 0\n",
"relative_velocity 0\n",
"miss_distance 0\n",
"orbiting_body 0\n",
"sentry_object 0\n",
"absolute_magnitude 0\n",
"hazardous 0\n",
"dtype: int64 \n",
"\n",
"id False\n",
"name False\n",
"est_diameter_min False\n",
"est_diameter_max False\n",
"relative_velocity False\n",
"miss_distance False\n",
"orbiting_body False\n",
"sentry_object False\n",
"absolute_magnitude False\n",
"hazardous False\n",
"dtype: bool \n",
"\n"
]
}
],
"source": [
"# Проверим, есть ли пропущенные значения\n",
"print(data.isnull().sum(), \"\\n\")\n",
"\n",
"# Есть ли пустые значения признаков\n",
"print(data.isnull().any(), \"\\n\")"
]
},
{
2024-11-04 20:29:09 +04:00
"cell_type": "markdown",
2024-10-19 17:27:50 +04:00
"metadata": {},
2024-11-04 20:29:09 +04:00
"source": [
"<p style=\"margin: 40px;\">Тут понятно, что пропущенных значений нет, поэтому заполнять пустые места не нужно <br>И еще на сайте видно, что колонки \"orbiting_body\" и \"sentry_object\" не имеют никаких значений кроме \"Земля\" и \"false\" соответственно. Значит удалим их</p>"
]
2024-10-19 17:27:50 +04:00
},
{
"cell_type": "code",
2024-11-16 17:29:15 +04:00
"execution_count": 3,
2024-10-19 17:27:50 +04:00
"metadata": {},
2024-11-04 20:29:09 +04:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Index(['id', 'name', 'est_diameter_min', 'est_diameter_max',\n",
" 'relative_velocity', 'miss_distance', 'absolute_magnitude',\n",
" 'hazardous'],\n",
" dtype='object')\n"
]
}
],
"source": [
2024-11-16 17:29:15 +04:00
"data = data.drop(columns=['sentry_object'])\n",
"data = data.drop(columns=['orbiting_body'])\n",
2024-11-04 20:29:09 +04:00
"print(data.columns)"
]
2024-10-19 17:27:50 +04:00
},
2024-11-14 21:22:58 +04:00
{
"cell_type": "markdown",
"metadata": {},
2024-11-16 17:29:15 +04:00
"source": [
"<p style=\"margin: 40px\">при просмотре типа данных на сайте kaggle выяснилось, что числовые колонки - это 3-7. По ним и будем просматриватьвыбросы и усреднять значения</p>"
]
2024-11-14 21:22:58 +04:00
},
2024-10-19 17:27:50 +04:00
{
"cell_type": "code",
2024-11-16 17:29:15 +04:00
"execution_count": 4,
2024-10-19 17:27:50 +04:00
"metadata": {},
2024-11-16 17:29:15 +04:00
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Колонка est_diameter_min:\n",
" Есть выбросы: Да\n",
" Количество выбросов: 8306\n",
" Минимальное значение: 0.0006089126\n",
" Максимальное значение: 0.32962154705\n",
" 1-й квартиль (Q1): 0.0192555078\n",
" 3-й квартиль (Q3): 0.1434019235\n",
"\n",
"Колонка est_diameter_max:\n",
" Есть выбросы: Да\n",
" Количество выбросов: 8306\n",
" Минимальное значение: 0.00136157\n",
" Максимальное значение: 0.7370561859\n",
" 1-й квартиль (Q1): 0.0430566244\n",
" 3-й квартиль (Q3): 0.320656449\n",
"\n",
"Колонка relative_velocity:\n",
" Есть выбросы: Да\n",
" Количество выбросов: 1574\n",
" Минимальное значение: 203.34643253\n",
" Максимальное значение: 114380.48061454494\n",
" 1-й квартиль (Q1): 28619.02064490995\n",
" 3-й квартиль (Q3): 62923.60463276395\n",
"\n",
"Колонка miss_distance:\n",
" Есть выбросы: Нет\n",
"Колонка absolute_magnitude:\n",
" Есть выбросы: Да\n",
" Количество выбросов: 101\n",
" Минимальное значение: 14.8\n",
" Максимальное значение: 32.239999999999995\n",
" 1-й квартиль (Q1): 21.34\n",
" 3-й квартиль (Q3): 25.7\n",
"\n"
]
}
],
2024-11-14 21:22:58 +04:00
"source": [
"numeric_columns = ['est_diameter_min', 'est_diameter_max', 'relative_velocity', 'miss_distance', 'absolute_magnitude']\n",
"for column in numeric_columns:\n",
" if pd.api.types.is_numeric_dtype(data[column]): # Проверяем, является ли колонка числовой\n",
" q1 = data[column].quantile(0.25) # Находим 1-й квартиль (Q1)\n",
" q3 = data[column].quantile(0.75) # Находим 3-й квартиль (Q3)\n",
" iqr = q3 - q1 # Вычисляем межквартильный размах (IQR)\n",
"\n",
" # Определяем границы для выбросов\n",
" lower_bound = q1 - 1.5 * iqr # Нижняя граница\n",
" upper_bound = q3 + 1.5 * iqr # Верхняя граница\n",
"\n",
" # Подсчитываем количество выбросов\n",
" outliers = data[(data[column] < lower_bound) | (data[column] > upper_bound)]\n",
" outlier_count = outliers.shape[0]\n",
"\n",
" # Устраняем выбросы: заменяем значения ниже нижней границы на саму нижнюю границу, а выше верхней — на верхнюю\n",
" data[column] = data[column].apply(lambda x: lower_bound if x < lower_bound else upper_bound if x > upper_bound else x)\n",
"\n",
" print(f\"Колонка {column}:\")\n",
" print(f\" Есть выбросы: {'Да' if outlier_count > 0 else 'Нет'}\")\n",
2024-11-16 17:29:15 +04:00
" if(outlier_count > 0) :\n",
" print(f\" Количество выбросов: {outlier_count}\")\n",
" print(f\" Минимальное значение: {data[column].min()}\")\n",
" print(f\" Максимальное значение: {data[column].max()}\")\n",
" print(f\" 1-й квартиль (Q1): {q1}\")\n",
" print(f\" 3-й квартиль (Q3): {q3}\\n\")\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<p style=\"margin: 40px;\">так, теперь мы построим диаграммы, чтобы найти зависимость значения \"опасен ли объект\" от других колонок</p>"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAaoAAAK9CAYAAAB1tChIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABg3UlEQVR4nO3deVxU9f4/8NcMMMMmi7IjglsqqZCgXFdMEVzTyty6F8SuVlzN5KpJpmBamJnXNBOrn0tlaWZZ3RQXknIhvblkrlfNXUFwQ0BZZj6/P/wy12Fm8DAMzhl8PR8PHjqfeZ8znxmGz2vOOZ85RyGEECAiIpIppbU7QEREVB0GFRERyRqDioiIZI1BRUREssagIiIiWWNQERGRrDGoiIhI1hhUREQkawwqIiKSNQYVPfJCQkIwevRo3e3s7GwoFApkZ2dbrU+Pgp49e6Jnz57W7kaNrFy5EgqFAr/99pu1u2IRtvJel1VQZWRkIC4uDr6+vnBwcICfnx+io6Px6aefQqvVWrt7JANffPEFFi5caO1uWF1JSQnS0tJkP8AQWYK9tTtwv1WrVsHf3x8zZsyAm5sbbt68iV9//RWjR4/Gpk2b8OWXX1q7i2RlX3zxBQ4fPoxXX321zh6jR48euHPnDlQqVZ09Rm2VlJRg1qxZAGBzWyWVtmzZYu0ukI2QVVD98ssvcHBw0Gt75ZVX0KhRI3zwwQdIT09HSEiIdTpHjwylUglHR0drd8MqiouL4eLi8lAeS84fBOoLIQTu3r0LJycna3elVmS1669qSFWqDCel8n/d/e677zBgwAAEBARArVajefPmmD17NjQajd6yPXv2hEKh0P14eXlhwIABOHz4sF6dQqFAWlqaXtu7774LhUJh8In17t27SEtLw2OPPQZHR0f4+/vjmWeewenTpwEAZ8+ehUKhwMqVK/WW+8c//gGFQqF3PKRyn7dKpUJ+fr5efU5Ojq7fVfeJr1u3DhEREXBycoKXlxf++te/4tKlSwav3fHjxzFs2DB4e3vDyckJrVq1wvTp0wEAaWlpeq+NsZ/KXUs9e/ZE27ZtDdYv1Z49e9C3b1+4u7vD2dkZ0dHR2LVrl17N7du38eqrryIkJARqtRo+Pj7o06cP9u/fr+vDjz/+iHPnzun6V5MPLkIIzJkzB40bN4azszOefPJJHDlyxKDO2H77HTt24LnnnkOTJk2gVqsRFBSESZMm4c6dO3rLjh49Gq6urjh//jwGDhwIV1dXBAYGYsmSJQCAP/74A7169YKLiwuCg4PxxRdfGDz+zZs38eqrryIoKAhqtRotWrTAO++8o9v9ffbsWXh7ewMAZs2apXst7n//Hj9+HEOHDkXDhg3h6OiIyMhIfP/993qPU/ne+/nnn5GUlAQfHx80btxY0mtZ+R6fP38+lixZgmbNmsHZ2RmxsbG4cOEChBCYPXs2GjduDCcnJwwePBjXr1/XW0fVY1SVr/tXX32Ft956C40bN4ajoyN69+6NU6dOSerXgQMH0K9fP7i5ucHV1RW9e/fGr7/+avCcq/up+ndrTGlpKZKTk+Ht7Q0XFxc8/fTTBn+/Usao6vpz/2uzYsUK9OrVCz4+PlCr1QgNDcXSpUsN+hUSEoKBAwdi8+bNiIyMhJOTE5YtWwYAuHjxIoYMGQIXFxf4+Phg0qRJKC0tNfr8pIwvpo4xjh492uDvcs2aNYiIiECDBg3g5uaGdu3a4f3336/uJdYjqy2qSjdv3kRFRQVu376Nffv2Yf78+RgxYgSaNGmiq1m5ciVcXV2RnJwMV1dX/PTTT5g5cyYKCwvx7rvv6q2vdevWmD59OoQQOH36NBYsWID+/fvj/Pnz1fYhPT3doF2j0WDgwIHIysrCiBEjMHHiRNy+fRtbt27F4cOH0bx5c6PrO3XqFD7++GOTj2dnZ4fPP/8ckyZN0rWtWLECjo6OuHv3rl7typUrkZiYiI4dOyI9PR15eXl4//33sWvXLhw4cAAeHh4AgEOHDqF79+5wcHDAuHHjEBISgtOnT+OHH37AW2+9hWeeeQYtWrTQrXfSpElo06YNxo0bp2tr06aNyT5L9dNPP6Ffv36IiIhAamoqlEql7g9vx44d6NSpEwDgpZdewtdff43x48cjNDQU165dw86dO3Hs2DF06NAB06dPx61bt3Dx4kX861//AgC4urpK7sfMmTMxZ84c9O/fH/3798f+/fsRGxuLsrKyBy67bt06lJSU4OWXX0ajRo2wd+9eLF68GBcvXsS6dev0ajUaDfr164cePXpg3rx5WL16NcaPHw8XFxdMnz4dzz//PJ555hlkZGQgPj4enTt3RtOmTQHc26UXHR2NS5cu4cUXX0STJk2we/dupKSk4MqVK1i4cCG8vb2xdOlSvPzyy3j66afxzDPPAADat28PADhy5Ai6du2KwMBATJs2DS4uLvjqq68wZMgQrF+/Hk8//bRef5OSkuDt7Y2ZM2eiuLhY8usJAKtXr0ZZWRkmTJiA69evY968eRg2bBh69eqF7OxsvPbaazh16hQWL16MyZMnY/ny5Q9c59y5c6FUKjF58mTcunUL8+bNw/PPP489e/ZUu9yRI0fQvXt3uLm5YerUqXBwcMCyZcvQs2dP/Pzzz4iKikKPHj3w2Wef6ZZ56623AED34Q0AunTp8sA+TpgwAZ6enkhNTcXZs2excOFCjB8/HmvXrtXVSBmjqvYHAM6dO4c33ngDPj4+uralS5fi8ccfx1NPPQV7e3v88MMPSEpKglarxT/+8Q+95U+cOIGRI0fixRdfxNixY9GqVSvcuXMHvXv3xvnz5/HKK68gICAAn332GX766SeD5yZ1fJFq69atGDlyJHr37o133nkHAHDs2DHs2rULEydOlLYSIUOtWrUSAHQ/8fHxory8XK+mpKTEYLkXX3xRODs7i7t37+raoqOjRXR0tF7d66+/LgCIq1ev6toAiNTUVN3tqVOnCh8fHxEREaG3/PLlywUAsWDBAoPH12q1Qgghzpw5IwCIFStW6O4bNmyYaNu2rQgKChIJCQm69hUrVggAYuTIkaJdu3a69uLiYuHm5iZGjRolAIj//Oc/QgghysrKhI+Pj2jbtq24c+eOrv7f//63ACBmzpypa+vRo4do0KCBOHfunNF+VhUcHKzXt/tFR0eLxx9/3Oh91dFqtaJly5YiLi5O73FLSkpE06ZNRZ8+fXRt7u7u4h//+Ee16xswYIAIDg6ucT+uXr0qVCqVGDBggF4/Kt8L9z/v7du3CwBi+/btev2tKj09XSgUCr3XNyEhQQAQb7/9tq7txo0bwsnJSSgUCrFmzRpd+/Hjxw3ed7NnzxYuLi7iv//9r95jTZs2TdjZ2Ynz588LIYTIz883WLZS7969Rbt27fT+DrRarejSpYto2bKlrq3yvdetWzdRUVFh5FUzrfI97u3tLW7evKlrT0lJEQBEWFiY3t/syJEjhUqlqvZvs/J1b9OmjSgtLdW1v//++wKA+OOPP6rt05AhQ4RKpRKnT5/WtV2+fFk0aNBA9OjRw+gyxsaH6lS+ZjExMXrvo0mTJgk7Ozu910LqGHW/O3fuiIiICBEQECCuXLlS7bri4uJEs2bN9NqCg4MFAJGZmanXvnDhQgFAfPXVV7q24uJi0aJFC733ek3GF1OvXUJCgt7f6MSJE4Wbm1uN32P3k9Wuv0orVqzA1q1bsXr1arzwwgtYvXq13qd8AHr7XG/fvo2CggJ0794dJSUlOH78uF5teXk5CgoKkJ+fj5ycHHz77bdo3749vLy8jD7+pUuXsHjxYsyYMcPgE/v69evh5eWFCRMmGCynUCiMrm/fvn1Yt24d0tPT9XZf3u9vf/sbjh8/rtvFt379eri7u6N37956db/99huuXr2KpKQkveMoAwYMQOvWrfHjjz8CAPLz8/HLL79gzJgxelui1fXzQTQaDQoKClBQUCBpKwQADh48iJMnT2LUqFG4du2abvni4mL07t0bv/zyi26XloeHB/bs2YPLly+b1b/qbNu2TffJ//7nL3VSxv3vt+LiYhQUFKBLly4QQuDAgQMG9X//+99
"text/plain": [
"<Figure size 400x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAaQAAAK9CAYAAABrfRj7AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABhxElEQVR4nO3deVxU5f4H8M8MMMMmoLIpIbiUigsUhKkppiguaWWmqV0Ru9bN6JrkRqZgWlSa1zKXUlEru3lN29VU1FIjLdTMBUtzS2VzY1OWmef3hz8mxhmGwzhwzsDn/Xrx0nnme858Z4Dz4ZzzzBmVEEKAiIhIZmq5GyAiIgIYSEREpBAMJCIiUgQGEhERKQIDiYiIFIGBREREisBAIiIiRWAgERGRIjCQiIhIERhIVO8EBwdj7Nixhtu7du2CSqXCrl27ZOuJ7EdycjJUKhXy8vLkbsUmVq9eDZVKhTNnzsjdSrVqNZCWLVuGmJgY+Pn5wcnJCf7+/oiKisKHH34IvV5fmw9NduKTTz7BwoUL5W5DdsXFxUhOTmZoUoPmWJsrX7NmDZo1a4aZM2fCw8MD165dw08//YSxY8di8+bN+O9//1ubD0924JNPPsGRI0fw4osv1tpj9OzZEzdu3IBGo6m1x7hTxcXFmD17NgCgV69e8jZDJJNaDaQffvgBTk5ORmP//ve/0bRpU7z33ntISUlBcHBwbbZABLVaDWdnZ7nbkEVRURHc3NzkboNqkV6vR2lpab34Ga/VQ3a3h1GFihBSq/9++C+//BKDBg1C8+bNodVq0bp1a8yZMwc6nc5o2V69ekGlUhm+vL29MWjQIBw5csSoTqVSITk52Whs3rx5UKlUJn+B3rx5E8nJybjnnnvg7OyMZs2aYejQoTh16hQA4MyZM1CpVFi9erXRcs8//zxUKpXR+YqK47UajQa5ublG9enp6Ya+f/nlF6P71q9fj/DwcLi4uMDb2xtPPfUULly4YPLaZWZmYvjw4fDx8YGLiwvatm2LGTNmAPj72Lelr4pDQr169ULHjh1N1i/Vvn370L9/f3h6esLV1RVRUVHYu3evUU1BQQFefPFFBAcHQ6vVwtfXF3379sWBAwcMPXz77bc4e/asob+a/IEihMDcuXNx1113wdXVFQ899BCOHj1qUmfuHNLu3bvxxBNPoEWLFtBqtQgMDMSkSZNw48YNo2XHjh0Ld3d3nDt3Dg8//DDc3d0REBCAxYsXAwB+++039O7dG25ubggKCsInn3xi8vjXrl3Diy++iMDAQGi1WrRp0wZvvvmm4bD1mTNn4OPjAwCYPXu24bWo/PObmZmJYcOGoUmTJnB2dkZERAS++uoro8ep+Nn7/vvvMWHCBPj6+uKuu+6S9FpW/IzPnz8fixcvRqtWreDq6op+/frh/PnzEEJgzpw5uOuuu+Di4oJHHnkEV65cMVqHlN/h48ePw8XFBWPGjDFads+ePXBwcMC0adOq7XXHjh3o0aMH3Nzc4OXlhUceeQTHjx833F+T3wNLrl27hrFjx8LLywuenp6Ii4tDcXGxUc2qVavQu3dv+Pr6QqvVIiQkBEuXLjWqsdRP5W3H/Pnz0a1bNzRt2hQuLi4IDw/HZ599ZtKXSqVCfHw81q5diw4dOkCr1WLLli0AgKNHj6J3795wcXHBXXfdhblz51Z5emTJkiWG5Zs3b47nn38e165dM6q5/XxshV69eplsRxctWoQOHTrA1dUVjRs3RkREhNnfB0tqdQ+pwrVr11BeXo6CggJkZGRg/vz5ePLJJ9GiRQtDzerVq+Hu7o6EhAS4u7tjx44dmDVrFvLz8zFv3jyj9bVr1w4zZsyAEAKnTp3CggULMHDgQJw7d85iDykpKSbjOp0ODz/8MNLS0vDkk09i4sSJKCgowLZt23DkyBG0bt3a7PpOnjyJ5cuXV/l4Dg4O+PjjjzFp0iTD2KpVq+Ds7IybN28a1a5evRpxcXG4//77kZKSguzsbLzzzjvYu3cvDh48CC8vLwDA4cOH0aNHDzg5OeGZZ55BcHAwTp06ha+//hqvvfYahg4dijZt2hjWO2nSJLRv3x7PPPOMYax9+/ZV9izVjh07MGDAAISHhyMpKQlqtdrwi7l7925ERkYCAP71r3/hs88+Q3x8PEJCQnD58mXs2bMHx48fx3333YcZM2bg+vXr+Ouvv/Cf//wHAODu7i65j1mzZmHu3LkYOHAgBg4ciAMHDqBfv34oLS2tdtn169ejuLgYzz33HJo2bYr9+/dj0aJF+Ouvv7B+/XqjWp1OhwEDBqBnz5546623sHbtWsTHx8PNzQ0zZszA6NGjMXToUCxbtgxjxoxB165d0bJlSwC3DsVFRUXhwoULePbZZ9GiRQv8+OOPSExMxKVLl7Bw4UL4+Phg6dKleO655/DYY49h6NChAIDOnTsDuLWR6d69OwICAjB9+nS4ubnhf//7Hx599FFs2LABjz32mFG/EyZMgI+PD2bNmoWioiLJrycArF27FqWlpXjhhRdw5coVvPXWWxg+fDh69+6NXbt2Ydq0aTh58iQWLVqEyZMnIzU11bCslN/h9u3bY86cOZgyZQqGDRuGIUOGoKioCGPHjkW7du3w6quvWuxv+/btGDBgAFq1aoXk5GTcuHEDixYtQvfu3XHgwAEEBwfb7Pdg+PDhaNmyJVJSUnDgwAGsWLECvr6+ePPNNw01S5cuRYcOHTBkyBA4Ojri66+/xoQJE6DX6/H8888DgEk/AJCRkYGFCxfC19fXMPbOO+9gyJAhGD16NEpLS/Hpp5/iiSeewDfffINBgwYZLb9jxw7873//Q3x8PLy9vREcHIysrCw89NBDKC8vN/ycfPDBB3BxcTF5bsnJyZg9ezaio6Px3HPP4cSJE1i6dCl+/vln7N27t8qdiaosX74c//73vzFs2DBMnDgRN2/exOHDh7Fv3z6MGjVK+opEHWjbtq0AYPgaM2aMKCsrM6opLi42We7ZZ58Vrq6u4ubNm4axqKgoERUVZVT38ssvCwAiJyfHMAZAJCUlGW5PnTpV+Pr6ivDwcKPlU1NTBQCxYMECk8fX6/VCCCFOnz4tAIhVq1YZ7hs+fLjo2LGjCAwMFLGxsYbxVatWCQBi5MiRolOnTobxoqIi4eHhIUaNGiUAiJ9//lkIIURpaanw9fUVHTt2FDdu3DDUf/PNNwKAmDVrlmGsZ8+eolGjRuLs2bNm+7xdUFCQUW+VRUVFiQ4dOpi9zxK9Xi/uvvtuERMTY/S4xcXFomXLlqJv376GMU9PT/H8889bXN+gQYNEUFBQjfvIyckRGo1GDBo0yKiPip+Fys97586dAoDYuXOnUb+3S0lJESqVyuj1jY2NFQDE66+/bhi7evWqcHFxESqVSnz66aeG8czMTJOfuzlz5gg3Nzfx+++/Gz3W9OnThYODgzh37pwQQojc3FyTZSv06dNHdOrUyej3QK/Xi27duom7777bMFbxs/fggw+K8vJyM69a1Sp+xn18fMS1a9cM44mJiQKACA0NNfqdHTlypNBoNEY9Sf0d1ul04sEHHxR+fn4iLy9PPP/888LR0dHwO2FJWFiY8PX1FZcvXzaM/frrr0KtVosxY8aYXcbS74E5SUlJAoAYN26c0fhjjz0mmjZtajRm7jnHxMSIVq1aVbn+3Nxc0aJFC9GpUydRWFhY5bpKS0tFx44dRe/evY3GAQi1Wi2OHj1qNP7iiy8KAGLfvn2GsZycHOHp6SkAiNOnTxvGNBqN6Nevn9DpdIba9957TwAQqamphrGqXrvbt8OPPPKIVduT29XJtO9Vq1Zh27ZtWLt2LZ5++mmsXbvW6K8VAEYpXlBQgLy8PPTo0QPFxcXIzMw0qi0rK0NeXh5yc3ORnp6Ozz//HJ07d4a3t7fZx79w4QIWLVqEmTNnmvwFvmHDBnh7e+OFF14wWU6lUpldX0ZGBtavX4+UlBSjw46V/eMf/0BmZqbh0NyGDRvg6emJPn36GNX98ssvyMnJwYQJE4yOAQ8aNAjt2rXDt99+CwDIzc3FDz/8gHHjxhntWVrqszo6nQ55eXnIy8uTtFc
"text/plain": [
"<Figure size 400x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAboAAAK9CAYAAABSGqmgAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABlxUlEQVR4nO3deVhUdfsG8HuGZdhkE9kUgdQEFCV33BcSl9zSFLVE8tW31BSpXHJDrUwtc00rU7S01EotFxT3QnJBccfSMEsDFwQElG2+vz/8cV7HARyGUYYz9+e6vGq+5zlnnpmBc3PWUQghBIiIiGRKWdkNEBERPU0MOiIikjUGHRERyRqDjoiIZI1BR0REssagIyIiWWPQERGRrDHoiIhI1hh0REQkaww6ohL4+Phg+PDhBl3m8OHD4ePjY9BlVraYmBgoFApcvXr1qSw/OjoaCoXiqSz7WTt48CAUCgW+//77ym7FIK5evQqFQoGYmJjKbuWJTDroVq5cidDQULi5ucHCwgLu7u7o0KED1q1bB7VaXdntURV048YNREdHIykpqbJbka0PP/wQW7durew2qAox6aBbu3YtbG1tMX36dHz11Vd47733ULNmTQwfPhxDhw6t7PaoCrpx4wZmzZpVYtB9+eWXuHTp0rNvqgqbNm0a7t+/rzHGoKPyMq/sBirT4cOHYWFhoTE2btw4VK9eHcuWLcPcuXNlt6vJVOXm5sLGxqZSe3j8Z42ezNzcHObmJr2aqlTG8HtjCCa9RVfaiqc43JTK/70927ZtQ8+ePeHp6QmVSoU6depgzpw5KCoq0pi3Y8eOUCgU0j8XFxf07NkT586d06hTKBSIjo7WGFuwYAEUCgU6duyoMf7gwQNER0fj+eefh5WVFTw8PPDyyy/jypUrAErfVz5mzBgoFAqNY03Fx1QsLS1x69YtjfqEhASp7xMnTmhM27x5M5o2bQpra2u4uLjg1VdfxfXr17Xeu+TkZAwcOBA1atSAtbU16tevj6lTpwL43/GWsv4dPHhQeh8bNmyotXxdFM+bmJiI9u3bw8bGBu+99x4AIC8vDzNnzkTdunWhUqng5eWFiRMnIi8vr8xlpqen45133kFgYCDs7Oxgb2+P7t274/Tp01LNwYMH0bx5cwBARESE9JqKP5dHj9EVFBTA2dkZERERWs+VlZUFKysrvPPOO9KYvn0/6uOPP4ZCocBff/2lNW3KlCmwtLTE3bt3pbGjR4+iW7ducHBwgI2NDTp06ID4+Hidnuuzzz5DgwYNoFKp4OnpiTFjxiAjI0Or7ujRo+jRowecnJxga2uLRo0aYfHixdL0x4/RKRQK5OTkYO3atdL7O3z4cBw4cAAKhQJbtmzReo4NGzZAoVAgISGhzJ7//PNPvPLKK3B2doaNjQ1atWqFHTt2SNOLj7GV9e/x3+mSqNVqfPDBB6hVqxasrKzQpUsXXL58WaPml19+wSuvvILatWtLn/eECRM0tm7L6ufRP9DLs+4q7fcmIyMDw4cPh4ODAxwdHREeHl7i5wkA+/fvR7t27WBrawtHR0f06dMHFy9e1Kgp7Xh1Scdk4+Li0LZtWzg6OsLOzg7169eX+tIV/1TCww+xsLAQ9+7dQ2JiIj7++GOEhYWhdu3aUk1MTAzs7OwQFRUFOzs77N+/HzNmzEBWVhYWLFigsTw/Pz9MnToVQghcuXIFCxcuRI8ePXDt2rUye5g7d67WeFFREV566SXs27cPYWFhGD9+PO7du4e4uDicO3cOderUKXF5ly9fxpdfflnq85mZmeGbb77BhAkTpLE1a9bAysoKDx480KiNiYlBREQEmjdvjrlz5yItLQ2LFy9GfHw8Tp06BUdHRwDAmTNn0K5dO1hYWGDUqFHw8fHBlStX8PPPP+ODDz7Ayy+/jLp160rLnTBhAvz9/TFq1ChpzN/fv9Sey+POnTvo3r07wsLC8Oqrr8LNzQ1qtRq9e/fGr7/+ilGjRsHf3x9nz57Fp59+it9//73M3WF//vkntm7dildeeQW+vr5IS0vD559/jg4dOuDChQvw9PSEv78/Zs+ejRkzZmDUqFFo164dAKB169Zay7OwsEC/fv3w448/4vPPP4elpaU0bevWrcjLy0NYWBgAVKjvRw0cOBATJ07Epk2b8O6772pM27RpE7p27QonJycAD1dW3bt3R9OmTTFz5kwolUqsWbMGnTt3xi+//IIWLVqU+jzR0dGYNWsWQkJC8Oabb+LSpUtYsWIFjh8/jvj4eOkPzLi4OLz00kvw8PDA+PHj4e7ujosXL2L79u0YP358icv++uuv8Z///ActWrSQfm7q1KmDVq1awcvLC+vXr0e/fv005lm/fj3q1KmD4ODgUntOS0tD69atkZubK+3VWbt2LXr37o3vv/8e/fr1g7+/P77++mtpni+++AIXL17Ep59+Ko01atSo1Oco9tFHH0GpVOKdd95BZmYm5s+fj6FDh+Lo0aNSzebNm5Gbm4s333wT1atXx7Fjx7B06VL8888/2Lx5MwBo9QM8XI9ERUXB1dVVGivPuquk3xshBPr06YNff/0Vb7zxBvz9/bFlyxaEh4drvba9e/eie/fueO655xAdHY379+9j6dKlaNOmDU6ePFnuPWTnz5/HSy+9hEaNGmH27NlQqVS4fPmyzn9wSQSJ+vXrCwDSv2HDhomCggKNmtzcXK35/vvf/wobGxvx4MEDaaxDhw6iQ4cOGnXvvfeeACBu3rwpjQEQM2fOlB5PnDhRuLq6iqZNm2rMv3r1agFALFy4UOv51Wq1EEKIlJQUAUCsWbNGmjZw4EDRsGFD4eXlJcLDw6XxNWvWCABi8ODBIjAwUBrPyckR9vb2YsiQIQKAOH78uBBCiPz8fOHq6ioaNmwo7t+/L9Vv375dABAzZsyQxtq3by+qVasm/vrrrxL7fJy3t7dGb4/q0KGDaNCgQYnTnqRDhw4CgFi5cqXG+Ndffy2USqX45ZdfNMZXrlwpAIj4+PhSe3vw4IEoKirSmC8lJUWoVCoxe/Zsaez48eNan0Wx8PBw4e3tLT3evXu3ACB+/vlnjboePXqI5557Tq++nyQ4OFg0bdpUY+zYsWMCgFi3bp0Q4uHnVa9ePREaGqrx2eXm5gpfX1/x4osvSmPFP08pKSlCCCFu3rwpLC0tRdeuXTXer2XLlgkAYvXq1UIIIQoLC4Wvr6/w9vYWd+/e1ejn0eecOXOmeHw1ZWtrW+LPzZQpU4RKpRIZGRnS2M2bN4W5ubnG71pJIiMjBQCN9/jevXvC19dX+Pj4aH32Qmh/nk9y4MABAUD4+/uLvLw8aXzx4sUCgDh79qw0VtL6Zu7cuUKhUGj9fhVTq9XipZdeEnZ2duL8+fNlLqu0dVdJvzdbt24VAMT8+fOlscLCQtGuXTutn/WgoCDh6uoq7ty5I42dPn1aKJVKMWzYMGmstPfu8c/7008/FQDErVu3SnzNujLpXZfF1qxZg7i4OKxfvx4jRozA+vXrNbYyAMDa2lr6/3v37uH27dto164dcnNzkZycrFFbUFCA27dv49atW0hISMCWLVvQqFEjuLi4lPj8169fx9KlSzF9+nTY2dlpTPvhhx/g4uKCt956S2u+0k67TkxMxObNmzF37lyN3a+Peu2115CcnCztovzhhx/g4OCALl26aNSdOHECN2/exOjRo2FlZSWN9+zZE35+ftKunVu3buHw4cN4/fXXNbaEy+rzSYqKinD79m3cvn0b+fn55ZpXpVJp7RbcvHkz/P394efnJy339u3b6Ny5MwDgwIEDZS6v+L0sKirCnTt3pN0oJ0+eLOcre6hz585wcXHBxo0bpbG7d+8iLi4OgwYNMkjfjxs0aBASExOl3d4AsHHjRqhUKvTp0wcAkJSUhD/++ANDhgzBnTt3pOfLyclBly5dcPjw4VLPSt67dy/y8/MRGRmp8bM3cuRI2NvbSz8vp06dQkpKCiIjI6U9AsX0/XkZNmwY8vLyNE7f37hxIwoLC/Hqq6+WOe/OnTvRokULtG3bVhqzs7PDqFGjcPXqVVy4cEGvnkoSERGhsQVfvOX/559
"text/plain": [
"<Figure size 400x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAALMCAYAAADkXsVPAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABKzUlEQVR4nO3deVyU5f7/8fegMIAIqICKIpr+3HKrXFI0LE1yIbOyXErUU52yU6anRY9fE6y05bSZppUntcUWM+uUmVuaZWquuVsWLpkbmqKiiMz1+8ObOY4DiOPIrfF6Ph489L7u677nM/fA9Z57mbkdxhgjAECJF2B3AQCASwOBAACQRCAAACwEAgBAEoEAALAQCAAASQQCAMBCIAAAJBEIAAALgYBLxuTJk+VwOLRt2za7S3FzOBxKTU11T1+KNf4VLVy4UA6HQ5988ondpfjFtm3b5HA4NHnyZLtLKZQtgTBhwgQlJSWpYsWKCgwMVKVKlZSYmKh33nlHLpfLjpKAiyYrK0upqalauHCh3aUAhbIlEKZMmaIyZcpo+PDh+s9//qN//etfqlKlivr27avevXvbURIuAXfffbeOHz+u+Ph4u0spkC81ZmVlKS0tjUDAJa+0HQ+6aNEiBQYGerQ9/PDDqlChgsaOHavRo0erevXqdpQGG5UqVUqlSpWyu4xCXQ41wh5ZWVkKDQ21u4wLYssewtlhkCcvBAIC/lfW559/rs6dOys2NlZOp1M1a9bUU089pdzcXI9l27ZtK4fD4f6JiopS586dtX79eo9+Zx8TlqQXXnhBDodDbdu29Wg/ceKEUlNTVbt2bQUHB6ty5cq69dZb9euvv0oq+Ljggw8+KIfDob59+7rb8o49BwUFaf/+/R79lyxZ4q57xYoVHvOmTZuma665RiEhIYqKitJdd92lXbt2eW27zZs364477lB0dLRCQkJUp04dDRs2TJKUmprqsW3y+8l799q2bVs1aNDAa/1Fkbfs2rVrlZiYqNDQUNWqVct9HPjbb79VixYt3PXNmzfPY/n8js+vWLFCSUlJioqKUkhIiGrUqKH+/ft7LPfhhx/qmmuuUdmyZRUeHq6GDRvq1VdfPa/as7OzNWjQIEVHR6ts2bK6+eab9fvvv3v1O98at23bpujoaElSWlqae3vn/Q6uXbtWffv21RVXXKHg4GBVqlRJ/fv314EDBzweN+813Lp1q/r27avIyEhFRESoX79+ysrK8qrzvffeU/PmzRUaGqpy5crpuuuu05w5czz6zJo1S23atFGZMmVUtmxZde7cWRs2bCjS9vrtt9/UvXt3lS9fXqGhobr22ms1c+ZM9/y8cwCF/Zz9d5gfl8ulZ555RlWrVlVwcLDatWunrVu3evT57rvv1L17d1WrVk1Op1NxcXEaNGiQjh8/XqR6znzzeT7jTYMGDbRy5Updd911Cg0N1b/+9S9J0qFDh9S3b19FREQoMjJSKSkpOnToUL7P75tvvnG/BpGRkeratas2bdrk0adv3775vkHO+50409y5c9W6dWtFRkYqLCxMderUcddVFLbsIeQ5dOiQTp06pSNHjmjlypX697//rR49eqhatWruPpMnT1ZYWJgGDx6ssLAwffPNN3ryySeVmZmpF154wWN9devW1bBhw2SM0a+//qqXXnpJnTp10o4dOwqtYfTo0V7tubm56tKli+bPn68ePXpo4MCBOnLkiObOnav169erZs2a+a5v69ateuuttwp8vFKlSum9997ToEGD3G2TJk1ScHCwTpw44dF38uTJ6tevn5o1a6bRo0dr7969evXVV7V48WKtXr1akZGRkk4PKm3atFFgYKDuu+8+Va9eXb/++qu++OILPfPMM7r11ltVq1Yt93oHDRqkevXq6b777nO31atXr8Caz8eff/6pLl26qEePHurevbvGjx+vHj166P3339cjjzyi+++/X7169dILL7yg22+/XTt37lTZsmXzXde+ffvUoUMHRUdHa8iQIYqMjNS2bdv06aefuvvMnTtXPXv2VLt27fTcc89JkjZt2qTFixdr4MCBRa77nnvu0XvvvadevXqpVatW+uabb9S5c+dzLneuGqOjozV+/Hg98MAD6tatm2699VZJUqNGjdz1//bbb+rXr58qVaqkDRs26M0339SGDRu0dOlSrz/4O+64QzVq1NDo0aO1atUqTZw4UTExMe7nLp0OntTUVLVq1UojR45UUFCQli1bpm+++UYdOnSQJL377rtKSUlRUlKSnnvuOWVlZWn8+PFq3bq1Vq9eXege+t69e9WqVStlZWW59+ynTJmim2++WZ988om6deumevXq6d1333Uv8+abb2rTpk16+eWX3W1526Awzz77rAICAvToo4/q8OHDev7559W7d28tW7bM3WfatGnKysrSAw88oAoVKujHH3/Ua6+9pt9//13Tpk2TJK96pNN/+4MHD1ZMTIy77XzGmwMHDqhjx47q0aOH7rrrLlWsWFHGGHXt2lXff/+97r//ftWrV08zZsxQSkqK13ObN2+eOnbsqCuuuEKpqak6fvy4XnvtNSUkJGjVqlXnfZRkw4YN6tKlixo1aqSRI0fK6XRq69atWrx4cdFXYmxUp04dI8n906dPH5OTk+PRJysry2u5v//97yY0NNScOHHC3ZaYmGgSExM9+v3rX/8yksy+ffvcbZLMiBEj3NOPP/64iYmJMddcc43H8m+//baRZF566SWvx3e5XMYYY9LT040kM2nSJPe8O+64wzRo0MDExcWZlJQUd/ukSZOMJNOzZ0/TsGFDd/uxY8dMeHi46dWrl5Fkli9fbowx5uTJkyYmJsY0aNDAHD9+3N3/yy+/NJLMk08+6W677rrrTNmyZc327dvzrfNs8fHxHrWdKTEx0Vx55ZX5zjuXxMREI8lMnTrV3bZ582YjyQQEBJilS5e622fPnu217fK2UXp6ujHGmBkzZnhsk/wMHDjQhIeHm1OnTvlUszHGrFmzxkgyAwYM8GjPe03O/H3xpcb9+/d7rSdPfr/fH3zwgZFkFi1a5G4bMWKEkWT69+/v0bdbt26mQoUK7ulffvnFBAQEmG7dupnc3FyPvnm/D0eOHDGRkZHm3nvv9Zi/Z88eExER4dV+tkceecRIMt9995277ciRI6ZGjRqmevXqXo9rjDEpKSkmPj6+0PWeacGCBUaSqVevnsnOzna3v/rqq0aSWbdunbstv204evRo43A4vP4m8rhcLtOlSxcTFhZmNmzYUOi6ChpvJJkJEyZ49P3ss8+MJPP888+7206dOmXatGnj9fvepEkTExMTYw4cOOBu++mnn0xAQIDp06ePu62gbZf3O5Hn5ZdfNpLM/v37833ORWHrZaeTJk3S3Llz9f777+tvf/ub3n//fY93rZIUEhLi/v+RI0eUkZGhNm3aKCsrS5s3b/bom5OTo4yMDO3fv19LlizRjBkz1KhRI0VFReX7+Lt27dJrr72m4cOHKywszGPe9OnTFRUVpYceeshrubPfteVZuXKlpk2bptGjR3sc9jrT3Xffrc2bN7sPDU2fPl0RERFq166dR78VK1Zo3759GjBggIKDg93tnTt3Vt26dd275/v379eiRYvUv39/jz2rwuo8l9zcXGVkZCgjI0MnT548r2XDwsLUo0cP93SdOnUUGRmpevXqqUWLFu72vP//9ttvBa4rbw/oyy+/VE5OToF9jh07prlz555XnWf66quvJJ0+j3WmRx555JzLFqXGwpz5+33ixAllZGTo2muvlSStWrXKq//999/vMd2mTRsdOHBAmZmZkqTPPvtMLpdLTz75pNfvYN7vw9y5c3Xo0CH17NnT/TpnZGSoVKlSatGihRYsWFBozV999ZWaN2+u1q1bu9vCwsJ03333adu2bdq4ceN5bIHC9evXT0FBQe7pNm3aSPL8vTlzGx47dkwZGRlq1aqVjDFavXp1vut96qmn9OWXX2ry5MmqX79+vus613jjdDrVr18/j7avvvpKpUuX1gMPPOBuK1WqlNc4snv3bq1Zs0Z9+/ZV+fLl3e2NGjXSjTfe6P6dPB95v4u
"text/plain": [
"<Figure size 400x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAbQAAAK9CAYAAABM05kTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABmGElEQVR4nO3deVwU9f8H8NdyLfclt6IgKmiilidqiice4dmh1ldQ08orj7K0PNAKszLzyOOb59csv2qalfeBmqEmiUcqeYA3KCqsHMLCfn5/+GV+LrsLy7IIjq/n48FD5zOfmX3vLMxrZ+azswohhAAREdFTzqKyCyAiIjIHBhoREckCA42IiGSBgUZERLLAQCMiIllgoBERkSww0IiISBYYaEREJAsMNCIikgUGGlWo6OhoODo6PtHHTElJgUKhwKpVq57o4z4rZsyYAYVC8cQeT6FQYMaMGU/s8cwhICAAL730UmWXYTbh4eEIDw+v7DJKVeGBtmTJEkRERMDb2xvW1tbw8fFB+/btsWbNGmg0mop+eKIyycnJwYwZMxAXF1fZpTxVPvvsM2zZsqWyy6BnXIUH2urVq+Hg4ICpU6di+fLlmDJlCqpXr47o6Gi8/vrrFf3wRGWSk5ODmJgYBloJPv74Y+Tm5mq1MdCoKrCq6Ac4ePAgrK2ttdrGjh2LatWqYeHChYiNjUVAQEBFl0FEZmJlZQUrqwrfddATlJ2dDQcHh8ouo9wq/AiteJgVKQoxC4v/L+Hnn39Gz5494efnB6VSiaCgIMyaNQuFhYVay4aHh0OhUEg/Hh4e6NmzJ86cOaPVT9+59y+++AIKhULnfPDDhw8xY8YM1KtXD7a2tvD19UW/fv1w6dIlAIavy4waNQoKhQLR0dFS26pVq6BQKGBjY4M7d+5o9Y+Pj5fqPn78uNa8DRs2oGnTprCzs4OHhwfeeOMN3LhxQ2fbnT9/Hq+++io8PT1hZ2eH4OBgfPTRRwD+//pGST9FRx/h4eFo2LChzvqNcejQIbzyyiuoWbMmlEol/P39MX78eJ137kUuX76MiIgIODg4wM/PDzNnzkTxL3r48ccf0bRpUzg5OcHZ2RmhoaH45ptvdNbzyiuvwN3dHfb29mjVqhV+++23Uus1dA0gOjpa+l1MSUmBp6cnACAmJkbaXo//Dp0/fx4vv/wy3N3dYWtri2bNmmHr1q2lPn5xCoUCo0ePxoYNG9CgQQPY2dkhLCwMp0+fBgAsXboUderUga2tLcLDw5GSkqK1fFm2f9Fj2NraomHDhti8ebPW8y567gqFAl9++SWWLVuGoKAgKJVKNG/eHH/++afW+opfQ1MoFMjOzsbq1aulbVb091D8cQytAwDy8vIwfvx4eHp6wsnJCb169cL169f1br8bN25g6NCh8Pb2hlKpxHPPPYcVK1YY2txaCgoKMGvWLOk5BgQEYMqUKcjLy5P6BAQElPg3ZOyb8N9//x0tWrSAra0tateujTVr1mjNv3fvHt577z2EhobC0dERzs7O6N69O06ePKnVr6R6iv6er1y5gpEjRyI4OBh2dnaoVq0aXnnlFZ3fnaL904EDBzBy5Eh4eXmhRo0a0vyi19/Ozg4tWrTAoUOH9D6327dvY9iwYfD29oatrS0aN26M1atXa/WJi4vTqrGIvv1pamoqhgwZgho1akCpVMLX1xe9e/fWqb8kT+xtVkZGBgoKCvDgwQMkJCTgyy+/xIABA1CzZk2pz6pVq+Do6IgJEybA0dER+/btw7Rp06BSqfDFF19orS8kJAQfffQRhBC4dOkS5s6dix49euDq1asl1hAbG6vTXlhYiJdeegl79+7FgAED8O677+LBgwfYvXs3zpw5g6CgIL3ru3jxIv79738bfDxLS0usXbsW48ePl9pWrlwJW1tbPHz4UKvvqlWrMGTIEDRv3hyxsbFIS0vDN998g8OHD+PEiRNwdXUFAJw6dQovvvgirK2tMWLECAQEBODSpUv45Zdf8Omnn6Jfv36oU6eOtN7x48ejfv36GDFihNRWv359gzUba8OGDcjJycE777yDatWq4dixY1iwYAGuX7+ODRs2aPUtLCxEt27d0KpVK8yZMwc7duzA9OnTUVBQgJkzZwIAdu/ejYEDB6JTp074/PPPAQDnzp3D4cOH8e677wIA0tLS0Lp1a+Tk5EhH+atXr0avXr2wceNG9O3bt1zPydPTE4sXL8Y777yDvn37ol+/fgCARo0aAQD+/vtvtGnTBtWrV8eHH34IBwcH/Pe//0WfPn2wadOmMj/+oUOHsHXrVowaNQoAEBsbi5deegmTJk3Ct99+i5EjR+L+/fuYM2cOhg4din379knLGrv9f/vtN7z22msIDQ1FbGws7t+/j2HDhqF69ep6a1q3bh0ePHiAt956CwqFAnPmzEG/fv1w+fJlg29O//Of/+DNN99EixYtpN8zQ38zJXnzzTexdu1aDBo0CK1bt8a+ffvQs2dPnX5paWlo1aqV9KbA09MT27dvx7Bhw6BSqTBu3LhSH2f16tV4+eWXMXHiRBw9ehSxsbE4d+4cNm/eDACYN28esrKyADz6Pfzss88wZcoU6W/HmIFOFy9exMsvv4xhw4YhKioKK1asQHR0NJo2bYrnnnsOwKM3aFu2bMErr7yCwMBApKWlYenSpWjfvj3Onj0LPz8/nXqKfP3110hMTES1atUAAH/++Sf++OMPDBgwADVq1EBKSgoWL16M8PBwnD17Fvb29lrLjxw5Ep6enpg2bRqys7MBAMuXL8dbb72F1q1bY9y4cbh8+TJ69eoFd3d3+Pv7S8vm5uYiPDwcFy9exOjRoxEYGIgNGzYgOjoaGRkZ0t9sWfTv3x9///03xowZg4CAANy+fRu7d+/G1atXjT+LJ56Q4OBgAUD6GTx4sFCr1Vp9cnJydJZ76623hL29vXj48KHU1r59e9G+fXutflOmTBEAxO3bt6U2AGL69OnS9KRJk4SXl5do2rSp1vIrVqwQAMTcuXN1Hl+j0QghhEhOThYAxMqVK6V5r776qmjYsKHw9/cXUVFRUvvKlSsFADFw4EARGhoqtWdnZwtnZ2cxaNAgAUD8+eefQggh8vPzhZeXl2jYsKHIzc2V+v/6668CgJg2bZrU1q5dO+Hk5CSuXLmit87iatWqpVXb49q3by+ee+45vfNKo++1io2NFQqFQqu2qKgoAUCMGTNGq9aePXsKGxsbcefOHSGEEO+++65wdnYWBQUFBh9z3LhxAoA4dOiQ1PbgwQMRGBgoAgICRGFhoRBC/2ul73emqL5atWpJ03fu3NH5vSnSqVMnERoaqvW7qNFoROvWrUXdunUN1q0PAKFUKkVycrLUtnTpUgFA+Pj4CJVKJbVPnjxZANDqa+z2Dw0NFTVq1BAPHjyQ2uLi4gQAreddtM2qVasm7t27J7X//PPPAoD45ZdfpLbp06eL4rsOBwcHvb9nxbevoXUkJiYKAGLkyJFa/Yr+Vh5/PYYNGyZ8fX1Fenq6Vt8BAwYIFxcXvdum+OO8+eabWu3vvfeeACD27duns8z+/fsFALF//36D6y2uVq1aAoA4ePCg1Hb79m2hVCrFxIkTpbaHDx9Kv7dFkpOThVKpFDNnzjS4/v/+978CgFYffc87Pj5eABBr1qyR2or2T23bttX6eyvaDzVp0kTk5eVJ7cuWLRMAtP5+5s2bJwCItWvXai0fFhYmHB0dpd9fQ9uu+N/o/fv3BQDxxRdfGHzOxnhiw/ZXrlyJ3bt34/vvv8ewYcPw/fffax01AICdnZ30/wcPHiA9PR0vvvgicnJycP78ea2+arUa6enpuHPnDuLj47F582Y0atQIHh4eeh//xo0bWLBgAaZOnarz7mrTpk3w8PDAmDFjdJYzNDw5ISEBGzZsQGxsrNZp08f961//wvnz56VTi5s2bYKLiws6deqk1e/48eO4ffs2Ro4cCVtbW6m9Z8+eCAkJkU6p3blzBwcPHsTQoUO1jmxLqrM0hYWFSE9PR3p6OvLz841e7vHXKjs7G+np6WjdujWEEDhx4oRO/9GjR2vVOnr0aOTn52PPnj0AAFd
"text/plain": [
"<Figure size 400x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"# Список числовых колонок, для которых мы будем строить графики\n",
"numeric_columns = ['est_diameter_min', 'est_diameter_max', 'relative_velocity', 'miss_distance', 'absolute_magnitude']\n",
"\n",
"# Создание диаграмм зависимости\n",
"for column in numeric_columns:\n",
" plt.figure(figsize=(4, 8)) # Установка размера графика\n",
" plt.scatter(data['hazardous'], data[column], alpha=0.5) # Создаем диаграмму рассеяния\n",
" plt.title(f'Зависимость {column} от hazardous')\n",
" plt.xlabel('hazardous (0 = не опасно, 1 = опасно)')\n",
" plt.ylabel(column)\n",
" plt.xticks([0, 1]) # Установка меток по оси X\n",
" plt.grid() # Добавление сетки для удобства восприятия\n",
" plt.show() # Отображение графика"
2024-11-14 21:22:58 +04:00
]
2024-10-19 17:27:50 +04:00
},
2024-11-16 21:33:16 +04:00
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Функция для создания выборок\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"def split_stratified_into_train_val_test(\n",
" data_input,\n",
" stratify_colname=\"y\",\n",
" frac_train=0.6,\n",
" frac_val=0.15,\n",
" frac_test=0.25,\n",
" random_state=None,\n",
"):\n",
" \"\"\"\n",
" Splits a Pandas dataframe into three subsets (train, val, and test)\n",
" following fractional ratios provided by the user, where each subset is\n",
" stratified by the values in a specific column (that is, each subset has\n",
" the same relative frequency of the values in the column). It performs this\n",
" splitting by running train_test_split() twice.\n",
"\n",
" Parameters\n",
" ----------\n",
" data_input : Pandas dataframe\n",
" Input dataframe to be split.\n",
" stratify_colname : str\n",
" The name of the column that will be used for stratification. Usually\n",
" this column would be for the label.\n",
" frac_train : float\n",
" frac_val : float\n",
" frac_test : float\n",
" The ratios with which the dataframe will be split into train, val, and\n",
" test data. The values should be expressed as float fractions and should\n",
" sum to 1.0.\n",
" random_state : int, None, or RandomStateInstance\n",
" Value to be passed to train_test_split().\n",
"\n",
" Returns\n",
" -------\n",
" data_train, data_val, data_test :\n",
" Dataframes containing the three splits.\n",
" \"\"\"\n",
"\n",
" if frac_train + frac_val + frac_test != 1.0:\n",
" raise ValueError(\n",
" \"fractions %f, %f, %f do not add up to 1.0\"\n",
" % (frac_train, frac_val, frac_test)\n",
" )\n",
"\n",
" if stratify_colname not in data_input.columns:\n",
" raise ValueError(\"%s is not a column in the dataframe\" % (stratify_colname))\n",
"\n",
" X = data_input # Contains all columns.\n",
" y = data_input[\n",
" [stratify_colname]\n",
" ] # Dataframe of just the column on which to stratify.\n",
"\n",
" # Split original dataframe into train and temp dataframes.\n",
" data_train, data_temp, y_train, y_temp = train_test_split(\n",
" X, y, stratify=y, test_size=(1.0 - frac_train), random_state=random_state\n",
" )\n",
"\n",
" # Split the temp dataframe into val and test dataframes.\n",
" relative_frac_test = frac_test / (frac_val + frac_test)\n",
" data_val, data_test, y_val, y_test = train_test_split(\n",
" data_temp,\n",
" y_temp,\n",
" stratify=y_temp,\n",
" test_size=relative_frac_test,\n",
" random_state=random_state,\n",
" )\n",
"\n",
" assert len(data_input) == len(data_train) + len(data_val) + len(data_test)\n",
"\n",
" return data_train, data_val, data_test"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hazardous\n",
"False 81996\n",
"True 8840\n",
"Name: count, dtype: int64\n",
"\n",
"Обучающая выборка: (54501, 6)\n",
"hazardous\n",
"False 49197\n",
"True 5304\n",
"Name: count, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAgkAAADECAYAAAAVi7K7AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA9TklEQVR4nO3dd1xT1/sH8E8SIGHvLbLBgQNxtFbBjavUVVu3WBxVa7X6tdXWgdWidddtHVDxa61StVonbqxVceBWkOFmbwiQ5Pz+4Jd8CUkQELwgz/v1yktzOffc5557c/PknDt4jDEGQgghhJBy+FwHQAghhJC6iZIEQgghhKhFSQIhhBBC1KIkgRBCCCFqUZJACCGEELUoSSCEEEKIWpQkEEIIIUQtShIIIYQQopYW1wEQQkhDUFxcjIyMDMhkMtjZ2XEdDqlBYrEYGRkZ0NLSgpWVFdfh1CjqSSCkDhg7diwMDAy4DqPGLFy4EDwej+swOBcdHY3hw4fDwsICQqEQtra2GDx4MNdh1Rvr169HVlaW4v2aNWuQn5/PXUBlREZGIiAgACYmJtDV1YW9vT2+/vprrsOqcVXqSQgNDUVgYKDivVAoROPGjdGrVy/MmzcP1tbWNR4gIYTUR4cOHcJnn32GJk2aYMmSJXB1dQWA9+6XZm06fPgw4uLiMHPmTFy4cAHz5s3DtGnTuA4LGzduxFdffYVOnTph7dq1sLe3BwA4OjpyHFnNq9Zww6JFi+Ds7AyxWIyoqChs2rQJR48exd27d6Gnp1fTMRJCSL2SkZGBoKAg+Pv7Y9++fdDR0eE6pHpp7ty5CAgIwNq1a8Hn87Fy5Urw+dx2gMfGxuKbb77BhAkTsHHjxve+x6xaSUKfPn3Qtm1bAEBQUBDMzc2xatUqHDp0CMOGDavRAAkhdY9EIoFMJqMvPw127twJsViM0NBQaqO34Ofnh6SkJDx48AAODg5o1KgR1yHhl19+gY2NDX755Zf3PkEAauichG7dugEAEhISAJRm0bNmzUKLFi1gYGAAIyMj9OnTBzExMSrzisViLFy4EB4eHhCJRLC1tcWgQYPw5MkTAEBiYiJ4PJ7GV5cuXRR1nTt3DjweD3v37sXcuXNhY2MDfX19BAQE4NmzZyrLvnLlCnr37g1jY2Po6enBz88Ply5dUruOXbp0Ubv8hQsXqpQNDw+Hj48PdHV1YWZmhs8//1zt8itat7JkMhnWrFmD5s2bQyQSwdraGhMnTkRmZqZSOScnJ/Tv319lOVOnTlWpU13sy5cvV2lTACgqKsKCBQvg5uYGoVAIBwcHzJ49G0VFRWrbqqwuXbqo1LdkyRLw+Xz897//rVZ7rFixAh07doS5uTl0dXXh4+OD/fv3q11+eHg42rdvDz09PZiamsLX1xcnT55UKnPs2DH4+fnB0NAQRkZGaNeunUps+/btU2xTCwsLjBw5Ei9evFAqM3bsWKWYTU1N0aVLF1y8ePGN7ST34sULDBgwAAYGBrC0tMSsWbMglUqrvP7lY1G3zxYXF2P+/Pnw8fGBsbEx9PX10blzZ5w9e1apLvl2WbFiBdasWQNXV1cIhULcv38fABAVFYV27dpBJBLB1dUVW7ZsUbtuEokEP/74o2J+JycnzJ07V2U/0vS5cnJywtixYxXvS0pKEBwcDHd3d4hEIpibm6NTp044depUhW0cGhqq1B56enpo0aIFtm3bVuF8cvHx8fj0009hZmYGPT09fPDBB/j777+Vyvz7779o3bo1fvrpJzg4OEAoFMLd3R1Lly6FTCZTlPPz80OrVq3ULsfT0xP+/v5KMScmJiqVKf/5quw2BVTb+fXr1xg9ejQsLS0hFArh5eWFX3/9VWmesvtCWV5eXiqf8xUrVqiN+cWLFxg3bhysra0hFArRvHlz7NixQ6mM/Fh+7tw5mJiY4MMPP0SjRo3Qr18/jfuHuvnlL6FQCA8PD4SEhKDsg4/l586kpaVprKv8fvfvv//Cx8cHkydPVqyDurYCgPz8fMycOVOxD3h6emLFihUo//BlHo+HqVOnYvfu3fD09IRIJIKPjw8uXLigVE7duT5nz56FUCjEpEmTlKZXpp0ro0aubpB/oZubmwMo/RAdPHgQn376KZydnZGcnIwtW7bAz88P9+/fV5zZK5VK0b9/f5w+fRqff/45vv76a+Tm5uLUqVO4e/euYgwPAIYNG4a+ffsqLXfOnDlq41myZAl4PB6+/fZbpKSkYM2aNejRowdu3boFXV1dAMCZM2fQp08f+Pj4YMGCBeDz+di5cye6deuGixcvon379ir1NmrUCCEhIQCAvLw8fPnll2qXPW/ePAwdOhRBQUFITU3FunXr4Ovri5s3b8LExERlngkTJqBz584AgD///BMHDhxQ+vvEiRMV54NMmzYNCQkJWL9+PW7evIlLly5BW1tbbTtURVZWlmLdypLJZAgICEBUVBQmTJiApk2b4s6dO1i9ejUeP36MgwcPVmk5O3fuxA8//ICVK1di+PDhasu8qT3Wrl2LgIAAjBgxAsXFxfj999/x6aef4siRI+jXr5+iXHBwMBYuXIiOHTti0aJF0NHRwZUrV3DmzBn06tULQOnBd9y4cWjevDnmzJkDExMT3Lx5E8ePH1fEJ2/7du3aISQkBMnJyVi7di0uXbqksk0tLCywevVqAMDz58+xdu1a9O3bF8+ePVO77cuSSqXw9/dHhw4dsGLFCkRGRmLlypVwdXVV2tcqs/4TJ05Ejx49lOo/fvw4du/erRgTz8nJwbZt2zBs2DCMHz8eubm52L59O/z9/XH16lW0bt1aZduJxWJMmDABQqEQZmZmuHPnDnr16gVLS0ssXLgQEokECxYsUHt+UlBQEMLCwjBkyBDMnDkTV65cQUhICB48eKCyjStj4cKFCAkJQVBQENq3b4+cnBxER0fjxo0b6Nmz5xvnX716NSwsLJCTk4MdO3Zg/PjxcHJyUmm3spKTk9GxY0cUFBRg2rRpMDc3R1hYGAICArB//34MHDgQAJCeno6oqChERUVh3Lhx8PHxwenTpzFnzhwkJiZi8+bNAIBRo0Zh/PjxuHv3Lry8vBTLuXbtGh4/fowffvihSm1S1W0qV1xcjB49euDhw4f48ssv4enpiYMHD2LChAlIT0/Hd999V6U4NElOTsYHH3yg+FK0tLTEsWPH8MUXXyAnJwfTp0/XOO+FCxdw9OjRKi1v7ty5aNq0KQoLCxU/Hq2srPDFF19Uex3S09MRHR0NLS0tTJkyBa6urmrbijGGgIAAnD17Fl988QVat26NEydO4D//+Q9evHihOE7InT9/Hnv37sW0adMgFAqxceNG9O7dG1evXlXaN8qKiYnBgAED0LdvX2zYsEEx/W3aWQWrgp07dzIALDIykqWmprJnz56x33//nZmbmzNdXV32/PlzxhhjYrGYSaVSpXkTEhKYUChkixYtUkzbsWMHA8BWrVqlsiyZTKaYDwBbvny5SpnmzZszPz8/xfuzZ88yAMze3p7l5OQopv/xxx8MAFu7dq2ibnd3d+bv769YDmOMFRQUMGdnZ9azZ0+VZXXs2JF5eXkp3qempjIAbMGCBYppiYmJTCAQsCVLlijNe+fOHaalpaUyPTY2lgFgYWFhimkLFixgZTfLxYsXGQC2e/dupXmPHz+uMt3R0ZH169dPJfYpU6aw8pu6fOyzZ89mVlZWzMfHR6lNd+3axfh8Prt48aLS/Js3b2YA2KVLl1SWV5afn5+ivr///ptpaWmxmTNnqi1bmfZgrHQ7lVVcXMy8vLxYt27dlOri8/ls4MCBKvuifJtnZWUxQ0ND1qFDB1ZYWKi2THFxMbOysmJeXl5KZY4cOcIAsPnz5yumjRkzhjk6OirVs3XrVgaAXb16Ve06l50XgNLngzHGvL29mY+PT5XXv7zY2FhmbGzMevbsySQSCWOMMYlEwoqKipTKZWZmMmtrazZu3DjFNPln0MjIiKWkpCiVHzBgABOJRCwpKUkx7f79+0wgEChtt1u3bjEALCgoSGn+WbNmMQDszJkzimnl9005R0dHNmbMGMX7Vq1aqd3f30R
"text/plain": [
"<Figure size 200x200 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Контрольная выборка: (18167, 6)\n",
"hazardous\n",
"False 16399\n",
"True 1768\n",
"Name: count, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAhUAAADECAYAAAAoGdPdAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA8hUlEQVR4nO3dd1wT9/8H8FcSQgJhLxm1iAxRUaso1oG4EbEWZ11V6Re1tWhttba2PxWtLXXUUXdbB47aCrbSasW9UOsGtyJLRURA2QRI8vn9wTf5EgJIIHCA7+fjkYfmuPvc++5yl3c+447HGGMghBBCCKklPtcBEEIIIaRpoKSCEEIIITpBSQUhhBBCdIKSCkIIIYToBCUVhBBCCNEJSioIIYQQohOUVBBCCCFEJyipIIQQQohOUFJBCCGE1LOsrCw8fPgQMpmM61B0ipIKQhqAyZMnw8jIiOswdCYkJAQ8Ho/rMMhr5smTJ9i+fbvqfVJSEnbv3s1dQGWUlJRg2bJl6NChA0QiEczNzeHq6orjx49zHZpOaZVUbN++HTweT/USi8Vwc3NDcHAw0tLS6ipGQggh5JV4PB4+/vhjHD58GElJSZg7dy7Onj3LdVgoKipC//79MX/+fPTu3Rvh4eE4evQoTpw4gW7dunEdnk7p1WShxYsXw8nJCVKpFNHR0di4cSP++ecf3Lp1C4aGhrqOkRBCCHklBwcHTJkyBYMGDQIA2NnZ4dSpU9wGBWDp0qW4ePEiDh8+jN69e3MdTp2qUVLh5+eHzp07AwCCgoJgaWmJlStXIjIyEmPHjtVpgISQhkcmk0GhUEBfX5/rUAhRs3r1asyYMQMZGRnw8PCARCLhNB6ZTIbVq1dj9uzZTT6hAHTUp6Jv374AgMTERADAixcvMGfOHLRr1w5GRkYwMTGBn58fYmNjNZaVSqUICQmBm5sbxGIx7OzsMHz4cMTHxwMobRMr2+RS/lX2IJ06dQo8Hg+///47vvrqK9ja2kIikWDo0KF4/PixxrovXryIQYMGwdTUFIaGhvDx8cG5c+cq3MbevXtXuP6QkBCNeXft2gVPT08YGBjAwsICY8aMqXD9VW1bWQqFAqtXr0bbtm0hFovRrFkzTJs2DS9fvlSbr0WLFhgyZIjGeoKDgzXKrCj25cuXa+xToLTqbuHChXBxcYFIJELz5s0xd+5cFBUVVbivyurdu7dGed9++y34fD5+/fXXGu2PFStWoHv37rC0tISBgQE8PT0RERFR4fp37doFLy8vGBoawtzcHL169cKRI0fU5jl06BB8fHxgbGwMExMTdOnSRSO28PBw1TG1srLChAkTkJKSojbP5MmT1WI2NzdH7969tap+TUlJQUBAAIyMjGBtbY05c+ZALpdrvf3lY6noM1tcXIwFCxbA09MTpqamkEgk8Pb2xsmTJ9XKUh6XFStWYPXq1XB2doZIJMKdO3cAANHR0ejSpQvEYjGcnZ2xefPmCrdNJpPhm2++US3fokULfPXVVxqfo8rOqxYtWmDy5Mmq9yUlJVi0aBFcXV0hFothaWmJnj174ujRo1Xu4/LNuIaGhmjXrh1++eWXKpcru2xSUpJq2u3bt2Fubo4hQ4aodbpLSEjAqFGjYGFhAUNDQ7z99ts4ePCgWnnKa1ZFn18jIyPV9paPuaKXsi+Bsn9OQkICfH19IZFIYG9vj8WLF6P8Q6nz8/Mxe/ZsNG/eHCKRCK1atcKKFSs05qsqhrLnt3KeK1euVLkfK+tDFBERAR6Pp1G7UN3zr0WLFgAAZ2dndO3aFS9evICBgYHGMasspuqcv5VdZ5WUx1S5Dffv38fLly9hbGwMHx8fGBoawtTUFEOGDMGtW7c0lr9+/Tr8/PxgYmICIyMj9OvXD//++6/aPMr9fObMGUybNg2WlpYwMTHBxIkTK/xeKHveAMDUqVMhFos19vOhQ4fg7e0NiUQCY2Nj+Pv74/bt21Xut/JqVFNRnjIBsLS0BFB6Mu3fvx+jRo2Ck5MT0tLSsHnzZvj4+ODOnTuwt7cHAMjlcgwZMgTHjx/HmDFj8MknnyA3NxdHjx7FrVu34OzsrFrH2LFjMXjwYLX1zps3r8J4vv32W/B4PHzxxRd4/vw5Vq9ejf79+yMmJgYGBgYAgBMnTsDPzw+enp5YuHAh+Hw+tm3bhr59++Ls2bPw8vLSKPeNN95AaGgoACAvLw8fffRRheueP38+Ro8ejaCgIKSnp2Pt2rXo1asXrl+/DjMzM41lpk6dCm9vbwDAH3/8gT///FPt79OmTcP27dsRGBiImTNnIjExEevWrcP169dx7tw5CIXCCveDNrKyslTbVpZCocDQoUMRHR2NqVOnonXr1rh58yZWrVqFBw8eYP/+/VqtZ9u2bfi///s//PDDDxg3blyF87xqf6xZswZDhw7F+PHjUVxcjN9++w2jRo3CgQMH4O/vr5pv0aJFCAkJQffu3bF48WLo6+vj4sWLOHHiBAYOHAig9OT84IMP0LZtW8ybNw9mZma4fv06oqKiVPEp932XLl0QGhqKtLQ0rFmzBufOndM4plZWVli1ahWA0k5ja9asweDBg/H48eMKj31Zcrkcvr6+6Nq1K1asWIFjx47hhx9+gLOzs9pnrTrbP23aNPTv31+t/KioKOzevRs2NjYAgJycHPzyyy8YO3YspkyZgtzcXGzZsgW+vr64dOkS3nrrLY1jJ5VKMXXqVIhEIlhYWODmzZsYOHAgrK2tERISAplMhoULF6JZs2Ya2xcUFISwsDCMHDkSs2fPxsWLFxEaGoq7d+9qHOPqCAkJQWhoKIKCguDl5YWcnBxcuXIF165dw4ABA165/KpVq2BlZYWcnBxs3boVU6ZMQYsWLTT2W1UeP36MQYMGwd3dHXv37oWeXuklNS0tDd27d0dBQQFmzpwJS0tLhIWFYejQoYiIiMCwYcO02tZevXph586dqvfffvstAODrr79WTevevbvq/3K5HIMGDcLbb7+NZcuWISoqCgsXLoRMJsPixYsBAIwxDB06FCdPnsR//vMfvPXWWzh8+DA+//xzpKSkqD7H5Sn3W9k46pI25195CxYsgFQqrfa6anP+ViYzMxNA6feVq6srFi1aBKlUivXr16NHjx64fPky3NzcAJQmqN7e3jAxMcHcuXMhFAqxefNm9O7dG6dPn0bXrl3Vyg4ODoaZmRlCQkJw//59bNy4EcnJyarEpiILFy7Eli1b8Pvvv6slhDt37sSkSZPg6+uLpUuXoqCgABs3bkTPnj1x/fp1VcL2SkwL27ZtYwDYsWPHWHp6Onv8+DH77bffmKWlJTMwMGBPnjxhjDEmlUqZXC5XWzYxMZGJRCK2ePFi1bStW7cyAGzlypUa61IoFKrlALDly5drzNO2bVvm4+Ojen/y5EkGgDk4OLCcnBzV9L179zIAbM2aNaqyXV1dma+vr2o9jDFWUFDAnJyc2IABAzTW1b17d+bh4aF6n56ezgCwhQsXqqYlJSUxgUDAvv32W7Vlb968yfT09DSmx8XFMQAsLCxMNW3hwoWs7GE5e/YsA8B2796ttmxUVJTGdEdHR+bv768R+8cff8zKH+rysc+dO5fZ2NgwT09PtX26c+dOxufz2dmzZ9WW37RpEwPAzp07p7G+snx8fFTlHTx4kOnp6bHZs2dXOG919gdjpceprOLiYubh4cH69u2rVhafz2fDhg3T+Cwqj3lWVhYzNjZmXbt2ZYWFhRXOU1xczGxsbJiHh4faPAcOHGAA2IIFC1TTJk2axBwdHdXK+emnnxgAdunSpQq3ueyyANTOD8YY69ixI/P09NR6+8uLi4tjpqambMCAAUwmkzHGGJPJZKyoqEhtvpcvX7JmzZqxDz74QDVNeQ6amJiw58+fq80fEBDAxGIxS05OVk27c+cOEwgEasctJiaGAWBBQUFqy8+ZM4cBYCdOnFBNK//ZVHJ0dGSTJk1Sve/QoUOFn/dXUV7HEhMTVdMePHjAALBly5ZVe9kXL16wNm3asFatWrGMjAy1+WbNmsUAqJ03ubm5zMnJibVo0UL1mVRes8LDwzXWJZFI1La3rLL
"text/plain": [
"<Figure size 200x200 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Тестовая выборка: (18168, 6)\n",
"hazardous\n",
"False 16400\n",
"True 1768\n",
"Name: count, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAfQAAADECAYAAABp29OTAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA5wElEQVR4nO3dd1xT1/sH8E8SQgJhhi0OkKEoThTrQNCqiFjFqrRq6/riqLWtVmtr+1VxtPxad922VVGsrava2jqrVrFW6wBFUZHhqrIUhEAISc7vD5p8CWELXEye9+uVl+Zy7rnPvbk5zx3n3PAYYwyEEEIIeanxuQ6AEEIIIS+OEjohhBBiACihE0IIIQaAEjohhBBiACihE0IIIQaAEjohhBBiACihE0IIIQaAEjohhBBiACihE0IIMRp5eXlIS0uDTCbjOpQ6RwmdkEZg/PjxsLCw4DqMOhMZGQkej8d1GKSB5OfnY9WqVdr3OTk5WLduHXcBlcIYw+bNm/HKK6/A3NwcVlZWcHd3R0xMDNeh1bkaJfRt27aBx+NpX2KxGN7e3pg+fTrS09PrK0ZCCCGNmJmZGf773/9i586dePDgASIjI/HLL79wHRYAYPTo0Zg6dSp8fHywY8cOHD9+HCdOnMDrr7/OdWh1zqQ2My1atAju7u6Qy+WIjY3Fhg0b8NtvvyEhIQHm5uZ1HSMhhJBGTCAQYOHChRg7dizUajWsrKzw66+/ch0Wtm/fjh9//BExMTEYPXo01+HUu1ol9JCQEHTp0gUAEBERATs7O6xYsQIHDx7EqFGj6jRAQkjjo1QqoVarYWpqynUopJGYNWsW3njjDTx48AA+Pj6wsbHhOiQsXboUo0aNMopkDtTRPfS+ffsCAFJTUwEAT58+xezZs9GuXTtYWFjAysoKISEhiI+P15tXLpcjMjIS3t7eEIvFcHFxweuvv47k5GQAQFpams5l/rKvoKAgbV2nT58Gj8fDjz/+iE8//RTOzs6QSCQYMmQIHjx4oLfsCxcuYODAgbC2toa5uTkCAwNx7ty5ctcxKCio3OVHRkbqlY2JiYGfnx/MzMwglUrx5ptvlrv8ytatNLVajVWrVqFt27YQi8VwcnLClClT8OzZM51ybm5uGDx4sN5ypk+frldnebEvXbpUb5sCQFFRERYsWABPT0+IRCI0a9YMc+bMQVFRUbnbqrSgoCC9+j7//HPw+Xx8//33tdoey5YtQ48ePWBnZwczMzP4+flh79695S4/JiYG/v7+MDc3h62tLXr37o1jx47plDl8+DACAwNhaWkJKysrdO3aVS+2PXv2aD9Te3t7vPXWW3j06JFOmfHjx+vEbGtri6CgIJw9e7bK7aTx6NEjhIWFwcLCAg4ODpg9ezZUKlWN179sLOXtswqFAvPnz4efnx+sra0hkUgQEBCAU6dO6dSl+VyWLVuGVatWwcPDAyKRCDdv3gQAxMbGomvXrhCLxfDw8MCmTZvKXTelUonFixdr53dzc8Onn36qtx9V9L1yc3PD+PHjte+Li4uxcOFCeHl5QSwWw87ODr169cLx48cr3cZlbx2am5ujXbt2+Pbbb2s0X3mvbdu2acvfunULI0aMgFQqhVgsRpcuXfDzzz/r1ZuTk4OZM2fCzc0NIpEITZs2xdixY5GVlaVt0yp7ld5WV69eRUhICKysrGBhYYFXX30Vf/31V63X/+TJkwgICIBEIoGNjQ2GDh2KxMREnTKl+0s0bdoU3bt3h4mJCZydncHj8XD69OlKt6tmfs3L0tIS/v7+OHDggE65oKAg+Pr6VliPZj/VfAYymQwJCQlo1qwZQkNDYWVlBYlEUuF3MiUlBSNHjoRUKoW5uTleeeUVvasMNckxNWn7apKLKlOrM/SyNMnXzs4OQMmGOXDgAEaOHAl3d3ekp6dj06ZNCAwMxM2bN9GkSRMAgEqlwuDBg/H777/jzTffxAcffIC8vDwcP34cCQkJ8PDw0C5j1KhRGDRokM5y586dW248n3/+OXg8Hj7++GNkZGRg1apV6NevH+Li4mBmZgagZEcNCQmBn58fFixYAD6fj61bt6Jv3744e/Ys/P399ept2rQpoqKiAJR0AnnnnXfKXfa8efMQHh6OiIgIZGZmYs2aNejduzeuXr1a7lHr5MmTERAQAADYv38/fvrpJ52/T5kyBdu2bcOECRPw/vvvIzU1FWvXrsXVq1dx7tw5CIXCcrdDTeTk5GjXrTS1Wo0hQ4YgNjYWkydPho+PD65fv46VK1fizp07el+6qmzduhX//e9/sXz58gqPmqvaHqtXr8aQIUMwZswYKBQK/PDDDxg5ciQOHTqE0NBQbbmFCxciMjISPXr0wKJFi2BqaooLFy7g5MmTGDBgAICSxm3ixIlo27Yt5s6dCxsbG1y9ehVHjhzRxqfZ9l27dkVUVBTS09OxevVqnDt3Tu8ztbe3x8qVKwEADx8+xOrVqzFo0CA8ePCgyjMWlUqF4OBgdOvWDcuWLcOJEyewfPlyeHh46Oxr1Vn/KVOmoF+/fjr1HzlyBDt37oSjoyMA4Pnz5/j2228xatQoTJo0CXl5efjuu+8QHByMixcvomPHjnqfnVwux+TJkyESiSCVSnH9+nUMGDAADg4OiIyMhFKpxIIFC+Dk5KS3fhEREYiOjsaIESMwa9YsXLhwAVFRUUhMTNT7jKsjMjISUVFRiIiIgL+/P54/f45Lly7hypUr6N+/f5Xzr1y5Evb29nj+/Dm2bNmCSZMmwc3NTW+7afTu3Rs7duzQvv/8888BAJ999pl2Wo8ePQAAN27cQM+ePeHq6opPPvkEEokEu3fvRlhYGPbt24dhw4YBKGlHAgICkJiYiIkTJ6Jz587IysrCzz//jIcPH2rv+2ps3rwZiYmJ2n0MANq3b69dZkBAAKysrDBnzhwIhUJs2rQJQUFB+OOPP9CtW7carf+JEycQEhKCli1bIjIyEoWFhVizZg169uyJK1euwM3NrcJtu3z58hr3q9KsZ1ZWFtavX4+RI0ciISEBrVq1qlE9GtnZ2QCAL7/8Es7Ozvjoo48gFovxzTffoF+/fjh+/Dh69+4NAEhPT0ePHj1QUFCA999/H3Z2doiOjsaQIUOwd+9e7eelUZ0cU1ZFbV9tclGFWA1s3bqVAWAnTpxgmZmZ7MGDB+yHH35gdnZ2zMzMjD18+JAxxphcLmcqlUpn3tTUVCYSidiiRYu007Zs2cIAsBUrVugtS61Wa+cDwJYuXapXpm3btiwwMFD7/tSpUwwAc3V1Zc+fP9dO3717NwPAVq9era3by8uLBQcHa5fDGGMFBQXM3d2d9e/fX29ZPXr0YL6+vtr3mZmZDABbsGCBdlpaWhoTCATs888/15n3+vXrzMTERG96UlISA8Cio6O10xYsWMBKfyxnz55lANjOnTt15j1y5Ije9BYtWrDQ0FC92N99911W9qMuG/ucOXOYo6Mj8/Pz09mmO3bsYHw+n509e1Zn/o0bNzIA7Ny5c3rLKy0wMFBb36+//spMTEzYrFmzyi1bne3BWMnnVJpCoWC+vr6sb9++OnXx+Xw2bNgwvX1R85nn5OQwS0tL1q1bN1ZYWFhuGYVCwRwdHZmvr69OmUOHDjEAbP78+dpp48aNYy1atNCpZ/PmzQwAu3jxYrnrXHpeADrfD8YY69SpE/Pz86vx+peVlJTErK2tWf/+/ZlSqWSMMaZUKllRUZFOuWfPnjEnJyc2ceJE7TTNd9DKyoplZGTolA8LC2NisZjdu3dPO+3mzZtMIBDofG5xcXEMAIuIiNCZf/bs2QwAO3nypHZa2X1To0WLFmzcuHHa9x06dCh3f6+Kph1LTU3VTrtz5w4DwL766qtq11N63y7r1VdfZe3atWNyuVw7Ta1Wsx49ejAvLy/ttPnz5zMAbP/+/Xp1lG6bNMrbxzTCwsKYqakpS05O1k77559/mKWlJevdu7d2WnXXv2PHjszR0ZFlZ2drp8XHxzM+n8/Gjh2rnVb2O5qRkcEsLS1ZSEgIA8BOnTpVbrwVzc8YY8eOHWMA2O7du7XTAgMDWdu2bSusR7Ofbt26Vee
"text/plain": [
"<Figure size 200x200 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Вывод распределения количества наблюдений по меткам (классам)\n",
"print(data.hazardous.value_counts())\n",
"print()\n",
"\n",
"\n",
"data = data[['est_diameter_min', 'est_diameter_max', 'relative_velocity', 'miss_distance', 'absolute_magnitude', 'hazardous']].copy()\n",
"\n",
"data_train, data_val, data_test = split_stratified_into_train_val_test(\n",
" data, stratify_colname=\"hazardous\", frac_train=0.60, frac_val=0.20, frac_test=0.20\n",
")\n",
"\n",
"print(\"Обучающая выборка: \", data_train.shape)\n",
"print(data_train.hazardous.value_counts())\n",
"hazardous_counts = data_train['hazardous'].value_counts()\n",
"plt.figure(figsize=(2, 2))# Установка размера графика\n",
"plt.pie(hazardous_counts, labels=hazardous_counts.index, autopct='%1.1f%%', startangle=90)# Построение круговой диаграммы\n",
"plt.title('Распределение классов hazardous в обучающей выборке')# Добавление заголовка\n",
"plt.show()# Отображение графика\n",
"\n",
"print(\"Контрольная выборка: \", data_val.shape)\n",
"print(data_val.hazardous.value_counts())\n",
"hazardous_counts = data_val['hazardous'].value_counts()\n",
"plt.figure(figsize=(2, 2))\n",
"plt.pie(hazardous_counts, labels=hazardous_counts.index, autopct='%1.1f%%', startangle=90)\n",
"plt.title('Распределение классов hazardous в контрольной выборке')\n",
"plt.show()\n",
"\n",
"print(\"Тестовая выборка: \", data_test.shape)\n",
"print(data_test.hazardous.value_counts())\n",
"hazardous_counts = data_test['hazardous'].value_counts()\n",
"plt.figure(figsize=(2, 2))\n",
"plt.pie(hazardous_counts, labels=hazardous_counts.index, autopct='%1.1f%%', startangle=90)\n",
"plt.title('Распределение классов hazardous в тестовой выборке')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Обучающая выборка после oversampling: (100249, 6)\n",
"hazardous\n",
"True 51052\n",
"False 49197\n",
"Name: count, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAqwAAADECAYAAABa+nMuAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABA6klEQVR4nO3dd1gU1/oH8O/uArtUASmiUUSwYomgJkEROxpNYo/GWH+2GJOIGo0aFVu4XpPYS5olajRqLDcaazSJeo29tyBq7FTpZWH3/P7g7oZllyowC3w/z8OjOztz5p3ZM7PvnjlzRiaEECAiIiIiMlNyqQMgIiIiIsoPE1YiIiIiMmtMWImIiIjIrDFhJSIiIiKzxoSViIiIiMwaE1YiIiIiMmtMWImIiIjIrDFhJSIiIiKzxoSViIionNNqtYiJicHdu3elDoWoVDBhJTIDw4YNg52dndRhlJjQ0FDIZDKpwyB6YSdOnMBvv/2mf/3bb7/h5MmT0gWUw7NnzzBhwgR4enrCysoKrq6uaNSoERITE6UOjajEFSlhXb9+PWQymf5PpVKhXr16GD9+PCIjI0srRiIiIkk8fPgQ48aNw9WrV3H16lWMGzcODx8+lDos3LlzBy1btsTWrVsxZswY7N27F4cPH8avv/4KW1tbqcOjIvjll18gk8lQvXp1aLVak/PUrl1bn3vJ5XI4OjqiSZMmGD16NE6fPv3C5avVaixduhTNmzeHg4MDHB0d4evri9GjR+PWrVsAgG7dusHJyclkvpeQkAAPDw+88sor0Gq1+O233/Txnj9/3mj+4jTSWBRp7v+ZO3cuvLy8kJ6ejhMnTmD16tX45ZdfcO3aNdjY2BSnSCIiIrPTu3dvLFmyBE2bNgUAvPbaa+jdu7fEUQFjxoyBlZUV/vzzT9SoUUPqcOgFbN68GbVr18b9+/dx9OhRdOrUyeR8L7/8MiZNmgQASEpKws2bN7F9+3Z88803CAkJwZdfflns8vv06YP9+/dj4MCBGDVqFDIzM3Hr1i3s3bsXAQEBaNCgAVatWoXGjRsjJCQEP/zwg8Hy06dPR0xMDA4cOAC53LAtNDQ0FD///HNxdo0hUQTr1q0TAMTZs2cNpk+cOFEAED/88ENRiiOi/xk6dKiwtbWVOoxCy8zMFBkZGXm+P3v2bFHE0wuR2crKyhKXLl0Sly5dEllZWVKHI86dOycAiEOHDkkdCr2g5ORkYWtrK5YtWyaaN28uhg0bZnI+T09P0b17d6PpqampomfPngKAWLVqVbHKP3PmjAAgFixYYPReVlaWiImJ0b9euHChACAOHjxosLxcLhdTpkzRTzt27JgAIF5++WUBQJw/f96g3OJ855VIH9YOHToAAO7duwcAiIuLw+TJk9GkSRPY2dnBwcEB3bp1w+XLl42WTU9PR2hoKOrVqweVSgUPDw/07t0bERERAID79+8bdEPI/deuXTt9Wbom6B9//BHTp09HtWrVYGtrizfffNPkJZzTp0+ja9euqFKlCmxsbBAUFJRn36R27dqZXH9oaKjRvJs2bYK/vz+sra3h7OyMAQMGmFx/ftuWk1arxZIlS+Dr6wuVSgV3d3eMGTMGz58/N5ivdu3a6NGjh9F6xo8fb1SmqdgXLVpktE8BICMjA7Nnz4aPjw+USiVq1qyJKVOmICMjw+S+yqldu3ZG5S1YsAByudzoF1ph98fnn3+OgIAAVK1aFdbW1vD398eOHTtMrn/Tpk1o1aoVbGxs4OTkhLZt2+LQoUMG8+zfvx9BQUGwt7eHg4MDWrZsaRTb9u3b9Z+pi4sL3n33XTx+/NhgnmHDhhnE7OTkhHbt2uH48eMF7iedx48fo2fPnrCzs4OrqysmT54MjUZT5O3PHYupOqtWqzFr1iz4+/ujSpUqsLW1RWBgII4dO2ZQlu5z+fzzz7FkyRJ4e3tDqVTixo0bALL7+LVs2RIqlQre3t746quvTG5bVlYW5s2bp1++du3amD59ulE9yuu4ql27NoYNG6Z/nZmZiTlz5qBu3bpQqVSoWrUq2rRpg8OHD+e7j3N3bbKxsUGTJk3w7bffFmk5U3/r168H8M/lrrt37yI4OBi2traoXr065s6dCyGEQblSHt9FPWeW5HGwZ88edO/eHdWrV4dSqYS3tzfmzZtnVN9NbYvus7h//36x9k9h66KuzikUCjRr1gzNmjXDzp07IZPJULt2baN15Zb7Mm61atXw9ttv48GDB/p5ch5fecndJ/zPP/+ESqVCREQEfH19oVQqUa1aNYwZMwZxcXFGyxf2cytMndXFq6vrQHZrn7+/P7y8vPD06VP99MLWbVPyO4fJZDKDfsWF3UYAuHXrFvr37w9XV1dYW1ujfv36mDFjhtF8OT+7/Na7f/9+BAYGwtbWFvb29ujevTuuX79e4Pbp7Nq1C2lpaejXrx8GDBiAnTt3Ij09vdDLW1tbY+PGjXB2dsaCBQuMzi+FKV+Xb7Vu3dqofIVCgapVq+pfT5w4EU2bNsW4ceOQnp4OjUaDsWPHwtPTE7NnzzZa/oMPPoCTk5PJc3pRFatLQG66jdVt1N27d7F7927069cPXl5eiIyMxFdffYWgoCDcuHED1atXBwBoNBr06NEDv/76KwYMGICPPvoISUlJOHz4MK5duwZvb2/9OgYOHIjXX3/dYL3Tpk0zGc+CBQsgk8kwdepUREVFYcmSJejUqRMuXboEa2trAMDRo0fRrVs3+Pv7Y/bs2ZDL5Vi3bh06dOiA48ePo1WrVkblvvTSSwgLCwMAJCcn47333jO57pkzZ6J///4YOXIkoqOjsXz5crRt2xYXL16Eo6Oj0TKjR49GYGAgAGDnzp3YtWuXwftjxozB+vXrMXz4cHz44Ye4d+8eVqxYgYsXL+LkyZOwtLQ0uR+KIj4+Xr9tOWm1Wrz55ps4ceIERo8ejYYNG+Lq1atYvHgx/vrrL+zevbtI61m3bh0+/fRTfPHFF3jnnXdMzlPQ/li6dCnefPNNDBo0CGq1Glu3bkW/fv2wd+9edO/eXT/fnDlzEBoaioCAAMydOxdWVlY4ffo0jh49ii5dugDI/uIbMWIEfH19MW3aNDg6OuLixYs4cOCAPj7dvm/ZsiXCwsIQGRmJpUuX4uTJk0afqYuLCxYvXgwAePToEZYuXYrXX38dDx8+NPnZ56TRaBAcHIxXXnkFn3/+OY4cOYIvvvgC3t7eBnWtMNs/ZswYo8s+Bw4cwObNm+Hm5gYASExMxLfffqu/BJSUlITvvvsOwcHBOHPmDF5++WWjzy49PR2jR4+GUqmEs7Mzrl69ii5dusDV1RWhoaHIysrC7Nmz4e7ubrR9I0eOxIYNG9C3b19MmjQJp0+fRlhYGG7evGn0GRdGaGgowsLCMHLkSLRq1QqJiYk4d+4cLly4gM6dOxe4/OLFi+Hi4oLExESsXbsWo0aNQu3atfO8HNe2bVts3LhR/3rBggUAYPBlFxAQoP+/RqNB165d8eqrr+Lf//43Dhw4gNmzZyMrKwtz587Vzyfl8Z1zWwo6Z5b0cbB+/XrY2dlh4sSJsLOzw9GjRzFr1iwkJiZi0aJFL7zN+SluXczKyjKZ3OQnMDAQo0ePhlarxbVr17BkyRI8efKkSD9kc4uNjUV6ejree+89dOjQAWPHjkVERARWrlyJ06dP4/Tp01AqlQCK9rkVts7mlJmZiT59+uDBgwc4efIkPDw89O+9aN1WKpVGPyTPnj2LZcuWGUwr7DZeuXIFgYGBsLS0xOjRo1G7dm1ERETg559/1h/POek+OwC4efMmPvvsM4P3N27ciKFDhyI4OBgLFy5EamoqVq9ejTZt2uDixYuF+lGzefNmtG/fHtWqVcOAAQPwySef4Oeff0a/fv0KXFbHzs4OvXr1wnfffYcbN27A19e3SOV7enrq523dujUsLPJODS0sLPD1118jICAA8+bNg5ubGy5cuIADBw6Y7BLq4OCAkJAQzJo1CxcuXICfn1+ht8tIUZpjdV0Cjhw5IqKjo8XDhw/F1q1bRdWqVYW1tbV49OiREEKI9PR0odFoDJa9d++eUCqVYu7cufp
"text/plain": [
"<Figure size 200x200 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from imblearn.over_sampling import ADASYN\n",
"\n",
"# Создание экземпляра ADASYN\n",
"ada = ADASYN()\n",
"\n",
"# Применение ADASYN\n",
"X_resampled, y_resampled = ada.fit_resample(data_train.drop(columns=['hazardous']), data_train['hazardous'])\n",
"\n",
"# Создание нового DataFrame\n",
"data_train_adasyn = pd.DataFrame(X_resampled)\n",
"data_train_adasyn['hazardous'] = y_resampled # Добавление целевой переменной\n",
"\n",
"# Вывод информации о новой выборке\n",
"print(\"Обучающая выборка после oversampling: \", data_train_adasyn.shape)\n",
"print(data_train_adasyn['hazardous'].value_counts())\n",
"hazardous_counts = data_train_adasyn['hazardous'].value_counts()\n",
"plt.figure(figsize=(2, 2))\n",
"plt.pie(hazardous_counts, labels=hazardous_counts.index, autopct='%1.1f%%', startangle=90)\n",
"plt.title('Распределение классов hazardous в тренировачной выборке после ADASYN')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<p style=\"margin: 40px;\">проведём также балансировку данных методом андерсемплинга. Этот метод помогает сбалансировать выборку, уменьшая количество экземпляров класса большинства, чтобы привести е г о в соответствие с классом меньшинства.</p>"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Обучающая выборка после undersampling: (10608, 6)\n",
"hazardous\n",
"False 5304\n",
"True 5304\n",
"Name: count, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAuYAAADECAYAAADTYuRHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABAH0lEQVR4nO3dd1gUV9sG8Ht3gaWrFClqFEFBxIolsWFHxc8aNUZjiy3G5NXEFDXGEpXXmNhbEmM3mthfNdZYYo8aCzZE7I0iUgQW2N3z/UF2w7JLdXEQ7t917aU7O3PmmTNnZp6dPXOQCSEEiIiIiIhIUnKpAyAiIiIiIibmRERERETFAhNzIiIiIqJigIk5EREREVExwMSciIiIiKgYYGJORERERFQMMDEnIiIiIioGmJgTERERERUDTMyJiIhec1qtFrGxsbh9+7bUoRDRS2BiTlQMDBo0CPb29lKHYTZTpkyBTCaTOgyil3b8+HEcOXJE//7IkSM4ceKEdAFl8fTpU4wZMwaVK1eGlZUVXF1d4e/vj8TERKlDo2Jo0KBBqFKlitRhFAtHjhyBTCYzOLaLS/0UKDFftWoVZDKZ/mVtbY3q1atj9OjRiIqKKqoYiYiIJPHgwQOMGjUKYWFhCAsLw6hRo/DgwQOpw8KtW7fQsGFDbNy4ESNGjMCuXbtw4MAB/PHHH7Czs5M6PMqDLp86d+6cyc87d+5cLJJEevUsCrPQtGnT4OXlBZVKhePHj2Pp0qX4/fffceXKFdja2po7RiIiIkn06NED8+bNQ+3atQEAb731Fnr06CFxVMCIESNgZWWF06dPo0KFClKHQ/Ta++mnn6DVaqUOo3CJeceOHdGgQQMAwNChQ+Hs7Iw5c+Zgx44d6Nu3r1kDJKLiR61WQ6vVwsrKSupQiIqUUqnEyZMnceXKFQBAQEAAFAqFpDGdP38ehw4dwv79+5mUkySEEFCpVLCxsZE6FLOxtLSUOgQAZupj3rp1awDAnTt3AABxcXEYN24catWqBXt7ezg6OqJjx464dOmS0bIqlQpTpkxB9erVYW1tDQ8PD/To0QORkZEAgLt37xp0n8n+atmypb4sXZ+hX3/9FRMmTIC7uzvs7OzQpUsXkz89njlzBh06dECZMmVga2uLoKCgHPsOtmzZ0uT6p0yZYjTvunXrEBgYCBsbGzg5OeGdd94xuf7cti0rrVaLefPmoWbNmrC2toabmxtGjBiB58+fG8xXpUoVdO7c2Wg9o0ePNirTVOyzZ882qlMASEtLw+TJk+Hj4wOlUolKlSrh888/R1pamsm6yqply5ZG5c2YMQNyuRy//PJLoerju+++Q5MmTeDs7AwbGxsEBgZi8+bNJte/bt06NGrUCLa2tihXrhxatGiB/fv3G8yzZ88eBAUFwcHBAY6OjmjYsKFRbJs2bdLvUxcXF/Tv3x+PHj0ymGfQoEEGMZcrVw4tW7bEsWPH8qwnnUePHqFbt26wt7eHq6srxo0bB41GU+Dtzx6LqTabnp6Or7/+GoGBgShTpgzs7OzQvHlzHD582KAs3X757rvvMG/ePHh7e0OpVOLatWsAMvvgNmzYENbW1vD29sYPP/xgctvUajW++eYb/fJVqlTBhAkTjNpRTsdVlSpVMGjQIP37jIwMTJ06FdWqVYO1tTWcnZ3RrFkzHDhwINc6zt4lz9bWFrVq1cLy5csLtJyp16pVqwD8+8zA7du3ERwcDDs7O3h6emLatGkQQhiUK+XxXdBzprmPgyVLlqBmzZpQKpXw9PTEhx9+iPj4+Dy3Rbcv7t69W6j6yW9b1LU5hUKBOnXqoE6dOti6dStkMlm+uhlUqVJFXw9yuRzu7u7o06cP7t+/r58n6/GVk+zPbJw+fRrW1taIjIzU15+7uztGjBiBuLg4o+Xzu9/y02Z18eraOgAkJSUhMDAQXl5eePLkiX56ftu2Kbmdw7L3Dc7vNgLAjRs30Lt3b7i6usLGxga+vr6YOHGi0XxZ911u692zZw+aN28OOzs7ODg4ICQkBFevXs1z+woqazv58ccf9W23YcOGOHv2rNH827dvR0BAAKytrREQEIBt27aZLLeg5599+/ahQYMGsLGx0Z/rDxw4gGbNmqFs2bKwt7eHr68vJkyYoF+2MNeaxYsXo2rVqrC1tUX79u3x4MEDCCHwzTffoGLFirCxsUHXrl2N2rsuzv3796Nu3bqwtraGv78/tm7dmmcdZ+9jXtA637RpE/z9/Q3qvDD91gt1xzw7XRLt7OwMALh9+za2b9+OXr16wcvLC1FRUfjhhx8QFBSEa9euwdPTEwCg0WjQuXNn/PHHH3jnnXfwn//8B0lJSThw4ACuXLkCb29v/Tr69u2LTp06Gax3/PjxJuOZMWMGZDIZvvjiC0RHR2PevHlo27YtLl68qP92d+jQIXTs2BGBgYGYPHky5HI5Vq5cidatW+PYsWNo1KiRUbkVK1ZEaGgoAODFixf44IMPTK570qRJ6N27N4YOHYqYmBgsXLgQLVq0wIULF1C2bFmjZYYPH47mzZsDALZu3Wp0AI0YMQKrVq3C4MGD8fHHH+POnTtYtGgRLly4gBMnTpjlW158fLx+27LSarXo0qULjh8/juHDh6NGjRoICwvD3LlzcfPmTWzfvr1A61m5ciW++uorfP/993j33XdNzpNXfcyfPx9dunRBv379kJ6ejo0bN6JXr17YtWsXQkJC9PNNnToVU6ZMQZMmTTBt2jRYWVnhzJkzOHToENq3bw8g8wI/ZMgQ1KxZE+PHj0fZsmVx4cIF7N27Vx+fru4bNmyI0NBQREVFYf78+Thx4oTRPnVxccHcuXMBAA8fPsT8+fPRqVMnPHjwwOS+z0qj0SA4OBiNGzfGd999h4MHD+L777+Ht7e3QVvLz/aPGDECbdu2NSh/7969WL9+PcqXLw8ASExMxPLly9G3b18MGzYMSUlJ+PnnnxEcHIy//voLdevWNdp3KpUKw4cPh1KphJOTE8LCwtC+fXu4urpiypQpUKvVmDx5Mtzc3Iy2b+jQoVi9ejXefvttfPrppzhz5gxCQ0Nx/fr1HC8auZkyZQpCQ0MxdOhQNGrUCImJiTh37hz+/vtvtGvXLs/l586dCxcXFyQmJmLFihUYNmwYqlSpYlRvOi1atMDatWv172fMmAEABhf1Jk2a6P+v0WjQoUMHvPnmm/j222+xd+9eTJ48GWq1GtOmTdPPJ+XxnXVb8jpnmvs4mDJlCqZOnYq2bdvigw8+QHh4OJYuXYqzZ8+abbtzUti2qFarTSZxuWnevDmGDx8OrVaLK1euYN68eXj8+HGBvrBn9+zZM6hUKnzwwQdo3bo1Ro4cicjISCxevBhnzpzBmTNnoFQqARRsv+W3zWaVkZGBnj174v79+zhx4gQ8PDz0n71s21YqlUZfmM+ePYsFCxYYTMvvNl6+fBnNmzeHpaUlhg8fjipVqiAyMhI7d+7UH89Z6fYdAFy/fh0zZ840+Hzt2rUYOHAggoODMWvWLKSkpGDp0qVo1qwZLly4UCR9xH/55RckJSVhxIgRkMlk+Pbbb9GjRw/cvn1bX5/79+9Hz5494e/vj9DQUDx79gyDBw9GxYoVjcoryD4KDw9H3759MWLECAwbNgy+vr64evUqOnfujNq1a2PatGlQKpW4deuWwU3Ogl5r1q9fj/T0dHz00UeIi4vDt99+i969e6N169Y4cuQIvvjiC9y6dQsLFy7EuHHjsGLFCoPlIyIi0KdPH4wcORIDBw7EypUr0atXL+zduzdf14bC1Pnu3bvRp08f1KpVC6GhoXj+/Dnef//9wv2iJQpg5cqVAoA4ePCgiImJEQ8ePBAbN24Uzs7OwsbGRjx8+FAIIYRKpRIajcZg2Tt37gilUimmTZumn7ZixQoBQMyZM8doXVqtVr8cADF79myjeWrWrCmCgoL07w8fPiwAiAoVKojExET99N9++00AEPPnz9eXXa1aNREcHKxfjxBCpKSkCC8vL9GuXTujdTVp0kQEBATo38fExAgAYvLkyfppd+/eFQqFQsyYMcNg2bCwMGFhYWE0PSIiQgAQq1e
"text/plain": [
"<Figure size 200x200 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from imblearn.under_sampling import RandomUnderSampler\n",
"\n",
"rus = RandomUnderSampler()# Создание экземпляра RandomUnderSampler\n",
"\n",
"# Применение RandomUnderSampler\n",
"X_resampled, y_resampled = rus.fit_resample(data_train.drop(columns=['hazardous']), data_train['hazardous'])\n",
"\n",
"# Создание нового DataFrame\n",
"data_train_undersampled = pd.DataFrame(X_resampled)\n",
"data_train_undersampled['hazardous'] = y_resampled # Добавление целевой переменной\n",
"\n",
"# Вывод информации о новой выборке\n",
"print(\"Обучающая выборка после undersampling: \", data_train_undersampled.shape)\n",
"print(data_train_undersampled['hazardous'].value_counts())\n",
"\n",
"# Визуализация распределения классов\n",
"hazardous_counts = data_train_undersampled['hazardous'].value_counts()\n",
"plt.figure(figsize=(2, 2))\n",
"plt.pie(hazardous_counts, labels=hazardous_counts.index, autopct='%1.1f%%', startangle=90)\n",
"plt.title('Распределение классов hazardous в тренировочной выборке после Undersampling')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
2024-10-19 17:27:50 +04:00
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}