PIbd42NevaevaCourses/lec5.ipynb

568 lines
340 KiB
Plaintext
Raw Normal View History

2025-01-20 03:36:42 +04:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Лабораторная работа 5\n",
"\n",
"Определение бизнес-цели для решения задачи кластеризации\n",
"\n",
"Бизнес-цель: Идентификация временных периодов с похожими рыночными условиями на основе исторических данных о ценах акций.\n",
"\n",
"Постановка задачи:Группировка временных периодов (например, дней) на основе схожих характеристик рыночной активности.\n",
" \n",
"Столбцы датасета и их пояснение:\n",
"\n",
"Date - Дата, на которую относятся данные. Эта характеристика указывает конкретный день, в который происходила торговля акциями Starbucks.\n",
"\n",
"Open - Цена открытия. Стоимость акций Starbucks в начале торгового дня. Это важный показатель, который показывает, по какой цене начались торги в конкретный день, и часто используется для сравнения с ценой закрытия для определения дневного тренда.\n",
"\n",
"High- Максимальная цена за день. Наибольшая цена, достигнутая акциями Starbucks в течение торгового дня. Эта характеристика указывает, какой была самая высокая стоимость акций за день.\n",
"\n",
"Low- Минимальная цена за день. Наименьшая цена, по которой торговались акции Starbucks в течение дня.\n",
"\n",
"Close- Цена закрытия. Стоимость акций Starbucks в конце торгового дня. Цена закрытия — один из основных показателей, используемых для анализа акций, так как она отображает итоговую стоимость акций за день и часто используется для расчета дневных изменений и трендов на длительных временных периодах.\n",
"\n",
"Adj Close - Скорректированная цена закрытия. Цена закрытия, скорректированная с учетом всех корпоративных действий.\n",
"\n",
"Volume- Объем торгов. Количество акций Starbucks, проданных и купленных в течение дня. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Загрузка данных датасета"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Open</th>\n",
" <th>High</th>\n",
" <th>Low</th>\n",
" <th>Close</th>\n",
" <th>Adj Close</th>\n",
" <th>Volume</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1992-06-26</td>\n",
" <td>0.328125</td>\n",
" <td>0.347656</td>\n",
" <td>0.320313</td>\n",
" <td>0.335938</td>\n",
" <td>0.260703</td>\n",
" <td>224358400</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1992-06-29</td>\n",
" <td>0.339844</td>\n",
" <td>0.367188</td>\n",
" <td>0.332031</td>\n",
" <td>0.359375</td>\n",
" <td>0.278891</td>\n",
" <td>58732800</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1992-06-30</td>\n",
" <td>0.367188</td>\n",
" <td>0.371094</td>\n",
" <td>0.343750</td>\n",
" <td>0.347656</td>\n",
" <td>0.269797</td>\n",
" <td>34777600</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1992-07-01</td>\n",
" <td>0.351563</td>\n",
" <td>0.359375</td>\n",
" <td>0.339844</td>\n",
" <td>0.355469</td>\n",
" <td>0.275860</td>\n",
" <td>18316800</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1992-07-02</td>\n",
" <td>0.359375</td>\n",
" <td>0.359375</td>\n",
" <td>0.347656</td>\n",
" <td>0.355469</td>\n",
" <td>0.275860</td>\n",
" <td>13996800</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8031</th>\n",
" <td>2024-05-17</td>\n",
" <td>75.269997</td>\n",
" <td>78.000000</td>\n",
" <td>74.919998</td>\n",
" <td>77.849998</td>\n",
" <td>77.849998</td>\n",
" <td>14436500</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8032</th>\n",
" <td>2024-05-20</td>\n",
" <td>77.680000</td>\n",
" <td>78.320000</td>\n",
" <td>76.709999</td>\n",
" <td>77.540001</td>\n",
" <td>77.540001</td>\n",
" <td>11183800</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8033</th>\n",
" <td>2024-05-21</td>\n",
" <td>77.559998</td>\n",
" <td>78.220001</td>\n",
" <td>77.500000</td>\n",
" <td>77.720001</td>\n",
" <td>77.720001</td>\n",
" <td>8916600</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8034</th>\n",
" <td>2024-05-22</td>\n",
" <td>77.699997</td>\n",
" <td>81.019997</td>\n",
" <td>77.440002</td>\n",
" <td>80.720001</td>\n",
" <td>80.720001</td>\n",
" <td>22063400</td>\n",
" </tr>\n",
" <tr>\n",
" <th>8035</th>\n",
" <td>2024-05-23</td>\n",
" <td>80.099998</td>\n",
" <td>80.699997</td>\n",
" <td>79.169998</td>\n",
" <td>79.260002</td>\n",
" <td>79.260002</td>\n",
" <td>4651418</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>8036 rows × 7 columns</p>\n",
"</div>"
],
"text/plain": [
" Date Open High Low Close Adj Close \\\n",
"0 1992-06-26 0.328125 0.347656 0.320313 0.335938 0.260703 \n",
"1 1992-06-29 0.339844 0.367188 0.332031 0.359375 0.278891 \n",
"2 1992-06-30 0.367188 0.371094 0.343750 0.347656 0.269797 \n",
"3 1992-07-01 0.351563 0.359375 0.339844 0.355469 0.275860 \n",
"4 1992-07-02 0.359375 0.359375 0.347656 0.355469 0.275860 \n",
"... ... ... ... ... ... ... \n",
"8031 2024-05-17 75.269997 78.000000 74.919998 77.849998 77.849998 \n",
"8032 2024-05-20 77.680000 78.320000 76.709999 77.540001 77.540001 \n",
"8033 2024-05-21 77.559998 78.220001 77.500000 77.720001 77.720001 \n",
"8034 2024-05-22 77.699997 81.019997 77.440002 80.720001 80.720001 \n",
"8035 2024-05-23 80.099998 80.699997 79.169998 79.260002 79.260002 \n",
"\n",
" Volume \n",
"0 224358400 \n",
"1 58732800 \n",
"2 34777600 \n",
"3 18316800 \n",
"4 13996800 \n",
"... ... \n",
"8031 14436500 \n",
"8032 11183800 \n",
"8033 8916600 \n",
"8034 22063400 \n",
"8035 4651418 \n",
"\n",
"[8036 rows x 7 columns]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from scipy.cluster.hierarchy import dendrogram, linkage, fcluster\n",
"from sklearn.cluster import KMeans\n",
"from sklearn.decomposition import PCA\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.metrics import silhouette_score\n",
"\n",
"df = pd.read_csv(\"data/starbucks.csv\")\n",
"df "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Предобработка данных"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Загрузка и предобработка данных\n",
"data = pd.read_csv(\"data/starbucks.csv\")\n",
"features = ['Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume']\n",
"\n",
"# Масштабируем числовые данные\n",
"scaler = StandardScaler()\n",
"data_scaled = scaler.fit_transform(data[features])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Выполним понижение размерности с помощью PCA\n",
"\n",
"Используем метод анализа главных компонент (PCA) для сокращения размерности данных до двух измерений. Это позволяет визуализировать данные на плоскости и понять их структуру. Также построим график, показывающий расположение объектов в пространстве двух главных компонент."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collecting seaborn\n",
" Using cached seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)\n",
"Requirement already satisfied: numpy!=1.24.0,>=1.20 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from seaborn) (2.1.3)\n",
"Requirement already satisfied: pandas>=1.2 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from seaborn) (2.2.3)\n",
"Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from seaborn) (3.9.3)\n",
"Requirement already satisfied: contourpy>=1.0.1 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.3.1)\n",
"Requirement already satisfied: cycler>=0.10 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.12.1)\n",
"Requirement already satisfied: fonttools>=4.22.0 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.55.1)\n",
"Requirement already satisfied: kiwisolver>=1.3.1 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.7)\n",
"Requirement already satisfied: packaging>=20.0 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (24.2)\n",
"Requirement already satisfied: pillow>=8 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (11.0.0)\n",
"Requirement already satisfied: pyparsing>=2.3.1 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.2.0)\n",
"Requirement already satisfied: python-dateutil>=2.7 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (2.9.0.post0)\n",
"Requirement already satisfied: pytz>=2020.1 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from pandas>=1.2->seaborn) (2024.2)\n",
"Requirement already satisfied: tzdata>=2022.7 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from pandas>=1.2->seaborn) (2024.2)\n",
"Requirement already satisfied: six>=1.5 in c:\\users\\ateks\\courses\\courses\\.venv\\lib\\site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.4->seaborn) (1.16.0)\n",
"Using cached seaborn-0.13.2-py3-none-any.whl (294 kB)\n",
"Installing collected packages: seaborn\n",
"Successfully installed seaborn-0.13.2\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install seaborn ##не устанавливается из консоли :("
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0kAAAIlCAYAAAAAOLPVAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/GU6VOAAAACXBIWXMAAA9hAAAPYQGoP6dpAACUVklEQVR4nOzdeXgUVdYG8Leq1+whkBAghE0RURZBQRYBEdxBEBVGxwF0lE9REFwZFRB1FFccRXBFxxERBUZwREVWEdkJRJRFthACIWRPequuut8fnW66SXdIh053Et7f80RJVS+nbi+pU/fecyUhhAAREREREREBAORIB0BERERERFSXMEkiIiIiIiLywiSJiIiIiIjIC5MkIiIiIiIiL0ySiIiIiIiIvDBJIiIiIiIi8sIkiYiIiIiIyAuTJCIiIiIiIi9MkoiIiIiIiLwwSapjWrduDUmSfH5MJhPS09MxcuRI/Pzzz1Xe/+jRo3jmmWdw5ZVXIjk5GQaDAYmJiejWrRsmTpyILVu2VHl/h8OB5ORkSJKE1NRUOJ3OUB6ex/Tp0ysd55nHumHDhlp5biIiIiKiqkhCCBHpIOi01q1b48iRI+jTpw8uuOACAEBRURG2bt2KY8eOQZIkvPbaa5g8eXKl+77yyit49tln4XA4EBsbi549eyIlJQWlpaXIzMzEkSNHAACPP/44XnnlFb/P/9VXX+GOO+7w/P7f//4Xt9xyS8iPc/r06XjuuefQtGlTXH/99Z7tRUVFyMjIwJEjRyBJEmbPno0HHngg5M9PRERERBSIPtIBkH9///vfMWbMGM/vNpsN48aNw7///W888cQTuPnmm9G+fXvP/qeeegozZ86EwWDAa6+9hoceeggmk8nnMTdu3Iinn34a+/btC/i8H330EQCgRYsWOHbsGD766KNaSZLcOnTogE8++cRnm6qqePzxx/Hmm29i8uTJuP3229GkSZNai4GIiIiIyBuH29UTZrMZs2fPRkxMDFRVxeLFiz37Vq5ciZkzZwIAvvzySzz66KOVEiQAuPLKK/HTTz/h0Ucf9fscR48exYoVK6DT6bBw4UJIkoTvvvsOx48fr52DCkCn0+Gf//wndDodbDYbfvnll7A+PxERERGd35gk1SOxsbG46KKLAACHDx/2bH/hhRcAAEOHDsXw4cOrfAxJknDVVVf53ffxxx9D0zTccMMN6N27NwYOHAhVVfHpp5+G5gCCYDabkZiYCACV5kV98sknkCTJp6fNzW63o3379p45Tt4KCgowevRoXHbZZUhOTobRaERqair69OmDuXPnwuFweG67evVqSJKEDh06INCIVJvNhsaNG0OSJPz++++e7Zs3b8YTTzyBHj16IDU1FUajEU2bNsWQIUPw008/VXnca9as8TtXy/vH2+HDhyFJElq3bu338UaPHu2535o1a3z2TZkyBX379kWzZs1gNpvRqFEjdO7cGVOnTsXJkycrPdbvv/+OadOmoU+fPmjRogWMRiMaN26MQYMGYeHChVUez4ABAwIe84ABA/zGdy7HBgCZmZm466670KpVK5hMpkrtWFVMVfE3b9D758yeUSD8bRdo+9keuzrP6W3MmDGVjnnjxo0wGo2IiopCRkZGpfvs2rUL0dHRMBgM1b4A4v7MV+fH+7vRraCgAP/4xz9wySWXIDo6GnFxcejevTteeeUVWK3WgM977NgxPP744+jUqRPi4uIQExOD9u3bY8yYMQHnTNbk/QEAX3/9Na6//nrPd1OLFi3w17/+1ee7pbqq21bTp0/3e/8FCxbgmmuuQVJSEkwmE1q1aoV77rkn4CgE9zH7a3sg8Hd2Tb/LN23ahLi4OMTHx/udZ+vvPoDruzk+Ph5xcXHYtGlTpf1WqxWvv/46rrzySiQmJsJsNuOiiy7CE088gfz8/Gofl7dAbVNVm82bN6/K1+jUqVN47LHH0LFjR0RHR5/178TZeN/ngw8+QPfu3RETE4PExETceOON2Lhxo9/71eR7bdOmTRgxYgQuuugiJCYmet5fQ4YMwXfffVfp9t6f/X79+gU8hsGDB5/1M7Zy5UrceuutaNasGYxGI1JSUjB8+HD8+uuvIW2Xmn4e3H744QfcfPPNSElJgdFoRPPmzTFy5Ehs3bo14PELIbB48WLcfPPNnvOO1NRU9O3bFzNnzvR8z53t+ynQ30h/3/Vue/fu9fyNrenf1bqKw+3qmZKSEgDw9BQVFRVh3bp1AFwnjTUlhMC8efMAAPfcc4/n/ytXrsS8efPw1FNP+b3fmDFj8Omnn2L06NEBv5hq4uDBg54/Spdcckm17/fqq69i//79fvcVFBRg4cKF6NSpE/r06YOYmBicOHEC69evx4YNG/DNN99g+fLlAICrr74anTp1QmZmJn766ScMHjy40uN98cUXKCgowNVXX42OHTt6tv/jH//A6tWrcckll3i+VA8cOIBvv/0W3377LWbNmoWJEydWeRxnztUCEHSyun79evz73/8OuH/+/PmIjo5G9+7dkZCQgJKSEvzyyy94/vnnMW/ePOzYscNnmOMbb7yBjz76CB06dECnTp2QmJiIrKwsrF69GitXrsTGjRvxxhtvBBVjTZ3t2Hbu3InevXvDYrGgefPmuPHGG5GQkAAAOHHiBH744YdzjmHEiBGIjY31ienAgQN+b1uX2q62XXnllXj55Zfx6KOP4o477sC2bdsQFxcHACgtLcXtt98Oq9WKV155BX369Anqsdu1a4e+ffv63ff111+jvLy80vaDBw9i4MCBOHLkCJKTk3HjjTdCURSsXr0aTz75JL788kv89NNPaNSokc/9Vq5cidtuuw1FRUVISUnBNddcA6PRiMOHD2P+/PkAgN69eweMtbrvD6fTibvuugsLFy6EyWRC9+7d0aJFC+zbtw+ff/45Fi9ejMWLF1f6PqiOQH8TMjIysHPnzkrbhRAYM2YM/v3vf0Ov16Nfv35ISUnB9u3bMW/ePHz55ZdYtGhRjWIJVlXf5T179sTSpUtx44034vrrr8fatWtx6aWXVvl4v/32G2644QYoioLly5ejZ8+ePvtzcnJw/fXXIzMzE0lJSbjiiisQFxeH7du349VXX8VXX32FNWvWoFWrViE7Rn8KCwvx5JNPBtxvsVjQu3dv7N+/H9HR0RgwYICn2BIQ/N8Jb5MnT8asWbPQp08f3HLLLcjMzMTy5cuxYsUKLFy4sNJF2Jp8r2VmZmL16tXo1KkTOnXqBL1e7/P38aWXXgp4vvHzzz8jIyMDXbt29dm+e/fus16AfOyxx/D6669DlmVcfvnluOqqq5CVlYVvvvkGy5YtwwcffICxY8eGpF3OxbPPPosXXngBkiShd+/eSE9Pxx9//IGFCxdi0aJFeP/99z3naG6KomDUqFFYvHgxZFlGjx49MHDgQJw6dQq///47nnrqKYwcORKtW7fGbbfdhlOnTvnc3/2eOfP7okOHDtWK+aGHHvK5yNygCKpTWrVqJQCIefPmVdq3c+dOIcuyACA+/vhjIYQQK1euFAAEAJGVlVXj5/3xxx8FAJGSkiIcDocQQgir1SoSExMFALFu3Tq/9xs9erQAIEaPHh3U802bNk0AEP379/fZXlRUJFauXCm6du0qAIiRI0dWuu+8efP8PuehQ4dEVFSUSE9P97SJN6fTKRRFqfR4R44cEcnJyQKA+O233zzbP/jgAwFADB061O8xdO/eXQAQixYt8tn+3XffiZycnEq337Bhg4iPjxcGg0FkZ2f7fcyffvpJABADBgyotM/fMR06dEgAEK1atfLZriiK6NSpk9DpdKJ58+YCgFi9erXPbWw2W6XnsFgsYsCAAQKAeOedd3z2rVmzRhw4cKDSffbs2SPS0tIEALFp0yaffatXr/b7Onvr37+/3/jO5djuvfdeAUBce+21nvdzMDFVpWXLlgKAOHTokM9292fB32c33G0XaPvZHjvYtqnqmIcNGyYAiFGjRnm2jRo1SgAQN998s9A0rVrPIUTgz7w393fnma9Lz549PZ/jsrIyz/aTJ0+Kbt26CQDizjvv9LlPVlaWSEhIEADEU08
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import seaborn as sns\n",
"# Понижение размерности с помощью PCA\n",
"pca = PCA(n_components=2)\n",
"data_pca = pca.fit_transform(data_scaled)\n",
"\n",
"# Визуализация данных после PCA\n",
"plt.figure(figsize=(10, 6))\n",
"sns.scatterplot(x=data_pca[:, 0], y=data_pca[:, 1], alpha=0.7, edgecolor=None)\n",
"plt.title(\"PCA: Визуализация данных после понижения размерности\", fontsize=16)\n",
"plt.xlabel(\"Главная компонента 1\")\n",
"plt.ylabel(\"Главная компонента 2\")\n",
"plt.grid()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Определение количества кластеров\n",
"\n",
"Выполним определение используя два метода:\n",
"\n",
"Метод локтя: Строится график зависимости инерции от количества кластеров. Этот метод помогает определить оптимальное количество кластеров, при котором инерция перестаёт существенно снижаться.\n",
"\n",
"Коэффициент силуэта: Для каждого количества кластеров вычисляется средний коэффициент силуэта, который измеряет качество кластеризации. График помогает выбрать количество кластеров с максимальным значением силуэта."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA2QAAAImCAYAAAA8D0kbAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/GU6VOAAAACXBIWXMAAA9hAAAPYQGoP6dpAACBHklEQVR4nO3dd3gU1f7H8c+mJ5CEntCJgvQOQuhEioAKUgTlh4AdAYFYUQSxXJSrgFJEEEGvogKCXSDSO4KECwiIEgWBhE6oSUjm98fc3WTJJqRPyvv1PPMkmTm7+93ds4FP5sw5NsMwDAEAAAAA8pyb1QUAAAAAQFFFIAMAAAAAixDIAAAAAMAiBDIAAAAAsAiBDAAAAAAsQiADAAAAAIsQyAAAAADAIgQyAAAAALAIgQwAAAAALOJhdQEAAGRFbGys1qxZo//+97+Kjo7WlStXFBsbq7Zt22r06NFWlwcAQIYQyFCkVatWTX///bck6amnntK7776bZtt///vfeu655yRJ7u7uun79ep7UCMBZQkKCXn75ZU2fPl1XrlxJdfzs2bMEMgBAgWEzDMOwugjAKikDWenSpXX8+HF5eXm5bFu7dm0dOHBAEoEMsEpiYqLuvvtu/fTTTwoJCdH48ePVqVMnVapUyerSAADIEq4hAyQ1a9ZMZ86c0TfffOPy+ObNm3XgwAE1b948jysDkNK///1v/fTTT+rYsaMiIyM1ZMgQwhgAoEAjkAGSHnroIUnSRx995PL4vHnznNoByHvXr1/XO++8o9KlS+urr75SQECA1SUBAJBtBDJAUv369dWsWTOtXLlSx44dczp26dIlLVq0SJUqVVKXLl3SvZ/r16/rww8/VIcOHVSqVCl5e3srJCREw4YN09GjR53aDhkyRDabLcPbjVasWKG77rpL5cqVk5eXlypUqKD+/ftrx44d6da4YMGCdB+nWrVqGXvRbvDXX3/d9Dn89ddfqW73zz//aOTIkapRo4Z8fHwUGBio1q1b64MPPlBiYmKa9Q8ZMiTVsYiICPn5+alYsWJavXp1hp7vjdvatWtTPV779u1VqlQpubu7p2q/YMGCTL9Wa9euzfT7bZfV1/nbb7/VnXfeqXLlysnDwyPVbV555ZUM1//KK6+kur2Pj49CQkLUv39/bdu2LdVt0nvfXOnQoUOq92P79u06ffq0evTooSVLlqh169YKDAyUj4+PbrnlFj3++OM6fPhwmvd59uxZvfjii6pbt678/Pzk7++vpk2bavLkybp69Wqq9vb3qUOHDrpy5YpefPFFVa9eXT4+PqpQoYIefvjhVL8v7H7++WeNHDlSjRo1UpkyZeTt7a1KlSqpf//++uWXXzL0GqSU2X6fUmY/Yzc+/4z2U3vfTOt3yODBg11+zm52O/vvyrQ+a6tWrVLv3r1Vvnx5eXl5qVy5crr33nu1ZcuWNJ/blStXNG3aNLVp00YlS5aUt7e3qlatqrvvvlsLFy50qiujW8rPULVq1VIdL1mypOrXr68JEybozJkzqWr6+++/9dZbbyksLExVqlSRt7e3SpQooTZt2uiDDz5QUlJSms8nLWn9Pjl//ryaN28um82mJ598Uq6uXnH1OU+5dejQwal9QkKCPv30Uw0cOFC1atVSQECAfH19VbNmTT311FM6fvx4mnUahqGlS5fqrrvuUnBwsLy8vBQcHKw2bdrorbfecnw+Xb2uGa1PynxfSfn6zZ07V02bNlWxYsVUokQJde/eXVu3bk3zOWX1903KzcPDQxUrVtQdd9yhL7/8Ms3HQsHHpB7A/zz00EPasWOHFixYoJdeesmxf9GiRbp06ZJGjRolN7e0/4Zx8eJF3XPPPVq7dq2KFy+upk2bqmzZstqzZ49mz56txYsXKyIiQo0bN5YktWnTJtV9LF++XDExMeratauCg4PTfKyXX35Zr7/+umw2m1q1aqUqVapo//79WrRokb766ivNmTPnpmfzbr31VqcaLl26pK+++ird22REsWLF1LdvX6d9S5Ys0eXLl1O1/eWXX3TnnXfq7NmzqlKlinr16qULFy5o7dq12rx5s5YtW6Zvv/02zev6UoqIiFDPnj1ls9n0/fffq2PHjpKk6tWra/DgwU5tIyMjtXv3bjVs2FCNGjVyOpbydR83bpzeeOMNSeaw1ho1ajhq2bhxo/7888+bvyDpCAoK0p133um07+OPP87QbTPzOs+dO1ePPfaYJKlu3boKCwuTj4+PpOTXIitSvn5Xr17Vr7/+6uiDP/74403/gJFZ//zzjyRp8eLF+uSTT+Tt7a327durZMmS2rFjh+bMmaOFCxfq66+/1h133OF028OHDyssLEx///23ypYtq+7duyshIUFr1qzR888/ry+//FI///yzSpYsmepx4+Pjdccdd+i///2vOnTooCZNmmjjxo366KOP9OOPP2r9+vWqUaOG022eeOIJHT16VHXr1lXr1q3l4eGhAwcOaNGiRVq6dKm++OIL9enTJ9uvSVr93i4nPmPZ6ad2Gzdu1CeffJKp22TEM888o3feeUdubm5q1qyZ2rZtqyNHjuibb77Rd999p7lz52ro0KFOtzl69KjuvPNO/fbbb/Lz81Pr1q1VunRpHTt2TBs2bNCePXv0wAMPqHjx4ql+d0RHR2vFihUuX5Mbf5dIUp8+fVS8eHFJ0unTp7V27Vq9+uqr+uKLLxQZGSlfX19H2//85z96+eWXFRISottuu02tW7fWiRMntGXLFm3atEkrV67UkiVL0v2DTUacP39enTt31o4dO/Tkk09qxowZ6d7njb8n7a/BjWJiYjRo0CAFBgaqdu3aatCggS5fvqzIyEhNnz5dX3zxhTZv3qzq1as73S4hIUEDBgzQ0qVL5ebmpttvv11hYWE6ffq0fvvtN73wwgvq37+/qlWrpr59++r06dNOt7f3xRvfq1q1ajn9nJW+YhceHq5p06apdevW6tmzp/bs2aOffvpJERERWrRoke69916n9tn5fZOybyUkJOiPP/7Q6tWrtXr1au3du1evvfaayxpRwBlAEVa1alVDkrFhwwbj/Pnzhq+vr1G9enWnNq1btzZsNpvx559/GlFRUYYkw93dPdV9PfDAA4Yk46677jJiYmKcjk2dOtWQZNSoUcO4fv16mvW0b9/ekGSsWbMmzTY//fSTIcnw8fExVq5c6XTsww8/NCQZnp6ext69e13eft68eYYkY/DgwU777c+tatWqaT52ev744w9DklGtWrVUx+yvc1RUlGPftWvXHPufeOIJIz4+3nHszz//NKpVq2ZIMl588UWn+5o/f36q+leuXGn4+voafn5+6b52dhMmTDAkGRMmTEizTVxcnOHn52dIMj7++ONUxwcPHmxIMubPn3/Tx7vRqlWrDElG+/btUx2TZKT3qzmzr7NhGMatt95qSDImTpyY6jYZeS0yepukpCRj2LBhhiSjT58+TsdcvW/pcfVZsN+HJCMkJMT4448/HMcSExON5557zpBklC5d2jhz5ozT/bVo0cKQZNxzzz3GpUuXHPtPnjxpNGnSxJBkPPDAA063WbNmjePxqlevbvz999+OY1evXjX69OljSDJatmyZqv5ly5YZZ8+edbnfw8PDKF26tHHlypUMvRYpn3tm+n1WP2N2P//8syHJ6NChQ6pjrvppWr9DEhISjPr16xvu7u5GhQoVUr2vN/vdk9Znbc6cOY73Zvfu3U7H1q1bZ/j7+xteXl7G77//7tifmJhoNGvWzJBkdOnSxTh58qTT7a5evWr88MMPLuswjOQ+4eqzm1Jan8WYmBijSpUqhiTju+++czq2fft2Y8+ePanu69ixY0bDhg0NScaiRYvSfdwb3fg+nTt3zvH8hw8fnu5tx40bZ0gyXnnlFaf9ab0GsbGxxjfffGPExcU57Y+PjzfGjh1rSDK6d++e6nHCw8Mdv9MiIyOdjiUlJRk///yzcf78+Qw/R1ey0ldS3revr6+xatUqp2OTJ082JBmBgYGp/s3Pzu8bV33riy++cPxuQ+HEkEXgfwIDA9W7d2/98ccfWrdunSTp4MGD2rRpk9q3b69
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAIlCAYAAAA5XwKOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/GU6VOAAAACXBIWXMAAA9hAAAPYQGoP6dpAACjbklEQVR4nOzdd1QU198G8GeWjoCgdFCxd7GjYlfsHURFxR5rRI0akxh72k9jjbFGAUvsGoyKYsOOHSzYERWsWFA67Lx/8O5GZFEWdhnK8zmHo8zOzD57d5nd786dewVRFEUQERERERFRrsikDkBERERERFQYsLgiIiIiIiLSABZXREREREREGsDiioiIiIiISANYXBEREREREWkAiysiIiIiIiINYHFFRERERESkASyuiIiIiIiINIDFFRERERERkQawuCpgnJycIAgCfH19s1zn33//hYGBAQRBwLfffpt34bJBEAQIgqCRfSna4uHDhxrZHxERERFRbrC4KmT27dsHd3d3JCcnY/Lkyfjtt9+kjkREREREVCSwuCpEDhw4oCysJk6ciPnz50sdiYiIiIioyGBxVUgEBgaiZ8+eSEpKgo+PDxYuXCh1JCIiIiKiIoXFVSFw6NAhZWH19ddfY/HixZ9d//z58/D09IS9vT309fVhbW2Nrl27IigoKNO6d+/exYQJE1CjRg0UL14cxsbGqFSpEkaPHo27d+9meR+vX7/GiBEjULJkSZiZmWHAgAF4/fq18vY3b95g0KBBKF68OCwsLODt7Y3nz5+r3FdCQgKmTp0KOzs7FCtWDF26dMGjR4+UtycmJmLChAmwsrKCqakpunfvjnv37mXaz/HjxyEIAlq2bKnyflq2bKm8JuzT67i+dH2Xr68vBEHA4MGDs9zv8ePHM9127Ngx5X2q2hYA7ty5g5EjR6J8+fIwNDRE8eLF0bx5c2zcuPGzj0PV/QGq22Hw4MHKHNn5UUdUVBSmTJmCmjVrwtTUFMWKFUOlSpUwePBgnDlzJsO6n9v/zJkzlbdndc3hrFmzPpv748es2N/IkSOzzH7+/HkIggAHBwekpqYiNTUVCxcuRMuWLWFnZwd9fX3Y2Nigc+fO2LVrl9p5Pv35+PW1a9cuDB8+HDVq1ICFhQUMDQ1RtmxZDB06FLdv3866wb/g4cOHauX41Je2VfW6O3nyJHr27Al7e3vo6ell2iar174qH/+txcTEYOzYsShdujQMDAxQpkwZTJw4EW/evFG5bU7adOHChWjdujUcHR1hZGQEMzMzVK1aFRMnTkRERESWOb/0N/XpY1a8VmbNmpVpX69evULJkiUhCAKcnJyyvR2Q9d/Ul/6Ws9rvl+7vU6qOnb///jsEQUClSpXw/v37TNusWbMGgiCgVKlSePXqVbbuR9Heqo4N8+bNgyAIqFixIp48eZLpdnXeEz+mzvEG+Pz7RFJSEipVqqTyefncdsDn359SU1Oxdu1atGzZEiVKlICBgQHKli2L0aNH4/Hjx1k+tuwctxW5svujODaoOgbJZDLY2NigSZMmWL16NVJTU7PMpk4bJCUloUuXLhAEAd27d0dycnKmbb/0OD79mwNyd3w+evQoevfuDUdHRxgYGMDKygoNGjTAzJkzERMTAyDj55Ev/ajKd+nSJfTv3195bCxRogTat2+P/fv3f7H9du/ejaZNm8LMzAympqZo2bJlltsBQHx8PH799VfUrVsXpqamMDY2RvXq1TF9+nSVx2JtPP/5ja7UASh3goKC0L17dyQmJmLMmDFYunTpZ9dfs2YNRo0aBblcjjp16qBly5aIjIzEv//+i3///RezZs3CzJkzlevv27cPS5YsQfXq1dGqVSvo6+sjLCwMK1euhK+vL3x9fdGnT58M9xEXF4dWrVohLCwM9vb2aNu2LUJCQtCtWzflOt26dUNUVBQ6duyIs2fPYsOGDTh79izOnTuHkiVLKteTy+Xo3r07goKCYGFhgY4dO+Lu3bto27YtEhISAAAjRozAvXv30Lp1a4SFhSEgIABnzpzB2bNnUaFChWy148aNGxEcHJytdTUlJSUFY8eO/ew627dvh7e3NxITE1GlShV06tQJ7969Q0hICAYOHIijR49i3bp1uc7StGnTTMsCAwPx/PlztG/fHra2tjna75EjR+Dh4YG3b9/C2toabdq0gb6+Ph4+fIjNmzcDAJo0afLF/dy/f1+t6wednZ1Ru3Zt5e/Pnj3DwYMHM6wzevRo/Prrr9i0aRN+++03mJubZ9rP8uXLAQAjR46Erq4u3r59i2+++QbW1taoUaMGbGxsEBUVhcOHD2P//v3o3bs3Nm/eDF3d9ENr7dq1MWjQoAz7PHXqFO7fvw9XV9dMr08TExPl/z09PWFgYIBq1aqhdevWSE1NxfXr17F+/Xps27YNhw4dylbbZaVYsWLw8PDIsGzHjh2Ii4vL1vafPi7F6+VTgYGB6NKlC9LS0lCuXDn06NEDxYoVAwDcu3cPp0+fzlH+N2/ewMXFBTExMcoPIsePH8fixYtx4MABnDx5ElZWVhm2yUmb7tmzB1FRUahVqxbMzc2RkJCAkJAQLF68GOvWrcP58+dRuXLlLHN++jzn5DFPmzYtw5dTBd0333yDEydOICAgAF999RX+/vtv5W2hoaEYP348dHV1sXXrVlhaWubqvubNm4cff/wRFSpUwPHjx+Hg4JDhdnXfE1XJzvHmS+bPn//ZLyxz4v379+jWrRuOHz8OExMT1KtXD1ZWVrh27RpWrlyJ7du3IygoCHXq1MmwXXaP2xUqVMh0HLh69SpCQ0MztQmATO8jHx+D0tLS8OjRI5w6dQpnz57FyZMnsWHDhlw9/qSkJLi7u2Pfvn3o3r07tm/fDj09vSzXL1++fIb3wg8fPmDnzp0q183p8Xn8+PFYtmwZgPT3h2bNmuHdu3e4ffs25syZg1atWqFly5bo0KFDpqJJcXx2d3fP8F7x6d/IkiVLMGnSJMjlctSuXRsuLi549uwZjh8/jkOHDmH27NmYMWOGyse1dOlSLFq0CPXr10eXLl1w//59BAcHIzg4GEuXLsXXX3+dYf3Xr1+jTZs2uHr1KszMzNC6dWvo6ekhODgYP/30EzZv3oyjR4+qLAC1/fxLSqQCpUyZMiIAcf369eLhw4dFIyMjEYBoYWEhxsXFfXbbsLAwUVdXVxQEQfT3989w2/79+0V9fX0RgHjo0CHl8itXroi3bt3KtC9/f39RJpOJ+vr64pUrVzLcNn36dBGA2Lx5c2WmpKQksUePHiIAEYDYrVs3MTExUXlbt27dRADiV199lWFfa9euFQGIVapUEV++fCmKoijK5XJx7Nixyn25uLiIb968EUVRFNPS0sTRo0eLAMR27dpl2NexY8dEAGKLFi0yLH/37p1oa2srmpqaiubm5iIAMSIiIsM6inb/dLnC+vXrRQDioEGDMt3WokULEYB47NixDMt/++03EYBYunRplduGhYWJBgYGoqGhobhz584Mtz18+FCsWbOmCED08/PL1v19qR2ymzu7Hj16JBYvXlwEIE6bNk1MSkrKcPvz58/FkydPZlimeE4/1bFjxwxttX79epX3qXjtzZo1K8PyrB5z//79RQDiwoULM+3r5cuXooGBgainpyc+ffpUFEVRTEhIEPfu3SumpqZmWPfBgwdixYoVRQDilClTVGZTGDRo0Gcfg8KWLVvEDx8+ZFgml8vF5cuXiwDE6tWri3K5/LP7UOX+/fsiALFMmTKZbvvS6zw1NTXL5yir10ubNm1EAOKwYcMy5f3c301WFNsAEBs1aiTGxMQob3vz5o3YpEkTEYDYt2/fTNvmpE0Vx6mPpaamil5eXiIAcfLkySpzDhgwQAQg+vr6qsz/6WOeOXOmCECcOXNmhuVnz54VBUFQvvY/fd6y2k4hq+crq+Vf2u+X7u9TWb2m3rx5Izo5OYkAxBUrVoiiKIqxsbHKv6P58+dna/8Kqv6u5s6dKwIQK1asKD558iTTNjl5T/yYusebrJ77iIgI0cjISPkcf/q8fOnvJKs2VrxGu3TpIj5//jzDbYsWLVK2zcf
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Метод локтя\n",
"inertia = []\n",
"for k in range(1, 11):\n",
" kmeans = KMeans(n_clusters=k, random_state=42)\n",
" kmeans.fit(data_scaled)\n",
" inertia.append(kmeans.inertia_)\n",
"\n",
"plt.figure(figsize=(10, 6))\n",
"plt.plot(range(1, 11), inertia, marker='o', color='blue', linestyle='--')\n",
"plt.title(\"Метод локтя для выбора количества кластеров\", fontsize=16)\n",
"plt.xlabel(\"Количество кластеров\")\n",
"plt.ylabel(\"Инерция\")\n",
"plt.grid()\n",
"plt.show()\n",
"\n",
"# Коэффициент силуэта\n",
"silhouette_scores = []\n",
"for k in range(2, 11):\n",
" kmeans = KMeans(n_clusters=k, random_state=42)\n",
" kmeans.fit(data_scaled)\n",
" score = silhouette_score(data_scaled, kmeans.labels_)\n",
" silhouette_scores.append(score)\n",
"\n",
"plt.figure(figsize=(10, 6))\n",
"plt.plot(range(2, 11), silhouette_scores, marker='o', color='green', linestyle='-')\n",
"plt.title(\"Коэффициент силуэта для различных количеств кластеров\", fontsize=16)\n",
"plt.xlabel(\"Количество кластеров\")\n",
"plt.ylabel(\"Коэффициент силуэта\")\n",
"plt.grid()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Кластеризация с помощью KMeans\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0kAAAIlCAYAAAAAOLPVAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/GU6VOAAAACXBIWXMAAA9hAAAPYQGoP6dpAAC1VklEQVR4nOzdd5xTVf7/8ddNmd7oHUVARRERRAQVkaKioCj2VQHdXVcREeyuvWH5ihVlLQs2flZYCyoiTRBBBBFQUOkgHYbpk3bP74/MhISZgckwk1Dez8djHkzuvbn3k5NMyCfnnM+xjDEGERERERERAcAR7wBEREREREQOJEqSREREREREwihJEhERERERCaMkSUREREREJIySJBERERERkTBKkkRERERERMIoSRIREREREQmjJElERERERCSMkiQREREREZEwSpIkZo488kgsy2LcuHEVHvPFF1+QmJiIZVncddddsQtOqsWgQYOwLItBgwaVuz8QCHDddddhWRaZmZnMnDkztM+yrNDPc889t9frDBkyJHRsq1atqvMhiIjIQWrcuHER/5dYloXD4SAzM5NTTjmFxx9/nPz8/Arvb4zh448/5sorr6RFixakpqaSlJREs2bN6Nu3L6+99hp5eXl7jWHUqFGha7/00kvV/RAlhpQkyQFj0qRJDBgwAK/Xy+23385TTz0V75CkGnk8Hi699FLGjh1LvXr1mD59OmeeeWa5x44dO7bC8xQXFzN+/PiaClNERA5yqampDBw4kIEDB/K3v/2Ntm3bsmDBAu677z46duzIli1bytxn1apVdOzYkUsvvZT333+f5ORkzj77bPr378+RRx7JlClTuOGGGzjqqKNYu3Zthdd+8803Q7//97//rZHHJ7HhincAIgBfffVVKEEaPnw4zzzzTLxDkmqUn59P//79mTp1Ks2bN2fKlCkcffTR5R578skn89NPPzF//nw6depUZv+ECRPYtWsXnTp1Yv78+TUduoiIHGTq1q1bZtTKjz/+SM+ePfnjjz+44447ePvtt0P71q1bR5cuXdi6dStdunRhzJgxtGvXLuL+eXl5vPrqqzz++ONkZ2dzxBFHlLnu3Llz+e2338jKysLn87Fo0SIWLlxIhw4dauRxSs1ST5LE3ddff81FF12Ex+Nh2LBhjBo1Kt4hSTXasWMHPXv2ZOrUqbRp04bvv/++wgQJ4LrrrgMq/gau9Fu60uNERET25ZRTTuG2224Dgl+2+f3+0L6rr76arVu3csoppzBt2rQyCRJAeno6d955JwsWLKBBgwblXqP0/6crr7ySSy+9NGKbHHyUJElcffPNN6EEaejQoTz//PN7PX7NmjVlxhvv+bNmzZqI+/z444/ceeednHLKKTRs2JCEhAQaNGhAv379+Pbbb/d6vT/++IObbrqJY445hpSUFDIyMjjuuOO46aabWLp0KQAPPfTQPmPaW3wbN25kxIgRtGnThpSUFNLT0+nUqRMvv/xyxJt4qdJ5P+PGjeOXX37h4osvpl69eiQnJ9OuXTteeOEFAoFAmfuVxvnQQw/t9TGXKp1Dtme80di4cSPdunXjxx9/pFOnTsyaNYumTZvu9T59+vShYcOGvP/++xQXF0fsW716NdOnT6dLly4ce+yxez1PUVERzz77LKeeeipZWVkkJSVxzDHHcOedd7Jjx44yx/t8Pt59913+9re/ceyxx5KRkUFycjLHHHMMt9xyCxs3biz3Ot27d8eyLGbMmMGiRYu4+OKLqVu3LomJiRx33HE8++yzGGPK3M/j8fDMM8/QsWNH0tPTSUhIoGHDhnTq1Ik777yTnTt37vXxVVbp8xj+U6tWLU444QQefPDBctsi/DHtafr06aHz7Dn37KuvvuKCCy6gZcuWofY76qijuOKKK/j+++8jjh04cCCWZTFy5MgKY//www+xLItTTjkltK2qz1Op0r+fin72fEx7+7vZvn07derUwbIsjjzyyIh9K1as4Morr6Rt27bUrl2bhIQEmjRpQs+ePRk/fnyZ18SMGTOwLIvu3buXG3fpc1Le32T43+rEiRM5/fTTycjIID09ne7du/Pll19W2B6FhYU8+eSTdOjQgfT0dFJSUjj++OO57777yM7OLnN8ee/BDoeDBg0a0LVrV1577bVy37eq8j5cOr+kojmOez72ymzf17krc81we/tbAfj4448599xzqVevXug1cPXVV/Pbb79V6vyVjWtvj9fv9/PGG2/QvXt3ateuTWJiIi1atODGG29k/fr1ZY4Pfy0WFhZy77330qpVK5KSkmjcuDHXX389f/31V4WxZmdn8+CDD9K+ffvQa+qEE07gscceo7CwsMzx77zzDueeey5HHnkkaWlppKam0rp1a/7+97+zZMmSSrdRZXTs2BGAgoICtm/fDsDMmTOZNWsWAGPGjCEpKWmv52jVqhWNGjUqs72goIAPPvgAgOuvv57rr78egPHjx5f5v0wODhpuJ3EzZcoULrzwQoqLi7npppt48cUXK33f1NRULrnkkohtH3/8MQUFBWWOvffee5k+fTrHH388HTt2JDU1lZUrV/LFF1/wxRdf8PzzzzNs2LAy9xs/fjzXXXcdHo+H5s2bc95552HbNqtWrWLMmDHUr1+ftm3b0r59ewYOHBhx39mzZ7Ny5UpOO+20MoUF0tLSQr9/99139O/fn+zsbI488kh69+6Nx+Phxx9/ZOjQoXz++ed88cUXuN3uMvH9+OOP3HjjjTRs2JCePXuSnZ3NjBkzuPXWW5k9e3boA2a8rFixgt69e7NmzRp69uzJ//73v4jHXhGXy8W1117L008/zSeffMLf/va30L6xY8dijNlnL9LGjRs599xzWbJkCbVr16ZTp06kp6ezcOFCnnnmGT766CNmzJgRMVxiy5YtXHPNNWRmZtKmTRvatWtHQUEBixYt4qWXXuL9999nzpw5FRaKmDx5MqNGjaJly5b07t2bTZs2MXv2bG6//XbWr18f8QWAbducf/75TJ06lYyMDM444wyysrLYtm0bf/75J8888wxXXXUVtWvX3md7VdaAAQNC7b99+3ZmzJjBI488wvvvv8+iRYtITk7e5zl8Ph9DhgypcP/333/PvHnzaNu2bejDyG+//cYHH3zAhx9+yHvvvceVV14JwLBhw3j77bcZM2YMd955J06ns8z5Ro8eDcDNN98c2ra/z1OpPf82V6xYUSaR25e77767wmR2zZo1fPHFF7Rr146jjz6apKQk1q1bx8yZM5k2bRrfffcdY8aMqdR13n333YgiJxV58cUXee655zj55JPp27cvK1euZObMmcycOZMXX3yRoUOHRhy/c+dOevbsyaJFi8jIyKBHjx643W5mzpzJ448/zvjx45k2bVqZBBAi34MDgQDr1q1j9uzZ/PDDD8yaNYt33nkn4viqvg8frPx+P3/729/48MMPSUxMpGPHjjRp0oQ//viD9957jwkTJjBhwgTOPffcGo0jLy+PCy64gBkzZpCWlkbHjh2pV68eS5YsYcyYMXz00UdMmTKFk046qcx9vV4vPXv2ZPHixXTv3p0OHTowe/Zs/vvf//Lll1/y3Xff0bp164j7/Pbbb5x77rmsX7+eRo0acfrpp+N2u/nxxx+5//77+eSTT5gxYwaZmZmh+3zzzTf8+uuvHHfccXTp0gWv18uiRYt48803eeedd5gyZQrdunWrlvbIzc0N/Z6YmAjAp59+CsAJJ5xQbjtU1gcffEBeXh7t2rULvf8dffTR/PHHH0yYMIGrrrpqPyKXuDAiMXLEEUcYwIwdO9Z8++23Jjk52QCmVq1apqCgoFLnWLFihQHMkUceWeH5V69eHbH9yy+/NBs3bixz/Jw5c0xGRoZxu91mw4YNEft++ukn43a7jWVZ5sUXXzSBQCBi/5o1a8xPP/1UYZwDBw4MPdaKbNq0ydSpU8dYlmVeeeWViGts377d9OjRwwDm4YcfLvfcgLnpppuMz+cL7Vu6dKmpV6+eAcyYMWMi7vfggw8awDz44IMVxhSuovbcm9LYTj31VNOgQQMDmAEDBpji4uJ93rf
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"optimal_k = 3 \n",
"kmeans = KMeans(n_clusters=optimal_k, random_state=42)\n",
"data['KMeans Cluster'] = kmeans.fit_predict(data_scaled)\n",
"\n",
"# Визуализация кластеров KMeans\n",
"plt.figure(figsize=(10, 6))\n",
"sns.scatterplot(x=data_pca[:, 0], y=data_pca[:, 1], hue=data['KMeans Cluster'], palette='viridis', alpha=0.8, edgecolor=None)\n",
"plt.title(\"Кластеры KMeans, визуализированные через PCA\", fontsize=16)\n",
"plt.xlabel(\"Главная компонента 1\")\n",
"plt.ylabel(\"Главная компонента 2\")\n",
"plt.legend(title='Кластеры')\n",
"plt.grid()\n",
"plt.show()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Иерархическая кластеризация\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA+0AAALpCAYAAADGocexAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/GU6VOAAAACXBIWXMAAA9hAAAPYQGoP6dpAACKZUlEQVR4nOzdd3hUZd7G8XuSTCopBFIIJYIgTTouVUQQUERRWCur2MsLIuIqulbWVdR1LSjq6u6KXVAXd3WRIgqKNAUBBUQINUAo6T2Tmef9A2fMkEnIhAk5kO/nunJl5tTfmXJm7jnnPI/NGGMEAAAAAAAsJ6i+CwAAAAAAAL4R2gEAAAAAsChCOwAAAAAAFkVoBwAAAADAogjtAAAAAABYFKEdAAAAAACLIrQDAAAAAGBRhHYAAAAAACyK0A4AAHAS2rhxo5555hmvYeXl5Zo2bZo2btxYT1UBAAKN0A6cQnbu3CmbzXbMv0cffbS+SwUAHKecnBzdc889WrlypWfY3Llz9eijjyo3N7ceKwMABFJIfRcAIPBsNpuuvfbaSsPnz5+vAwcO1ENFAIBAO+uss3TGGWfo3HPP1YgRI1RWVqZFixapQ4cOOuuss+q7PABAgBDagVNQUFCQZs2aVWn44MGDCe0AcIoIDQ3VZ599prvvvlvLli2TJI0cOVLPPvus7HZ7PVcHAAgUQjtwCnG5XJKOHGkHAJz62rVrp//+97/1XQYAoA5xTTtwCiktLZUkhYTU/ve4NWvWaNy4cWrVqpXCwsIUHx+vESNGaN68eT6nP+2002Sz2bRz506f42fNmiWbzabrrruuynUe6xr8JUuW+JwvPT1dd9xxh9q1a6fw8HDFxsZqwIAB+vvf/y6n01llLdX9rVu3zjP94MGDPetfunSphg8frvj4eEVGRup3v/ud3n77bZ91HTp0SDNmzNDIkSPVunVrRUREKCYmRr1799ZTTz2lkpKSah+HoKAgbdu2zec07777rme6wYMHe41bsmSJZ1zr1q09P+Ic7eabb66yfYP8/Hy9/vrrGjNmjNq1a6eoqChFRUWpS5cueuCBB5STk+NzmTVxrMf/tNNOq3LeittW1V9V6zznnHMUHx+v4ODgSvP4OiOlKtddd12V82zZskVhYWE+nxe3ffv2acqUKerYsaMiIyMVHR2ts846Sy+99JLKy8urXd/69es1ZswYJSQkKCIiQl27dtULL7zg83Ve2+ew4nt57ty5GjhwoGJiYhQdHa3Bgwf73Ad89NFHstlsSkhIUHp6eqXxCxYsUHBwsGJjY7V169bjfiyrm8/dpkdVr6Ps7Gw98sgj6t69u6KjoxUZGakuXbroL3/5i4qKinzOIx3ZJ44fP16tW7dWeHi44uPj1a1bN91zzz3atWuXZzr3a9TX879+/Xo1bdpUISEhevfddyuNz8rK0p/+9Cd17tzZ89ro1auXnn76aRUXF1eavrp1Sb/tu6rbN1c339H7XGOMbrnlFtlsNvXt29fn9fK1eY9+8cUXuuOOO9S9e3c1bdpUYWFhatGiha644gp999131dZak+fF/Xqp6d/RfvnlF9166606/fTTPZ8xgwYN0jvvvHPMx8+fz4zavh9WrVqlsWPHqn379oqLi1NYWJhSU1N10UUX+Xy/OhwOvfPOOxo3bpw6dOigmJgYRUREqH379po0aZL27dt3zO3ypbrXY223ra5e48DJiiPtwCnE4XBIksLCwmo1/wsvvKApU6bI5XKpe/fu6tOnjzIyMrRkyRItXLhQ06ZN08MPPxzIkr2MHz/e63511+B/9913Ov/885WVlaVWrVrpkksuUW5urpYsWaLly5dr7ty5+u9//6vQ0NBK855++ukaOHCgz+XGx8dXGjZ37ly99NJL6tChg0aMGKF9+/Zp2bJluvbaa7Vu3Tr97W9/85p+wYIFuvPOO9W8eXO1bdtWffv21aFDh7Rq1Srdd999+s9//qOvvvqqyufJGKOXXnpJzz//fKVxL7zwgs95jrZz507997//1SWXXOI1PDMz02docFu/fr1uueUWJSQkqH379urVq5eys7O1Zs0aPfHEE5ozZ45WrlypJk2a1KgOX45+/AsKCvTxxx/XaN6kpCSdf/75XsPefPNNn9M++OCDevzxxyVJvXv3Vrt27Tyvh2XLliktLa025fs0ceJElZWVVTn+66+/1iWXXKLs7GyddtppGjZsmEpLS7V69Wrdcccd+vTTT/XZZ5/5PKV59erVuv3225WcnKyhQ4cqOztbS5Ys0eTJk7Vs2TLNmTPHK3Ac73M4Y8YMPffcc+rdu7dGjRqltLQ0LV26VEuXLtWMGTN0xx13eKb9/e9/rzvuuEMvvviirrrqKn311VeeHw337t2ra665Ri6XS6+//rratWsXkMeyNjZt2qTzzz9fe/bsUbNmzTRw4EDZ7XatXr1aDz30kD7++GMtWbJEsbGxXvP99a9/1X333SeXy6UzzjhDo0ePVnFxsbZt26ZnnnlGnTt3rvYHSenI8zF06FDl5OTorbfe0tVXX+01fvv27RoyZIh27dqlhIQEjRw5Ug6HQ1999ZWmTp2q2bNn64svvlDjxo1rtK3vvPOOli5d6tfjUx1jjG699Va9/vrr6tu3rxYsWKCYmJgqp/fnPXrbbbdpz5496ty5swYMGKCQkBD9/PPPmjNnjv7973/rgw8+0NixYyvNV9Pnxdd+3v25MmLECCUnJ1e5HR9++KGuvfZalZSUqEOHDho5cqRyc3O1atUqXXPNNfryyy/1r3/9y+e8/n5mVKe698OPP/6or776Sl26dFGXLl0UEhKitLQ0ffbZZ/rss880ffp03XfffZ7pDxw4oGuuuUaxsbHq2LGjunbtqsLCQq1bt04vvviiPvjgAy1fvlxt27atcX3Ho7bv9UC/xoGTggFwyli6dKmRZJKSknyOP+ecc4wk88gjj1QaN3/+fGOz2UzTpk3N0qVLvcZt2LDBtGjRwkgyS5Ys8RqXmppqJJkdO3b4XOcbb7xhJJnx48f7HO90Oo0k42t35K73q6++8hpeUlLiWe9tt91mysrKPOPS0tLMaaedZiSZP/3pT37VUtX6JZknnnjCa9ySJUtMRESEkWTmz5/vNW7Tpk1mxYoVlZaXlZVlhg8fbiSZp59+utJ497qGDBliYmJiTH5+vtf45cuXG0lm6NChRpI555xzvMZ/9dVXRpLp1auXiY2NNeeee26ldTzxxBNeyzj6tbBnzx7zxRdfGKfT6TW8sLDQXHvttUaS+b//+79Ky62Jf/7znz4f/x07dhhJJjU1tcp5v/jiCyPJDB48uNI4X6+f0tJSExkZaSSZN998s9I848ePN5LMG2+8UeP6q5pn9uzZRpJp1aqVz+dl//79pkmTJsZms5mXX37Z67E9fPiwGTJkiJFkpk2b5nN97sfc4XB4xv30008mISHBSDKvvvqq13y1fQ7d7ymbzWbeeecdr3EffPCBsdlsJiQkxPz4449e40pLS83vfvc7I8lMnTrVGGOMw+EwAwcONJLMhAkTAvZYVve8VfU6KioqMqeffrqRZB588EFTWlrq9ZhcddVVRpK5/vrrveb7z3/+YySZ8PBwM3v27Err27hxo9m0aZPnvvv9V7HmdevWmSZNmpjg4GDz3nvvVVqGMcb06dPHSDIXX3yxKSgo8Aw/ePCg6dmzp5Fkrr76aq95fK3LGGNyc3NNcnKyiY6ONnFxcdXum305ep/rcrnMzTffbCSZvn37mtzc3Crn9fc9aowxc+fONVlZWT6Hh4SEmCZNmpiioiKvcf4+L8faRl82bNhgwsLCTHh4uPn444+9xu3cudN06dLF576ltp8ZtX0/lJSU+Kz/888/N5JM06ZNvYbn5eWZ//znP17vAWOMKSsrM/fff7+RZEaOHFlpecd6zKp6PR7PttXVaxw4WRHagVPIxx9/bCSZzp07+xxfXWh3f3H86KOPfM47Z84cI8mMHTvWa/jxhvbCwkIjyYSEhFRZ79FfFN5++20jyaSkpPj80vLRRx8ZSSY6OtoUFxfXuJaq1t+jRw+f4++++24jyQw
"text/plain": [
"<Figure size 1200x800 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA3MAAAImCAYAAAD5fdOKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/GU6VOAAAACXBIWXMAAA9hAAAPYQGoP6dpAAC8b0lEQVR4nOzdd3gU5doG8Htm+6aSnkDoICUU6cUDSAfpCAIWwI4oCiqKHRviUazY+QQLR1FBQVRAkCYiAtKL9FDSCelb5/3+WHbNZndDNqSw4f5dF+eYmdmZZ96d2Z1n3yYJIQSIiIiIiIgooMjVHQARERERERH5j8kcERERERFRAGIyR0REREREFICYzBEREREREQUgJnNEREREREQBiMkcERERERFRAGIyR0REREREFICYzBEREREREQUgJnNEFJB+//13fPLJJ27LsrKy8Nxzz+Hs2bPVFBURERFR1amQZK5+/fqQJAkLFy70uc2PP/4InU4HSZLw2GOPVcRhiegqdvLkSUybNg0nTpxwLfu///s/zJ49G5IkVWNkREREdKVauHAhJEly+yfLMsLCwtCpUye89NJLyM/P9/l6IQS+/fZbjB8/Hg0aNEBQUBD0ej0SExMxZMgQfPTRR8jLyys1hnnz5rmO/c4771zW+UhCCHFZe4AjmTt16hQ+/fRTTJo0yWP9ypUrMWrUKFgsFjzyyCP473//e7mHJKKr3Pnz59GkSRMoioI+ffrg/PnzWL9+PXr37o1ff/21usMjIiKiK9DChQsxefJkBAUF4cYbbwQA2O12HD9+HFu3boWiKGjatCk2btyI2NhYt9ceP34cN954I/7++28AQPPmzXHNNddAp9Ph7Nmz2LZtGywWC6KiorB9+3bUq1fPawwtW7bEgQMHAABt27Z17a881OV+ZRn9/PPPGD16NCwWC6ZPn85EjogqREREBFatWoWZM2fi119/hV6vx80334zXXnutukMjIiKiK1xUVJRHq8Jt27ahT58++Oeff/Doo4/is88+c61LTk5G165dkZ6ejq5du+KDDz5A69at3V6fl5eH999/Hy+99BKys7O9JnNbt27FgQMHEB4eDqvVil27dmHnzp1o165duc6jUvvM/fLLLxg5ciTMZjMefPBBzJs3rzIPR0RXmQ4dOmDdunW4cOECUlNT8fnnn3v8ikZERERUFp06dcLDDz8MAFi6dClsNptr3S233IL09HR06tQJ69at80jkACAkJAQzZ87Ejh07fD6PLFiwAAAwfvx4jBkzxm1ZeVRaMrd69WpXIvfAAw/gzTffLHX7kydPerRfLfnv5MmTbq/Ztm0bZs6ciU6dOiEuLg5arRaxsbEYOnToJZtZ/fPPP7jvvvtwzTXXwGg0IjQ0FC1atMB9992Hffv2AQCee+65S8ZUWnznzp3DjBkz0Lx5cxiNRoSEhKBjx45499133S4Op0mTJrn6Hu7evRujRo1CdHQ0DAYDWrdujbfeegt2u93jdc44n3vuuVLP2cnZx7FkvKXp1asXJEnC+vXrPdb99ttvrjLw1swWcJT3Pffcg0aNGkGv1yMsLAw9evTAF198ccnjbdiwAf3790dERASMRiM6deqEzz//3OvrMjIy8Pbbb2Pw4MFo0KABDAYDQkND0aFDB8ydOxcmk8nr65zxA8DHH3+M9u3bIygoCOHh4Rg8eDC2bt3q8ZrXX38dkiShadOmXttGf/zxx5AkCYmJicjMzPR6biWVVpalvW79+vWQJAm9evXyen7+XotO69atw5gxY1CnTh3odDpER0ejY8eOePbZZ5GVleXaztn+3Nv7v2bNGhiNRgQFBWHdunUe68+cOYMHHngATZo0cV0b3bt3x4cffuj1ei/tWGazGU2bNnV7P8vK131hNpsxZMgQSJKE4cOHw2Kx+IzJ17/69et7vGbp0qW48847kZSUhFq1akGv16NBgwa4/fbbcfjw4VJjLcv74rxeyvLPW3w7duzAzTffjLp160Kn0yEiIgIDBgzATz/9dMnyW7ZsGa677jqEhoYiJCQEvXr18vm68t4PP//8M4YNG4ZGjRohNDQUBoMBDRs2xLhx4/D777977CsvLw8ff/wxRo0ahSZNmiAoKAhBQUFo1aoVnnzySVy4cOGS5+VNaddjec+tsq7x0jjPs/i/WrVqoVWrVh73u1NVvHcTJ06EJEmYM2eOz9iXLFkCSZLQqVMn1zKr1YovvvgCN998M5o1a+Y6zjXXXINp06bh3LlzpZaH8/vY17+S51Ta93BmZiYiIyO93mtHjx7F+PHjkZSUhIiICGi1WtSuXRt9+vTB4sWLUbInzKU+64vf9yWv2fLeowBQWFiIV155Be3atUNISAiMRiNatmyJp556CtnZ2R7be3umk2UZsbGx6NatGz766COv3z3lea4r7X7xdu5lWX6pfZflmMWVdq8AwLfffouBAwciOjradQ3ccsstrmZ4ZXWpuEo7X5vNhk8++QS9evVCREQEdDodGjRogClTpuD06dMe2xe/FgsLC/HEE0+gcePG0Ov1SEhIwB133FHqYGTZ2dl49tln0bZtW9c11apVK7z44osoLCz02P7zzz/HwIEDUb9+fQQHByMoKAhNmjTBnXfeib1795a5jMqiffv2AICCggLX89uGDRuwadMmAMAHH3wAvV5f6j4aN26M+Ph4j+UFBQX4+uuvAQB33HEH7rjjDgDA4sWLfT6jXpKoAPXq1RMAxKeffiqEEGL16tVCr9cLAOK+++4r0z5OnDghAIigoCAxceJEt39BQUECgDhx4oTba/r06SNkWRatWrUSgwcPFmPGjBHt2rUTAAQA8eabb3o91pdffil0Op0AIOrWrStGjx4tRo4cKdq0aSMkSRLPPvusEEKIZcuWecTSqFEjAUB0797dY11GRobrGBs2bBC1atUSAET9+vXFsGHDxIABA1zL+vfvLywWi1tcEydOFADElClThF6vF/Xr1xc33XST6N+/v9BqtQKAuPHGG4WiKG6ve/bZZwUAV9yX4ny/SpZnaXr27CkAiN9++81tucViEc2bN3eV+cSJEz1eu2TJEtf10KxZMzFy5EjRu3dv1/s6efJkn8ebNm2akGVZtGjRQowbN0706NFDyLIsAIgZM2Z4vO7zzz8XAETt2rVFz549xbhx40SfPn1EcHCwACC6du0qTCaTx+uc8U+fPl1IkiSuu+46MX78eJGUlCQACLVaLZYuXerxumHDhgkAYty4cW7Ld+3aJfR6vVCr1eL333+vkLL09TohhPjtt98EANGzZ0+PdeW5FoUQ4oEHHnDF0rZtWzFu3DgxaNAg0bBhQ484Pv30U68xr169WhgMBmE0GsW6des8jrFt2zYRERHhuhdvuukmMXDgQNf1MmDAAGE2m91e4+tYQgjxwgsvuGL29+PN231hMpnEDTfcIACI4cOHey2n4jE1atTI7TNh9OjRAoCoV6+ex2tUKpUwGo2iQ4cOYtSoUWLYsGGusg0KCvK4bpzK+r7MmTPH52fp6NGj3ZY//PDDbsd48803XfdZ27ZtxY033iiuu+461+fQ7NmzfZbf9OnTBQDRoUMHMX78eNGpUydXvG+//bbH68p7Pzz55JMiJiZG9O7dW4wZM0aMGTNGtGzZUgAQkiSJxYsXu22/adMmAUBER0eL6667zvXZGhkZKQCIxo0bi8zMTJ/n5evzsrTrsbznVlnXeGmc51n82rjhhhtc10zTpk1FYWFhhZyfP+/djh07XJ8PNpvNa+w9evQQAMSiRYtcy06fPi0AiLCwMNGlSxcxZswYMXjwYJGQkOC6Do4cOeKzPJzfxyW/67t37+71nEr7Hr7jjjtcZVHys2DNmjUiODhYdOvWTYwcOVKMHz9edO/eXahUKgFA3HPPPW7bl/ZZ7/z+c/4rec2W9x7NysoSbdu2FQBEaGioGDZsmBg9erSIiooSAESDBg08juXtme6WW25x+w6/5ZZbPI5Vnue60u6Xkufuq0z8vb/LcszifN0rVqtVjB07VgAQOp1OdOvWTYwZM0a0adNGABAGg0H8/PPPZTpGWeLydb65ubm
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.cluster import KMeans, AgglomerativeClustering\n",
"\n",
"Z = linkage(data_scaled, method='ward')\n",
"\n",
"plt.figure(figsize=(12, 8))\n",
"dendrogram(Z, truncate_mode='lastp', p=optimal_k, leaf_rotation=90., leaf_font_size=12., show_contracted=True)\n",
"plt.title(\"Дендограмма для иерархической кластеризации\", fontsize=16)\n",
"plt.xlabel(\"Объекты\")\n",
"plt.ylabel(\"Евклидово расстояние\")\n",
"plt.grid()\n",
"plt.show()\n",
"\n",
"# Применение иерархической кластеризации\n",
"hierarchical = AgglomerativeClustering(n_clusters=optimal_k)\n",
"data['Hierarchical Cluster'] = hierarchical.fit_predict(data_scaled)\n",
"\n",
"# Визуализация кластеров иерархической кластеризации\n",
"plt.figure(figsize=(10, 6))\n",
"sns.scatterplot(x=data_pca[:, 0], y=data_pca[:, 1], hue=data['Hierarchical Cluster'], palette='coolwarm', alpha=0.8, edgecolor=None)\n",
"plt.title(\"Кластеры иерархической кластеризации, визуализированные через PCA\", fontsize=16)\n",
"plt.xlabel(\"Главная компонента 1\")\n",
"plt.ylabel(\"Главная компонента 2\")\n",
"plt.legend(title='Кластеры')\n",
"plt.grid()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Оценка качества кластеризации\n",
"\n",
"Подводя итоги оцениваем качество кластеризации. Для этого были вычислены средние коэффициенты силуэта для:\n",
"\n",
"Кластеризации KMeans.\n",
"Иерархической кластеризации.\n",
"Эти метрики показывают, насколько хорошо объекты внутри одного кластера схожи и насколько различаются между кластерами. "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Коэффициент силуэта для KMeans: 0.5469\n",
"Коэффициент силуэта для иерархической кластеризации: 0.5783\n"
]
}
],
"source": [
"silhouette_kmeans = silhouette_score(data_scaled, data['KMeans Cluster'])\n",
"silhouette_hierarchical = silhouette_score(data_scaled, data['Hierarchical Cluster'])\n",
"\n",
"print(f\"Коэффициент силуэта для KMeans: {silhouette_kmeans:.4f}\")\n",
"print(f\"Коэффициент силуэта для иерархической кластеризации: {silhouette_hierarchical:.4f}\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}