852 lines
494 KiB
Plaintext
Raw Normal View History

2024-10-18 21:30:59 +04:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1.\n",
"Были выбраны следующие датасеты:\n",
"1) Данные о автомобилях (17) \n",
"2) Данные о мобильных устройствах (18)\n",
"3) Данные о миллиордерах (19)"
]
},
{
"cell_type": "code",
"execution_count": 3,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"cars_df = pd.read_csv(\"./car_price_prediction.csv\")\n",
"phones_df = pd.read_csv(\"./mobile phone price prediction.csv\")\n",
"rich_df = pd.read_csv(\"./Forbes Billionaires.csv\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2.\n",
"Проблемные области:\n",
"car_price_prediction.csv - цены на автомобили\n",
"mobile phone price prediction.csv - цены на мобильные телефоны\n",
"Forbes Billionaires.csv - данные о миллиордерах"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.\n",
"Объекты наблюдения\n",
"car_price_prediction.csv: автомобили;\n",
"mobile phone price prediction.csv: телефоны;\n",
"Forbes Billionaires.csv: миллиардеры;\n",
"\n",
"Атрибуты:"
]
},
{
"cell_type": "code",
"execution_count": 4,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Index(['Unnamed: 0', 'Name', 'Rating', 'Spec_score', 'No_of_sim', 'Ram',\n",
" 'Battery', 'Display', 'Camera', 'External_Memory', 'Android_version',\n",
" 'Price', 'company', 'Inbuilt_memory', 'fast_charging',\n",
" 'Screen_resolution', 'Processor', 'Processor_name'],\n",
2024-10-18 21:30:59 +04:00
" dtype='object')\n",
"Index(['Unnamed: 0', 'Name', 'Rating', 'Spec_score', 'No_of_sim', 'Ram',\n",
" 'Battery', 'Display', 'Camera', 'External_Memory', 'Android_version',\n",
" 'Price', 'company', 'Inbuilt_memory', 'fast_charging',\n",
" 'Screen_resolution', 'Processor', 'Processor_name'],\n",
" dtype='object')\n",
"Index(['Rank ', 'Name', 'Networth', 'Age', 'Country', 'Source', 'Industry'], dtype='object')\n"
]
}
],
"source": [
"print(phones_df.columns)\n",
"print(phones_df.columns)\n",
"print(rich_df.columns)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Связи между объектами не прослеживаю\n",
"\n",
"4.\n",
"car_price_prediction.csv и mobile phone price prediction.csv бизнес-целью будует являться формирование цены, которая будет соответсвовать существующему рынку и атрибутам объекта.\n",
"Forbes Billionaires.csv - выявление наиболее прибыльных видов бизнеса и проверенных спосов создания капитала\n",
"\n",
"5. \n",
"Формирование цены: на вход характеристики продукта; целевой признак - цена\n",
"Выявление...: на вход вид бизнеса, страна, источники дохода; целевой признак - место в форбс\n",
"\n",
"6, 7. \n",
"Проблемы наборов данных\n",
"Зашумленность:\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"19237\n",
"1370\n",
"2600\n"
]
}
],
"source": [
"print(cars_df.shape[0])\n",
"print(phones_df.shape[0])\n",
"print(rich_df.shape[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Так как набо дастаточно быльшие (более 1000 строк), то зашкмленность не будет иметь сильного влияние на качество, шумы усреднятся\n",
"\n",
"Смещение данных, актуальность проверить представляется невозможным, так как был взят готовый сет данных"
2024-10-18 21:30:59 +04:00
]
},
{
"cell_type": "code",
"execution_count": 6,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"было 19237\n",
"ID 45576535.886104904 936591.4227992407\n",
"Price 18581.7495915248 191880.3101852926\n",
"Prod. year 2010.9471797575118 5.560374489543753\n",
"Mileage 1448962.4091998083 46414350.64710781\n",
"Cylinders 4.579505488649685 1.195311870131117\n",
"Airbags 6.622327917913639 4.306232340768934\n",
"стало 18712\n",
2024-10-18 21:30:59 +04:00
"\n",
"------------------\n",
"\n",
"было 1313\n",
"Unnamed: 0 689.4889565879665 393.9451859329142\n",
"Rating 4.374714394516375 0.2306630122648432\n",
"Spec_score 79.78217821782178 8.203848320388786\n",
"Price 29.234423195084485 21.790674983254675\n",
"стало 1282\n",
2024-10-18 21:30:59 +04:00
"\n",
"------------------\n",
"\n",
"было 2600\n",
"Rank 1269.5707692307692 728.1463636959434\n",
"Networth 4.8607499999999995 10.659670683623453\n",
"Age 64.25370226032736 13.195277077997176\n",
"стало 2565\n"
]
}
],
"source": [
"#преобразование пробега к float, чтобы потом помтроить график\n",
"cars_df['Mileage'] = cars_df['Mileage'].str.replace(r'\\D+', '', regex=True).astype(float)\n",
"phones_df['Price'] = pd.to_numeric(phones_df['Price'].str.replace(',', '.'), errors='coerce')\n",
"phones_df = phones_df.dropna(subset=['Price'])\n",
"\n",
"print(\"было \", cars_df.shape[0])\n",
"for column in cars_df.select_dtypes(include=['int', 'float']).columns:\n",
2024-10-18 21:30:59 +04:00
" mean = cars_df[column].mean()\n",
" std_dev = cars_df[column].std()\n",
" print(column, mean, std_dev)\n",
" \n",
" lower_bound = mean - 3 * std_dev\n",
" upper_bound = mean + 3 * std_dev\n",
" \n",
" cars_df = cars_df[(cars_df[column] <= upper_bound) & (cars_df[column] >= lower_bound)]\n",
" \n",
"print(\"стало \", cars_df.shape[0])\n",
"\n",
"print(\"\\n------------------\\n\")\n",
"\n",
"print(\"было \", phones_df.shape[0])\n",
"for column in phones_df.select_dtypes(include=['int', 'float']).columns:\n",
" mean = phones_df[column].mean()\n",
" std_dev = phones_df[column].std()\n",
" print(column, mean, std_dev)\n",
" \n",
" lower_bound = mean - 3 * std_dev\n",
" upper_bound = mean + 3 * std_dev\n",
" \n",
" phones_df = phones_df[(phones_df[column] <= upper_bound) & (phones_df[column] >= lower_bound)]\n",
" \n",
"print(\"стало \", phones_df.shape[0])\n",
"\n",
"print(\"\\n------------------\\n\")\n",
"\n",
"print(\"было \", rich_df.shape[0])\n",
"for column in rich_df.select_dtypes(include=['int', 'float']).columns:\n",
" mean = rich_df[column].mean()\n",
" std_dev = rich_df[column].std()\n",
" print(column, mean, std_dev)\n",
" \n",
" lower_bound = mean - 3 * std_dev\n",
" upper_bound = mean + 3 * std_dev\n",
" \n",
" rich_df = rich_df[(rich_df[column] <= upper_bound) & (rich_df[column] >= lower_bound)]\n",
" \n",
"print(\"стало \", rich_df.shape[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Выше были устранены выбросы, которые могли повлиять на качество данных. При этом выбока осталась достаточного размера для работы с ней.\n",
"В телефонах были некорректные данные по ценам (с двумя запятыми, я не знаю что они значат, поэтому я их удалил)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Просачивание данных. Проверим зависимость между параметрами графически. "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: title={'center': 'Price'}, xlabel='Airbags'>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlIAAAHNCAYAAADVB5V4AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAACeP0lEQVR4nOzdeVhUdfs/8PcwwLAo4AaIIpC4g3siLqApIopBQKmZqVk+mVgIiuFTbpmkAu5LPZWWWwsiFeIyLiAmmGtCLoGBmgq4grIzc35/+Dvny4ERhuHMDDNzv67LS+ace+Y+s9/zOZ9FxDAMA0IIIYQQ0mhG2j4AQgghhBBdRYUUIYQQQoiKqJAihBBCCFERFVKEEEIIISqiQooQQgghREVUSBFCCCGEqIgKKUIIIYQQFVEhRQghhBCiIiqkCCGEEEJURIUUIQZMJBJh6dKl2j4MvZeSkgKRSISUlJR645YuXQqRSIQHDx5o5sCUsGPHDohEIuTl5SkVLxKJEBoaqt6DIqQZoUKKEDVgv3xq/rO1tcXIkSNx8OBBbR9ek125cgVLly5V+suVNF+DBg2CSCTC1q1btX0ohOgkKqQIUaPly5dj586d+P777xEZGYn79+9j3LhxSEpK0vahNcmVK1ewbNkyKqR0XHZ2Ns6ePQtnZ2fs3r1bYczUqVNRVlYGJycnDR8dIbrBWNsHQIg+8/Pzw8CBA7nLM2fOhJ2dHfbu3Qt/f38tHpnuqK6uhlwuh6mpqbYPRe/s2rULtra2iI2NRUhICPLy8uDs7MyLEYvFEIvF9d4OwzAoLy+Hubm5Go+WkOaJWqQI0SAbGxuYm5vD2Jj/G6akpAQRERFwdHSERCJBt27dEBMTA4ZhAABlZWXo3r07unfvjrKyMu56jx49Qvv27TFkyBDIZDIAwPTp09GiRQv8888/8PX1haWlJRwcHLB8+XLu9upz8eJF+Pn5wcrKCi1atMCoUaOQkZHB7d+xYwdef/11AMDIkSO5U5cN9f/5+eef0bNnT5iZmcHNzQ379+/H9OnTeV/ceXl5EIlEiImJwbp169C5c2dIJBJcuXIFAHD8+HEMHz4clpaWsLGxQUBAAK5evcrLU/s2WWz/o5rY/jy7d+9Gt27dYGZmhgEDBuDkyZN1rn/nzh288847sLOzg0QiQa9evfDtt9/Wifv3338RGBgIS0tL2NraYt68eaioqKj3santwYMHeOONN2BlZYU2bdrgo48+Qnl5Obff29sbffr0UXjdbt26wdfXV6k8e/bsQUhICPz9/WFtbY09e/bUiVHUR8rZ2Rn+/v44fPgwBg4cCHNzc3z55Ze86zX0mN68eRMffPABunXrBnNzc7Rp0wavv/66wlbOy5cvw9vbG+bm5ujYsSNWrFiB7du31zmuc+fOwdfXF23btoW5uTlcXFzwzjvvKPVYEKIqapEiRI2Kiorw4MEDMAyDwsJCbNy4Ec+ePcNbb73FxTAMg1dffRUnTpzAzJkz0bdvXxw+fBgLFizAnTt3sHbtWpibm+O7777D0KFD8d///hdxcXEAgDlz5qCoqAg7duzgtRrIZDKMHTsWgwcPxurVq3Ho0CEsWbIE1dXVWL58+QuP96+//sLw4cNhZWWFyMhImJiY4Msvv8SIESOQmpoKDw8PeHl54cMPP8SGDRuwaNEi9OjRAwC4/xU5cOAAJk6cCHd3d0RHR+Px48eYOXMmOnTooDB++/btKC8vx6xZsyCRSNC6dWscPXoUfn5+eOmll7B06VKUlZVh48aNGDp0KC5cuKCweFJGamoqfvzxR3z44YeQSCTYsmULxo4diz/++ANubm4AgIKCAgwePJgrvNq1a4eDBw9i5syZKC4uRlhYGIDnBe+oUaNw69YtfPjhh3BwcMDOnTtx/PjxRh3TG2+8AWdnZ0RHRyMjIwMbNmzA48eP8f333wN4frrtvffeQ1ZWFneMAHD27Fn8/fff+OSTTxrMcebMGeTk5GD79u0wNTVFUFAQdu/ejUWLFil1jNevX8fkyZPxn//8B++99x66devG7VPmMT179ixOnz6NSZMmoWPHjsjLy8PWrVsxYsQIXLlyBRYWFgCeF7BswR4VFQVLS0t8/fXXkEgkvOMpLCzEmDFj0K5dO3z88cewsbFBXl4eEhISlLo/hKiMIYQIbvv27QyAOv8kEgmzY8cOXmxiYiIDgFmxYgVve0hICCMSiZicnBxuW1RUFGNkZMScPHmS+fnnnxkAzLp163jXmzZtGgOAmTt3LrdNLpcz48ePZ0xNTZn79+9z2wEwS5Ys4S4HBgYypqamzI0bN7htd+/eZVq2bMl4eXlx29jcJ06cUOrxcHd3Zzp27Mg8ffqU25aSksIAYJycnLhtubm5DADGysqKKSws5N1G3759GVtbW+bhw4fctj///JMxMjJi3n77bd79r3mbrCVLljC1P/LY5+XcuXPctps3bzJmZmbMa6+9xm2bOXMm0759e+bBgwe860+aNImxtrZmSktLGYZhmHXr1jEAmJ9++omLKSkpYVxdXZV6vNhjfPXVV3nbP/jgAwYA8+effzIMwzBPnjxhzMzMmIULF/LiPvzwQ8bS0pJ59uxZvXkYhmFCQ0MZR0dHRi6XMwzDMEeOHGEAMBcvXuTFsa/l3NxcbpuTkxMDgDl06FCd21X2MWUfs5rS09MZAMz333/PbZs7dy4jEol4x/Xw4UOmdevWvOPav38/A4A5e/Zsg/edECHRqT1C1Gjz5s2QSqWQSqXYtWsXRo4ciXfffZf3Kzk5ORlisRgffvgh77oRERFgGIY3ym/p0qXo1asXpk2bhg8++ADe3t51rseqOQSdbUmprKzE0aNHFcbLZDIcOXIEgYGBeOmll7jt7du3x5tvvolTp06huLi40Y/B3bt3kZmZibfffhstWrTgtnt7e8Pd3V3hdYKDg9GuXTvu8r1793Dp0iVMnz4drVu35rb37t0bPj4+SE5ObvRxsTw9PTFgwADucqdOnRAQEIDDhw9DJpOBYRjs27cPEyZMAMMwePDgAffP19cXRUVFuHDhAoDnz2X79u0REhLC3Z6FhQVmzZrVqGOaM2cO7/LcuXO52wcAa2trBAQEYO/evdzpWplMhh9//JE7rVif6upq/Pjjj5g4cSJ3uvOVV16Bra3tCzud1+bi4vLCU4gNPaYAeP2pqqqq8PDhQ7i6usLGxoZ7PAHg0KFD8PT0RN++fbltrVu3xpQpU3g5bWxsAABJSUmoqqpS6j4QIgQqpAhRo0GDBmH06NEYPXo0pkyZggMHDqBnz55cUQM87yvi4OCAli1b8q7Lniq7efMmt83U1BTffvstcnNz8fTpU66fSG1GRka8YggAunbtCgAvHGl3//59lJaW8k7R1DwWuVyO27dvK3/n/z/2+F1dXevsU7QNeP4lreg2XnRsDx48QElJSaOPDQC6dOlSZ1vXrl1RWlqK+/fv4/79+3jy5Am++uortGvXjvdvxowZAJ6fVmKP09XVtc5zoui4G3NMnTt3hpGREe+5e/vtt3Hr1i2kpaUBAI4ePYqCggJMnTq1wds/cuQI7t+/j0GDBiEnJwc5OTnIzc3FyJEjsXfvXsjl8gZvo/ZzVN/xA/zHFHh+GnTx4sVcv8C2bduiXbt2ePLkCYqKirjrsY9pbbW3eXt7Izg4GMuWLUPbtm0REBCA7du3N7p/GiGNRX2kCNEgIyMjjBw5EuvXr0d2djZ69erV6Ns4fPgwAKC8vBzZ2dn1fqHpqqaM/lJUWALgWkIaiy0q3nrrLUybNk1hTO/evVW6bWUpuk++vr6ws7PDrl274OXlhV27dsHe3h6jR49u8PbYVqc33nhD4f7U1FSMHDmy3tto6gi9uXPnYvv27QgLC4Onpyesra0hEokwadIkpQq52kQiEeLj45GRkYHffvsNhw8fxjvvvIPY2FhkZGTwWkMJERIVUoRoWHV1NQDg2bNnAAAnJyccPXoUT58+5bVKXbt2jdvPunz5MpYvX44ZM2bg0qVLePfdd5GZmQlra2teDrlcjn/++YdrhQKAv//+GwBe2Cm7Xbt2sLCwwPXr1+v
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cars_df.boxplot(column = 'Price', by = 'Airbags')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"не смотря на то что график показывает выбросы (на самом деле стало гораздо лучше чем было). По медиане видно, что цена имеет прямопропорцианальную зависимость от количества подушек безопасности "
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='Price', ylabel='Mileage'>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjcAAAHACAYAAABeV0mSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA6GklEQVR4nO3df1xW9f3/8ecF8kNEEOWXGIoIaqYpmiKZlsVSa1aftc2P+U1zzcpptahlrNJqK9Klc5nLfcpyffaZujb7sXQ6R/6YxtJU0koRFcVKEFBAQEHh/f3Dcc1LfuN1ccHhcb/drtv0fd7nnNd5xyXPnR/vYzPGGAEAAFiEh7sLAAAAcCbCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsJR2HW62bt2qiRMnKiIiQjabTe+//36Tt7FhwwaNHDlSnTt3VkhIiO6++24dPXrU6bUCAIDGadfhprS0VIMHD9bSpUubtX5WVpbuvPNO3XzzzUpPT9eGDRuUn5+v733ve06uFAAANJaNF2deZLPZ9N577+muu+6yt5WXl+vpp5/WypUrVVhYqIEDB2r+/Pm66aabJEl//vOfNXnyZJWXl8vD42JO/Otf/6o777xT5eXl8vLycsORAADQvrXrMzcNmT17ttLS0rRq1Srt3btXP/jBDzR+/HhlZmZKkoYNGyYPDw+9/fbbqqysVFFRkf73f/9XiYmJBBsAANyEMzf/dvmZm+zsbEVHRys7O1sRERH2fomJiRoxYoReeuklSdKWLVv0wx/+UAUFBaqsrFRCQoLWrVunLl26uOEoAAAAZ27qsG/fPlVWVqpv377y9/e3f7Zs2aLDhw9LknJycjRjxgxNmzZNO3fu1JYtW+Tt7a3vf//7IjMCAOAeHdxdQGtVUlIiT09P7dq1S56eng7L/P39JUlLly5VYGCgFixYYF/2hz/8QZGRkfr00081cuTIFq0ZAAAQbuoUFxenyspKnTx5UqNHj661T1lZmf1G4mrVQaiqqsrlNQIAgJra9WWpkpISpaenKz09XdLFR7vT09OVnZ2tvn37asqUKZo6darWrFmjrKws7dixQykpKVq7dq0k6fbbb9fOnTv1wgsvKDMzU7t379b06dPVq1cvxcXFufHIAABov9r1DcWbN2/W2LFja7RPmzZNK1as0Pnz5/XLX/5S77zzjr755hsFBwdr5MiRev755zVo0CBJ0qpVq7RgwQIdPHhQfn5+SkhI0Pz589W/f/+WPhwAAKB2Hm4AAID1tOvLUgAAwHoINwAAwFLa3dNSVVVV+vbbb9W5c2fZbDZ3lwMAABrBGKMzZ84oIiKixpPKl2t34ebbb79VZGSku8sAAADNcPz4cV111VX19ml34aZz586SLg5OQECAm6sBAACNUVxcrMjISPvv8fq0u3BTfSkqICCAcAMAQBvTmFtKuKEYAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYSrt7/UJrdiSvRMdOlSmqWyf1Du7k7nIAAGiTCDetQGFZhR5Zma6tmXn2tjGxIVoyOU6Bfl5urAwAgLaHy1KtwCMr07X9UL5D2/ZD+Xp45R43VQQAQNtFuHGzI3kl2pqZp0pjHNorjdHWzDxl5Ze6qTIAANomwo2bHTtVVu/yowWEGwAAmoJw42a9uvrVuzyqGzcWAwDQFIQbN4sO8deY2BB52mwO7Z42m8bEhvDUFAAATUS4aQWWTI7TqJhgh7ZRMcFaMjnOTRUBANB28Sh4KxDo56V37h+hrPxSHS0oZZ4bAACuAOGmFekdTKgBAOBKcVkKAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYCuEGAABYilvDzdatWzVx4kRFRETIZrPp/fffb3CdzZs3a+jQofLx8VFMTIxWrFjh8joBAEDb4dZwU1paqsGDB2vp0qWN6p+VlaXbb79dY8eOVXp6un7605/qxz/+sTZs2ODiSgEAQFvRwZ07nzBhgiZMmNDo/suWLVPv3r21cOFCSdLVV1+tbdu26de//rXGjRvnqjIBAEAb0qbuuUlLS1NiYqJD27hx45SWluamigAAQGvj1jM3TZWTk6OwsDCHtrCwMBUXF+vs2bPq2LFjjXXKy8tVXl5u/3txcbHL6wQAAO7Tps7cNEdKSooCAwPtn8jISHeXBAAAXKhNhZvw8HDl5uY6tOXm5iogIKDWszaSlJycrKKiIvvn+PHjLVEqAABwkzZ1WSohIUHr1q1zaNu4caMSEhLqXMfHx0c+Pj6uLg0AALQSbj1zU1JSovT0dKWnp0u6+Kh3enq6srOzJV086zJ16lR7/4ceekhHjhzRk08+qQMHDui3v/2t/vSnP+mxxx5zR/kAAKAVcmu4+eyzzxQXF6e4uDhJUlJSkuLi4jR37lxJ0okTJ+xBR5J69+6ttWvXauPGjRo8eLAWLlyoN998k8fAAQCAnc0YY9xdREsqLi5WYGCgioqKFBAQ4O5yAABAIzTl93ebuqEYAACgIYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKYQbAABgKW4PN0uXLlVUVJR8fX0VHx+vHTt21Nt/8eLF6tevnzp27KjIyEg99thjOnfuXAtVCwAAWju3hpvVq1crKSlJ8+bN0+7duzV48GCNGzdOJ0+erLX/H//4Rz311FOaN2+e9u/fr+XLl2v16tX6+c9/3sKVAwCA1sqt4WbRokWaMWOGpk+frgEDBmjZsmXy8/PTW2+9VWv/Tz75RKNGjdI999yjqKgo3XrrrZo8eXKDZ3sAAED74bZwU1FRoV27dikxMfE/xXh4KDExUWlpabWuc/3112vXrl32MHPkyBGtW7dOt912W537KS8vV3FxscMHAABYVwd37Tg/P1+VlZUKCwtzaA8LC9OBAwdqXeeee+5Rfn6+brjhBhljdOHCBT300EP1XpZKSUnR888/79TaAQBA6+X2G4qbYvPmzXrppZf029/+Vrt379aaNWu0du1a/eIXv6hzneTkZBUVFdk/x48fb8GKAQBAS3PbmZvg4GB5enoqNzfXoT03N1fh4eG1rvPss8/q3nvv1Y9//GNJ0qBBg1RaWqoHHnhATz/9tDw8amY1Hx8f+fj4OP8AAABAq+S2Mzfe3t4aNmyYUlNT7W1VVVVKTU1VQkJCreuUlZXVCDCenp6SJGOM64oFAABthtvO3EhSUlKSpk2bpuuuu04jRozQ4sWLVVpaqunTp0uSpk6dqh49eiglJUWSNHHiRC1atEhxcXGKj4/XoUOH9Oyzz2rixIn2kAMAANo3t4abSZMmKS8vT3PnzlVOTo6GDBmi9evX228yzs7OdjhT88wzz8hms+mZZ57RN998o5CQEE2cOFEvvviiuw4BAAC0MjbTzq7nFBcXKzAwUEVFRQoICHB3OQAAoBGa8vu7TT0tBQAA0BDCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBTCDQAAsBT
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cars_df.plot.scatter(x='Price', y='Mileage')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"по графику видна обратно пропорциональная цены у машин и пробега машин. Также виден факт, что у всех дорогих машин нет большого пробега"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='Prod. year', ylabel='Price'>"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAlUAAAGwCAYAAACAZ5AeAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAByCElEQVR4nO3de1hVVf4/8PcBueMB5CqJCmLe8p4gaaZGUuNUlt++5jRpRjkZ2qil5Uw/a5pmbGqcxszq21SaTRdzZnJKG83xgpPiDaG8kgqKJlcVEFCu+/dHc04cPGevBSzO2Rzfr+fhefSs7WLtDXI+rPVZn2XSNE0DEREREbWJh6sHQEREROQOGFQRERERKcCgioiIiEgBBlVERERECjCoIiIiIlKAQRURERGRAgyqiIiIiBTo5OoBXEsaGxtx7tw5dO7cGSaTydXDISIiIgmapuHSpUuIjo6Gh4fj+SgGVU507tw5xMTEuHoYRERE1ApnzpxBt27dHLYzqHKizp07A/jhi2I2m108GiIiIpJRUVGBmJgY6/u4IwyqnMiy5Gc2mxlUERERdTCi1B0mqhMREREpwKCKiIiISAEGVUREREQKMKgiIiIiUoBBFREREZECDKqIiIiIFGBQRURERKQAgyoiIiIiBRhUERERESnAoIqIiIhIAR5TQ0RE1AHkllTi9IVq9AwNQGxYgKuHQ3YwqCIiIjKwsupaPPFxNnYcL7G+NqZ3OJZPHYogfy8Xjoya4/IfERGRgT3xcTZ2nii1eW3niVLM+TjLRSMiRxhUERERGVRuSSV2HC9Bg6bZvN6gadhxvAR5pVUuGhnZw6CKiIjIoE5fqNZtP3WeQZWRMKgiIiIyqB5d/HXbe4YyYd1IGFQREREZVFx4IMb0DoenyWTzuqfJhDG9w7kL0GAYVBERERnY8qlDMSo+zOa1UfFhWD51qItGRI6wpAIREZGBBfl7YXVqAvJKq3DqfBXrVBkYgyoiIqIOIDaMwZTRuXT5780338SgQYNgNpthNpuRlJSEf/3rX9b2K1euIC0tDaGhoQgMDMTkyZNRVFRk00d+fj4mTpwIf39/REREYMGCBaivr7e5Zvv27Rg2bBh8fHwQHx+PVatWXTWWFStWoGfPnvD19UViYiL27t1r0y4zFiIiIrp2uTSo6tatG1566SVkZmZi//79GD9+PO6++24cPnwYADBv3jx88cUXWLt2LdLT03Hu3Dnce++91n/f0NCAiRMnora2Frt27cL777+PVatWYfHixdZr8vLyMHHiRIwbNw7Z2dmYO3cuHnnkEWzatMl6zZo1azB//nw899xzOHDgAAYPHoyUlBQUFxdbrxGNhYiIiK5xmsGEhIRo77zzjlZWVqZ5eXlpa9eutbYdPXpUA6BlZGRomqZpX375pebh4aEVFhZar3nzzTc1s9ms1dTUaJqmaQsXLtQGDBhg8zmmTJmipaSkWP+ekJCgpaWlWf/e0NCgRUdHa0uWLNE0TZMai4zy8nINgFZeXi79b4iIqGM7WXxJ23qsSMstqXT1UAypIzwf2fdvw+RUNTQ0YO3ataiqqkJSUhIyMzNRV1eH5ORk6zV9+/ZF9+7dkZGRgZEjRyIjIwMDBw5EZGSk9ZqUlBTMmjULhw8fxtChQ5GRkWHTh+WauXPnAgBqa2uRmZmJRYsWWds9PDyQnJyMjIwMAJAaiz01NTWoqamx/r2ioqL1D4iIiDoUntmnzx2fj8tLKhw8eBCBgYHw8fHBY489hs8++wz9+/dHYWEhvL29ERwcbHN9ZGQkCgsLAQCFhYU2AZWl3dKmd01FRQUuX76M0tJSNDQ02L2maR+isdizZMkSBAUFWT9iYmLkHgoREXV4PLNPnzs+H5cHVX369EF2djb27NmDWbNmYfr06Thy5Iirh6XEokWLUF5ebv04c+aMq4dEREROwDP79Lnr83H58p+3tzfi4+MBAMOHD8e+ffuwbNkyTJkyBbW1tSgrK7OZISoqKkJUVBQAICoq6qpdepYdeU2vab5Lr6ioCGazGX5+fvD09ISnp6fda5r2IRqLPT4+PvDx8WnB0yAiIncgc2bftVwewV2fj8tnqpprbGxETU0Nhg8fDi8vL2zZssXalpOTg/z8fCQlJQEAkpKScPDgQZtdeps3b4bZbEb//v2t1zTtw3KNpQ9vb28MHz7c5prGxkZs2bLFeo3MWIiIiCx4Zp8+d30+Lp2pWrRoEe644w50794dly5dwkcffYTt27dj06ZNCAoKQmpqKubPn48uXbrAbDZjzpw5SEpKsiaGT5gwAf3798eDDz6Il19+GYWFhXj22WeRlpZmnSF67LHH8Prrr2PhwoV4+OGHsXXrVnz66afYsGGDdRzz58/H9OnTceONNyIhIQF//vOfUVVVhRkzZgCA1FiIiIgsLGf27TxRarPE5WkyYVR8WIechVHJXZ+PS4Oq4uJiTJs2DQUFBQgKCsKgQYOwadMm3HbbbQCAV199FR4eHpg8eTJqamqQkpKCN954w/rvPT09sX79esyaNQtJSUkICAjA9OnT8cILL1iviY2NxYYNGzBv3jwsW7YM3bp1wzvvvIOUlBTrNVOmTEFJSQkWL16MwsJCDBkyBBs3brRJXheNhYiIqKnlU4dizsdZNrvbeGbfj9zx+Zg0rVmWGLWbiooKBAUFoby8HGaz2dXDISIiJ+CZffo6wvORff92eaI6ERGRO+OZffrc6fkwqCIiIqIWyy2pxOkL1YaeYXI2BlVEREQkzR0roatiuJIKREREZFzuWAldFQZVREREJMVdK6GrwqCKiIiIpMhUQr+WMagiIiIiKe5aCV0VBlVEREQkxVIJ3dNksnnd02TCmN7h1/wuQAZVREREJG351KEYFR9m81pHr4SuCksqEBERkbQgfy+sTk3oEJXQnY1BFREREbWYO1VCV4XLf0REREQKMKgiIiIiUoBBFREREZECDKqIiIiIFGBQRURERKQAgyoiIiIiBRhUERERESnAoIqIiIhIAQZVRERERAowqCIiIiJSgEEVERERkQIMqoiIiIgUYFBFREREpACDKiIiIiIFGFQRERERKcCgioiIiEgBBlVERERECjCoIiIiIlKAQRURERGRAgyqiIiIiBRgUEVERESkQCdXD4CIiIg6ntySSpy+UI2eoQGIDQtw9XAMgUEVERERSSurrsUTH2djx/ES62tjeodj+dShCPL3cuHIXI/Lf0RERCTtiY+zsfNEqc1rO0+UYs7HWS4akXEwqCIiIiIpuSWV2HG8BA2aZvN6g6Zhx/ES5JVWuWhkxsCgioiIiKScvlCt237qPIMqIiIiIqEeXfx123uGXtsJ6wyqiIiISEpceCDG9A6Hp8lk87qnyYQxvcOv+V2ADKqIiIg6gNySSmzLKXZ53tLyqUMxKj7M5rVR8WFYPnWoi0ZkHCypQEREZGBGK2EQ5O+F1akJyCutwqnzVaxT1QRnqoiIiAzMqCUMYsMCMK5PBAOqJhhUERERGVR7lDAwyjKiO3JpULVkyRKMGDECnTt3RkREBCZNmoScnByba8aOHQuTyWTz8dhjj9lck5+fj4kTJ8Lf3x8RERFYsGAB6uvrba7Zvn07hg0bBh8fH8THx2PVqlVXjWfFihXo2bMnfH19kZiYiL1799q0X7lyBWlpaQgNDUVgYCAmT56MoqIiNQ+DiIioGZUlDMqqazHt3b0YvzQdM1buw7g/bse0d/eivLqurcOk/3JpUJWeno60tDTs3r0bmzdvRl1dHSZMmICqKttvkkcffRQFBQXWj5dfftna1tDQgIkTJ6K2tha7du3C+++/j1WrVmHx4sXWa/Ly8jBx4kSMGzcO2dnZmDt3Lh555BFs2rTJes2aNWswf/58PPfcczhw4AAGDx6MlJQUFBcXW6+ZN28evvjiC6xduxbp6ek4d+4c7r333nZ8QkREdC1TWcLAqMuI7sSkac3mFF2opKQEERERSE9Px5gxYwD8MFM1ZMgQ/PnPf7b7b/71r3/hpz/9Kc6dO4fIyEgAwFtvvYWnn34aJSUl8Pb2xtNPP40NGzbg0KFD1n93//33o6y
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"cars_df.plot.scatter(x ='Prod. year', y='Price')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"По графику видно, что цена зависит от года производства машины прямо пропорциональная (имею ввиду чем больше год тем больше цена)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjIAAAHjCAYAAAA5RvVOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAACuTElEQVR4nOydeVxU1fvHP8OOIItsbqiAiHvuprhnmktqmpmZmppm4Z6WVlqppZZL7pmZW6VmXxeyvmoaoiLuuCuyKbiCCxC4Auf3h7+ZLwMzd+5lDjNnrs/79ZrXC+5hDufee865zz3Pcz6PhjHGQBAEQRAEYYPYWbsBBEEQBEEQJYUMGYIgCIIgbBYyZAiCIAiCsFnIkCEIgiAIwmYhQ4YgCIIgCJuFDBmCIAiCIGwWMmQIgiAIgrBZyJAhCIIgCMJmcbB2A0qbgoIC3LhxA2XLloVGo7F2cwiCIAiCkAFjDP/++y8qVqwIOzvj6y6qN2Ru3LiBwMBAazeDIAiCIIgSkJaWhsqVKxstV70hU7ZsWQDPLoSHh4eVW0MQBEEQhByys7MRGBioe44bQ/WGjNad5OHhQYYMQRAEQdgYpsJCKNiXIAiCIAibhQwZgiAIgiBsFjJkCIIgCIKwWciQIQiCIAjCZiFDhiAIgiAIm4UMGYIgCIIgbBYyZAiCIAiCsFnIkCEIgiAIwmYhQ4YgCIIgCJuFDBmCIAiCIGwW1acoIAiCIAhrk5yRg6v3HqCajxuCfN2s3RxVQYYMQRAEQZQSmQ+eYMyGU9ifkKE71ibUD4v7N4RnGUcrtkw9kGuJIAiCIEqJMRtOISbxjt6xmMQ7GL0hzkotUh9kyBAEQRBEKZCckYP9CRnIZ0zveD5j2J+QgZQ7uVZqmbogQ4YgCIIgSoGr9x5Ill+5S4YMD8iQIQiCIIhSoGq5MpLl1Xwo6JcHZMgQBEEQBGGzkCFDEARBEKUAuZYsAxkyBEEQBFEKkGvJMpAhQxAEQRClQLCfO9qE+sFeo9E7bq/RoE2oHwnjcYIMGYIgCIIoJRb3b4jw6r56x8Kr+2Jx/4ZWapH6IGVfgiAIgiglPMs4Yt2wZki5k4srd3MpRUEpQIYMQRAEQZQyQb5kwJQW5FoiCIIgCMJmIUOGIAiCIAibhVxLViY6Ph2nrmWiURVvtA71M6suXmniebaJKH143S9e/YdXPaJB40KaTUdTEZtyF+EhvujbJNDazeEG3Xfx0TBWJJuVysjOzoanpyeysrLg4eHBrV5zJ+urd3PRa2kM7j94qjvmXcYRkRGtEOgjrT1QFF5p4nm2iSh9eN0vXv2HVz2iQeNCmrPXMvHaskPIK/jfo8TBToPIiHDUruRpxZaZB9136yP3+U2GjEJ4TdYNp+/WGyBavMs4Im5aJ0VtGrTqKGIS7+hlWLXXaBBe3RfrhjWzSpuI0ofX/eLVf3jVIxo0LqSp/slfekaMFgc7DRK/7mqFFvGB7rv1kfv8phgZhYzZcAoxiXf0jsUk3sHoDXGy64iOTzc4QADg/oOnOFDISDIFrzTxPNtElD687hev/sOrHtGgcSHNpqOpBo0YAMgrYNh8PM3CLeID3XfbggwZBfCarE9dy5QsP5l6X3abeOXy4NkmovThdb949R+15pShcSFNbMpdyfKYpDuS5aJC9922IENGAbwm6waVvSTLG1Xxltskbrk8eLaJKH143S9e/UetOWVoXEjTIshHsjw8xFeyXFTovtsWZMgogNdk3TbMH95G4mm8yzgqioznlcuDZ5uI0ofX/eLVf9SaU4bGhTT9mlWRLLfV3Ut0320LMmQUwHOyjoxoVWygaCPilcIrlwfPNhGlD6/7xav/qDWnDI0L4yRn5EiW22psFED33ZagXUsKyXrwFKM3xHHbYnogIQMnU+9z0SjglcuDZ5uI0ofX/eLVf9SaU4bGRXGi4tMxZPUxo+WrhzRF+zB/C7aIP3TfrQdtv/5/SktHRq2TNUEQhFySM3LQYV600fKoie1ofiRKjNznNyn7lhBKAEYQxPOO1t1uTD+I5kjCElCMDEEQBFFi1BobRdgOtCJDEARBlBjPMo5YN6wZudsJq0GGDEEQBGE25G4nrAW5lgiCIAiCsFnIkCEIgiAIwmYhQ4YgCIIgCJuFDBmCIAiCIGwWMmQIgiAIgrBZaNcSUYzkjBxcvffA7G2UvOohCIKwdWg+LD3IkCF0ZD54gjEbTpmdR4pXPQRBELYOzYelD7mWCB1jNpxCTOIdvWMxiXcwekOcVeohCIKwdWg+LH3IkCEAPFv23J+QoZcvBQDyGcP+hAyk3Mm1aD0EQRC2Ds2HloEMGQIAcPXeA8nyK3flDThe9RAEQdg6NB9aBjJkCABA1XJlJMur+cgLTuNVD0EQhK1D86FlIEOGAAAE+7mjTagf7DUaveP2Gg3ahPrJjrLnVQ9BEIStE+znDm8jAb3eZRxpPuQEGTKEjsX9GyK8uq/esfDqvljcv6FV6iEIgrBlkjNycP/BU4Nl9x88pRgZTtD2aysTHZ+OU9cy0aiKN1qH+lm1LZ5lHLFuWDOk3MnFlbu5JdY74FUPQYgE6SsRSpETI0N9wHzIkLESV+/motfSGD1r3buMIyIjWiHQR9qvWtoE+fKZYHnVQxDWhPSViJJCMTKWgVxLVqKoEQM8W2rssfSglVpEEIQhSF+JKCkUI2MZyJCxAtHx6ZJ+0wOF3tgIgrAepK9EmAPFyFgGMmRKSHJGDqLi00vUEU9dy5QsP5l6v4StIgiCJ6SvRJgD3XfLQDEyCuHh525Q2UuyvFEVb3OaSBAEJ0hfiTAHuu+WgVZkFMLDz902zF/Sb2rt3UsEQTyDV4wD6Ss9n9B9twxkyCiAp587MqJVsQlSu2uJIAgx4BnjQPpKzyd030sfci0pgKcmQKBPGcRN64TfjqXiUPJdhIf4om+TQB7NJAiCEzzHPOkrPZ/QfS99yJBRAE9/Z9FYm21xN/DH6ZukKUEQAlEaMQ6kr/R8Qve99CDXkgJ4+jtJU4IgxIdiHAhCfKxqyOTn52Pq1KkICgqCq6srQkJCMGPGDLBCMSiMMUybNg0VKlSAq6srOnbsiISEBKu1mYe/kzQlCMJ2oBgHghAbq7qW5syZg+XLl2Pt2rWoU6cOjh8/jiFDhsDT0xNjxowBAHzzzTdYtGgR1q5di6CgIEydOhWdO3fGhQsX4OLiYvE2a/2d+y+nIy6tZDmSnpf8GyLlkVIzm46mIjaF4qxKC+2Y33QsFbEc4tnUmmuJ13in60MoRcNYkWUBC9K9e3cEBARg1apVumN9+vSBq6srfv75ZzDGULFiRXz44YeYOHEiACArKwsBAQFYs2YN3nzzTZP/Izs7G56ensjKyoKHh4fZbeahI5OckYMO86KNlkdNbGfTA1jkPFJq4uy1TLy27BDyCv43hB3sNIiMCEftSp5WbJm64NWf1Zpria6PNDQflhy5z2+rupZatmyJvXv34vLlywCA06dP4+DBg+jSpQsAICUlBbdu3ULHjh113/H09ETz5s0RGxtrlTbziG1Re/4NyiNlGYoaMQCQV8DQY2mMlVqkTnj1Z7XGxdH1kYbmw9LHqobM5MmT8eabb6JmzZpwdHREw4YNMW7cOAwYMAAAcOvWLQBAQECA3vcCAgJ0ZUV5/PgxsrOz9T684Jl3Ra35NyiPlGXYdDS1mBGjJa+AYfPxNAu3SJ3w6s9qjYuj6yMNzYeWwaqGzG+//YZffvkFv/76K06ePIm1a9di7ty5WLt2bYnrnDVrFjw9PXWfwEB+MQOUd8U0lEfKMsSm3JUsj0m6I1lOyINXf1brmKfrIw3Nh5bBqobMpEmTdKsy9erVw8CBAzF+/HjMmjULAFC+fHkAwO3bt/W+d/v2bV1ZUaZMmYKsrCzdJy2N35sp5V0xDeWRsgwtgnwky8NDfCXLCXnw6s9qHfN0faSh+dAyWNWQefDgAezs9Jtgb2+PgoICAEBQUBDKly+PvXv36sqzs7Nx5MgRtGjRwmCdzs7O8PDw0PvwgpemhJq1KQJNTEiVvSm4jQf9mlWBg53GYJmDnYZ2L3GCV140tY55uj7
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"\n",
"phones_df.plot.scatter(x='company', y='Price')\n",
"plt.xticks(rotation=90)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"На графике можно проследить зависимость цены от компании производителя"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAi8AAAHeCAYAAABT8utlAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABws0lEQVR4nO3de1wU9f4/8NfCckfRSAFFhSRFAzXN8ApaXgmDEDtWpnb9lpmpoCVdvByPJiLeSj1m6SmTUyHhOXhJNC+YkoZZmOGFwPKGd1FAYJfP7w9/u4flolxmZ3eG1/Px4KE7Mzvvz2dndvY9n/nMZzRCCAEiIiIihbCxdAGIiIiI6oLJCxERESkKkxciIiJSFCYvREREpChMXoiIiEhRmLwQERGRojB5ISIiIkVh8kJERESKwuSFiIiIFIXJC5EV0Gg0mDVrlqWLoXq7d++GRqPB7t2777rcrFmzoNFocPnyZXkKJhMfHx+MHz/e0sUgajAmL6Rq69atg0ajMflr2bIlBg4ciK1bt1q6eA127NgxzJo1C3l5eZYuCtXDgAEDTPZNJycndOnSBUuWLEF5eXm91rl//37MmjUL169fl7awRFZEa+kCEMlhzpw58PX1hRAC+fn5WLduHUJDQ/Hf//4XYWFhli5evR07dgyzZ8/GgAED4OPjY+niUD14e3tj/vz5AIDLly9jw4YNmDJlCi5duoR//OMfdV7f/v37MXv2bIwfPx7NmjUzmXf8+HHY2PCclZSPyQs1CsOHD8cjjzxifP3SSy/Bw8MDiYmJik5e5KTT6VBeXg57e3tLF0VV3NzcMGbMGOPr1157Df7+/li+fDnmzJkDW1tbyWI5ODhIti4iS2IKTo1Ss2bN4OTkBK3WNH8vLCxEdHQ02rRpAwcHB3Ts2BHx8fEwPHy9uLgY/v7+8Pf3R3FxsfF9V69ehZeXF/r06QO9Xg8AGD9+PFxdXfHHH39g6NChcHFxQatWrTBnzhzU5mHuP//8M4YPH46mTZvC1dUVjz/+ODIyMozz161bh1GjRgEABg4caLz0cK/+HN988w06d+4MR0dHBAQE4Ntvv8X48eNNWm7y8vKg0WgQHx+PJUuWoH379nBwcMCxY8cAAN9//z369+8PFxcXNGvWDOHh4fj9999N4lRep4GhP0lFGo0GEydOxJdffomOHTvC0dERPXr0wN69e6u8/+zZs3jxxRfh4eEBBwcHPPTQQ/jss8+qLHfmzBlERETAxcUFLVu2xJQpU1BSUnLXz6ayy5cv4+mnn0bTpk3h7u6Ot956C7dv3zbODwkJQdeuXat9b8eOHTF06NA6xQMAR0dH9OzZEzdv3sTFixeN03/99VeMHz8eDzzwABwdHeHp6YkXX3wRV65cMS4za9YsTJs2DQDg6+tr3CcMlxUr93kxXFb94YcfMHXqVLRo0QIuLi546qmncOnSJZNylZeXY9asWWjVqhWcnZ0xcOBAHDt2jP1oyCLY8kKNwo0bN3D58mUIIXDx4kUsX74ct27dMjnjFULgySefxK5du/DSSy+hW7du+O677zBt2jScPXsWixcvhpOTE/71r3+hb9++ePfdd5GQkAAAeOONN3Djxg2sW7fO5ExZr9dj2LBh6NWrF+Li4rBt2zbMnDkTOp0Oc+bMqbG8v/32G/r374+mTZti+vTpsLOzwz//+U8MGDAAe/bsQVBQEIKDgzFp0iQsW7YMsbGx6NSpEwAY/63O5s2b8be//Q2BgYGYP38+rl27hpdeegmtW7eudvm1a9fi9u3bePXVV+Hg4ID77rsPO3bswPDhw/HAAw9g1qxZKC4uxvLly9G3b18cPny43pev9uzZg6+++gqTJk2Cg4MDVqxYgWHDhuHgwYMICAgAAOTn56NXr17GZKdFixbYunUrXnrpJRQUFGDy5MkA7iSZjz/+OP78809MmjQJrVq1whdffIHvv/++TmV6+umn4ePjg/nz5yMjIwPLli3DtWvX8PnnnwMAnn/+ebzyyis4evSosYwAcOjQIZw4cQLvvfdevT4LQ/JY8bJPWloa/vjjD7zwwgvw9PTEb7/9htWrV+O3335DRkYGNBoNIiMjceLECSQmJmLx4sW4//77AQAtWrS4a7w333wTzZs3x8yZM5GXl4clS5Zg4sSJ+Oqrr4zLzJgxA3FxcRgxYgSGDh2KX375BUOHDjVJ5ohkI4hUbO3atQJAlT8HBwexbt06k2VTUlIEADF37lyT6VFRUUKj0YhTp04Zp82YMUPY2NiIvXv3im+++UYAEEuWLDF537hx4wQA8eabbxqnlZeXiyeeeELY29uLS5cuGacDEDNnzjS+joiIEPb29iInJ8c47dy5c6JJkyYiODjYOM0Qe9euXbX6PAIDA4W3t7e4efOmcdru3bsFANGuXTvjtNzcXAFANG3aVFy8eNFkHd26dRMtW7YUV65cMU775ZdfhI2NjRg7dqxJ/Suu02DmzJmi8qHHsF1++ukn47TTp08LR0dH8dRTTxmnvfTSS8LLy0tcvnzZ5P2jR48Wbm5uoqioSAghxJIlSwQA8fXXXxuXKSwsFH5+frX6vAxlfPLJJ02mT5gwQQAQv/zyixBCiOvXrwtHR0fx9ttvmyw3adIk4eLiIm7dunXXOCEhIcLf319cunRJXLp0SWRnZ4tp06YJAOKJJ54wWdZQt4oSExMFALF3717jtIULFwoAIjc3t8ry7dq1E+PGjTO+Nnw/Bg0aJMrLy43Tp0yZImxtbcX169eFEEJcuHBBaLVaERERYbK+WbNmCQAm6ySSAy8bUaPw8ccfIy0tDWlpaVi/fj0GDhyIl19+GcnJycZltmzZAltbW0yaNMnkvdHR0RBCmNydNGvWLDz00EMYN24cJkyYgJCQkCrvM5g4caLx/4YWg9LSUuzYsaPa5fV6PbZv346IiAg88MADxuleXl549tlnsW/fPhQUFNT5Mzh37hyysrIwduxYuLq6GqeHhIQgMDCw2veMHDnS5Kz9/PnzOHLkCMaPH4/77rvPOL1Lly4YPHgwtmzZUudyGfTu3Rs9evQwvm7bti3Cw8Px3XffQa/XQwiBjRs3YsSIERBC4PLly8a/oUOH4saNGzh8+DCAO9vSy8sLUVFRxvU5Ozvj1VdfrVOZ3njjDZPXb775pnH9wJ3+KuHh4UhMTDReCtTr9fjqq6+Ml6zuJTs7Gy1atECLFi3g7++PhQsX4sknn8S6detMlnNycjL+//bt27h8+TJ69eoFAMZ619err75qcimvf//+0Ov1OH36NABg586d0Ol0mDBhgsn7DJ8HkdyYvFCj8Oijj2LQoEEYNGgQnnvuOWzevBmdO3c2JhIAcPr0abRq1QpNmjQxea/hMozhQA4A9vb2+Oyzz5Cbm4ubN29i7dq1VfpxAICNjY1JAgIAHTp0AIAab2++dOkSioqK0LFjxyrzOnXqhPLycvz111+1r/z/Zyi/n59flXnVTQPu9Juobh01le3y5csoLCysc9kA4MEHH6wyrUOHDigqKsKlS5dw6dIlXL9+HatXrzb+2Bv+XnjhBQAw9hE5ffo0/Pz8qmyT6spdlzK1b98eNjY2Jttu7Nix+PPPP5Geng4A2LFjB/Lz8/H888/XKoaPjw/S0tLw3XffYcWKFWjdujUuXboER0dHk+WuXr2Kt956Cx4eHnByckKLFi2M2+fGjRt1qldlbdu2NXndvHlzAMC1a9cA1Lzv3HfffcZlieTEPi/UKNnY2GDgwIFYunQpTp48iYceeqjO6/juu+8A3DkLPnnyZJUfejWoeLZfV9UlcwCMHZrryjDuyZgxYzBu3Lhql+nSpUu91l1b1dVp6NCh8PDwwPr16xEcHIz169fD09MTgwYNqtU6XVxcTJbt27cvunfvjtjYWCxbtsw4/emnn8b+/fsxbdo0dOvWDa6urigvL8ewYcPqPSaMQU13NIladCwnsgQmL9Ro6XQ6AMCtW7cAAO3atcOOHTtw8+ZNk9aX7Oxs43yDX3/9FXPmzMELL7yAI0eO4OWXX0ZWVhbc3NxMYpSXl+OPP/4wtrYAwIkTJwCgxo6tLVq0gLOzM44
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"phones_df.boxplot(column='Price', by='Rating')\n",
"plt.xticks(rotation=90)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"По графику можно сказать, рейтинг (насколько телефон понравился пользователям) не зависит напрямую от цены"
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='Price', ylabel='Ram'>"
]
},
"execution_count": 82,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAGwCAYAAABFHJ/jAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAACjFElEQVR4nOzdeVwU9f8H8NfCcq8spwLKoS4iHijegmd5X2mlRuKFaaZmVpZSpqmlaZqaV1oImonH1yyztDxAxQsP8EjjEPBEuUFAkWN+f/hjY2B32WXXs9fz8ZjHg535HO85dtn3zsxnJIIgCCAiIiIiIiK9GD3tAIiIiIiIiF4ETK6IiIiIiIgMgMkVERERERGRATC5IiIiIiIiMgAmV0RERERERAbA5IqIiIiIiMgAmFwREREREREZgPRpB0BE9F9RVlaG27dvo1atWpBIJE87HCIiItKCIAi4d+8eXFxcYGSk+dwUkysioifk9u3bcHV1fdphEBERUQ3cuHED9erV01iGyRUR0RNSq1YtAI8+nK2trZ9yNERERKSNvLw8uLq6Kv+Pa8LkiojoCSm/FNDa2prJFRER0XNGm0v6OaAFERERERGRATC5IiIiIiIiMgAmV0RERERERAbA5IqIiIiIiMgAmFwREREREREZAJMrIiIiIiIiA2ByRUREREREZABMroiIiIiIiAyAyRUREREREZEBMLkiomdWZGQkJBIJcnJytK4zZswYDB48WO++JRIJfvnlFwBASkoKJBIJYmNj9W6XiIiIXlxMroiec59//jkkEoloaty4sajM+vXr0a1bN1hbW6tMVlJSUjBu3DjUr18fFhYWaNiwIebMmYOHDx9W239MTAyGDx8OZ2dnmJmZwd3dHQMGDMBvv/0GQRCU7VeMz9TUFAqFAl988YWyjCp+fn5ITU2FXC7XfcPoKTU1FX379lW5rCZJHxE9m5LS8xERl4bkjIL/VN9E9HhIn3YARKS/pk2b4sCBA8rXUqn4rV1YWIg+ffqgT58+CA4OrlL/n3/+QVlZGdatWweFQoFLly5h/PjxKCgowJIlS9T2++uvv2LYsGHo0aMHNm7cCIVCgaKiIhw/fhyzZs1C586dYWNjoyx/4MABNG3aFEVFRYiKisJbb70FZ2dnjBs3TmX7pqamcHJy0nFrGMbT6peInoycwoeYGh6LIwnpynldPB2xMsAXckuTF7ZvInq8eOaK6AUglUrh5OSknBwcHETLp02bhpkzZ6JDhw4q6/fp0wehoaHo1asXGjRogEGDBmH69On4+eef1fZZUFCAcePGoX///vj999+Vdb29vTFu3DicP3++yhkne3t7ODk5wd3dHSNGjIC/vz/OnTunto/KZ4jCwsJgY2ODP//8E97e3pDJZOjTpw9SU1Or1J07dy4cHR1hbW2NiRMnis7CeXh4YPny5aLyLVu2xOeff658XfGywIpSUlLQvXt3AICtrS0kEgnGjBmjdh2I6Nk0NTwWxxIzRPOOJWbg3fCYF7pvInq8mFwRvQASEhLg4uKCBg0aYMSIEbh+/brebebm5sLOzk7t8r/++guZmZn4+OOP1ZaRSCRql505cwZnz55F+/btdYqrsLAQS5YswY8//ogjR47g+vXrmD59uqjMwYMHceXKFURGRiI8PBw///wz5s6dq1M/6ri6umLnzp0AgLi4OKSmpmLFihUqyxYVFSEvL080EdHTl5SejyMJ6SitdFlyqSDgSEL6Y71M72n2TUSPH5Mroudc+/btERYWhn379mHt2rVITk5G586dce/evRq3mZiYiJUrV+Ltt99WWyY+Ph4A4OXlpZx3+vRpyGQy5bRnzx5RHT8/P8hkMpiamqJt27YYNmwYRo0apVNsxcXF+O6779CmTRu0atUKU6ZMwcGDB0VlTE1NsWHDBjRt2hT9+/fHvHnz8O2336KsrEynvlQxNjZWJp21a9eGk5OT2nvCFi5cCLlcrpxcXV317p+I9Hctq1Dj8pTMx5fgPM2+iejx4z1XRM+5ioMu+Pj4oH379nB3d8f27dvV3sukya1bt9CnTx8MHToU48eP16muj4+PckQ9T09PlJSUiJZv27YN3t7eKC4uxqVLl/Duu+/C1tYWX331ldZ9WFpaomHDhsrXzs7OSEtLE5Vp0aIFLC0tla87duyI/Px83LhxA+7u7jqtkz6Cg4PxwQcfKF/n5eUxwSJ6BrjbWWpc7mFv9UL2TUSPH89cEb1gbGxs0KhRIyQmJupc9/bt2+jevTv8/Pywfv16jWU9PT0BPLo0rpyZmRkUCgUUCoXKOq6urlAoFPD29sbQoUMxbdo0LF26FA8ePNA6RhMT8c3eEolE44iDqhgZGVWpU1xcrFMb2jAzM4O1tbVoIqKnr4GjDF08HWFc6dJlY4kEXTwdUd/h8SU4T7NvInr8mFwRvWDy8/Nx9epVODs761Tv1q1b6NatG1q3bo3Q0FAYGWn+eOjVqxfs7OywaNGiGsdqbGyMkpISrYZ818X58+dx//595euTJ09CJpMpzxo5OjqKBsHIy8tDcnKy1u2bmpoCAEpLSw0UMRE9aSsDfOGvEA/+469wwMoA3xe6byJ6vHhZINFzbvr06Rg4cCDc3d1x+/ZtzJkzB8bGxggICFCWuXPnDu7cuaM8m3Xx4kXUqlULbm5usLOzUyZW7u7uWLJkCdLT/x0eWN2Q5DKZDD/88AOGDx+O/v37Y+rUqfD09ER+fj727dsH4FHyVFFmZibu3LmDkpISXLx4EStWrED37t0Nfkbn4cOHGDduHGbNmoWUlBTMmTMHU6ZMUSaML730EsLCwjBw4EDY2Nhg9uzZVWLVxN3dHRKJBHv27EG/fv1gYWEBmUxm0HUgosdLbmmCTePaITmjACmZBfCwt3piZ42eZt9E9HgxuSJ6zt28eRMBAQHIzMyEo6MjOnXqhJMnT8LR0VFZ5rvvvhONltelSxcAQGhoKMaMGYP9+/cjMTERiYmJqFevnqh9TZfcDRkyBMePH8eiRYswatQoZGVlQS6Xo02bNti6dSsGDBggKt+jRw8Aj5IuZ2dn9OvXD19++aXe26Cyl19+GZ6enujSpQuKiooQEBAgGmY9ODgYycnJGDBgAORyOebPn6/Tmau6deti7ty5mDlzJsaOHYtRo0YhLCzM4OtBRI9ffYenl9g8zb6J6PGQCLrerEBERDWSl5cHuVyO3Nxc3n9FRET0nNDl/zfvuSIiIiIiIjIAJldEREREREQGwOSKiIiIiIjIAJhcERERERERGQCTKyIiIiIiIgNgckVERERERGQATK6IiIiIiIgMgMkVERERERGRATC5IiIiIiIiMgAmV0RERERERAbA5IqIiIiIiMgAmFwREREREREZAJMrIiIiIiIiA2ByRUREREREZABMrojomRUZGQmJRIKcnByt64wZMwaDBw/Wu2+JRIJffvkFAJCSkgKJRILY2Fi92yUiIqIXF5MroufcwoUL0bZtW9SqVQu1a9fG4MGDERcXJyrTrVs3SCQS0TRx4sQqbYWFhcHHxwfm5uaoXbs2Jk+eXG3/MTExGD58OJydnWFmZgZ3d3cMGDAAv/32GwRBAPBvclI+mZqaQqFQ4IsvvlCWUcXPzw+pqamQy+U6bhX9paamom/fviqX1STpI6qpw3FpWHEwHkcT0nWum5Sej4i4NCRnFBi0rKE9z30/zdiJ9MXj1/CkTzsAItLP4cOHMXnyZLRt2xYlJSX45JNP0KtXL1y+fBlWVlbKcuPHj8e8efOUry0tLUXtfPPNN1i6dCm+/vprtG/fHgUFBUhJSdHY96+//ophw4ahR48e2LhxIxQKBYqKinD8+HHMmjULnTt3ho2NjbL8gQMH0LRpUxQVFSEqKgpvvfUWnJ2dMW7cOJXtm5qawsnJSfeNYgBPq1+ictcyCzB49TFkFxYr59lammD35E5wtbfUUBPIKXyIqeGxOFIhIevi6YiVAb6QW5rUuKyhPc99P83YifTF4/fx4Zkroufcvn37MGbMGDRt2hQtWrRAWFgYrl+/jrNnz4rKWVpawsnJSTlZW1srl2VnZ2PWrFnYtGk
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"phones_df.plot.scatter(x='Price', y='Ram')"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='Age', ylabel='Networth'>"
]
},
"execution_count": 97,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjIAAAGwCAYAAACzXI8XAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAABxAElEQVR4nO3deXxU9bk/8M9kJSFkISRAIIGEpCAgyBoii6BU5fYqXLmtRbwg0nq1FBdKK9SttlLs4m1d6W3l4oLi0p8i2rpiCILsEBVpY0JYRUgIJCEJhCzn9wedYebM2eecOefMfN6vl6+XmRkm31lyznO+3+f7PB5BEAQQERERuVCM3QMgIiIiMoqBDBEREbkWAxkiIiJyLQYyRERE5FoMZIiIiMi1GMgQERGRazGQISIiIteKs3sAVuvs7MSxY8fQrVs3eDweu4dDREREGgiCgDNnziAnJwcxMfLzLhEfyBw7dgy5ubl2D4OIiIgMOHLkCPr27St7f8QHMt26dQNw4Y1ITU21eTRERESkRWNjI3Jzc33ncTkRH8h4l5NSU1MZyBAREbmMWloIk32JiIjItRjIEBERkWsxkCEiIiLXYiBDRERErsVAhoiIiFyLgQwRERG5FgMZIiIici0GMkRERORaDGSIiIjItRjIEBERkWtFfIsCIopu1bVNOHSqBf0zuyK/R1e7h0NEJmMgQ0QRqb7lPO5cU46NlbW+2yYVZeHJWSOQlhxv48iIyExcWiKiiHTnmnJsrjoZcNvmqpNYuGaPTSMiIiswkCGiiFNd24SNlbXoEISA2zsEARsra3HgZLNNIyMiszGQIaKIc+hUi+L9B+sYyBBFCgYyRBRx+nVPVry/fyaTfokiBQMZIjKkurYJpRU1jlymKchKwaSiLMR6PAG3x3o8mFSUxd1LRBGEu5aISBe37AZ6ctYILFyzJ2Cc4wt74MlZI2wcFRGZzSMIomy4CNPY2Ii0tDQ0NDQgNTXV7uEQud6clduxuepkQCJtrMeD8YU98ML8sTaOTNqBk804WNfMOjJELqP1/M0ZGSLSzLsbSMx/N5DTgoX8HgxgiCIZc2SISDPuBiIip2EgQ0SacTcQETkNAxki0oy7gYjIaRjIEJEuT84agfGFPQJu424gIrILk32JSJe05Hi8MH8sdwMRkSMwkCEiQ7gbiIicgEtLRERE5Fq2BjIrVqzAsGHDkJqaitTUVJSUlODdd9/13T958mR4PJ6A/26//XYbR0xEREROYuvSUt++ffHoo4+iqKgIgiDg+eefx/Tp07Fnzx4MGTIEAPDDH/4Qv/zlL33/JjlZefsnERERRQ9bA5nrrrsu4Odly5ZhxYoV2Lp1qy+QSU5ORq9evewYHhFRxKiubcKhUy1MznYJfl7aOSbZt6OjA6+//jqam5tRUlLiu/2ll17C6tWr0atXL1x33XV44IEHFGdlWltb0dra6vu5sbHR0nETETmZW5p80gX8vPSzPdn3iy++QEpKChITE3H77bfjzTffxODBgwEAN910E1avXo3S0lIsXboUL774Im6++WbF51u+fDnS0tJ8/+Xm5objZRAROdKda8qxuepkwG2bq05i4Zo9No2IlPDz0s/27tfnz5/H4cOH0dDQgL/+9a949tlnUVZW5gtm/H388ce46qqrUFVVhQEDBkg+n9SMTG5uLrtfE1HUqa5twpWPlcneX7p4MpctHISfVyCt3a9tn5FJSEhAYWEhRo0aheXLl2P48OF4/PHHJR9bXFwMAKiqqpJ9vsTERN8uKO9/RETRiE0+3YWflzG2BzJinZ2dATMq/srLywEAvXv3DuOIiIjciU0+3YWflzG2JvsuXboU06ZNQ15eHs6cOYOXX34ZGzZswPvvv4/9+/fj5Zdfxr/9278hMzMTn3/+Oe655x5MmjQJw4YNs3PYRESu4G3yubnqJDr8sghiPR6ML+wRVcsUbsDPyxhbZ2RqamowZ84cDBw4EFdddRV27NiB999/H9/+9reRkJCAjz76CFdffTUGDRqEn/zkJ5g5cybefvttO4dMROQqbPLpLvy89LM92ddqWpOFiIgiGZt8ugs/L+3nb8fUkSEiIuuwyae78PPSznHJvkRERERaMZAhIiIi12IgQ0RERK7FQIaIiIhci4EMERERuRYDGSIiInItBjJERETkWgxkiIiIyLUYyBAREZFrMZAhIiIi12KLAiIiIpeorm3CoVMtUd2DSYyBDBFFNB74KRLUt5zHnWvKsbGy1nfbpKIsPDlrBNKS420cmf0YyBBRROKBnyLJnWvKsbnqZMBtm6tOYuGaPXhh/libRuUMzJEhooikdOAncpPq2iZsrKxFhyAE3N4hCNhYWYsDJ5ttGpkzMJAhoojDAz9FkkOnWhTvP1gX3d9nBjJEFHF44KdI0q97suL9/TOjO/eLgQwRRRwe+CmSFGSlYFJRFmI9noDbYz0eTCrKivokdgYyRBRxeOCnSPPkrBEYX9gj4LbxhT3w5KwRNo3IOTyCIFpEjjCNjY1IS0tDQ0MDUlNT7R4OEYVJQ0sbFq7Zw11LFFEOnGzGwbrmqCgnoPX8zUCGiCJaNB34I4HVdX9YV8g9tJ6/WUeGiCJafg+esNzA6ro/rCsUuZgjQ0REtrO67g/rCkUuBjJERGQrq+v+sK5QZGMgQ0TkANW1TSitqNF8UtX7eCezuu4P6wpFNubIEBHZSG/uRiTmelhd94d1hSIbZ2SIiGykN3cjEnM9rK77w7pCkY2BDBGRTfTmbkRyrofVBd9YUC5ycWmJiCgEodQl0ZK74f+ceh/vJmnJ8Xhh/ljL6v5Y/fxkHwYyREQGmJGrojd3IxpyPayu+8O6QpGHS0tERAaYkauiN3eDuR5EwRjIEBHpZGauit7cDeZ6EAXi0hIRkU5m5qrozd1grgdRIAYyREQ6WZGrojd3g7keRBdwaYmISCfmqhA5BwMZIiIDHpkxFKlJgZPaqUlxWDZjqE0jIopOtgYyK1aswLBhw5CamorU1FSUlJTg3Xff9d1/7tw5LFiwAJmZmUhJScHMmTNx4sQJG0dMRHTB/Wv3ovFse8BtjWfbcd/avTaNiCg62RrI9O3bF48++ih27dqFnTt34sorr8T06dPx5ZdfAgDuuecevP3223j99ddRVlaGY8eO4YYbbrBzyEREEV1hl8htbE32ve666wJ+XrZsGVasWIGtW7eib9++WLlyJV5++WVceeWVAIBVq1bhkksuwdatWzFu3Dg7hkxEFNEVdoncxjE5Mh0dHXjllVfQ3NyMkpIS7Nq1C21tbZg6darvMYMGDUJeXh62bNki+zytra1obGwM+I+IyEzRUGGXyC1sD2S++OILpKSkIDExEbfffjvefPNNDB48GMePH0dCQgLS09MDHt+zZ08cP35c9vmWL1+OtLQ033+5ubkWvwIiijbctUTkHLYHMgMHDkR5eTm2bduGO+64A3PnzsW+ffsMP9/SpUvR0NDg++/IkSMmjpaI6AJW2CVyBtsL4iUkJKCwsBAAMGrUKOzYsQOPP/44brzxRpw/fx719fUBszInTpxAr169ZJ8vMTERiYmJVg+biKIcK+wSOYPtMzJinZ2daG1txahRoxAfH4/169f77quoqMDhw4dRUlJi4wiJiC7K79EVUwZmM4ghsomtMzJLly7FtGnTkJeXhzNnzuDll1/Ghg0b8P777yMtLQ3z58/HokWL0L17d6SmpmLhwoUoKSnhjiUiIiICYHMgU1NTgzlz5uCbb75BWloahg0bhvfffx/f/va3AQB/+MMfEBMTg5kzZ6K1tRXXXHMNnnnmGTuHTERERA7iEQRRRacI09jYiLS0NDQ0NCA1NdXu4RAREZEGWs/fjsuRISIiItLK9l1LRERkn+raJhw61cJdV+RaDGSIiKJQfct53LmmHBsra323TSrKwpOzRiAtOd7GkUljwEVyGMgQEUWhO9eUY3PVyYDbNledxMI1e/DC/LE2jSqY2wIuCj/myBARRRk3de9WCriIAAYyRERRR0v3br2qa5tQWlFjahDkpoCL7MOlJSKiKGNm924rl360BFzMlyHOyBA
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"rich_df.plot.scatter(x='Age', y='Networth')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"По графику можно понять, что основная часть миллиардеров - это люди старше сорока лет (зависимость состояния от возраста)"
]
},
{
"cell_type": "code",
"execution_count": 146,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAA0AAAAKUCAYAAAAtu1jPAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAEAAElEQVR4nOzdd3iN9/8/8Gf2PolEhpEgYiRGrSIoSmrF9qFGzdDSUKtWa+/atEar/VjVUruovXeJUVvMaDVBiVSQSLx+f+Q69y93zknOfULr63M/H9d1X+Q+73Puce5z3+/Xe9qIiICIiIiIiEgHbF/3DhAREREREf1bGAAREREREZFuMAAiIiIiIiLdYABERERERES6wQCIiIiIiIh0gwEQERERERHpBgMgIiIiIiLSDQZARERERESkG/avewdy48WLF7hz5w48PDxgY2PzuneHiIiIiIheExHB33//jfz588PW1nL9zhsZAN25cweBgYGvezeIiIiIiOj/iNu3b6NgwYIW072RAZCHhweAjIM0GAyveW+IiIiIiOh1SUpKQmBgoBIjWPJGBkDGZm8Gg4EBEBERERERae4aw0EQiIiIiIhINxgAERERERGRbjAAIiIiIiIi3WAAREREREREusEAiIiIiIiIdIMBEBERERER6QYDICIiIiIi0g0GQEREREREpBsMgIiIiIiISDcYABERERERkW4wACIiIiIiIt1gAERERERERLrBAIiIiIiIiHSDARAREREREekGAyAiIiIiItINqwKgwoULw8bGxmSJjo4GADx79gzR0dHw8fGBu7s7WrVqhYSEBNVnxMXFITIyEq6urvDz88OgQYOQlpb26o6IiIiIiIgoG1YFQMePH8eff/6pLDt27AAAtG7dGgDQv39/bNy4EatWrcK+fftw584dtGzZUnl/eno6IiMjkZqaisOHD2PJkiVYvHgxRo4c+QoPiYiIiIiIyDwbEZHcvrlfv37YtGkTYmNjkZSUBF9fX/zwww/4z3/+AwC4dOkSQkNDceTIEVStWhVbtmxB48aNcefOHfj7+wMAFixYgCFDhuDevXtwdHTUtN2kpCR4enri0aNHMBgMud19IiIiIiJ6w1kbG+S6D1Bqaiq+//57dOvWDTY2NoiJicHz588RERGhpClZsiSCgoJw5MgRAMCRI0dQpkwZJfgBgPr16yMpKQnnz5/P7a4QERERERFpYp/bN65fvx6JiYno0qULACA+Ph6Ojo7w8vJSpfP390d8fLySJnPwY3zd+Fp2UlJSkJKSovydlJSker3w0M0m77k5OVLzsRARERERkT7kugbou+++Q8OGDZE/f/5XuT9mTZo0CZ6ensoSGBj4j2+TiIiIiIj+9+QqALp16xZ27tyJ7t27K+sCAgKQmpqKxMREVdqEhAQEBAQoabKOCmf825jGnGHDhuHRo0fKcvv27dzsNhERERER6VyuAqBFixbBz88PkZH/v5lZxYoV4eDggF27dinrLl++jLi4OISHhwMAwsPDcfbsWdy9e1dJs2PHDhgMBoSFhWW7PScnJxgMBtVCRERERERkLav7AL148QKLFi1C586dYW///9/u6emJqKgoDBgwAN7e3jAYDOjTpw/Cw8NRtWpVAEC9evUQFhaGjh07YsqUKYiPj8fw4cMRHR0NJyenV3dUREREREREZlgdAO3cuRNxcXHo1q2byWszZ86Era0tWrVqhZSUFNSvXx/z5s1TXrezs8OmTZvQq1cvhIeHw83NDZ07d8bYsWNf7iiIiIiIiIg0eKl5gF6XrGN9cxQ4IiIiIiJ9+tfmASIiIiIiInrTMAAiIiIiIiLdYABERERERES6wQCIiIiIiIh0gwEQERERERHpBgMgIiIiIiLSDQZARERERESkGwyAiIiIiIhINxgAERERERGRbjAAIiIiIiIi3WAAREREREREusEAiIiIiIiIdIMBEBERERER6QYDICIiIiIi0g0GQEREREREpBsMgIiIiIiISDcYABERERERkW4wACIiIiIiIt1gAERERERERLrBAIiIiIiIiHSDARAREREREekGAyAiIiIiItINBkBERERERKQbDICIiIiIiEg3GAAREREREZFuMAAiIiIiIiLdYABERERERES6wQCIiIiIiIh0gwEQERERERHpBgMgIiIiIiLSDQZARERERESkGwyAiIiIiIhINxgAERERERGRbjAAIiIiIiIi3WAAREREREREusEAiIiIiIiIdIMBEBERERER6QYDICIiIiIi0g0GQEREREREpBsMgIiIiIiISDcYABERERERkW4wACIiIiIiIt1gAERERERERLrBAIiIiIiIiHSDARAREREREekGAyAiIiIiItINBkBERERERKQbDICIiIiIiEg3GAAREREREZFuMAAiIiIiIiLdYABERERERES6wQCIiIiIiIh0w+oA6I8//sAHH3wAHx8fuLi4oEyZMjhx4oTyuohg5MiRyJcvH1xcXBAREYHY2FjVZzx48AAdOnSAwWCAl5cXoqKi8Pjx45c/GiIiIiIiohxYFQA9fPgQ1atXh4ODA7Zs2YILFy5g+vTpyJMnj5JmypQpmDNnDhYsWIBjx47Bzc0N9evXx7Nnz5Q0HTp0wPnz57Fjxw5s2rQJ+/fvx4cffvjqjoqIiIiIiMgMGxERrYmHDh2KQ4cO4cCBA2ZfFxHkz58fAwcOxKeffgoAePToEfz9/bF48WK0bdsWFy9eRFhYGI4fP45KlSoBALZu3YpGjRrh999/R/78+S3uR1JSEjw9PfHo0SMYDAYUHrrZJM3NyZFaD4uIiIiIiN5QWWMDS6yqAfr5559RqVIltG7dGn5+fihfvjwWLlyovH7jxg3Ex8cjIiJCWefp6YkqVargyJEjAIAjR47Ay8tLCX4AICIiAra2tjh27JjZ7aakpCApKUm1EBERERERWcuqAOj69euYP38+ihUrhm3btqFXr1745JNPsGTJEgBAfHw8AMDf31/1Pn9/f+W1+Ph4+Pn5qV63t7eHt7e3kiarSZMmwdPTU1kCAwOt2W0iIiIiIiIAVgZAL168QIUKFTBx4kSUL18eH374IXr06IEFCxb8U/sHABg2bBgePXqkLLdv3/5Ht0dERERERP+brAqA8uXLh7CwMNW60NBQxMXFAQACAgIAAAkJCao0CQkJymsBAQG4e/eu6vW0tDQ8ePBASZOVk5MTDAaDaiEiIiIiIrKWVQFQ9erVcfnyZdW6K1euoFChQgCAIkWKICAgALt27VJeT0pKwrFjxxAeHg4ACA8PR2JiImJiYpQ0u3fvxosXL1ClSpVcHwgREREREZEl9tYk7t+/P6pVq4aJEyeiTZs2+PXXX/HNN9/gm2++AQDY2NigX79+GD9+PIoVK4YiRYpgxIgRyJ8/P5o3bw4go8aoQYMGStO558+fo3fv3mjbtq2mEeCIiIiIiIhyy6oA6O2338a6deswbNgwjB07FkWKFMGsWbPQoUMHJc3gwYORnJyMDz/8EImJiahRowa2bt0KZ2dnJc3y5cvRu3dv1K1bF7a2tmjVqhXmzJnz6o6KiIiIiIjIDKvmAfq/gvMAERERERER8A/PA0RERERERPQmYwBERERERES6wQCIiIiIiIh0gwEQERERERHpBgMgIiIiIiLSDQZARERERESkGwyAiIiIiIhINxgAERERERGRbjAAIiIiIiIi3WAAREREREREusEAiIiIiIiIdIMBEBERERER6QYDICIiIiIi0g0GQEREREREpBsMgIiIiIiISDcYABERERERkW4wACIiIiIiIt1gAERERERERLrBAIiIiIiIiHSDARAREREREekGAyAiIiIiItINBkBERERERKQbDICIiIiIiEg3GAAREREREZFuMAAiIiIiIiLdYABERERERES6wQCIiIiIiIh0gwEQERERERHpBgMgIiIiIiLSDQZARERERESkGwyAiIiIiIhINxgAERERERGRbjAAIiIiIiIi3WAAREREREREusEAiIiIiIiIdIMBEBERERER6QYDICIiIiIi0g0GQEREREREpBsMgIiIiIi
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"country_counts = rich_df['Country'].value_counts()\n",
"plt.figure(figsize=(10, 6))\n",
"country_counts.plot(kind='bar')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"По графику можно увидеть сколько миллиардеров проживают в разных странах и сделать вывод о зависимости шанса становления миллиардеровм от страны проживания."
]
},
{
"cell_type": "code",
"execution_count": 153,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='Industry'>"
]
},
"execution_count": 153,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAz8AAALECAYAAAAmd44tAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAACxEklEQVR4nOzdd3xN9/8H8NfN3sNIYmWhIiT2SLU2QUo0KBV7lVoVu7WCovpFamupoEJttbdYsUViSyRCSWKUSEJC8vn94ZdT102Qlpxze17Px+M8Hu45JzevG3e9z2dphBACRERERERE/3EGcgcgIiIiIiIqCCx+iIiIiIhIFVj8EBERERGRKrD4ISIiIiIiVWDxQ0REREREqsDih4iIiIiIVIHFDxERERERqQKLHyIiIiIiUgUWP0REREREpApGcgf4J7Kzs3Hnzh1YW1tDo9HIHYeIiIiIiGQihMCTJ09QvHhxGBi8pW1H/AtTp04VAMTgwYOlfU+fPhVff/21KFSokLC0tBQBAQEiMTFR6+du3rwpWrRoIczNzUXRokXFsGHDxPPnz9/59966dUsA4MaNGzdu3Lhx48aNGzcBQNy6deutdcQ/bvk5deoUFi1aBG9vb639Q4YMwbZt27B27VrY2tpiwIABCAgIwNGjRwEAWVlZ8PPzg5OTE44dO4a7d++iS5cuMDY2xpQpU97pd1tbWwMAbt26BRsbm3/6EIiIiIiISM+lpKSgVKlSUo3wJhohhMjvL0hNTUXVqlUxf/58TJ48GZUrV0ZISAgeP36MokWLIiwsDG3btgUAXLlyBeXLl0dERARq166NHTt24LPPPsOdO3fg6OgIAFi4cCFGjhyJe/fuwcTE5J0eoK2tLR4/fszih4iIiIhIxfJTG/yjCQ/69+8PPz8/NG7cWGv/mTNn8Pz5c639Hh4ecHZ2RkREBAAgIiICXl5eUuEDAL6+vkhJScHFixdz/X0ZGRlISUnR2oiIiIiIiPIj393eVq9ejbNnz+LUqVM6xxITE2FiYgI7Ozut/Y6OjkhMTJTOebXwyTmecyw3U6dORXBwcH6jEhERERERSfLV8nPr1i0MHjwYK1euhJmZ2YfKpGP06NF4/PixtN26davAfjcREREREf035Kv4OXPmDJKTk1G1alUYGRnByMgI4eHhmD17NoyMjODo6IjMzEw8evRI6+eSkpLg5OQEAHByckJSUpLO8ZxjuTE1NYWNjY3WRkRERERElB/5Kn4aNWqE6OhoREZGSlv16tURGBgo/dvY2Bj79u2Tfubq1atISEiAj48PAMDHxwfR0dFITk6WztmzZw9sbGzg6en5nh4WERERERGRtnyN+bG2tkbFihW19llaWqJw4cLS/p49eyIoKAiFChWCjY0NBg4cCB8fH9SuXRsA0LRpU3h6eqJz586YPn06EhMTMWbMGPTv3x+mpqbv6WERERERERFp+8fr/ORl1qxZMDAwQJs2bZCRkQFfX1/Mnz9fOm5oaIitW7eiX79+8PHxgaWlJbp27YqJEye+7yhERERERESSf7TOj9y4zg8REREREQEFsM4PERERERGRvmHxQ0REREREqsDih4iIiIiIVIHFDxERERERqQKLHyIiIiIiUgUWP0REREREpAosfoiIiIiISBVY/BARERERkSqw+CEiIiIiIlVg8UNERERERKrA4oeIiIiIiFTBSO4ABcF11Lb3fp/x0/ze+30SEREREdGHw5YfIiIiIiJSBRY/RERERESkCix+iIiIiIhIFVj8EBERERGRKrD4ISIiIiIiVWDxQ0REREREqsDih4iIiIiIVIHFDxERERERqQKLHyIiIiIiUgUWP0REREREpAosfoiIiIiISBVY/BARERERkSqw+CEiIiIiIlVg8UNERERERKrA4oeIiIiIiFSBxQ8REREREakCix8iIiIiIlIFFj9ERERERKQKLH6IiIiIiEgVWPwQEREREZEqsPghIiIiIiJVYPFDRERERESqwOKHiIiIiIhUgcUPERERERGpAosfIiIiIiJSBRY/RERERESkCix+iIiIiIhIFVj8EBERERGRKrD4ISIiIiIiVWDxQ0REREREqsDih4iIiIiIVIHFDxERERERqQKLHyIiIiIiUgUWP0REREREpAr5Kn4WLFgAb29v2NjYwMbGBj4+PtixY4d0vH79+tBoNFpb3759te4jISEBfn5+sLCwgIODA4YPH44XL168n0dDRERERESUB6P8nFyyZElMmzYNZcuWhRACy5Ytg7+/P86dO4cKFSoAAHr37o2JEydKP2NhYSH9OysrC35+fnBycsKxY8dw9+5ddOnSBcbGxpgyZcp7ekhERERERES68lX8tGzZUuv2999/jwULFuD48eNS8WNhYQEnJ6dcf3737t24dOkS9u7dC0dHR1SuXBmTJk3CyJEjMWHCBJiYmPzDh0FERERERPRm/3jMT1ZWFlavXo20tDT4+PhI+1euXIkiRYqgYsWKGD16NNLT06VjERER8PLygqOjo7TP19cXKSkpuHjxYp6/KyMjAykpKVobERERERFRfuSr5QcAoqOj4ePjg2fPnsHKygobN26Ep6cnAKBjx45wcXFB8eLFERUVhZEjR+Lq1avYsGEDACAxMVGr8AEg3U5MTMzzd06dOhXBwcH5jUpERERERCTJd/FTrlw5REZG4vHjx1i3bh26du2K8PBweHp6ok+fPtJ5Xl5eKFasGBo1aoTY2FiULl36H4ccPXo0goKCpNspKSkoVarUP74/IiIiIiJSn3x3ezMxMUGZMmVQrVo1TJ06FZUqVcJPP/2U67m1atUCAMTExAAAnJyckJSUpHVOzu28xgkBgKmpqTTDXM5GRERERESUH/96nZ/s7GxkZGTkeiwyMhIAUKxYMQCAj48PoqOjkZycLJ2zZ88e2NjYSF3niIiIiIiIPoR8dXsbPXo0mjdvDmdnZzx58gRhYWE4ePAgdu3ahdjYWISFhaFFixYoXLgwoqKiMGTIENStWxfe3t4AgKZNm8LT0xOdO3fG9OnTkZiYiDFjxqB///4wNTX9IA+QiIiIiIgIyGfxk5ycjC5duuDu3buwtbWFt7c3du3ahSZNmuDWrVvYu3cvQkJCkJaWhlKlSqFNmzYYM2aM9POGhobYunUr+vXrBx8fH1haWqJr165a6wIRERERERF9CBohhJA7RH6lpKTA1tYWjx8/fqfxP66jtr33DPHT/N77fRIRERERUf7kpzb412N+iIiIiIiI9AGLHyIiIiIiUgUWP0REREREpAosfoiIiIiISBVY/BARERERkSqw+CEiIiIiIlVg8UNERERERKrA4oeIiIiIiFSBxQ8REREREakCix8iIiIiIlIFFj9ERERERKQKLH6IiIiIiEgVWPwQEREREZEqsPghIiIiIiJVYPFDRERERESqwOKHiIiIiIhUgcUPERERERGpAosfIiIiIiJSBRY/RERERESkCix+iIiIiIhIFVj8EBERERGRKrD4ISIiIiIiVWDxQ0REREREqsDih4iIiIiIVIHFDxERERERqYKR3AFIm+uobe/9PuOn+b33+yQiIiIi0jds+SEiIiIiIlVg8UNERERERKrA4oeIiIiIiFSBY37oH/sQ45MAjlEiIiIiog+DLT9ERERERKQKLH6IiIiIiEgVWPwQEREREZEqsPghIiIiIiJVYPFDRERERESqwOKHiIiIiIhUgcUPERERERGpAosfIiIiIiJSBRY/RERERESkCix+iIiIiIhIFVj8EBERERGRKrD4ISIiIiIiVWDxQ0REREREqsDih4iIiIiIVIHFDxERERERqQKLHyIiIiIiUgUWP0REREREpAr5Kn4WLFgAb29v2NjYwMbGBj4+PtixY4d0/NmzZ+jfvz8KFy4MKysrtGnTBklJSVr3kZCQAD8/P1hYWMDBwQHDhw/Hixcv3s+jISIiIiIiykO+ip+SJUti2rRpOHPmDE6fPo2GDRvC398fFy9eBAAMGTIEW7Zswdq1axEeHo47d+4gICBA+vmsrCz4+fkhMzMTx44dw7JlyxAaGopx48a930dFRER
"text/plain": [
"<Figure size 1000x600 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"country_counts = rich_df['Industry'].value_counts()\n",
"plt.figure(figsize=(10, 6))\n",
"country_counts.plot(kind='bar')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"На графике можно проследить зависимость индустрии от кол-ва миллиардеров"
2024-10-18 21:30:59 +04:00
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"8.\n",
"На мой взгляд наборы довольно информативные учитывая кол-во строк и атрибутов. \n",
"Степень покрытия, соответсвие реальным данным и согласованность меток проверить не представляется возможным (но я верю составителям сетов)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"9.\n",
"Проверка на пропущенные значения:"
]
},
{
"cell_type": "code",
"execution_count": 154,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Series([], dtype: int64)\n",
"--------------\n",
"Series([], dtype: int64)\n",
2024-10-18 21:30:59 +04:00
"--------------\n",
"Series([], dtype: int64)\n"
]
}
],
"source": [
"print(cars_df.isnull().sum().loc[lambda x: x>0])\n",
"print(\"--------------\")\n",
"print(phones_df.isnull().sum().loc[lambda x: x>0])\n",
"print(\"--------------\")\n",
"print(rich_df.isnull().sum().loc[lambda x: x>0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"в датасете с телефонами нашлись пустые значения. Жаль, но они все не числовые, поэтому просто заменим на моду"
]
},
{
"cell_type": "code",
"execution_count": 155,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Series([], dtype: int64)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\pasha\\AppData\\Local\\Temp\\ipykernel_6832\\3049087464.py:4: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.\n",
"The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.\n",
"\n",
"For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.\n",
"\n",
"\n",
" phones_df[column].fillna(mode, inplace=True)\n"
]
2024-10-18 21:30:59 +04:00
}
],
"source": [
"columns = [\"Android_version\", \"Inbuilt_memory\", \"fast_charging\", \"Screen_resolution\", \"Processor\"]\n",
"for column in columns:\n",
" mode = phones_df[column].mode()[0]\n",
" phones_df[column].fillna(mode, inplace=True)\n",
" \n",
"print(phones_df.isnull().sum().loc[lambda x: x>0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Не знаю насколько это правильно и как отразиться на качестве данных, но удалять 400+ строк их 1300 явно было бы хуже\n",
"\n",
"10. \n",
"Разбиение данных на выборки"
]
},
{
"cell_type": "code",
"execution_count": 156,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"18712 13098 2807 2807\n",
"18712\n",
2024-10-18 21:30:59 +04:00
"\n",
" 1282 897 193 192\n",
"1282\n",
2024-10-18 21:30:59 +04:00
"\n",
" 2565 1795 385 385\n",
"2565\n"
]
}
],
"source": [
"from sklearn.model_selection import train_test_split\n",
"\n",
"cars_train_df, cars_temp_df = train_test_split(cars_df, test_size=0.3, random_state=52)\n",
"cars_val_df, cars_test_df = train_test_split(cars_temp_df, test_size=0.5, random_state=52)\n",
"\n",
"phones_train_df, phones_temp_df = train_test_split(phones_df, test_size=0.3, random_state=52)\n",
"phones_val_df, phones_test_df = train_test_split(phones_temp_df, test_size=0.5, random_state=52)\n",
"\n",
"rich_train_df, rich_temp_df = train_test_split(rich_df, test_size=0.3, random_state=52)\n",
"rich_val_df, rich_test_df = train_test_split(rich_temp_df, test_size=0.5, random_state=52)\n",
"\n",
"print(cars_df.shape[0], cars_train_df.shape[0], cars_test_df.shape[0], cars_val_df.shape[0])\n",
"print(cars_val_df.shape[0] + cars_test_df.shape[0] + cars_train_df.shape[0])\n",
"print('\\n', phones_df.shape[0], phones_train_df.shape[0], phones_test_df.shape[0], phones_val_df.shape[0])\n",
"print(phones_val_df.shape[0] + phones_test_df.shape[0] + phones_train_df.shape[0])\n",
"print('\\n', rich_df.shape[0], rich_train_df.shape[0], rich_test_df.shape[0], rich_val_df.shape[0])\n",
"print(rich_val_df.shape[0] + rich_test_df.shape[0] + rich_train_df.shape[0])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Данные были разбиты на обучающую, тестовую и контрольную выборки в отношении 70%-15%-15%\n",
"\n",
"11. Взял проценты из лекции, наверное это сбалансированно"
]
},
{
"cell_type": "code",
"execution_count": 157,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"oversampling:\n",
"old_type\n",
"Old 13190\n",
"Normal 13190\n",
"New 13190\n",
2024-10-18 21:30:59 +04:00
"Name: count, dtype: int64\n",
"undersampling:\n",
"old_type\n",
"Old 2274\n",
"Normal 2274\n",
"New 2274\n",
2024-10-18 21:30:59 +04:00
"Name: count, dtype: int64\n"
]
}
],
"source": [
"from imblearn.over_sampling import RandomOverSampler\n",
"from imblearn.under_sampling import RandomUnderSampler\n",
"cars_df['old_type'] = pd.cut(cars_df['Prod. year'], bins=[1900, 2004, 2015, 2025], \n",
" labels=['Old', 'Normal', 'New'])\n",
"\n",
"y = cars_df['old_type']\n",
"x = cars_df.drop(columns=['Prod. year', 'old_type'])\n",
"\n",
"oversampler = RandomOverSampler(random_state=52)\n",
"x_resampled, y_resampled = oversampler.fit_resample(x, y)\n",
"\n",
"undersampler = RandomUnderSampler(random_state=52)\n",
"x_resampled_under, y_resampled_under = undersampler.fit_resample(x, y)\n",
"\n",
"print(\"oversampling:\")\n",
"print(pd.Series(y_resampled).value_counts())\n",
"\n",
"print(\"undersampling:\")\n",
"print(pd.Series(y_resampled_under).value_counts())"
]
},
{
"cell_type": "code",
"execution_count": 158,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"oversampling:\n",
"rating_type\n",
"bad 784\n",
"normal 784\n",
"good 784\n",
2024-10-18 21:30:59 +04:00
"Name: count, dtype: int64\n",
"undersampling:\n",
"rating_type\n",
"bad 91\n",
"normal 91\n",
"good 91\n",
2024-10-18 21:30:59 +04:00
"Name: count, dtype: int64\n"
]
}
],
"source": [
"phones_df['rating_type'] = pd.cut(phones_df['Rating'], bins=[0, 4.0, 4.5, 5.0], \n",
" labels=[\"bad\", \"normal\", \"good\"])\n",
"\n",
"y = phones_df['rating_type']\n",
"x = phones_df.drop(columns=['Rating', 'rating_type'])\n",
"\n",
"oversampler = RandomOverSampler(random_state=42)\n",
"x_resampled, y_resampled = oversampler.fit_resample(x, y)\n",
"\n",
"undersampler = RandomUnderSampler(random_state=42)\n",
"x_resampled_under, y_resampled_under = undersampler.fit_resample(x, y)\n",
"\n",
"print(\"oversampling:\")\n",
"print(pd.Series(y_resampled).value_counts())\n",
"\n",
"print(\"undersampling:\")\n",
"print(pd.Series(y_resampled_under).value_counts())"
]
},
{
"cell_type": "code",
"execution_count": null,
2024-10-18 21:30:59 +04:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"oversampling:\n",
"age_type\n",
"young 1535\n",
2024-10-18 21:30:59 +04:00
"grown 1535\n",
"old 1535\n",
"Name: count, dtype: int64\n",
"undersampling:\n",
"age_type\n",
"young 14\n",
"grown 14\n",
"old 14\n",
2024-10-18 21:30:59 +04:00
"Name: count, dtype: int64\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAjAAAAHQCAYAAAC/XVBwAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjkuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8hTgPZAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA0HUlEQVR4nO3de1hVZd7/8c9G5OBhb0QTZAbQskehPI2akWY6MuJxNJ3KZMyMdDKoUXvMmEdJrckyx2Omj06eGpzKZ0ZmsgYjTDElVIxyiKGTBaUbaggIGznI/v3R5fq1ww7Uxs0t79d1rety3/d37fVdtoVP67CXzeVyuQQAAGAQH283AAAA0FgEGAAAYBwCDAAAMA4BBgAAGIcAAwAAjEOAAQAAxiHAAAAA4xBgAACAcQgwAADAOAQYAABgHN/GrpCVlaXHH39cubm5On36tHbv3q2JEye61RQUFGjBggU6cOCA6urqFB0drb/85S+KiIiQJJ09e1b33XefnnnmGVVXVysuLk5PPvmkQkJCrPcoKirS7Nmz9corr6hdu3aaPn26li1bJl/f79dyfX29Tp06pfbt28tmszV2NwEAgBe4XC59/vnnCgsLk4/PNx9naXSAOXPmjPr06aM77rhDkyZNajD/3nvvaciQIUpISNCSJUtkt9uVn5+vgIAAq2bu3Ll64YUXtGvXLjkcDiUlJWnSpEk6dOiQJOncuXMaO3asQkNDdfjwYZ0+fVq33XabWrdurUceeeR79Xnq1CmFh4c3dvcAAEAzUFxcrJ/+9KffOG/7MQ9ztNlsDY7ATJkyRa1bt9bTTz99wXUqKip02WWXaefOnfrVr34lSfrXv/6lqKgoZWdn69prr9U//vEPjRs3TqdOnbKOymzcuFELFizQJ598Ij8/v+/sraKiQkFBQSouLpbdbv+huwgAAC6iyspKhYeHq7y8XA6H4xvrGn0E5tvU19frhRde0P3336+4uDi9/vrr6tatm5KTk62Qk5ubq9raWsXGxlrr9ezZUxEREVaAyc7OVq9evdxOKcXFxWn27NnKz89Xv379Gmy7urpa1dXV1uvPP/9ckmS32wkwAAAY5rsu//DoRbylpaWqqqrSo48+qlGjRumll17SjTfeqEmTJunAgQOSJKfTKT8/PwUFBbmtGxISIqfTadV8Nbycnz8/dyHLli2Tw+GwFk4fAQBw6fJogKmvr5ckTZgwQXPnzlXfvn31wAMPaNy4cdq4caMnN9VAcnKyKioqrKW4uLhJtwcAALzHowGmU6dO8vX1VXR0tNt4VFSUioqKJEmhoaGqqalReXm5W01JSYlCQ0OtmpKSkgbz5+cuxN/f3zpdxGkjAAAubR4NMH5+fho4cKAKCwvdxt9++21FRkZKkvr376/WrVsrMzPTmi8sLFRRUZFiYmIkSTExMTpx4oRKS0utmoyMDNnt9gbhCAAAtDyNvoi3qqpK7777rvX65MmTysvLU3BwsCIiIjR//nzdcsstGjp0qIYPH6709HQ9//zz2r9/vyTJ4XAoISFB8+bNU3BwsOx2u+655x7FxMTo2muvlSSNHDlS0dHRmjZtmpYvXy6n06mFCxcqMTFR/v7+ntlzAABgLlcjvfLKKy5JDZbp06dbNU899ZSre/furoCAAFefPn1caWlpbu/xn//8x3X33Xe7OnTo4GrTpo3rxhtvdJ0+fdqt5oMPPnCNHj3aFRgY6OrUqZPrvvvuc9XW1n7vPisqKlySXBUVFY3dRQAA4CXf9/f3j/oemOassrJSDodDFRUVXA8DAIAhvu/vb56FBAAAjEOAAQAAxiHAAAAA4xBgAACAcQgwAADAOAQYAABgHAIMAAAwDgEGAAAYp9GPEoBndX3gBW+3cMn44NGx3m7hksHn0jP4THoOn0nPuVQ+lxyBAQAAxiHAAAAA4xBgAACAcQgwAADAOAQYAABgHAIMAAAwDgEGAAAYhwADAACMQ4ABAADGIcAAAADjEGAAAIBxCDAAAMA4BBgAAGAcAgwAADAOAQYAABiHAAMAAIxDgAEAAMYhwAAAAOMQYAAAgHEIMAAAwDgEGAAAYBwCDAAAMA4BBgAAGIcAAwAAjEOAAQAAxiHAAAAA4zQ6wGRlZWn8+PEKCwuTzWZTWlraN9beddddstlsWr16tdt4WVmZ4uPjZbfbFRQUpISEBFVVVbnVvPnmm7r++usVEBCg8PBwLV++vLGtAgCAS1SjA8yZM2fUp08frV+//lvrdu/erddee01hYWEN5uLj45Wfn6+MjAzt2bNHWVlZmjVrljVfWVmpkSNHKjIyUrm5uXr88ce1ePFibdq0qbHtAgCAS5BvY1cYPXq0Ro8e/a01H3/8se655x7t3btXY8eOdZsrKChQenq6jh49qgEDBkiS1q1bpzFjxmjFihUKCwtTamqqampqtGXLFvn5+emqq65SXl6eVq5c6RZ0AABAy+Txa2Dq6+s1bdo0zZ8/X1dddVWD+ezsbAUFBVnhRZJiY2Pl4+OjnJwcq2bo0KHy8/OzauLi4lRYWKjPPvvsgtutrq5WZWWl2wIAAC5NHg8wjz32mHx9fXXvvfdecN7pdKpz585uY76+vgoODpbT6bRqQkJC3GrOvz5f83XLli2Tw+GwlvDw8B+7KwAAoJnyaIDJzc3VmjVrtG3bNtlsNk++9XdKTk5WRUWFtRQXF1/U7QMAgIvHowHm4MGDKi0tVUREhHx9feXr66sPP/xQ9913n7p27SpJCg0NVWlpqdt6dXV1KisrU2hoqFVTUlLiVnP+9fmar/P395fdbndbAADApcmjAWbatGl68803lZeXZy1hYWGaP3++9u7dK0mKiYlReXm5cnNzrfX27dun+vp6DRo0yKrJyspSbW2tVZORkaEePXqoQ4cOnmwZAAAYqNF3IVVVVendd9+1Xp88eVJ5eXkKDg5WRESEOnbs6FbfunVrhYaGqkePHpKkqKgojRo1SjNnztTGjRtVW1urpKQkTZkyxbrleurUqVqyZIkSEhK0YMEC/fOf/9SaNWu0atWqH7OvAADgEtHoAHPs2DENHz7cej1v3jxJ0vTp07Vt27bv9R6pqalKSkrSiBEj5OPjo8mTJ2vt2rXWvMPh0EsvvaTExET1799fnTp1UkpKCrdQAwAAST8gwAwbNkwul+t713/wwQcNxoKDg7Vz585vXa937946ePBgY9sDAAAtAM9CAgAAxiHAAAAA4xBgAACAcQgwAADAOAQYAABgHAIMAAAwDgEGAAAYhwADAACMQ4ABAADGIcAAAADjEGAAAIBxCDAAAMA4BBgAAGAcAgwAADAOAQYAABiHAAMAAIxDgAEAAMYhwAAAAOMQYAAAgHEIMAAAwDgEGAAAYBwCDAAAMA4BBgAAGIcAAwAAjEOAAQAAxiHAAAAA4xBgAACAcQgwAADAOAQYAABgHAIMAAAwDgEGAAAYhwADAACMQ4ABAADGIcAAAADjEGAAAIBxGh1gsrKyNH78eIWFhclmsyktLc2aq62t1YIFC9SrVy+1bdtWYWFhuu2223Tq1Cm39ygrK1N8fLzsdruCgoKUkJCgqqoqt5o333xT119/vQICAhQeHq7ly5f/sD0EAACXnEYHmDNnzqhPnz5av359g7kvvvhCx48f16JFi3T8+HH99a9/VWFhoX75y1+61cXHxys/P18ZGRnas2ePsrKyNGvWLGu+srJSI0eOVGRkpHJzc/X4449r8eLF2rRp0w/YRQAAcKnxbewKo0eP1ujRoy8453A4lJGR4Tb2xBNP6JprrlFRUZEiIiJUUFCg9PR0HT16VAMGDJAkrVu3TmPGjNGKFSsUFham1NRU1dTUaMuWLfLz89NVV12lvLw8rVy50i3ofFV1dbWqq6ut15WVlY3dNQAAYIgmvwamoqJCNptNQUFBkqTs7GwFBQVZ4UWSYmNj5ePjo5ycHKtm6NCh8vPzs2ri4uJUWFiozz777ILbWbZsmRwOh7WEh4c33U4BAACvatIAc/bsWS1YsEC33nqr7Ha7JMnpdKpz585udb6+vgoODpbT6bRqQkJC3GrOvz5f83XJycmqqKiwluL
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
2024-10-18 21:30:59 +04:00
}
],
"source": [
"rich_df['age_type'] = pd.cut(rich_df['Age'], bins=[0, 30, 60, 100], \n",
2024-10-18 21:30:59 +04:00
" labels=[\"young\", \"grown\", \"old\"])\n",
"\n",
"y = rich_df['age_type']\n",
"x = rich_df.drop(columns=['Age', 'age_type'])\n",
"\n",
"oversampler = RandomOverSampler(random_state=42)\n",
"x_resampled, y_resampled = oversampler.fit_resample(x, y)\n",
"\n",
"undersampler = RandomUnderSampler(random_state=42)\n",
"x_resampled_under, y_resampled_under = undersampler.fit_resample(x, y)\n",
"\n",
"print(\"oversampling:\")\n",
"print(pd.Series(y_resampled).value_counts())\n",
"\n",
"print(\"undersampling:\")\n",
"print(pd.Series(y_resampled_under).value_counts())\n",
"\n",
"y_resampled.value_counts().plot(kind='bar')\n",
"plt.show()\n",
"y_resampled_under.value_counts().plot(kind='bar')\n",
"plt.show()\n"
2024-10-18 21:30:59 +04:00
]
}
],
"metadata": {
"kernelspec": {
"display_name": "aimenv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}