{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Лабораторная работа 2. Анализ нескольких датасетов." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.Выбрать три набора данных, которые не соответствуют Вашему варианту задания\n", "Выбранны варианты: Данные по инсультам(Вариант 4), Продажи домов(Вариант 6), Цены на мобильные устройства (Вариант 18)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Провести анализ сведений о каждом наборе данных со страницы загрузки в Kaggle. Какова проблемная область?\n", "\n", "#### Данные по инсультам:\n", "- **Проблемная область:** Анализ данных о пациентах с инсультом\n", "- **Цели:** Анализ данных о пациентах с инсультом, определение факторов, влияющих на шансы возникновения болезни\n", "- **Набор данных:** 5111 записей, 12 переменных:\n", " - id\n", " - gender\n", " - age\n", " - hypertension\n", " - heart_disease\n", " - ever_married\n", " - work_type\n", " - residence_typr\n", " - avg_glucose_level\n", " - bmi\n", " - smoking_status\n", " - stroke\n", "- **Описание данных:** Сведения о пациентах с инсультом, их образе жизни и медецинских показателях\n", "\n", "#### Продажи домов:\n", "- **Проблемная область:** Анализ продаж домов и их цен в зависисмости от различных факторов \n", "- **Цели:** Анализ продаж домов, определение факторов влияющих на цены\n", "- **Набор данных:** 21614 записей, 21 переменная:\n", " - id\n", " - date\n", " - price\n", " - bedrooms\n", " - bathrooms\n", " - sqft_living\n", " - sqft_loft\n", " - floors\n", " - waterfront\n", " - view\n", " - condition\n", " - grade\n", " - sqft_above\n", " - sqft_basment\n", " - yr_build\n", " - yr_renovated\n", " - zipcode\n", " - lat\n", " - longsqft_living15\n", " - sqft_lot15\n", "- **Описание данных:** Сведения о проданных домах в King County, США\n", "\n", "#### Цены на мобильные устройства:\n", "- **Проблемная область:** Анализ цен на мобильные устройства\n", "- **Цели:** Анализ цен на мобильные устройства, определение факторов, влияющих на цены\n", "- **Набор данных:** 1371 записей, 18 переменных:\n", " - id\n", " - name\n", " - rating\n", " - spec_score\n", " - no_of_sim\n", " - ram\n", " - battery\n", " - camera\n", " - external_memory\n", " - android_version\n", " - price\n", " - company\n", " - inbuild_memory\n", " - fast_charging\n", " - screen_resolution\n", " - processor\n", " - processor_name\n", "- **Описание данных:** Сведения о ценах на мобильные устройства в зависимости от различных характеристик" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Данные по инсультам:\n", "Каждая строка в датасете содержит соответствующую информацию о пациенте, что позволяет проводить анализ и строить модели для предсказания риска инсульта." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "gender | \n", "age | \n", "hypertension | \n", "heart_disease | \n", "ever_married | \n", "work_type | \n", "Residence_type | \n", "avg_glucose_level | \n", "bmi | \n", "smoking_status | \n", "stroke | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "9046 | \n", "Male | \n", "67.0 | \n", "0 | \n", "1 | \n", "Yes | \n", "Private | \n", "Urban | \n", "228.69 | \n", "36.6 | \n", "formerly smoked | \n", "1 | \n", "
1 | \n", "51676 | \n", "Female | \n", "61.0 | \n", "0 | \n", "0 | \n", "Yes | \n", "Self-employed | \n", "Rural | \n", "202.21 | \n", "NaN | \n", "never smoked | \n", "1 | \n", "
2 | \n", "31112 | \n", "Male | \n", "80.0 | \n", "0 | \n", "1 | \n", "Yes | \n", "Private | \n", "Rural | \n", "105.92 | \n", "32.5 | \n", "never smoked | \n", "1 | \n", "
3 | \n", "60182 | \n", "Female | \n", "49.0 | \n", "0 | \n", "0 | \n", "Yes | \n", "Private | \n", "Urban | \n", "171.23 | \n", "34.4 | \n", "smokes | \n", "1 | \n", "
4 | \n", "1665 | \n", "Female | \n", "79.0 | \n", "1 | \n", "0 | \n", "Yes | \n", "Self-employed | \n", "Rural | \n", "174.12 | \n", "24.0 | \n", "never smoked | \n", "1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
5105 | \n", "18234 | \n", "Female | \n", "80.0 | \n", "1 | \n", "0 | \n", "Yes | \n", "Private | \n", "Urban | \n", "83.75 | \n", "NaN | \n", "never smoked | \n", "0 | \n", "
5106 | \n", "44873 | \n", "Female | \n", "81.0 | \n", "0 | \n", "0 | \n", "Yes | \n", "Self-employed | \n", "Urban | \n", "125.20 | \n", "40.0 | \n", "never smoked | \n", "0 | \n", "
5107 | \n", "19723 | \n", "Female | \n", "35.0 | \n", "0 | \n", "0 | \n", "Yes | \n", "Self-employed | \n", "Rural | \n", "82.99 | \n", "30.6 | \n", "never smoked | \n", "0 | \n", "
5108 | \n", "37544 | \n", "Male | \n", "51.0 | \n", "0 | \n", "0 | \n", "Yes | \n", "Private | \n", "Rural | \n", "166.29 | \n", "25.6 | \n", "formerly smoked | \n", "0 | \n", "
5109 | \n", "44679 | \n", "Female | \n", "44.0 | \n", "0 | \n", "0 | \n", "Yes | \n", "Govt_job | \n", "Urban | \n", "85.28 | \n", "26.2 | \n", "Unknown | \n", "0 | \n", "
5110 rows × 12 columns
\n", "\n", " | id | \n", "date | \n", "price | \n", "bedrooms | \n", "bathrooms | \n", "sqft_living | \n", "sqft_lot | \n", "floors | \n", "waterfront | \n", "view | \n", "... | \n", "grade | \n", "sqft_above | \n", "sqft_basement | \n", "yr_built | \n", "yr_renovated | \n", "zipcode | \n", "lat | \n", "long | \n", "sqft_living15 | \n", "sqft_lot15 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "7129300520 | \n", "20141013T000000 | \n", "221900.0 | \n", "3 | \n", "1.00 | \n", "1180 | \n", "5650 | \n", "1.0 | \n", "0 | \n", "0 | \n", "... | \n", "7 | \n", "1180 | \n", "0 | \n", "1955 | \n", "0 | \n", "98178 | \n", "47.5112 | \n", "-122.257 | \n", "1340 | \n", "5650 | \n", "
1 | \n", "6414100192 | \n", "20141209T000000 | \n", "538000.0 | \n", "3 | \n", "2.25 | \n", "2570 | \n", "7242 | \n", "2.0 | \n", "0 | \n", "0 | \n", "... | \n", "7 | \n", "2170 | \n", "400 | \n", "1951 | \n", "1991 | \n", "98125 | \n", "47.7210 | \n", "-122.319 | \n", "1690 | \n", "7639 | \n", "
2 | \n", "5631500400 | \n", "20150225T000000 | \n", "180000.0 | \n", "2 | \n", "1.00 | \n", "770 | \n", "10000 | \n", "1.0 | \n", "0 | \n", "0 | \n", "... | \n", "6 | \n", "770 | \n", "0 | \n", "1933 | \n", "0 | \n", "98028 | \n", "47.7379 | \n", "-122.233 | \n", "2720 | \n", "8062 | \n", "
3 | \n", "2487200875 | \n", "20141209T000000 | \n", "604000.0 | \n", "4 | \n", "3.00 | \n", "1960 | \n", "5000 | \n", "1.0 | \n", "0 | \n", "0 | \n", "... | \n", "7 | \n", "1050 | \n", "910 | \n", "1965 | \n", "0 | \n", "98136 | \n", "47.5208 | \n", "-122.393 | \n", "1360 | \n", "5000 | \n", "
4 | \n", "1954400510 | \n", "20150218T000000 | \n", "510000.0 | \n", "3 | \n", "2.00 | \n", "1680 | \n", "8080 | \n", "1.0 | \n", "0 | \n", "0 | \n", "... | \n", "8 | \n", "1680 | \n", "0 | \n", "1987 | \n", "0 | \n", "98074 | \n", "47.6168 | \n", "-122.045 | \n", "1800 | \n", "7503 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
21608 | \n", "263000018 | \n", "20140521T000000 | \n", "360000.0 | \n", "3 | \n", "2.50 | \n", "1530 | \n", "1131 | \n", "3.0 | \n", "0 | \n", "0 | \n", "... | \n", "8 | \n", "1530 | \n", "0 | \n", "2009 | \n", "0 | \n", "98103 | \n", "47.6993 | \n", "-122.346 | \n", "1530 | \n", "1509 | \n", "
21609 | \n", "6600060120 | \n", "20150223T000000 | \n", "400000.0 | \n", "4 | \n", "2.50 | \n", "2310 | \n", "5813 | \n", "2.0 | \n", "0 | \n", "0 | \n", "... | \n", "8 | \n", "2310 | \n", "0 | \n", "2014 | \n", "0 | \n", "98146 | \n", "47.5107 | \n", "-122.362 | \n", "1830 | \n", "7200 | \n", "
21610 | \n", "1523300141 | \n", "20140623T000000 | \n", "402101.0 | \n", "2 | \n", "0.75 | \n", "1020 | \n", "1350 | \n", "2.0 | \n", "0 | \n", "0 | \n", "... | \n", "7 | \n", "1020 | \n", "0 | \n", "2009 | \n", "0 | \n", "98144 | \n", "47.5944 | \n", "-122.299 | \n", "1020 | \n", "2007 | \n", "
21611 | \n", "291310100 | \n", "20150116T000000 | \n", "400000.0 | \n", "3 | \n", "2.50 | \n", "1600 | \n", "2388 | \n", "2.0 | \n", "0 | \n", "0 | \n", "... | \n", "8 | \n", "1600 | \n", "0 | \n", "2004 | \n", "0 | \n", "98027 | \n", "47.5345 | \n", "-122.069 | \n", "1410 | \n", "1287 | \n", "
21612 | \n", "1523300157 | \n", "20141015T000000 | \n", "325000.0 | \n", "2 | \n", "0.75 | \n", "1020 | \n", "1076 | \n", "2.0 | \n", "0 | \n", "0 | \n", "... | \n", "7 | \n", "1020 | \n", "0 | \n", "2008 | \n", "0 | \n", "98144 | \n", "47.5941 | \n", "-122.299 | \n", "1020 | \n", "1357 | \n", "
21613 rows × 21 columns
\n", "\n", " | Unnamed: 0 | \n", "Name | \n", "Rating | \n", "Spec_score | \n", "No_of_sim | \n", "Ram | \n", "Battery | \n", "Display | \n", "Camera | \n", "External_Memory | \n", "Android_version | \n", "Price | \n", "company | \n", "Inbuilt_memory | \n", "fast_charging | \n", "Screen_resolution | \n", "Processor | \n", "Processor_name | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "Samsung Galaxy F14 5G | \n", "4.65 | \n", "68 | \n", "Dual Sim, 3G, 4G, 5G, VoLTE, | \n", "4 GB RAM | \n", "6000 mAh Battery | \n", "6.6 inches | \n", "50 MP + 2 MP Dual Rear & 13 MP Front Camera | \n", "Memory Card Supported, upto 1 TB | \n", "13 | \n", "9,999 | \n", "Samsung | \n", "128 GB inbuilt | \n", "25W Fast Charging | \n", "2408 x 1080 px Display with Water Drop Notch | \n", "Octa Core Processor | \n", "Exynos 1330 | \n", "
1 | \n", "1 | \n", "Samsung Galaxy A11 | \n", "4.20 | \n", "63 | \n", "Dual Sim, 3G, 4G, VoLTE, | \n", "2 GB RAM | \n", "4000 mAh Battery | \n", "6.4 inches | \n", "13 MP + 5 MP + 2 MP Triple Rear & 8 MP Fro... | \n", "Memory Card Supported, upto 512 GB | \n", "10 | \n", "9,990 | \n", "Samsung | \n", "32 GB inbuilt | \n", "15W Fast Charging | \n", "720 x 1560 px Display with Punch Hole | \n", "1.8 GHz Processor | \n", "Octa Core | \n", "
2 | \n", "2 | \n", "Samsung Galaxy A13 | \n", "4.30 | \n", "75 | \n", "Dual Sim, 3G, 4G, VoLTE, | \n", "4 GB RAM | \n", "5000 mAh Battery | \n", "6.6 inches | \n", "50 MP Quad Rear & 8 MP Front Camera | \n", "Memory Card Supported, upto 1 TB | \n", "12 | \n", "11,999 | \n", "Samsung | \n", "64 GB inbuilt | \n", "25W Fast Charging | \n", "1080 x 2408 px Display with Water Drop Notch | \n", "2 GHz Processor | \n", "Octa Core | \n", "
3 | \n", "3 | \n", "Samsung Galaxy F23 | \n", "4.10 | \n", "73 | \n", "Dual Sim, 3G, 4G, VoLTE, | \n", "4 GB RAM | \n", "6000 mAh Battery | \n", "6.4 inches | \n", "48 MP Quad Rear & 13 MP Front Camera | \n", "Memory Card Supported, upto 1 TB | \n", "12 | \n", "11,999 | \n", "Samsung | \n", "64 GB inbuilt | \n", "NaN | \n", "720 x 1600 px | \n", "Octa Core | \n", "Helio G88 | \n", "
4 | \n", "4 | \n", "Samsung Galaxy A03s (4GB RAM + 64GB) | \n", "4.10 | \n", "69 | \n", "Dual Sim, 3G, 4G, VoLTE, | \n", "4 GB RAM | \n", "5000 mAh Battery | \n", "6.5 inches | \n", "13 MP + 2 MP + 2 MP Triple Rear & 5 MP Fro... | \n", "Memory Card Supported, upto 1 TB | \n", "11 | \n", "11,999 | \n", "Samsung | \n", "64 GB inbuilt | \n", "15W Fast Charging | \n", "720 x 1600 px Display with Water Drop Notch | \n", "Octa Core | \n", "Helio P35 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
1365 | \n", "1365 | \n", "TCL 40R | \n", "4.05 | \n", "75 | \n", "Dual Sim, 3G, 4G, 5G, VoLTE, | \n", "4 GB RAM | \n", "5000 mAh Battery | \n", "6.6 inches | \n", "50 MP + 2 MP + 2 MP Triple Rear & 8 MP Fro... | \n", "Memory Card (Hybrid) | \n", "12 | \n", "18,999 | \n", "TCL | \n", "64 GB inbuilt | \n", "15W Fast Charging | \n", "720 x 1612 px | \n", "Octa Core | \n", "Dimensity 700 5G | \n", "
1366 | \n", "1366 | \n", "TCL 50 XL NxtPaper 5G | \n", "4.10 | \n", "80 | \n", "Dual Sim, 3G, 4G, VoLTE, | \n", "8 GB RAM | \n", "5000 mAh Battery | \n", "6.8 inches | \n", "50 MP + 2 MP Dual Rear & 16 MP Front Camera | \n", "Memory Card (Hybrid) | \n", "14 | \n", "24,990 | \n", "TCL | \n", "128 GB inbuilt | \n", "33W Fast Charging | \n", "1200 x 2400 px | \n", "Octa Core | \n", "Dimensity 7050 | \n", "
1367 | \n", "1367 | \n", "TCL 50 XE NxtPaper 5G | \n", "4.00 | \n", "80 | \n", "Dual Sim, 3G, 4G, 5G, VoLTE, | \n", "6 GB RAM | \n", "5000 mAh Battery | \n", "6.6 inches | \n", "50 MP + 2 MP Dual Rear & 16 MP Front Camera | \n", "Memory Card Supported, upto 1 TB | \n", "13 | \n", "23,990 | \n", "TCL | \n", "256 GB inbuilt | \n", "18W Fast Charging | \n", "720 x 1612 px | \n", "Octa Core | \n", "Dimensity 6080 | \n", "
1368 | \n", "1368 | \n", "TCL 40 NxtPaper 5G | \n", "4.50 | \n", "79 | \n", "Dual Sim, 3G, 4G, 5G, VoLTE, | \n", "6 GB RAM | \n", "5000 mAh Battery | \n", "6.6 inches | \n", "50 MP + 2 MP + 2 MP Triple Rear & 8 MP Fro... | \n", "Memory Card Supported, upto 1 TB | \n", "13 | \n", "22,499 | \n", "TCL | \n", "256 GB inbuilt | \n", "15W Fast Charging | \n", "720 x 1612 px | \n", "Octa Core | \n", "Dimensity 6020 | \n", "
1369 | \n", "1369 | \n", "TCL Trifold | \n", "4.65 | \n", "93 | \n", "Dual Sim, 3G, 4G, 5G, VoLTE, Vo5G, | \n", "12 GB RAM | \n", "4600 mAh Battery | \n", "10 inches | \n", "Foldable Display, Dual Display | \n", "50 MP + 48 MP + 8 MP Triple Rear & 32 MP F... | \n", "13 | \n", "1,19,990 | \n", "TCL | \n", "256 GB inbuilt | \n", "67W Fast Charging | \n", "1916 x 2160 px | \n", "Octa Core | \n", "Snapdragon 8 Gen2 | \n", "
1370 rows × 18 columns
\n", "