{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "LR_vs_LSTM_on_PIMA_without_skin.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"authorship_tag": "ABX9TyNe4S5uNRpxMSdW51WlML+l",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sYD4qX7ik4xw",
"colab_type": "text"
},
"source": [
"# Overview #\n",
"\n",
"Two diabetic datasets can be explored:\n",
"\n",
"1. UCI\n",
"\n",
"2. PIMA: \"This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes.\"\n",
"\n",
"Adopted from:\n",
"\n",
"- [Collab notebook](https://github.com/1UC1F3R616/myGoogleCollabNotebooks/blob/master/Pima_Indians_Diabetes.ipynb)\n",
"\n",
"- [MDPI 2019](https://www.mdpi.com/2076-3417/9/17/3532/pdf)\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BtG7k2Y0k80_",
"colab_type": "text"
},
"source": [
"# A) Mount and download datasets #"
]
},
{
"cell_type": "code",
"metadata": {
"id": "_rup1_Ybj5jh",
"colab_type": "code",
"outputId": "0a1bd4f9-6713-462e-ed71-715cbb2daf23",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 122
}
},
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')\n",
"\n",
"from pydrive.auth import GoogleAuth\n",
"from pydrive.drive import GoogleDrive\n",
"from google.colab import auth\n",
"from oauth2client.client import GoogleCredentials"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
"Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n",
"\n",
"Enter your authorization code:\n",
"··········\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2d2ACaXRIjsD",
"colab_type": "text"
},
"source": [
"## Download UCI-Diabetes ##"
]
},
{
"cell_type": "code",
"metadata": {
"id": "QwGqLC0I8nsQ",
"colab_type": "code",
"colab": {}
},
"source": [
"import os\n",
"if os.path.isdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/'\n",
" except e as Exception:\n",
" pass \n",
"\n",
"if os.path.isdir( '/content/drive/My Drive/Colab Notebooks/opensource_datasets/UCI-diabetes' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/UCI-diabetes'\n",
" except e as Exception:\n",
" pass \n",
" \n",
"os.chdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/UCI-diabetes')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "xqOOoKP38mrZ",
"colab_type": "code",
"colab": {}
},
"source": [
"! wget -O diabetes2.Z https://archive.ics.uci.edu/ml/machine-learning-databases/diabetes/diabetes-data.tar.Z\n",
"! tar xvf diabetes2.Z"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "4PLAuZJL9WcK",
"colab_type": "text"
},
"source": [
"## Download PIMA ##"
]
},
{
"cell_type": "code",
"metadata": {
"id": "lADZ3VR7kENs",
"colab_type": "code",
"colab": {}
},
"source": [
"import os\n",
"if os.path.isdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/'\n",
" except e as Exception:\n",
" pass \n",
"\n",
"if os.path.isdir( '/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA'\n",
" except e as Exception:\n",
" pass \n",
" \n",
"os.chdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "sP9ypL6QjsXL",
"colab_type": "code",
"outputId": "b35cfea7-5280-4a0e-9707-2087c2b89cc2",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 102
}
},
"source": [
"! git clone https://github.com/lisatwyw/GlucoseLevels.git\n",
"! ls"
],
"execution_count": 5,
"outputs": [
{
"output_type": "stream",
"text": [
"Cloning into 'GlucoseLevels'...\n",
"remote: Enumerating objects: 42, done.\u001b[K\n",
"remote: Total 42 (delta 0), reused 0 (delta 0), pack-reused 42\u001b[K\n",
"Unpacking objects: 100% (42/42), done.\n",
"GlucoseLevels\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Q8ooVLDdlJLC",
"colab_type": "code",
"outputId": "822b7ae6-2979-4159-9f22-f835c3ba3fb3",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
}
},
"source": [
"os.chdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA/GlucoseLevels')\n",
"! ls"
],
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"text": [
"ann_BGL.ipynb diabetes3.csv diabetes.csv README.md\n",
"diabetes2.csv diabetes4.csv glucose_RF.R\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2fhi-8rSIqY4",
"colab_type": "text"
},
"source": [
"# B) Load data #"
]
},
{
"cell_type": "code",
"metadata": {
"id": "DQO8W9hklSn_",
"colab_type": "code",
"outputId": "1bffdeaa-ed1e-449d-a9a1-46aff9654555",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 238
}
},
"source": [
"import pandas as pd\n",
"col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']\n",
"feature_cols=['pregnant','insulin', 'bmi', 'age','glucose','bp','pedigree']\n",
"\n",
"pima = pd.read_csv('diabetes.csv', header=None, names=col_names)\n",
"print(pima.shape)\n",
"pima.drop(pima.index[0], inplace=True)\n",
"print(pima.shape)\n",
"pima.head()"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"(769, 9)\n",
"(768, 9)\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n", " | pregnant | \n", "glucose | \n", "bp | \n", "skin | \n", "insulin | \n", "bmi | \n", "pedigree | \n", "age | \n", "label | \n", "
---|---|---|---|---|---|---|---|---|---|
1 | \n", "6 | \n", "148 | \n", "72 | \n", "35 | \n", "0 | \n", "33.6 | \n", "0.627 | \n", "50 | \n", "1 | \n", "
2 | \n", "1 | \n", "85 | \n", "66 | \n", "29 | \n", "0 | \n", "26.6 | \n", "0.351 | \n", "31 | \n", "0 | \n", "
3 | \n", "8 | \n", "183 | \n", "64 | \n", "0 | \n", "0 | \n", "23.3 | \n", "0.672 | \n", "32 | \n", "1 | \n", "
4 | \n", "1 | \n", "89 | \n", "66 | \n", "23 | \n", "94 | \n", "28.1 | \n", "0.167 | \n", "21 | \n", "0 | \n", "
5 | \n", "0 | \n", "137 | \n", "40 | \n", "35 | \n", "168 | \n", "43.1 | \n", "2.288 | \n", "33 | \n", "1 | \n", "