{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "LR_vs_LSTM_vs_MLP_on_PIMA.ipynb",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"authorship_tag": "ABX9TyOgPuu0jxe0vRGeXEezeyNo",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "sYD4qX7ik4xw",
"colab_type": "text"
},
"source": [
"# Overview #\n",
"\n",
"Two diabetic datasets can be explored:\n",
"\n",
"1. UCI\n",
"\n",
"2. PIMA: \"This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes.\"\n",
"\n",
"Adopted from:\n",
"\n",
"- [Collab notebook](https://github.com/1UC1F3R616/myGoogleCollabNotebooks/blob/master/Pima_Indians_Diabetes.ipynb)\n",
"\n",
"- [MDPI 2019](https://www.mdpi.com/2076-3417/9/17/3532/pdf)\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "BtG7k2Y0k80_",
"colab_type": "text"
},
"source": [
"# A) Mount and download datasets #"
]
},
{
"cell_type": "code",
"metadata": {
"id": "_rup1_Ybj5jh",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 122
},
"outputId": "2b8ca93c-3906-4b33-87a0-c5b602087b01"
},
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')\n",
"\n",
"from pydrive.auth import GoogleAuth\n",
"from pydrive.drive import GoogleDrive\n",
"from google.colab import auth\n",
"from oauth2client.client import GoogleCredentials"
],
"execution_count": 1,
"outputs": [
{
"output_type": "stream",
"text": [
"Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n",
"\n",
"Enter your authorization code:\n",
"··········\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2d2ACaXRIjsD",
"colab_type": "text"
},
"source": [
"## Download UCI-Diabetes ##"
]
},
{
"cell_type": "code",
"metadata": {
"id": "QwGqLC0I8nsQ",
"colab_type": "code",
"colab": {}
},
"source": [
"import os\n",
"if os.path.isdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/'\n",
" except e as Exception:\n",
" pass \n",
"\n",
"if os.path.isdir( '/content/drive/My Drive/Colab Notebooks/opensource_datasets/UCI-diabetes' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/UCI-diabetes'\n",
" except e as Exception:\n",
" pass \n",
" \n",
"os.chdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/UCI-diabetes')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "xqOOoKP38mrZ",
"colab_type": "code",
"colab": {}
},
"source": [
"! wget -O diabetes2.Z https://archive.ics.uci.edu/ml/machine-learning-databases/diabetes/diabetes-data.tar.Z\n",
"! tar xvf diabetes2.Z"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "4PLAuZJL9WcK",
"colab_type": "text"
},
"source": [
"## Download PIMA ##"
]
},
{
"cell_type": "code",
"metadata": {
"id": "lADZ3VR7kENs",
"colab_type": "code",
"colab": {}
},
"source": [
"import os\n",
"if os.path.isdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/'\n",
" except e as Exception:\n",
" pass \n",
"\n",
"if os.path.isdir( '/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA' )==False:\n",
" try:\n",
" ! mkdir '/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA'\n",
" except e as Exception:\n",
" pass \n",
" \n",
"os.chdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "sP9ypL6QjsXL",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
},
"outputId": "5814b33f-a508-409d-e0e3-bc4a55279589"
},
"source": [
"! git clone https://github.com/lisatwyw/GlucoseLevels.git\n",
"! ls"
],
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"text": [
"fatal: destination path 'GlucoseLevels' already exists and is not an empty directory.\n",
"GlucoseLevels\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Q8ooVLDdlJLC",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
},
"outputId": "64e92963-757b-4ae2-c5ae-097bde8ab731"
},
"source": [
"os.chdir('/content/drive/My Drive/Colab Notebooks/opensource_datasets/PIMA/GlucoseLevels')\n",
"! ls"
],
"execution_count": 239,
"outputs": [
{
"output_type": "stream",
"text": [
"accuracy.png diabetes2.csv diabetes3.csv diabetes.csv loss.png\n",
"ann_BGL.ipynb diabetes2.Z diabetes4.csv glucose_RF.R README.md\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2fhi-8rSIqY4",
"colab_type": "text"
},
"source": [
"# B) Load data #"
]
},
{
"cell_type": "code",
"metadata": {
"id": "DQO8W9hklSn_",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 238
},
"outputId": "50647cee-4759-4c06-c97a-5c5db10569d3"
},
"source": [
"import pandas as pd\n",
"col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']\n",
"feature_cols=['pregnant','insulin', 'bmi', 'skin', 'age','glucose','bp','pedigree']\n",
"\n",
"pima = pd.read_csv('diabetes.csv', header=None, names=col_names)\n",
"print(pima.shape)\n",
"pima.drop(pima.index[0], inplace=True)\n",
"print(pima.shape)\n",
"pima.head()"
],
"execution_count": 272,
"outputs": [
{
"output_type": "stream",
"text": [
"(769, 9)\n",
"(768, 9)\n"
],
"name": "stdout"
},
{
"output_type": "execute_result",
"data": {
"text/html": [
"
\n", " | pregnant | \n", "glucose | \n", "bp | \n", "skin | \n", "insulin | \n", "bmi | \n", "pedigree | \n", "age | \n", "label | \n", "
---|---|---|---|---|---|---|---|---|---|
1 | \n", "6 | \n", "148 | \n", "72 | \n", "35 | \n", "0 | \n", "33.6 | \n", "0.627 | \n", "50 | \n", "1 | \n", "
2 | \n", "1 | \n", "85 | \n", "66 | \n", "29 | \n", "0 | \n", "26.6 | \n", "0.351 | \n", "31 | \n", "0 | \n", "
3 | \n", "8 | \n", "183 | \n", "64 | \n", "0 | \n", "0 | \n", "23.3 | \n", "0.672 | \n", "32 | \n", "1 | \n", "
4 | \n", "1 | \n", "89 | \n", "66 | \n", "23 | \n", "94 | \n", "28.1 | \n", "0.167 | \n", "21 | \n", "0 | \n", "
5 | \n", "0 | \n", "137 | \n", "40 | \n", "35 | \n", "168 | \n", "43.1 | \n", "2.288 | \n", "33 | \n", "1 | \n", "