{ "cells": [ { "cell_type": "markdown", "id": "dd8f4a87", "metadata": {}, "source": [ "# Graded lab 2: Simpson paradox" ] }, { "cell_type": "markdown", "id": "63a48569", "metadata": {}, "source": [ "### Name:" ] }, { "cell_type": "markdown", "id": "d5a124be", "metadata": {}, "source": [ "This graded lab will help you to better understand the Simpson paradox and to solve it using a causal graph. In the following, fill only cells starting with TODO!" ] }, { "cell_type": "code", "execution_count": 1, "id": "9659d044", "metadata": {}, "outputs": [ { "ename": "ModuleNotFoundError", "evalue": "No module named 'pandas'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[1], line 3\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mnetworkx\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mnx\u001b[39;00m\n\u001b[1;32m 2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mnumpy\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mnp\u001b[39;00m\n\u001b[0;32m----> 3\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mpandas\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mpd\u001b[39;00m\n\u001b[1;32m 5\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mtools\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m plot_graph\n", "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pandas'" ] } ], "source": [ "import networkx as nx\n", "import numpy as np\n", "import pandas as pd\n", "\n", "from tools import plot_graph" ] }, { "cell_type": "markdown", "id": "09d7d0de", "metadata": {}, "source": [ "### Task 1: Causal stories and statistical paradoxes" ] }, { "cell_type": "markdown", "id": "7327745b", "metadata": {}, "source": [ "### Experiment 1\n", "We record the recovery rates and the gender of 700 sick patients who were given acess to a drug. A total of 350 patients chose to take the drug and 350 patients did not. \n", "The results of the study are shown in the following table:\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "
Drug No drug
Men 81 out of 87 recovered (93%) 234 out of 270 recovered (87%)
Women 192 out of 263 recovered (73%) 55 out of 80 recovered (69%)
Combined data 273 out of 350 recovered (78%) 289 out of 350 recovered (83%)
\n", "* Given the results of this study, then, should a doctor prescribe the drug for women? men? any patient? Explain." ] }, { "cell_type": "markdown", "id": "70a1bc99", "metadata": {}, "source": [ "TODO: " ] }, { "cell_type": "markdown", "id": "74ce5df5", "metadata": {}, "source": [ "Suppose now that we know an additional fact: Estrogen has a negative effect on recovery, so women are less likely to recover than men, regardless of the drug.\n", "* Draw the causal graph representing the causal relations of \"gender\", \"drug taking\" and \"recovery\"" ] }, { "cell_type": "code", "execution_count": null, "id": "7ce8c233", "metadata": {}, "outputs": [], "source": [ "G1 = nx.DiGraph()\n", "#TODO\n", "plot_graph(G1)" ] }, { "cell_type": "markdown", "id": "8baccf54", "metadata": {}, "source": [ "* Given the results of this study and the causal graph, should a doctor prescribe the drug for women? men? any patient? Explain." ] }, { "cell_type": "markdown", "id": "2921ffc7", "metadata": {}, "source": [ "TODO:" ] }, { "cell_type": "markdown", "id": "7ca5dc40", "metadata": {}, "source": [ "### Experiment 2\n", "Again, we record the recovery rates of 700 sick patients who were given acess to a drug. As before, a total of 350 patients chose to take the drug and 350 patients did not. However, now instead of recording the gender of patients, we recorder their blood pressure after they have taken the drug.\n", "The results of the study are shown in the following table:\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "
Drug No drug
Low blood pressure 81 out of 87 recovered (93%) 234 out of 270 recovered (87%)
High blood pressure 192 out of 263 recovered (73%) 55 out of 80 recovered (69%)
Combined data 273 out of 350 recovered (78%) 289 out of 350 recovered (83%)
\n", "* Given the results of this study, then, should a doctor prescribe the drug for patients with low blood pressure? patients with high blood pressure? any patient? Explain." ] }, { "cell_type": "markdown", "id": "1ed1c87e", "metadata": {}, "source": [ "TODO:" ] }, { "cell_type": "markdown", "id": "41dfe6ad", "metadata": {}, "source": [ "Suppose now that we know an additional fact: In this experiment we know that the drug lowers the blood pressure which also affects recovery. \n", "* Draw the causal graph representing the causal relations of \"blood pressure\", \"drug taking\" and \"recovery\"." ] }, { "cell_type": "code", "execution_count": null, "id": "3aff0da5", "metadata": {}, "outputs": [], "source": [ "G2 = nx.DiGraph()\n", "#TODO\n", "plot_graph(G2)" ] }, { "cell_type": "markdown", "id": "ea9880c9", "metadata": {}, "source": [ "* Given the results of this study and the causal graph, should a doctor prescribe the drug for any patient? Explain." ] }, { "cell_type": "markdown", "id": "9f1fcb22", "metadata": {}, "source": [ "TODO: " ] }, { "cell_type": "markdown", "id": "c449c281", "metadata": {}, "source": [ "### Experiment 3\n", "We record the recovery rates of 700 sick patients who were given acess to a drug. As before, a total of 350 patients chose to take the drug and 350 patients did not. However, now instead of recording the gender of patients, we recorded their weight.\n", "The results of the study are shown in the following table:\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "\n", "
Drug No drug
Low weight 81 out of 87 recovered (93%) 234 out of 270 recovered (87%)
High weight 192 out of 263 recovered (73%) 55 out of 80 recovered (69%)
Combined data 273 out of 350 recovered (78%) 289 out of 350 recovered (83%)
\n", "* Given the results of this study, then, should a doctor prescribe the drug for patients with low weight? patients with high weight? any patient? Explain." ] }, { "cell_type": "markdown", "id": "25cf9cb2", "metadata": {}, "source": [ "TODO: " ] }, { "cell_type": "markdown", "id": "7bae7b9f", "metadata": {}, "source": [ "Suppose now that we know additional two facts: socioeconomic status affects both the choice of taking the drug and the weight of the patient; the weight has an effect on recovery.\n", "* Draw the causal graph representing the causal relations of \"socioeconomic status\", \"weight\", \"drug taking\" and \"recovery\"." ] }, { "cell_type": "code", "execution_count": null, "id": "96651413", "metadata": {}, "outputs": [], "source": [ "#TODO" ] }, { "cell_type": "markdown", "id": "b77d164c", "metadata": {}, "source": [ "* Given the results of this study and the causal graph, should a doctor prescribe the drug for patients with low weight? patients with high weight? any patient? Explain." ] }, { "cell_type": "markdown", "id": "0bed280c", "metadata": {}, "source": [ "TODO:" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.1" } }, "nbformat": 4, "nbformat_minor": 5 }