{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "e67d0edb",
   "metadata": {},
   "source": [
    "# Clase 13: introducción a los datos\n",
    "En esta clase revisaremos como importar datos, diferentes formatos que podemos utilizar y algunos problemas típicos a la hora del manejo de información. \n",
    "\n",
    "\n",
    "La librería que vamos a ocupar para el manejo de datos es `pandas`. \n",
    "- La documentación de `pandas` la pueden encontrar en el link 1: https://pandas.pydata.org/docs/\n",
    "- La documentación de read_csv la encuentran en el link 2: https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html\n",
    "\n",
    "Conceptos clave: \n",
    "- Pandas\n",
    "- DataFrame\n",
    "- Delimitador de miles y decimales\n",
    "- Tipo de variable\n",
    "- Index"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fc3d179d",
   "metadata": {},
   "source": [
    "## 1. Introducción"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "c68d9523",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f59e9d6c",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "id": "5ac7f25e",
   "metadata": {},
   "source": [
    "Para importar una base de datos en formato `csv`vamos a utilizar `pd.read_csv(ruta/archivo)`\n",
    "\n",
    "En el caso de Windows llamamos la ruta con doble \"$\\backslash \\backslash$\", por ejemplo: \"C:$\\backslash \\backslash$Users$\\backslash \\backslash$...\"\n",
    "\n",
    "Vamos a utilizar una base csv del banco central con información de empleo. Lo primero que llama la atención es que la base importa mal debido al delimitador, la base viene con \";\" y por default viene \",\". "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "03924724",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>Periodo;1.Total;2.Empleadores;3.Cuenta Propia;4.Asalariados;5.Personal de servicio;6.Familiar no remunerado</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>mar.2010;7.156</th>\n",
       "      <th>21;318</th>\n",
       "      <th>32;1.289</th>\n",
       "      <th>68;5.141</th>\n",
       "      <th>76;325</th>\n",
       "      <th>38;81</th>\n",
       "      <td>8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>abr.2010;7.198</th>\n",
       "      <th>78;324</th>\n",
       "      <th>94;1.332</th>\n",
       "      <th>33;5.114</th>\n",
       "      <th>80;331</th>\n",
       "      <th>31;95</th>\n",
       "      <td>39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>may.2010;7.181</th>\n",
       "      <th>90;326</th>\n",
       "      <th>95;1.346</th>\n",
       "      <th>54;5.080</th>\n",
       "      <th>65;328</th>\n",
       "      <th>56;99</th>\n",
       "      <td>21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>jun.2010;7.221</th>\n",
       "      <th>58;328</th>\n",
       "      <th>03;1.384</th>\n",
       "      <th>28;5.074</th>\n",
       "      <th>00;327</th>\n",
       "      <th>60;107</th>\n",
       "      <td>68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>jul.2010;7.256</th>\n",
       "      <th>52;333</th>\n",
       "      <th>82;1.390</th>\n",
       "      <th>03;5.081</th>\n",
       "      <th>93;339</th>\n",
       "      <th>44;111</th>\n",
       "      <td>29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>nov.2020;7.916</th>\n",
       "      <th>72;248</th>\n",
       "      <th>89;1.568</th>\n",
       "      <th>05;5.833</th>\n",
       "      <th>68;188</th>\n",
       "      <th>43;77</th>\n",
       "      <td>66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>dic.2020;8.026</th>\n",
       "      <th>22;234</th>\n",
       "      <th>57;1.588</th>\n",
       "      <th>14;5.927</th>\n",
       "      <th>28;194</th>\n",
       "      <th>91;81</th>\n",
       "      <td>32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>ene.2021;8.121</th>\n",
       "      <th>42;237</th>\n",
       "      <th>25;1.610</th>\n",
       "      <th>63;6.000</th>\n",
       "      <th>74;197</th>\n",
       "      <th>43;75</th>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>feb.2021;8.167</th>\n",
       "      <th>62;245</th>\n",
       "      <th>25;1.634</th>\n",
       "      <th>08;6.018</th>\n",
       "      <th>35;198</th>\n",
       "      <th>73;71</th>\n",
       "      <td>20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mar.2021;8.148</th>\n",
       "      <th>21;246</th>\n",
       "      <th>92;1.646</th>\n",
       "      <th>38;5.978</th>\n",
       "      <th>29;204</th>\n",
       "      <th>48;72</th>\n",
       "      <td>14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>133 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                       Periodo;1.Total;2.Empleadores;3.Cuenta Propia;4.Asalariados;5.Personal de servicio;6.Familiar no remunerado\n",
       "mar.2010;7.156 21;318 32;1.289 68;5.141 76;325 38;81                                                   8                                                          \n",
       "abr.2010;7.198 78;324 94;1.332 33;5.114 80;331 31;95                                                  39                                                          \n",
       "may.2010;7.181 90;326 95;1.346 54;5.080 65;328 56;99                                                  21                                                          \n",
       "jun.2010;7.221 58;328 03;1.384 28;5.074 00;327 60;107                                                 68                                                          \n",
       "jul.2010;7.256 52;333 82;1.390 03;5.081 93;339 44;111                                                 29                                                          \n",
       "...                                                                                                  ...                                                          \n",
       "nov.2020;7.916 72;248 89;1.568 05;5.833 68;188 43;77                                                  66                                                          \n",
       "dic.2020;8.026 22;234 57;1.588 14;5.927 28;194 91;81                                                  32                                                          \n",
       "ene.2021;8.121 42;237 25;1.610 63;6.000 74;197 43;75                                                  36                                                          \n",
       "feb.2021;8.167 62;245 25;1.634 08;6.018 35;198 73;71                                                  20                                                          \n",
       "mar.2021;8.148 21;246 92;1.646 38;5.978 29;204 48;72                                                  14                                                          \n",
       "\n",
       "[133 rows x 1 columns]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.read_csv(\"/home/felix/Dropbox/Computational_Economics/Intro_python/2021_S2/Clases/clase13_base1.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "629a0a98",
   "metadata": {},
   "source": [
    "Para cambiar el delimitador vamos a usar `delimiter`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "45603e41",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>1.Total</th>\n",
       "      <th>2.Empleadores</th>\n",
       "      <th>3.Cuenta Propia</th>\n",
       "      <th>4.Asalariados</th>\n",
       "      <th>5.Personal de servicio</th>\n",
       "      <th>6.Familiar no remunerado</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7.156,21</td>\n",
       "      <td>318,32</td>\n",
       "      <td>1.289,68</td>\n",
       "      <td>5.141,76</td>\n",
       "      <td>325,38</td>\n",
       "      <td>81,08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7.198,78</td>\n",
       "      <td>324,94</td>\n",
       "      <td>1.332,33</td>\n",
       "      <td>5.114,80</td>\n",
       "      <td>331,31</td>\n",
       "      <td>95,39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7.181,90</td>\n",
       "      <td>326,95</td>\n",
       "      <td>1.346,54</td>\n",
       "      <td>5.080,65</td>\n",
       "      <td>328,56</td>\n",
       "      <td>99,21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7.221,58</td>\n",
       "      <td>328,03</td>\n",
       "      <td>1.384,28</td>\n",
       "      <td>5.074,00</td>\n",
       "      <td>327,60</td>\n",
       "      <td>107,68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7.256,52</td>\n",
       "      <td>333,82</td>\n",
       "      <td>1.390,03</td>\n",
       "      <td>5.081,93</td>\n",
       "      <td>339,44</td>\n",
       "      <td>111,29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>128</th>\n",
       "      <td>nov.2020</td>\n",
       "      <td>7.916,72</td>\n",
       "      <td>248,89</td>\n",
       "      <td>1.568,05</td>\n",
       "      <td>5.833,68</td>\n",
       "      <td>188,43</td>\n",
       "      <td>77,66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129</th>\n",
       "      <td>dic.2020</td>\n",
       "      <td>8.026,22</td>\n",
       "      <td>234,57</td>\n",
       "      <td>1.588,14</td>\n",
       "      <td>5.927,28</td>\n",
       "      <td>194,91</td>\n",
       "      <td>81,32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>130</th>\n",
       "      <td>ene.2021</td>\n",
       "      <td>8.121,42</td>\n",
       "      <td>237,25</td>\n",
       "      <td>1.610,63</td>\n",
       "      <td>6.000,74</td>\n",
       "      <td>197,43</td>\n",
       "      <td>75,36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131</th>\n",
       "      <td>feb.2021</td>\n",
       "      <td>8.167,62</td>\n",
       "      <td>245,25</td>\n",
       "      <td>1.634,08</td>\n",
       "      <td>6.018,35</td>\n",
       "      <td>198,73</td>\n",
       "      <td>71,20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>132</th>\n",
       "      <td>mar.2021</td>\n",
       "      <td>8.148,21</td>\n",
       "      <td>246,92</td>\n",
       "      <td>1.646,38</td>\n",
       "      <td>5.978,29</td>\n",
       "      <td>204,48</td>\n",
       "      <td>72,14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>133 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Periodo   1.Total 2.Empleadores 3.Cuenta Propia 4.Asalariados  \\\n",
       "0    mar.2010  7.156,21        318,32        1.289,68      5.141,76   \n",
       "1    abr.2010  7.198,78        324,94        1.332,33      5.114,80   \n",
       "2    may.2010  7.181,90        326,95        1.346,54      5.080,65   \n",
       "3    jun.2010  7.221,58        328,03        1.384,28      5.074,00   \n",
       "4    jul.2010  7.256,52        333,82        1.390,03      5.081,93   \n",
       "..        ...       ...           ...             ...           ...   \n",
       "128  nov.2020  7.916,72        248,89        1.568,05      5.833,68   \n",
       "129  dic.2020  8.026,22        234,57        1.588,14      5.927,28   \n",
       "130  ene.2021  8.121,42        237,25        1.610,63      6.000,74   \n",
       "131  feb.2021  8.167,62        245,25        1.634,08      6.018,35   \n",
       "132  mar.2021  8.148,21        246,92        1.646,38      5.978,29   \n",
       "\n",
       "    5.Personal de servicio 6.Familiar no remunerado  \n",
       "0                   325,38                    81,08  \n",
       "1                   331,31                    95,39  \n",
       "2                   328,56                    99,21  \n",
       "3                   327,60                   107,68  \n",
       "4                   339,44                   111,29  \n",
       "..                     ...                      ...  \n",
       "128                 188,43                    77,66  \n",
       "129                 194,91                    81,32  \n",
       "130                 197,43                    75,36  \n",
       "131                 198,73                    71,20  \n",
       "132                 204,48                    72,14  \n",
       "\n",
       "[133 rows x 7 columns]"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.read_csv(\"/home/felix/Dropbox/Computational_Economics/Intro_python/2021_S2/Clases/clase13_base1.csv\", delimiter=\";\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62f460b5",
   "metadata": {},
   "source": [
    "Si el archivo se encuentra en la misma carpeta que el Jupyter se puede llamar sólo con el nombre del csv. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "a582bb2f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>1.Total</th>\n",
       "      <th>2.Empleadores</th>\n",
       "      <th>3.Cuenta Propia</th>\n",
       "      <th>4.Asalariados</th>\n",
       "      <th>5.Personal de servicio</th>\n",
       "      <th>6.Familiar no remunerado</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7.156,21</td>\n",
       "      <td>318,32</td>\n",
       "      <td>1.289,68</td>\n",
       "      <td>5.141,76</td>\n",
       "      <td>325,38</td>\n",
       "      <td>81,08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7.198,78</td>\n",
       "      <td>324,94</td>\n",
       "      <td>1.332,33</td>\n",
       "      <td>5.114,80</td>\n",
       "      <td>331,31</td>\n",
       "      <td>95,39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7.181,90</td>\n",
       "      <td>326,95</td>\n",
       "      <td>1.346,54</td>\n",
       "      <td>5.080,65</td>\n",
       "      <td>328,56</td>\n",
       "      <td>99,21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7.221,58</td>\n",
       "      <td>328,03</td>\n",
       "      <td>1.384,28</td>\n",
       "      <td>5.074,00</td>\n",
       "      <td>327,60</td>\n",
       "      <td>107,68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7.256,52</td>\n",
       "      <td>333,82</td>\n",
       "      <td>1.390,03</td>\n",
       "      <td>5.081,93</td>\n",
       "      <td>339,44</td>\n",
       "      <td>111,29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>128</th>\n",
       "      <td>nov.2020</td>\n",
       "      <td>7.916,72</td>\n",
       "      <td>248,89</td>\n",
       "      <td>1.568,05</td>\n",
       "      <td>5.833,68</td>\n",
       "      <td>188,43</td>\n",
       "      <td>77,66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129</th>\n",
       "      <td>dic.2020</td>\n",
       "      <td>8.026,22</td>\n",
       "      <td>234,57</td>\n",
       "      <td>1.588,14</td>\n",
       "      <td>5.927,28</td>\n",
       "      <td>194,91</td>\n",
       "      <td>81,32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>130</th>\n",
       "      <td>ene.2021</td>\n",
       "      <td>8.121,42</td>\n",
       "      <td>237,25</td>\n",
       "      <td>1.610,63</td>\n",
       "      <td>6.000,74</td>\n",
       "      <td>197,43</td>\n",
       "      <td>75,36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131</th>\n",
       "      <td>feb.2021</td>\n",
       "      <td>8.167,62</td>\n",
       "      <td>245,25</td>\n",
       "      <td>1.634,08</td>\n",
       "      <td>6.018,35</td>\n",
       "      <td>198,73</td>\n",
       "      <td>71,20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>132</th>\n",
       "      <td>mar.2021</td>\n",
       "      <td>8.148,21</td>\n",
       "      <td>246,92</td>\n",
       "      <td>1.646,38</td>\n",
       "      <td>5.978,29</td>\n",
       "      <td>204,48</td>\n",
       "      <td>72,14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>133 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Periodo   1.Total 2.Empleadores 3.Cuenta Propia 4.Asalariados  \\\n",
       "0    mar.2010  7.156,21        318,32        1.289,68      5.141,76   \n",
       "1    abr.2010  7.198,78        324,94        1.332,33      5.114,80   \n",
       "2    may.2010  7.181,90        326,95        1.346,54      5.080,65   \n",
       "3    jun.2010  7.221,58        328,03        1.384,28      5.074,00   \n",
       "4    jul.2010  7.256,52        333,82        1.390,03      5.081,93   \n",
       "..        ...       ...           ...             ...           ...   \n",
       "128  nov.2020  7.916,72        248,89        1.568,05      5.833,68   \n",
       "129  dic.2020  8.026,22        234,57        1.588,14      5.927,28   \n",
       "130  ene.2021  8.121,42        237,25        1.610,63      6.000,74   \n",
       "131  feb.2021  8.167,62        245,25        1.634,08      6.018,35   \n",
       "132  mar.2021  8.148,21        246,92        1.646,38      5.978,29   \n",
       "\n",
       "    5.Personal de servicio 6.Familiar no remunerado  \n",
       "0                   325,38                    81,08  \n",
       "1                   331,31                    95,39  \n",
       "2                   328,56                    99,21  \n",
       "3                   327,60                   107,68  \n",
       "4                   339,44                   111,29  \n",
       "..                     ...                      ...  \n",
       "128                 188,43                    77,66  \n",
       "129                 194,91                    81,32  \n",
       "130                 197,43                    75,36  \n",
       "131                 198,73                    71,20  \n",
       "132                 204,48                    72,14  \n",
       "\n",
       "[133 rows x 7 columns]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.read_csv(\"clase13_base1.csv\", delimiter=\";\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a067be00",
   "metadata": {},
   "source": [
    "El resultado es una estructura del tipo `fila-columna` que podemos guardar en una variable que llamaremos `DataFrame`. \n",
    "\n",
    "Un `DataFrame` corresponde a una estructura de datos del tipo fila-columna (similar a una hoja de excel) en el que podemos guardar información de diferentes `types`. Los DataFrame tienen un índice en la primera columna que parte en 0. \n",
    "\n",
    "Nuestro DataFrame tiene 133 filas y 7 columnas, donde el índice va de 0 a 132. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "3d93a8f7",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"clase13_base1.csv\", delimiter=\";\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "475b4393",
   "metadata": {},
   "source": [
    "## 2. Accediendo a un DataFrame\n",
    "La primera mirada a nustros datos se la vamos a dar con la función `head()`, esta nos muestra un resumen de la tabla. Para esto ponemos \"nombre del DataFrame\"+ \".\" + \"head()\".\n",
    "\n",
    "Esto nos va a mostrar las columnas de la base y las primeras 5 filas. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "ac3a0641",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>1.Total</th>\n",
       "      <th>2.Empleadores</th>\n",
       "      <th>3.Cuenta Propia</th>\n",
       "      <th>4.Asalariados</th>\n",
       "      <th>5.Personal de servicio</th>\n",
       "      <th>6.Familiar no remunerado</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7.156,21</td>\n",
       "      <td>318,32</td>\n",
       "      <td>1.289,68</td>\n",
       "      <td>5.141,76</td>\n",
       "      <td>325,38</td>\n",
       "      <td>81,08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7.198,78</td>\n",
       "      <td>324,94</td>\n",
       "      <td>1.332,33</td>\n",
       "      <td>5.114,80</td>\n",
       "      <td>331,31</td>\n",
       "      <td>95,39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7.181,90</td>\n",
       "      <td>326,95</td>\n",
       "      <td>1.346,54</td>\n",
       "      <td>5.080,65</td>\n",
       "      <td>328,56</td>\n",
       "      <td>99,21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7.221,58</td>\n",
       "      <td>328,03</td>\n",
       "      <td>1.384,28</td>\n",
       "      <td>5.074,00</td>\n",
       "      <td>327,60</td>\n",
       "      <td>107,68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7.256,52</td>\n",
       "      <td>333,82</td>\n",
       "      <td>1.390,03</td>\n",
       "      <td>5.081,93</td>\n",
       "      <td>339,44</td>\n",
       "      <td>111,29</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Periodo   1.Total 2.Empleadores 3.Cuenta Propia 4.Asalariados  \\\n",
       "0  mar.2010  7.156,21        318,32        1.289,68      5.141,76   \n",
       "1  abr.2010  7.198,78        324,94        1.332,33      5.114,80   \n",
       "2  may.2010  7.181,90        326,95        1.346,54      5.080,65   \n",
       "3  jun.2010  7.221,58        328,03        1.384,28      5.074,00   \n",
       "4  jul.2010  7.256,52        333,82        1.390,03      5.081,93   \n",
       "\n",
       "  5.Personal de servicio 6.Familiar no remunerado  \n",
       "0                 325,38                    81,08  \n",
       "1                 331,31                    95,39  \n",
       "2                 328,56                    99,21  \n",
       "3                 327,60                   107,68  \n",
       "4                 339,44                   111,29  "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ef5a33e8",
   "metadata": {},
   "source": [
    "Podemos decir específicamente cuántas filas queremos ver colocando el número dentro del paréntesis de head(10)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "7de37363",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>1.Total</th>\n",
       "      <th>2.Empleadores</th>\n",
       "      <th>3.Cuenta Propia</th>\n",
       "      <th>4.Asalariados</th>\n",
       "      <th>5.Personal de servicio</th>\n",
       "      <th>6.Familiar no remunerado</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7.156,21</td>\n",
       "      <td>318,32</td>\n",
       "      <td>1.289,68</td>\n",
       "      <td>5.141,76</td>\n",
       "      <td>325,38</td>\n",
       "      <td>81,08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7.198,78</td>\n",
       "      <td>324,94</td>\n",
       "      <td>1.332,33</td>\n",
       "      <td>5.114,80</td>\n",
       "      <td>331,31</td>\n",
       "      <td>95,39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7.181,90</td>\n",
       "      <td>326,95</td>\n",
       "      <td>1.346,54</td>\n",
       "      <td>5.080,65</td>\n",
       "      <td>328,56</td>\n",
       "      <td>99,21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7.221,58</td>\n",
       "      <td>328,03</td>\n",
       "      <td>1.384,28</td>\n",
       "      <td>5.074,00</td>\n",
       "      <td>327,60</td>\n",
       "      <td>107,68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7.256,52</td>\n",
       "      <td>333,82</td>\n",
       "      <td>1.390,03</td>\n",
       "      <td>5.081,93</td>\n",
       "      <td>339,44</td>\n",
       "      <td>111,29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>ago.2010</td>\n",
       "      <td>7.289,22</td>\n",
       "      <td>333,77</td>\n",
       "      <td>1.430,97</td>\n",
       "      <td>5.080,20</td>\n",
       "      <td>339,49</td>\n",
       "      <td>104,79</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>sep.2010</td>\n",
       "      <td>7.389,47</td>\n",
       "      <td>339,06</td>\n",
       "      <td>1.477,55</td>\n",
       "      <td>5.130,21</td>\n",
       "      <td>338,02</td>\n",
       "      <td>104,63</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>oct.2010</td>\n",
       "      <td>7.414,43</td>\n",
       "      <td>343,68</td>\n",
       "      <td>1.485,61</td>\n",
       "      <td>5.150,80</td>\n",
       "      <td>331,05</td>\n",
       "      <td>103,29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>nov.2010</td>\n",
       "      <td>7.503,09</td>\n",
       "      <td>347,12</td>\n",
       "      <td>1.486,07</td>\n",
       "      <td>5.216,86</td>\n",
       "      <td>341,96</td>\n",
       "      <td>111,08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>dic.2010</td>\n",
       "      <td>7.572,32</td>\n",
       "      <td>340,32</td>\n",
       "      <td>1.486,32</td>\n",
       "      <td>5.294,66</td>\n",
       "      <td>342,20</td>\n",
       "      <td>108,82</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Periodo   1.Total 2.Empleadores 3.Cuenta Propia 4.Asalariados  \\\n",
       "0  mar.2010  7.156,21        318,32        1.289,68      5.141,76   \n",
       "1  abr.2010  7.198,78        324,94        1.332,33      5.114,80   \n",
       "2  may.2010  7.181,90        326,95        1.346,54      5.080,65   \n",
       "3  jun.2010  7.221,58        328,03        1.384,28      5.074,00   \n",
       "4  jul.2010  7.256,52        333,82        1.390,03      5.081,93   \n",
       "5  ago.2010  7.289,22        333,77        1.430,97      5.080,20   \n",
       "6  sep.2010  7.389,47        339,06        1.477,55      5.130,21   \n",
       "7  oct.2010  7.414,43        343,68        1.485,61      5.150,80   \n",
       "8  nov.2010  7.503,09        347,12        1.486,07      5.216,86   \n",
       "9  dic.2010  7.572,32        340,32        1.486,32      5.294,66   \n",
       "\n",
       "  5.Personal de servicio 6.Familiar no remunerado  \n",
       "0                 325,38                    81,08  \n",
       "1                 331,31                    95,39  \n",
       "2                 328,56                    99,21  \n",
       "3                 327,60                   107,68  \n",
       "4                 339,44                   111,29  \n",
       "5                 339,49                   104,79  \n",
       "6                 338,02                   104,63  \n",
       "7                 331,05                   103,29  \n",
       "8                 341,96                   111,08  \n",
       "9                 342,20                   108,82  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head(10)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "17389080",
   "metadata": {},
   "source": [
    "La función `dtypes` nos va a describir la información dentro de la base de datos."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "bd42a4c4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Periodo                     object\n",
       "1.Total                     object\n",
       "2.Empleadores               object\n",
       "3.Cuenta Propia             object\n",
       "4.Asalariados               object\n",
       "5.Personal de servicio      object\n",
       "6.Familiar no remunerado    object\n",
       "dtype: object"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f3370db9",
   "metadata": {},
   "source": [
    "El type `object` corresponde a un dato del tipo texto, como una palabra. En este caso es poco intuitivo frente al tipo de datos que estamos usando. Deberíamos esperar que la base fuese en su mayoría del tipo numérico (Float, Int).  \n",
    "\n",
    "Para esto podemos especificar dos cosas: \n",
    "- Decimal: usamos el argumento `decimal=\"separador\"`. \n",
    "- Separador de miles: usamos el argumento `thousands=\"separador\"`\n",
    "\n",
    "Esto es relevante porque según la configuración del computador e idioma las bases pueden venir con separadores \".\" o \",\". En nuestro caso la base viene con separador de decimal \",\" y con separador de miles \".\". "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "31f8de1a",
   "metadata": {},
   "outputs": [],
   "source": [
    "#1. Guarda el DataFrame, muestra las columnas y la cantidad de filas y columnas"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "d56722e3",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Periodo                      object\n",
      "1.Total                     float64\n",
      "2.Empleadores               float64\n",
      "3.Cuenta Propia             float64\n",
      "4.Asalariados               float64\n",
      "5.Personal de servicio      float64\n",
      "6.Familiar no remunerado    float64\n",
      "dtype: object\n",
      "Index(['Periodo', '1.Total', '2.Empleadores', '3.Cuenta Propia',\n",
      "       '4.Asalariados', '5.Personal de servicio', '6.Familiar no remunerado'],\n",
      "      dtype='object')\n",
      "(133, 7)\n"
     ]
    }
   ],
   "source": [
    "#IMportar datos\n",
    "df = pd.read_csv(\"clase13_base1.csv\", delimiter=\";\", decimal=\",\", thousands='.')\n",
    "#Muestra los tipos\n",
    "print(df.dtypes)\n",
    "#Muestra columnas\n",
    "print(df.columns)\n",
    "#Mostrar N fila- M columna\n",
    "print(df.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0a50faa9",
   "metadata": {},
   "source": [
    "Para ver las columnas del DataFrame usamos `columns`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "72cfb751",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['Periodo', '1.Total', '2.Empleadores', '3.Cuenta Propia',\n",
       "       '4.Asalariados', '5.Personal de servicio', '6.Familiar no remunerado'],\n",
       "      dtype='object')"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.columns"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e402d915",
   "metadata": {},
   "source": [
    "Las dimensiones fila-columna las podemos ver mediante `shape`. Esta viene en formato tupla (fila,columna)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "6cf31910",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(133, 7)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1e9ca181",
   "metadata": {},
   "source": [
    "Para ver el final de la tabla podemos usar `tail()`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "1f8481d9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>1.Total</th>\n",
       "      <th>2.Empleadores</th>\n",
       "      <th>3.Cuenta Propia</th>\n",
       "      <th>4.Asalariados</th>\n",
       "      <th>5.Personal de servicio</th>\n",
       "      <th>6.Familiar no remunerado</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>128</th>\n",
       "      <td>nov.2020</td>\n",
       "      <td>7916.72</td>\n",
       "      <td>248.89</td>\n",
       "      <td>1568.05</td>\n",
       "      <td>5833.68</td>\n",
       "      <td>188.43</td>\n",
       "      <td>77.66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129</th>\n",
       "      <td>dic.2020</td>\n",
       "      <td>8026.22</td>\n",
       "      <td>234.57</td>\n",
       "      <td>1588.14</td>\n",
       "      <td>5927.28</td>\n",
       "      <td>194.91</td>\n",
       "      <td>81.32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>130</th>\n",
       "      <td>ene.2021</td>\n",
       "      <td>8121.42</td>\n",
       "      <td>237.25</td>\n",
       "      <td>1610.63</td>\n",
       "      <td>6000.74</td>\n",
       "      <td>197.43</td>\n",
       "      <td>75.36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131</th>\n",
       "      <td>feb.2021</td>\n",
       "      <td>8167.62</td>\n",
       "      <td>245.25</td>\n",
       "      <td>1634.08</td>\n",
       "      <td>6018.35</td>\n",
       "      <td>198.73</td>\n",
       "      <td>71.20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>132</th>\n",
       "      <td>mar.2021</td>\n",
       "      <td>8148.21</td>\n",
       "      <td>246.92</td>\n",
       "      <td>1646.38</td>\n",
       "      <td>5978.29</td>\n",
       "      <td>204.48</td>\n",
       "      <td>72.14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Periodo  1.Total  2.Empleadores  3.Cuenta Propia  4.Asalariados  \\\n",
       "128  nov.2020  7916.72         248.89          1568.05        5833.68   \n",
       "129  dic.2020  8026.22         234.57          1588.14        5927.28   \n",
       "130  ene.2021  8121.42         237.25          1610.63        6000.74   \n",
       "131  feb.2021  8167.62         245.25          1634.08        6018.35   \n",
       "132  mar.2021  8148.21         246.92          1646.38        5978.29   \n",
       "\n",
       "     5.Personal de servicio  6.Familiar no remunerado  \n",
       "128                  188.43                     77.66  \n",
       "129                  194.91                     81.32  \n",
       "130                  197.43                     75.36  \n",
       "131                  198.73                     71.20  \n",
       "132                  204.48                     72.14  "
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tail()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f902b5c",
   "metadata": {},
   "source": [
    "Para revisar una columna en específico podemos usar diferentes mecanismos"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "56a047d5",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      mar.2010\n",
       "1      abr.2010\n",
       "2      may.2010\n",
       "3      jun.2010\n",
       "4      jul.2010\n",
       "         ...   \n",
       "128    nov.2020\n",
       "129    dic.2020\n",
       "130    ene.2021\n",
       "131    feb.2021\n",
       "132    mar.2021\n",
       "Name: Periodo, Length: 133, dtype: object"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Caso 1\n",
    "df['Periodo']\n",
    "#Caso 2\n",
    "df.Periodo"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a754cbf2",
   "metadata": {},
   "source": [
    "¿Qué pasa cuando el nombre de nuestra columna viene con espacios? ¿Podemos usar el caso 2? "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "e6cf0e5b",
   "metadata": {},
   "outputs": [
    {
     "ename": "SyntaxError",
     "evalue": "invalid syntax (<ipython-input-29-3e2cc8594c50>, line 4)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;36m  File \u001b[0;32m\"<ipython-input-29-3e2cc8594c50>\"\u001b[0;36m, line \u001b[0;32m4\u001b[0m\n\u001b[0;31m    df.'3.Cuenta Propia'\u001b[0m\n\u001b[0m       ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
     ]
    }
   ],
   "source": [
    "#Funciona\n",
    "df['3.Cuenta Propia']\n",
    "#No Funciona\n",
    "df.'3.Cuenta Propia'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c06f4841",
   "metadata": {},
   "source": [
    "Por esta razón es fundamental que los nombres sean simples, en caso que tengan más de una palabra separar con \"_\". "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fe135172",
   "metadata": {},
   "source": [
    "## 3. Manipulando el DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab493bf9",
   "metadata": {},
   "source": [
    "Lo primero que haremos es modificar el nombre de las variables. Para esto podemos usar el la función `rename()` y un `diccionario` con {'nombre_antiguo':'nombre_nuevo'}."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "c8fd9c5f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>TOT</th>\n",
       "      <th>EMP</th>\n",
       "      <th>CP</th>\n",
       "      <th>ASA</th>\n",
       "      <th>PdS</th>\n",
       "      <th>FnR</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7156.21</td>\n",
       "      <td>318.32</td>\n",
       "      <td>1289.68</td>\n",
       "      <td>5141.76</td>\n",
       "      <td>325.38</td>\n",
       "      <td>81.08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7198.78</td>\n",
       "      <td>324.94</td>\n",
       "      <td>1332.33</td>\n",
       "      <td>5114.80</td>\n",
       "      <td>331.31</td>\n",
       "      <td>95.39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7181.90</td>\n",
       "      <td>326.95</td>\n",
       "      <td>1346.54</td>\n",
       "      <td>5080.65</td>\n",
       "      <td>328.56</td>\n",
       "      <td>99.21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7221.58</td>\n",
       "      <td>328.03</td>\n",
       "      <td>1384.28</td>\n",
       "      <td>5074.00</td>\n",
       "      <td>327.60</td>\n",
       "      <td>107.68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7256.52</td>\n",
       "      <td>333.82</td>\n",
       "      <td>1390.03</td>\n",
       "      <td>5081.93</td>\n",
       "      <td>339.44</td>\n",
       "      <td>111.29</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>128</th>\n",
       "      <td>nov.2020</td>\n",
       "      <td>7916.72</td>\n",
       "      <td>248.89</td>\n",
       "      <td>1568.05</td>\n",
       "      <td>5833.68</td>\n",
       "      <td>188.43</td>\n",
       "      <td>77.66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129</th>\n",
       "      <td>dic.2020</td>\n",
       "      <td>8026.22</td>\n",
       "      <td>234.57</td>\n",
       "      <td>1588.14</td>\n",
       "      <td>5927.28</td>\n",
       "      <td>194.91</td>\n",
       "      <td>81.32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>130</th>\n",
       "      <td>ene.2021</td>\n",
       "      <td>8121.42</td>\n",
       "      <td>237.25</td>\n",
       "      <td>1610.63</td>\n",
       "      <td>6000.74</td>\n",
       "      <td>197.43</td>\n",
       "      <td>75.36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>131</th>\n",
       "      <td>feb.2021</td>\n",
       "      <td>8167.62</td>\n",
       "      <td>245.25</td>\n",
       "      <td>1634.08</td>\n",
       "      <td>6018.35</td>\n",
       "      <td>198.73</td>\n",
       "      <td>71.20</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>132</th>\n",
       "      <td>mar.2021</td>\n",
       "      <td>8148.21</td>\n",
       "      <td>246.92</td>\n",
       "      <td>1646.38</td>\n",
       "      <td>5978.29</td>\n",
       "      <td>204.48</td>\n",
       "      <td>72.14</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>133 rows × 7 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Periodo      TOT     EMP       CP      ASA     PdS     FnR\n",
       "0    mar.2010  7156.21  318.32  1289.68  5141.76  325.38   81.08\n",
       "1    abr.2010  7198.78  324.94  1332.33  5114.80  331.31   95.39\n",
       "2    may.2010  7181.90  326.95  1346.54  5080.65  328.56   99.21\n",
       "3    jun.2010  7221.58  328.03  1384.28  5074.00  327.60  107.68\n",
       "4    jul.2010  7256.52  333.82  1390.03  5081.93  339.44  111.29\n",
       "..        ...      ...     ...      ...      ...     ...     ...\n",
       "128  nov.2020  7916.72  248.89  1568.05  5833.68  188.43   77.66\n",
       "129  dic.2020  8026.22  234.57  1588.14  5927.28  194.91   81.32\n",
       "130  ene.2021  8121.42  237.25  1610.63  6000.74  197.43   75.36\n",
       "131  feb.2021  8167.62  245.25  1634.08  6018.35  198.73   71.20\n",
       "132  mar.2021  8148.21  246.92  1646.38  5978.29  204.48   72.14\n",
       "\n",
       "[133 rows x 7 columns]"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.rename(columns={'1.Total': 'TOT', '2.Empleadores':'EMP', '3.Cuenta Propia':'CP', '4.Asalariados':'ASA', '5.Personal de servicio':'PdS', '6.Familiar no remunerado':'FnR'})"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65268ef6",
   "metadata": {},
   "source": [
    "Si hacemos sólo `df.rename()` no se modifica el DataFrame, entonces tenemos dos opciones: 1) creamos uno nuevo o 2) modificamos el que ya existe.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "1e08da99",
   "metadata": {},
   "outputs": [],
   "source": [
    "#1. Creamos lun DF nuevo \n",
    "df2 = df.rename(columns={'1.Total': 'TOT', '2.Empleadores':'EMP', '3.Cuenta Propia':'CP', '4.Asalariados':'ASA', '5.Personal de servicio':'PdS', '6.Familiar no remunerado':'FnR'})\n",
    "\n",
    "#2. Modificamos el que existe\n",
    "df = df.rename(columns={'1.Total': 'TOT', '2.Empleadores':'EMP', '3.Cuenta Propia':'CP', '4.Asalariados':'ASA', '5.Personal de servicio':'PdS', '6.Familiar no remunerado':'FnR'})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "d9fb6db7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>TOT</th>\n",
       "      <th>EMP</th>\n",
       "      <th>CP</th>\n",
       "      <th>ASA</th>\n",
       "      <th>PdS</th>\n",
       "      <th>FnR</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7156.21</td>\n",
       "      <td>318.32</td>\n",
       "      <td>1289.68</td>\n",
       "      <td>5141.76</td>\n",
       "      <td>325.38</td>\n",
       "      <td>81.08</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7198.78</td>\n",
       "      <td>324.94</td>\n",
       "      <td>1332.33</td>\n",
       "      <td>5114.80</td>\n",
       "      <td>331.31</td>\n",
       "      <td>95.39</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7181.90</td>\n",
       "      <td>326.95</td>\n",
       "      <td>1346.54</td>\n",
       "      <td>5080.65</td>\n",
       "      <td>328.56</td>\n",
       "      <td>99.21</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7221.58</td>\n",
       "      <td>328.03</td>\n",
       "      <td>1384.28</td>\n",
       "      <td>5074.00</td>\n",
       "      <td>327.60</td>\n",
       "      <td>107.68</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7256.52</td>\n",
       "      <td>333.82</td>\n",
       "      <td>1390.03</td>\n",
       "      <td>5081.93</td>\n",
       "      <td>339.44</td>\n",
       "      <td>111.29</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Periodo      TOT     EMP       CP      ASA     PdS     FnR\n",
       "0  mar.2010  7156.21  318.32  1289.68  5141.76  325.38   81.08\n",
       "1  abr.2010  7198.78  324.94  1332.33  5114.80  331.31   95.39\n",
       "2  may.2010  7181.90  326.95  1346.54  5080.65  328.56   99.21\n",
       "3  jun.2010  7221.58  328.03  1384.28  5074.00  327.60  107.68\n",
       "4  jul.2010  7256.52  333.82  1390.03  5081.93  339.44  111.29"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3fba8e83",
   "metadata": {},
   "source": [
    "Una mirada inicial a una variable podemos darla con la función `describe()`. Esta función nos entrega una resumen estadístico de la variable. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "9ce90f2c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "count     133.000000\n",
       "mean     8187.411504\n",
       "std       504.204366\n",
       "min      7073.190000\n",
       "25%      7844.780000\n",
       "50%      8202.890000\n",
       "75%      8535.210000\n",
       "max      9118.180000\n",
       "Name: TOT, dtype: float64"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#1. Llamamos una variable df.variable\n",
    "df.TOT.describe()\n",
    "\n",
    "#2. Llamamos df['variable']\n",
    "df['TOT'].describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "518e6902",
   "metadata": {},
   "source": [
    "Podemos sacar una estadística en particular"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "a18493f9",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "min: 7073.19\n",
      "max: 9118.18\n",
      "mean: 8187.411503759398\n",
      "std: 504.20436565714255\n",
      "count: 133\n"
     ]
    }
   ],
   "source": [
    "print(\"min:\", df.TOT.min())\n",
    "print(\"max:\", df.TOT.max())\n",
    "print(\"mean:\", df.TOT.mean())\n",
    "print(\"std:\", df.TOT.std())\n",
    "print(\"count:\", df.TOT.count())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3673349",
   "metadata": {},
   "source": [
    "La variable `Periodo` sigue siendo del tipo Objeto (texto), podemos crear una variable del tipo fecha. Para esto vamos a hacer dos cosas: 1) crear una variable en formato fecha y 2) agregar esta variable al DataFrame. \n",
    "\n",
    "- Para crear una variable del tipo fecha podemos usar la función `date_range(fecha_inicio, periodos, frecuencia)`. En el siguiente link (link 3) encuentran detalle de como variables del tipo fecha en un DataFrame: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html. \n",
    "- Para anexar una variable al DataFrame colocamos `df['nombre_variable'] = variable`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "4c6c2477",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Periodo            object\n",
       "TOT               float64\n",
       "EMP               float64\n",
       "CP                float64\n",
       "ASA               float64\n",
       "PdS               float64\n",
       "FnR               float64\n",
       "Date       datetime64[ns]\n",
       "dtype: object"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Creamos una variable del tipo fecha con la función\n",
    "Date = pd.date_range(\"2010-03-01\", periods=133, freq=\"M\")\n",
    "\n",
    "#Anexamos la variable nueva\n",
    "df['Date'] = Date\n",
    "\n",
    "#Vemos el resultado\n",
    "df.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1db536ae",
   "metadata": {},
   "source": [
    "Vamos a guardar los meses y años por separados en el DataFrame. Para esto utilizamos `DatetimeIndex` (abreviamos dt) que nos permite extraer el segundos/dia/mes/año de una variable del tipo `datetime`, por ejemplo usando `month` y `year`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "78b75d8c",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Periodo            object\n",
      "TOT               float64\n",
      "EMP               float64\n",
      "CP                float64\n",
      "ASA               float64\n",
      "PdS               float64\n",
      "FnR               float64\n",
      "Date       datetime64[ns]\n",
      "mes                 int64\n",
      "year                int64\n",
      "dtype: object\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Periodo</th>\n",
       "      <th>TOT</th>\n",
       "      <th>EMP</th>\n",
       "      <th>CP</th>\n",
       "      <th>ASA</th>\n",
       "      <th>PdS</th>\n",
       "      <th>FnR</th>\n",
       "      <th>Date</th>\n",
       "      <th>mes</th>\n",
       "      <th>year</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>mar.2010</td>\n",
       "      <td>7156.21</td>\n",
       "      <td>318.32</td>\n",
       "      <td>1289.68</td>\n",
       "      <td>5141.76</td>\n",
       "      <td>325.38</td>\n",
       "      <td>81.08</td>\n",
       "      <td>2010-03-31</td>\n",
       "      <td>3</td>\n",
       "      <td>2010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>abr.2010</td>\n",
       "      <td>7198.78</td>\n",
       "      <td>324.94</td>\n",
       "      <td>1332.33</td>\n",
       "      <td>5114.80</td>\n",
       "      <td>331.31</td>\n",
       "      <td>95.39</td>\n",
       "      <td>2010-04-30</td>\n",
       "      <td>4</td>\n",
       "      <td>2010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>may.2010</td>\n",
       "      <td>7181.90</td>\n",
       "      <td>326.95</td>\n",
       "      <td>1346.54</td>\n",
       "      <td>5080.65</td>\n",
       "      <td>328.56</td>\n",
       "      <td>99.21</td>\n",
       "      <td>2010-05-31</td>\n",
       "      <td>5</td>\n",
       "      <td>2010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>jun.2010</td>\n",
       "      <td>7221.58</td>\n",
       "      <td>328.03</td>\n",
       "      <td>1384.28</td>\n",
       "      <td>5074.00</td>\n",
       "      <td>327.60</td>\n",
       "      <td>107.68</td>\n",
       "      <td>2010-06-30</td>\n",
       "      <td>6</td>\n",
       "      <td>2010</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>jul.2010</td>\n",
       "      <td>7256.52</td>\n",
       "      <td>333.82</td>\n",
       "      <td>1390.03</td>\n",
       "      <td>5081.93</td>\n",
       "      <td>339.44</td>\n",
       "      <td>111.29</td>\n",
       "      <td>2010-07-31</td>\n",
       "      <td>7</td>\n",
       "      <td>2010</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Periodo      TOT     EMP       CP      ASA     PdS     FnR       Date  \\\n",
       "0  mar.2010  7156.21  318.32  1289.68  5141.76  325.38   81.08 2010-03-31   \n",
       "1  abr.2010  7198.78  324.94  1332.33  5114.80  331.31   95.39 2010-04-30   \n",
       "2  may.2010  7181.90  326.95  1346.54  5080.65  328.56   99.21 2010-05-31   \n",
       "3  jun.2010  7221.58  328.03  1384.28  5074.00  327.60  107.68 2010-06-30   \n",
       "4  jul.2010  7256.52  333.82  1390.03  5081.93  339.44  111.29 2010-07-31   \n",
       "\n",
       "   mes  year  \n",
       "0    3  2010  \n",
       "1    4  2010  \n",
       "2    5  2010  \n",
       "3    6  2010  \n",
       "4    7  2010  "
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#Guardamos el mes en una variable \n",
    "df['mes'] = df['Date'].dt.month\n",
    "\n",
    "#Guardamos el año en una variable\n",
    "df['year'] = df['Date'].dt.year\n",
    "\n",
    "#Vemos el resultado\n",
    "print(df.dtypes)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3dbabc89",
   "metadata": {},
   "source": [
    "Podemos agrupar una variable mediante `groupby`. Luego podemos aplicar funciones básicas como `mean()`, `std()`, `sum()`, etc. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "8f87f1e3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>TOT</th>\n",
       "      <th>EMP</th>\n",
       "      <th>CP</th>\n",
       "      <th>ASA</th>\n",
       "      <th>PdS</th>\n",
       "      <th>FnR</th>\n",
       "      <th>mes</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>year</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2010</th>\n",
       "      <td>7318.352000</td>\n",
       "      <td>333.601000</td>\n",
       "      <td>1410.938000</td>\n",
       "      <td>5136.587000</td>\n",
       "      <td>334.501000</td>\n",
       "      <td>102.726000</td>\n",
       "      <td>7.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2011</th>\n",
       "      <td>7676.545833</td>\n",
       "      <td>349.602500</td>\n",
       "      <td>1488.466667</td>\n",
       "      <td>5382.360833</td>\n",
       "      <td>359.097500</td>\n",
       "      <td>97.016667</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012</th>\n",
       "      <td>7858.610833</td>\n",
       "      <td>323.438333</td>\n",
       "      <td>1460.165833</td>\n",
       "      <td>5626.480000</td>\n",
       "      <td>353.575000</td>\n",
       "      <td>94.954167</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2013</th>\n",
       "      <td>8023.047500</td>\n",
       "      <td>334.418333</td>\n",
       "      <td>1484.029167</td>\n",
       "      <td>5766.081667</td>\n",
       "      <td>336.561667</td>\n",
       "      <td>101.956667</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014</th>\n",
       "      <td>8150.107500</td>\n",
       "      <td>338.969167</td>\n",
       "      <td>1569.675000</td>\n",
       "      <td>5800.595833</td>\n",
       "      <td>336.685000</td>\n",
       "      <td>104.185000</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2015</th>\n",
       "      <td>8273.739167</td>\n",
       "      <td>344.422500</td>\n",
       "      <td>1593.640833</td>\n",
       "      <td>5919.330833</td>\n",
       "      <td>319.370833</td>\n",
       "      <td>96.973333</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016</th>\n",
       "      <td>8391.921667</td>\n",
       "      <td>334.027500</td>\n",
       "      <td>1686.861667</td>\n",
       "      <td>5945.020000</td>\n",
       "      <td>330.251667</td>\n",
       "      <td>95.760000</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017</th>\n",
       "      <td>8574.332500</td>\n",
       "      <td>371.210833</td>\n",
       "      <td>1777.026667</td>\n",
       "      <td>6008.260833</td>\n",
       "      <td>323.455000</td>\n",
       "      <td>94.375833</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2018</th>\n",
       "      <td>8773.940000</td>\n",
       "      <td>363.481667</td>\n",
       "      <td>1800.942500</td>\n",
       "      <td>6185.576667</td>\n",
       "      <td>323.193333</td>\n",
       "      <td>100.750000</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>8953.679167</td>\n",
       "      <td>368.760000</td>\n",
       "      <td>1855.963333</td>\n",
       "      <td>6320.787500</td>\n",
       "      <td>319.305833</td>\n",
       "      <td>88.863333</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020</th>\n",
       "      <td>7932.822500</td>\n",
       "      <td>274.867500</td>\n",
       "      <td>1495.655000</td>\n",
       "      <td>5878.350000</td>\n",
       "      <td>214.355000</td>\n",
       "      <td>69.597500</td>\n",
       "      <td>6.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021</th>\n",
       "      <td>8145.750000</td>\n",
       "      <td>243.140000</td>\n",
       "      <td>1630.363333</td>\n",
       "      <td>5999.126667</td>\n",
       "      <td>200.213333</td>\n",
       "      <td>72.900000</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              TOT         EMP           CP          ASA         PdS  \\\n",
       "year                                                                  \n",
       "2010  7318.352000  333.601000  1410.938000  5136.587000  334.501000   \n",
       "2011  7676.545833  349.602500  1488.466667  5382.360833  359.097500   \n",
       "2012  7858.610833  323.438333  1460.165833  5626.480000  353.575000   \n",
       "2013  8023.047500  334.418333  1484.029167  5766.081667  336.561667   \n",
       "2014  8150.107500  338.969167  1569.675000  5800.595833  336.685000   \n",
       "2015  8273.739167  344.422500  1593.640833  5919.330833  319.370833   \n",
       "2016  8391.921667  334.027500  1686.861667  5945.020000  330.251667   \n",
       "2017  8574.332500  371.210833  1777.026667  6008.260833  323.455000   \n",
       "2018  8773.940000  363.481667  1800.942500  6185.576667  323.193333   \n",
       "2019  8953.679167  368.760000  1855.963333  6320.787500  319.305833   \n",
       "2020  7932.822500  274.867500  1495.655000  5878.350000  214.355000   \n",
       "2021  8145.750000  243.140000  1630.363333  5999.126667  200.213333   \n",
       "\n",
       "             FnR  mes  \n",
       "year                   \n",
       "2010  102.726000  7.5  \n",
       "2011   97.016667  6.5  \n",
       "2012   94.954167  6.5  \n",
       "2013  101.956667  6.5  \n",
       "2014  104.185000  6.5  \n",
       "2015   96.973333  6.5  \n",
       "2016   95.760000  6.5  \n",
       "2017   94.375833  6.5  \n",
       "2018  100.750000  6.5  \n",
       "2019   88.863333  6.5  \n",
       "2020   69.597500  6.5  \n",
       "2021   72.900000  2.0  "
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_agrupado = df.groupby('year')\n",
    "df_agrupado.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88c9792e",
   "metadata": {},
   "source": [
    "Lo anterior podemos hacerlo sobre una o un grupo de variables específicas. Para esto hacemos lo siguiente: \n",
    "- Seleccionamos las variablers sobre las que vamos a trabajar mediante una lista: df[['TOT', 'year']]. Tiene que estar la variable sobre la que quiero tener el análisis (TOT) y la que voy a agrupar (year).  \n",
    "- Aplicamos la función para agrupar y la variable: groupby('year'). "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "707f372b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>TOT</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>year</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2010</th>\n",
       "      <td>7318.352000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2011</th>\n",
       "      <td>7676.545833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012</th>\n",
       "      <td>7858.610833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2013</th>\n",
       "      <td>8023.047500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014</th>\n",
       "      <td>8150.107500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2015</th>\n",
       "      <td>8273.739167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016</th>\n",
       "      <td>8391.921667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017</th>\n",
       "      <td>8574.332500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2018</th>\n",
       "      <td>8773.940000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>8953.679167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020</th>\n",
       "      <td>7932.822500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021</th>\n",
       "      <td>8145.750000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              TOT\n",
       "year             \n",
       "2010  7318.352000\n",
       "2011  7676.545833\n",
       "2012  7858.610833\n",
       "2013  8023.047500\n",
       "2014  8150.107500\n",
       "2015  8273.739167\n",
       "2016  8391.921667\n",
       "2017  8574.332500\n",
       "2018  8773.940000\n",
       "2019  8953.679167\n",
       "2020  7932.822500\n",
       "2021  8145.750000"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_agrupado = df[['TOT', 'year']].groupby('year')\n",
    "df_agrupado.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a9be3fd",
   "metadata": {},
   "source": [
    "Si analizamos el resultado de df_agrupado.mean() tiene dos elementos: \n",
    "- index: variable sobre la que se agrupó, esta la llamamos con `.index`\n",
    "- Variables relevantes: sobre las que hicimos el análisis, en este caso TOT. La llamamos con ['TOT']\n",
    "\n",
    "$$f(x) = 10$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "293802cb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "index: Int64Index([2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020,\n",
      "            2021],\n",
      "           dtype='int64', name='year')\n",
      "TOT: year\n",
      "2010    7318.352000\n",
      "2011    7676.545833\n",
      "2012    7858.610833\n",
      "2013    8023.047500\n",
      "2014    8150.107500\n",
      "2015    8273.739167\n",
      "2016    8391.921667\n",
      "2017    8574.332500\n",
      "2018    8773.940000\n",
      "2019    8953.679167\n",
      "2020    7932.822500\n",
      "2021    8145.750000\n",
      "Name: TOT, dtype: float64\n"
     ]
    }
   ],
   "source": [
    "#Index\n",
    "print(\"index:\", df_agrupado.mean().index)\n",
    "\n",
    "#TOT\n",
    "print(\"TOT:\", df_agrupado.mean()['TOT'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "f21b815f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAD4CAYAAAAAczaOAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAP0UlEQVR4nO3df6zddX3H8efLVhBkTJBCSsssyzq3QjKBhqEuixszVN1WMkdSE6VbcJ0EN12WbO2yxC3aBBdnlDnYGtmATCX1x0I3gxvpJJuGwC5ChFKRKgwqHVxdnOgWFPbeH+fT5Nje9p5Lzz23936ej+TkfM/7fL/f83lzzn2dbz/fcw6pKiRJfXjRQg9AkjQ5hr4kdcTQl6SOGPqS1BFDX5I6snyhBzCbM844o9asWbPQw5CkReXee+/9ZlWtOLR+3If+mjVrmJqaWuhhSNKikuQ/Zqo7vSNJHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR057r+RK2lpWrP1s2Pf52PXvmns+1xqPNKXpI4Y+pLUEUNfkjrinL6kH+Jc+9Lmkb4kdcTQl6SOGPqS1BHn9KVFwrl2jYNH+pLUEUNfkjpi6EtSRwx9SeqIJ3KlMRj3SVZPsGq+eKQvSR3xSF9Lmh9zlH6YR/qS1BFDX5I64vSOFoTTLlpqFsvJ/JGO9JP8XpI9SR5M8okkL0lyepI7kjzSrk8bWn9bkn1JHk5y2VD9oiQPtPuuS5L5aEqSNLNZQz/JKuB3gfVVdT6wDNgEbAV2V9VaYHe7TZJ17f7zgA3A9UmWtd3dAGwB1rbLhrF2I0k6qlGnd5YDJyX5AXAy8CSwDXhdu/9m4E7gD4GNwK1V9SzwaJJ9wMVJHgNOraq7AJLcAlwO3D6ORjQeTrtIS9usR/pV9Q3gA8DjwAHgv6vqn4GzqupAW+cAcGbbZBXwxNAu9rfaqrZ8aP0wSbYkmUoyNT09PbeOJElHNOuRfpur3wicC3wb+GSStx5tkxlqdZT64cWqHcAOgPXr18+4jiSNYrGcYJ2UUU7k/hLwaFVNV9UPgM8ArwGeSrISoF0/3dbfD5wztP1qBtNB+9vyoXVJ0oSMMqf/OHBJkpOB/wUuBaaA7wGbgWvb9W1t/V3Ax5N8EDibwQnbe6rq+STPJLkEuBu4EviLcTaz1HnEIulYzRr6VXV3kk8BXwKeA+5jMPVyCrAzyVUM3hiuaOvvSbITeKitf01VPd92dzVwE3ASgxO4nsSVpAka6dM7VfUe4D2HlJ9lcNQ/0/rbge0z1KeA8+c4RknSmPgzDJLUEX+GYQyca5e0WHikL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SerIkv5ylv9DEEn6YR7pS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6shIoZ/kZUk+leQrSfYmeXWS05PckeSRdn3a0PrbkuxL8nCSy4bqFyV5oN13XZLMR1OSpJmNeqT/YeBzVfVTwM8Ae4GtwO6qWgvsbrdJsg7YBJwHbACuT7Ks7ecGYAuwtl02jKkPSdIIZg39JKcCPw/cCFBV36+qbwMbgZvbajcDl7fljcCtVfVsVT0K7AMuTrISOLWq7qqqAm4Z2kaSNAGjHOn/ODAN/G2S+5J8NMlLgbOq6gBAuz6zrb8KeGJo+/2ttqotH1o/TJItSaaSTE1PT8+pIUnSkY0S+suBC4EbquoC4Hu0qZwjmGmevo5SP7xYtaOq1lfV+hUrVowwREnSKEYJ/f3A/qq6u93+FIM3gafalA3t+umh9c8Z2n418GSrr56hLkmakFlDv6r+E3giyStb6VLgIWAXsLnVNgO3teVdwKYkJyY5l8EJ23vaFNAzSS5pn9q5cmgbSdIELB9xvd8BPpbkBODrwG8yeMPYmeQq4HHgCoCq2pNkJ4M3hueAa6rq+bafq4GbgJOA29tFkjQhI4V+Vd0PrJ/hrkuPsP52YPsM9Sng/DmMT5I0Rn4jV5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSMjh36SZUnuS/KP7fbpSe5I8ki7Pm1o3W1J9iV5OMllQ/WLkjzQ7rsuScbbjiTpaOZypP8uYO/Q7a3A7qpaC+xut0myDtgEnAdsAK5PsqxtcwOwBVjbLhuOafSSpDkZKfSTrAbeBHx0qLwRuLkt3wxcPlS/taqerapHgX3AxUlWAqdW1V1VVcAtQ9tIkiZg1CP9DwF/APzfUO2sqjoA0K7PbPVVwBND6+1vtVVt+dD6YZJsSTKVZGp6enrEIUqSZjNr6Cf5ZeDpqrp3xH3ONE9fR6kfXqzaUVXrq2r9ihUrRnxYSdJslo+wzmuBX03yRuAlwKlJ/g54KsnKqjrQpm6ebuvvB84Z2n418GSrr56hLkmakFmP9KtqW1Wtrqo1DE7Q/ktVvRXYBWxuq20GbmvLu4BNSU5Mci6DE7b3tCmgZ5Jc0j61c+XQNpKkCRjlSP9IrgV2JrkKeBy4AqCq9iTZCTwEPAdcU1XPt22uBm4CTgJubxdJ0oTMKfSr6k7gzrb8LeDSI6y3Hdg+Q30KOH+ug5QkjYffyJWkjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SerIrKGf5Jwkn0+yN8meJO9q9dOT3JHkkXZ92tA225LsS/JwksuG6hcleaDdd12SzE9bkqSZjHKk/xzw+1X108AlwDVJ1gFbgd1VtRbY3W7T7tsEnAdsAK5Psqzt6wZgC7C2XTaMsRdJ0ixmDf2qOlBVX2rLzwB7gVXARuDmttrNwOVteSNwa1U9W1WPAvuAi5OsBE6tqruqqoBbhraRJE3AnOb0k6wBLgDuBs6qqgMweGMAzmyrrQKeGNpsf6utasuH1iVJEzJy6Cc5Bfg08O6q+s7RVp2hVkepz/RYW5JMJZmanp4edYiSpFmMFPpJXswg8D9WVZ9p5afalA3t+ulW3w+cM7T5auDJVl89Q/0wVbWjqtZX1foVK1aM2oskaRajfHonwI3A3qr64NBdu4DNbXkzcNtQfVOSE5Ocy+CE7T1tCuiZJJe0fV45tI0kaQKWj7DOa4G3AQ8kub/V/gi4FtiZ5CrgceAKgKrak2Qn8BCDT/5cU1XPt+2uBm4CTgJubxdJ0oTMGvpV9QVmno8HuPQI22wHts9QnwLOn8sAJUnj4zdyJakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHDH1J6oihL0kdMfQlqSOGviR1xNCXpI4Y+pLUEUNfkjpi6EtSRwx9SeqIoS9JHTH0Jakjhr4kdcTQl6SOGPqS1BFDX5I6YuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjhj6ktQRQ1+SOmLoS1JHJh76STYkeTjJviRbJ/34ktSziYZ+kmXAXwJvANYBb0mybpJjkKSeTfpI/2JgX1V9vaq+D9wKbJzwGCSpW6mqyT1Y8uvAhqp6e7v9NuBnq+qdh6y3BdjSbr4SeHieh3YG8M15foxJWkr9LKVewH6OZ0upF4BXVNWKQ4vLJzyIzFA77F2nqnYAO+Z/OANJpqpq/aQeb74tpX6WUi9gP8ezpdTL0Ux6emc/cM7Q7dXAkxMegyR1a9Kh/+/A2iTnJjkB2ATsmvAYJKlbE53eqarnkrwT+CdgGfA3VbVnkmM4golNJU3IUupnKfUC9nM8W0q9HNFET+RKkhaW38iVpI4Y+pLUkSUZ+knOSfL5JHuT7EnyrlY/PckdSR5p16e1+svb+t9N8pFD9nVRkgfaz0Zcl2Smj50uin6SnJzks0m+0vZz7WLt5ZB97kry4CT7GHrscb7WTkiyI8lX23P05kXez1va386Xk3wuyRnHeS+vT3JvG/O9SX5xaF8LngNjU1VL7gKsBC5syz8CfJXBzz78GbC11bcC72/LLwV+DngH8JFD9nUP8GoG3zG4HXjDYu0HOBn4hbZ8AvBvk+5nnM9Nu//XgI8DDy6B19qfAu9ryy8Czlis/TD4kMjTB3to2//Jcd7LBcDZbfl84BtD+1rwHBjbf5eFHsCEnvzbgNcz+GbvyqEXxMOHrPcbh7xwVwJfGbr9FuCvF2s/M+znw8BvLdZegFOAL7Q/5AUJ/TH38wTw0oXuYRz9AC8GpoFXtKD8K2DLYuil1QN8CzjxeM2BF3pZktM7w5KsYfAOfjdwVlUdAGjXZ86y+SoGXyg7aH+rLZhj7Gd4Py8DfgXYPf5RjjyGNRxbL+8F/hz4n/ka41wcSz/t+QB4b5IvJflkkrPmcbizOpZ+quoHwNXAAwy+gLkOuHE+x3s0L6CXNwP3VdWzHIc5cCyWdOgnOQX4NPDuqvrOC9nFDLUF+4zrGPo5uJ/lwCeA66rq6+Ma3xzHcEy9JHkV8BNV9ffjHtsLMYbnZjmDb6h/saouBO4CPjDGIc7JGJ6fFzMI/QuAs4EvA9vGOsjRxzKnXpKcB7wf+O2DpRlWW7SfdV+yod9edJ8GPlZVn2nlp5KsbPevZDDneDT7GfwhHrRgPxsxpn4O2gE8UlUfGvtARzCmXl4NXJTkMQZTPD+Z5M75GfHRjamfbzH4F8vBN7FPAhfOw3BnNaZ+XgVQVV+rwZzITuA18zPiI5trL0lWM3gOrqyqr7XycZMD47AkQ7+dWb8R2FtVHxy6axewuS1vZjDHd0Ttn37PJLmk7fPK2baZD+Pqp+3rfcCPAu8e8zBHMsbn5oaqOruq1jA4kfjVqnrd+Ed8dGPsp4B/AF7XSpcCD411sCMY42vtG8C6JAd/5fH1wN5xjnU2c+2lTbF9FthWVV88uPLxkgNjs9AnFebjwiAEisE/Ke9vlzcCL2cwh/1Iuz59aJvHgP8CvsvgnX1dq68HHgS+BnyE9i3mxdgPgyOUYvDHd3A/b1+MvRyyzzUs3Kd3xvlaewXwr21fu4EfW+T9vKO91r7M4A3t5cdzL8AfA98bWvd+4Mx234LnwLgu/gyDJHVkSU7vSJJmZuhLUkcMfUnqiKEvSR0x9CWpI4a+JHXE0Jekjvw/VPyfjkeWhTIAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "x = df_agrupado.mean().index\n",
    "y = df_agrupado.mean()['TOT']\n",
    "plt.bar(x, y)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c472c936",
   "metadata": {},
   "source": [
    "**Con esto podemos empezar a utilizar una base de datos y mostrar algunos resultados!** \n",
    "- Para leer datos de excel podemos usar `pd.read_excel`.\n",
    "- Para leer datos de stata podemos usar `pd.read_stata`. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "d4cfd550",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>TOT</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>year</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2010</th>\n",
       "      <td>7318.352000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2011</th>\n",
       "      <td>7676.545833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2012</th>\n",
       "      <td>7858.610833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2013</th>\n",
       "      <td>8023.047500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2014</th>\n",
       "      <td>8150.107500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2015</th>\n",
       "      <td>8273.739167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2016</th>\n",
       "      <td>8391.921667</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2017</th>\n",
       "      <td>8574.332500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2018</th>\n",
       "      <td>8773.940000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2019</th>\n",
       "      <td>8953.679167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2020</th>\n",
       "      <td>7932.822500</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2021</th>\n",
       "      <td>8145.750000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "              TOT\n",
       "year             \n",
       "2010  7318.352000\n",
       "2011  7676.545833\n",
       "2012  7858.610833\n",
       "2013  8023.047500\n",
       "2014  8150.107500\n",
       "2015  8273.739167\n",
       "2016  8391.921667\n",
       "2017  8574.332500\n",
       "2018  8773.940000\n",
       "2019  8953.679167\n",
       "2020  7932.822500\n",
       "2021  8145.750000"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_agrupado = df[['TOT', 'year']].groupby('year')\n",
    "df_agrupado.mean()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "a91bcb72",
   "metadata": {},
   "outputs": [],
   "source": [
    "df_agrupado = df[['TOT', 'year']].groupby('year')\n",
    "# df_agrupado.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c9d2d4ef",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "972d2a49",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}