{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Normalisierung und Vorverarbeitung\n",
    "\n",
    "[sklearn.preprocessing](https://scikit-learn.org/stable/modules/preprocessing.html) kann in vielfältiger Form verwendet werden um Daten zu bereinigen:\n",
    "\n",
    "* Standardisierung mit [StandardScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html), [MinMaxScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html), [MaxAbsScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html) oder [RobustScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.RobustScaler.html).\n",
    "* Zentrierung von Kernel-Matrizen mit [KernelCenterer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KernelCenterer.html).\n",
    "* Nicht-lineare Transformationen mit [QuantileTransformer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.QuantileTransformer.html), [PowerTransformer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PowerTransformer.html)\n",
    "* Normalisierung mit [normalize](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.normalize.html).\n",
    "* Kodierung kategorischer Merkmale mit [OrdinalEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OrdinalEncoder.html), [OneHotEncoder](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html).\n",
    "* [Diskretisierung](https://de.wikipedia.org/wiki/Diskretisierung) (auch bekannt als Quantisierung oder Binning) mit [KBinsDiscretizer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.KBinsDiscretizer.html).\n",
    "* Binarisierung von Merkmalen mit [Binarizer](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Binarizer.html)\n",
    "* Zurechnen (engl.: _Imputation_) fehlender Werte mit [SimpleImputer](https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html), [IterativeImputer](https://scikit-learn.org/stable/modules/generated/sklearn.impute.IterativeImputer.html) oder [KNNImputer](https://scikit-learn.org/stable/modules/generated/sklearn.impute.KNNImputer.html) wobei die zugerechneten Werte mit [MissingIndicator](https://scikit-learn.org/stable/modules/generated/sklearn.impute.MissingIndicator.html) markiert werden können.\n",
    "\n",
    "<div class=\"alert alert-block alert-info\">\n",
    "\n",
    "**Siehe auch**\n",
    "\n",
    "* [statsmodels](https://www.statsmodels.org/stable/index.html)\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Beispiel\n",
    "\n",
    "Im folgenden Beispiel füllen wir Mittelwerte auf und nehmen einige Skalierungen vor:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. Importe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from datetime import datetime\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "from sklearn import preprocessing\n",
    "from sklearn.impute import SimpleImputer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "hvac = pd.read_csv(\"https://raw.githubusercontent.com/kjam/data-cleaning-101/master/data/HVAC_with_nulls.csv\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Datenqualität überprüfen\n",
    "\n",
    "Datentypen mit [pandas.DataFrame.dtypes](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dtypes.html) anzeigen"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Date           object\n",
       "Time           object\n",
       "TargetTemp    float64\n",
       "ActualTemp      int64\n",
       "System          int64\n",
       "SystemAge     float64\n",
       "BuildingID      int64\n",
       "10            float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac.dtypes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Dimensionen das DataFrame mit [pandas.DataFrame.shape](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.shape.html) als Tupel zurückgeben"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(8000, 8)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Erste *n* Zeilen mit [pandas.DataFrame.head](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html) zurückgeben"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Date</th>\n",
       "      <th>Time</th>\n",
       "      <th>TargetTemp</th>\n",
       "      <th>ActualTemp</th>\n",
       "      <th>System</th>\n",
       "      <th>SystemAge</th>\n",
       "      <th>BuildingID</th>\n",
       "      <th>10</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6/1/13</td>\n",
       "      <td>0:00:01</td>\n",
       "      <td>66.0</td>\n",
       "      <td>58</td>\n",
       "      <td>13</td>\n",
       "      <td>20.0</td>\n",
       "      <td>4</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>6/2/13</td>\n",
       "      <td>1:00:01</td>\n",
       "      <td>NaN</td>\n",
       "      <td>68</td>\n",
       "      <td>3</td>\n",
       "      <td>20.0</td>\n",
       "      <td>17</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>6/3/13</td>\n",
       "      <td>2:00:01</td>\n",
       "      <td>70.0</td>\n",
       "      <td>73</td>\n",
       "      <td>17</td>\n",
       "      <td>20.0</td>\n",
       "      <td>18</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6/4/13</td>\n",
       "      <td>3:00:01</td>\n",
       "      <td>67.0</td>\n",
       "      <td>63</td>\n",
       "      <td>2</td>\n",
       "      <td>NaN</td>\n",
       "      <td>15</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>6/5/13</td>\n",
       "      <td>4:00:01</td>\n",
       "      <td>68.0</td>\n",
       "      <td>74</td>\n",
       "      <td>16</td>\n",
       "      <td>9.0</td>\n",
       "      <td>3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Date     Time  TargetTemp  ActualTemp  System  SystemAge  BuildingID  10\n",
       "0  6/1/13  0:00:01        66.0          58      13       20.0           4 NaN\n",
       "1  6/2/13  1:00:01         NaN          68       3       20.0          17 NaN\n",
       "2  6/3/13  2:00:01        70.0          73      17       20.0          18 NaN\n",
       "3  6/4/13  3:00:01        67.0          63       2        NaN          15 NaN\n",
       "4  6/5/13  4:00:01        68.0          74      16        9.0           3 NaN"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. Fehlenden Werten den Mittelwert zuschreiben\n",
    "\n",
    "Hierzu verwenden wir die `mean`-Strategie von [sklearn.impute.SimpleImputer](https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "imp = SimpleImputer(missing_values=np.nan, strategy=\"mean\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "hvac_numeric = hvac[[\"TargetTemp\", \"SystemAge\"]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "imp = imp.fit(hvac_numeric.loc[:10])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Weiter Infos zu `fit` erhaltet ihr in der [Scikit Learn-Dokumentation](https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html#sklearn.impute.SimpleImputer.fit).\n",
    "\n",
    "[fit_transform](https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html#sklearn.impute.SimpleImputer.fit_transform) wandelt dann die angepassten Daten um:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "transformed = imp.fit_transform(hvac_numeric)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[66.        , 20.        ],\n",
       "       [67.50773481, 20.        ],\n",
       "       [70.        , 20.        ],\n",
       "       ...,\n",
       "       [67.50773481,  4.        ],\n",
       "       [65.        , 23.        ],\n",
       "       [66.        , 21.        ]])"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "transformed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "hvac[\"TargetTemp\"], hvac[\"SystemAge\"] = transformed[:,0], transformed[:,1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Nun lassen wir uns die ersten Zeilen mit den geänderten Datensätzen anzeigen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Date</th>\n",
       "      <th>Time</th>\n",
       "      <th>TargetTemp</th>\n",
       "      <th>ActualTemp</th>\n",
       "      <th>System</th>\n",
       "      <th>SystemAge</th>\n",
       "      <th>BuildingID</th>\n",
       "      <th>10</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6/1/13</td>\n",
       "      <td>0:00:01</td>\n",
       "      <td>66.000000</td>\n",
       "      <td>58</td>\n",
       "      <td>13</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>4</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>6/2/13</td>\n",
       "      <td>1:00:01</td>\n",
       "      <td>67.507735</td>\n",
       "      <td>68</td>\n",
       "      <td>3</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>17</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>6/3/13</td>\n",
       "      <td>2:00:01</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>73</td>\n",
       "      <td>17</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>18</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6/4/13</td>\n",
       "      <td>3:00:01</td>\n",
       "      <td>67.000000</td>\n",
       "      <td>63</td>\n",
       "      <td>2</td>\n",
       "      <td>15.386643</td>\n",
       "      <td>15</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>6/5/13</td>\n",
       "      <td>4:00:01</td>\n",
       "      <td>68.000000</td>\n",
       "      <td>74</td>\n",
       "      <td>16</td>\n",
       "      <td>9.000000</td>\n",
       "      <td>3</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Date     Time  TargetTemp  ActualTemp  System  SystemAge  BuildingID  10\n",
       "0  6/1/13  0:00:01   66.000000          58      13  20.000000           4 NaN\n",
       "1  6/2/13  1:00:01   67.507735          68       3  20.000000          17 NaN\n",
       "2  6/3/13  2:00:01   70.000000          73      17  20.000000          18 NaN\n",
       "3  6/4/13  3:00:01   67.000000          63       2  15.386643          15 NaN\n",
       "4  6/5/13  4:00:01   68.000000          74      16   9.000000           3 NaN"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. Skalieren\n",
    "\n",
    "Zur Standardisierung von Datensätzen, die wie standardnormalverteilte Daten aussehen, können wir [sklearn.preprocessing.scale](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.scale.html) verwenden. Damit lassen sich die Faktoren ermitteln, um die ein Wert sich vergrößert oder verkleinert. Dies können wir für die Skalierung der aktuellen Temperatur verwenden."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "hvac[\"ScaledTemp\"] = preprocessing.scale(hvac[\"ActualTemp\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0   -1.293272\n",
       "1    0.048732\n",
       "2    0.719733\n",
       "3   -0.622270\n",
       "4    0.853934\n",
       "Name: ScaledTemp, dtype: float64"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac[\"ScaledTemp\"].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[sklearn.preprocessing.MinMaxScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html) skaliert die Mermale so, dass sie zwischen einem bestimmten Minimal- und Maximalwert liegen, häufig zwischen Null und Eins. Dies hat den Vorteil, dass die Skalierung robuster gegenüber sehr kleinen Standardabweichungen von Merkmalen wird."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "min_max_scaler = preprocessing.MinMaxScaler()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "temp_minmax = min_max_scaler.fit_transform(hvac[[\"ActualTemp\"]])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0.12],\n",
       "       [0.52],\n",
       "       [0.72],\n",
       "       ...,\n",
       "       [0.56],\n",
       "       [0.32],\n",
       "       [0.44]])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "temp_minmax"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Nun fügen wir auch `temp_minmax` noch als neue Spalte ein:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    0.12\n",
       "1    0.52\n",
       "2    0.72\n",
       "3    0.32\n",
       "4    0.76\n",
       "Name: MinMaxScaledTemp, dtype: float64"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac[\"MinMaxScaledTemp\"] = temp_minmax[:,0]\n",
    "hvac[\"MinMaxScaledTemp\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Date</th>\n",
       "      <th>Time</th>\n",
       "      <th>TargetTemp</th>\n",
       "      <th>ActualTemp</th>\n",
       "      <th>System</th>\n",
       "      <th>SystemAge</th>\n",
       "      <th>BuildingID</th>\n",
       "      <th>10</th>\n",
       "      <th>ScaledTemp</th>\n",
       "      <th>MinMaxScaledTemp</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>6/1/13</td>\n",
       "      <td>0:00:01</td>\n",
       "      <td>66.000000</td>\n",
       "      <td>58</td>\n",
       "      <td>13</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-1.293272</td>\n",
       "      <td>0.12</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>6/2/13</td>\n",
       "      <td>1:00:01</td>\n",
       "      <td>67.507735</td>\n",
       "      <td>68</td>\n",
       "      <td>3</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>17</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.048732</td>\n",
       "      <td>0.52</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>6/3/13</td>\n",
       "      <td>2:00:01</td>\n",
       "      <td>70.000000</td>\n",
       "      <td>73</td>\n",
       "      <td>17</td>\n",
       "      <td>20.000000</td>\n",
       "      <td>18</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.719733</td>\n",
       "      <td>0.72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>6/4/13</td>\n",
       "      <td>3:00:01</td>\n",
       "      <td>67.000000</td>\n",
       "      <td>63</td>\n",
       "      <td>2</td>\n",
       "      <td>15.386643</td>\n",
       "      <td>15</td>\n",
       "      <td>NaN</td>\n",
       "      <td>-0.622270</td>\n",
       "      <td>0.32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>6/5/13</td>\n",
       "      <td>4:00:01</td>\n",
       "      <td>68.000000</td>\n",
       "      <td>74</td>\n",
       "      <td>16</td>\n",
       "      <td>9.000000</td>\n",
       "      <td>3</td>\n",
       "      <td>NaN</td>\n",
       "      <td>0.853934</td>\n",
       "      <td>0.76</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Date     Time  TargetTemp  ActualTemp  System  SystemAge  BuildingID  10  \\\n",
       "0  6/1/13  0:00:01   66.000000          58      13  20.000000           4 NaN   \n",
       "1  6/2/13  1:00:01   67.507735          68       3  20.000000          17 NaN   \n",
       "2  6/3/13  2:00:01   70.000000          73      17  20.000000          18 NaN   \n",
       "3  6/4/13  3:00:01   67.000000          63       2  15.386643          15 NaN   \n",
       "4  6/5/13  4:00:01   68.000000          74      16   9.000000           3 NaN   \n",
       "\n",
       "   ScaledTemp  MinMaxScaledTemp  \n",
       "0   -1.293272              0.12  \n",
       "1    0.048732              0.52  \n",
       "2    0.719733              0.72  \n",
       "3   -0.622270              0.32  \n",
       "4    0.853934              0.76  "
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hvac.head()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.11 Kernel",
   "language": "python",
   "name": "python311"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": false,
   "autocomplete": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}