{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "b6d8e288",
   "metadata": {},
   "source": [
    "# Python-Datenstrukturen in pandas überführen\n",
    "\n",
    "Python-Datenstrukuren wie Listen und Arrays lassen sich in pandas [Series](#Series) oder [DataFrames](#DataFrame) überführen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "362d58a7",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "11ba3983",
   "metadata": {},
   "source": [
    "## Series\n",
    "\n",
    "Python [Lists](https://docs.python.org/3/tutorial/introduction.html#lists) können einfach in pandas Series umgewandelt werden:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d86c6823",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0   -0.751442\n",
       "1    0.816935\n",
       "2   -0.272546\n",
       "3   -0.268295\n",
       "4   -0.296728\n",
       "5    0.176255\n",
       "6   -0.322612\n",
       "dtype: float64"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list1 = [\n",
    "    -0.751442,\n",
    "    0.816935,\n",
    "    -0.272546,\n",
    "    -0.268295,\n",
    "    -0.296728,\n",
    "    0.176255,\n",
    "    -0.322612,\n",
    "]\n",
    "\n",
    "pd.Series(list1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29236df2",
   "metadata": {},
   "source": [
    "Auch mehrere Lists lassen sich einfach in eine pandas Series umwandeln:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "50689bfc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    -0.751442\n",
       "1     0.816935\n",
       "2    -0.272546\n",
       "3    -0.268295\n",
       "4    -0.296728\n",
       "5     0.176255\n",
       "6    -0.322612\n",
       "7    -0.029608\n",
       "8    -0.277982\n",
       "9     2.693057\n",
       "10   -0.850817\n",
       "11    0.783868\n",
       "12   -1.137835\n",
       "13   -0.617132\n",
       "dtype: float64"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "list2 = [\n",
    "    -0.029608,\n",
    "    -0.277982,\n",
    "    2.693057,\n",
    "    -0.850817,\n",
    "    0.783868,\n",
    "    -1.137835,\n",
    "    -0.617132,\n",
    "]\n",
    "\n",
    "pd.Series(list1 + list2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4103ed20",
   "metadata": {},
   "source": [
    "Es kann auch eine Liste als Index übergeben werden:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "6511d214",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2022-01-31   -0.751442\n",
       "2022-02-01    0.816935\n",
       "2022-02-02   -0.272546\n",
       "2022-02-03   -0.268295\n",
       "2022-02-04   -0.296728\n",
       "2022-02-05    0.176255\n",
       "2022-02-06   -0.322612\n",
       "dtype: float64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "date = [\n",
    "    \"2022-01-31\",\n",
    "    \"2022-02-01\",\n",
    "    \"2022-02-02\",\n",
    "    \"2022-02-03\",\n",
    "    \"2022-02-04\",\n",
    "    \"2022-02-05\",\n",
    "    \"2022-02-06\",\n",
    "]\n",
    "\n",
    "pd.Series(list1, index=date)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6b2764b3",
   "metadata": {},
   "source": [
    "Mit Python [Dictionary](https://docs.python.org/3/tutorial/datastructures.html#dictionaries) könnt ihr nicht nur Werte sondern auch die zugehörigen Schlüssel an eine pandas Series übergeben:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "cd74ecdd",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2022-01-31   -0.751442\n",
       "2022-02-01    0.816935\n",
       "2022-02-02   -0.272546\n",
       "2022-02-03   -0.268295\n",
       "2022-02-04   -0.296728\n",
       "2022-02-05    0.176255\n",
       "2022-02-06   -0.322612\n",
       "dtype: float64"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dict1 = {\n",
    "    \"2022-01-31\": -0.751442,\n",
    "    \"2022-02-01\": 0.816935,\n",
    "    \"2022-02-02\": -0.272546,\n",
    "    \"2022-02-03\": -0.268295,\n",
    "    \"2022-02-04\": -0.296728,\n",
    "    \"2022-02-05\": 0.176255,\n",
    "    \"2022-02-06\": -0.322612,\n",
    "}\n",
    "\n",
    "pd.Series(dict1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8fcb9bf5",
   "metadata": {},
   "source": [
    "Wenn ihr ein `dict` übergebt, berücksichtigt der Index in der resultierenden pandas Series die Reihenfolge der Schlüssel im Dict."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "357db634",
   "metadata": {},
   "source": [
    "Mit [collections.ChainMap](https://docs.python.org/3/library/collections.html#collections.ChainMap) könnt ihr auch mehrere Dicts in eine pandas.Series verwandeln.\n",
    "\n",
    "Zunächst definieren wir hierfür ein zweites Dict:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "e2d2be25",
   "metadata": {},
   "outputs": [],
   "source": [
    "dict2 = {\n",
    "    \"2022-02-07\": -0.029608,\n",
    "    \"2022-02-08\": -0.277982,\n",
    "    \"2022-02-09\": 2.693057,\n",
    "    \"2022-02-10\": -0.850817,\n",
    "    \"2022-02-11\": 0.783868,\n",
    "    \"2022-02-12\": -1.137835,\n",
    "    \"2022-02-13\": -0.617132,\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "36514ba2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2022-02-07   -0.029608\n",
       "2022-02-08   -0.277982\n",
       "2022-02-09    2.693057\n",
       "2022-02-10   -0.850817\n",
       "2022-02-11    0.783868\n",
       "2022-02-12   -1.137835\n",
       "2022-02-13   -0.617132\n",
       "2022-01-31   -0.751442\n",
       "2022-02-01    0.816935\n",
       "2022-02-02   -0.272546\n",
       "2022-02-03   -0.268295\n",
       "2022-02-04   -0.296728\n",
       "2022-02-05    0.176255\n",
       "2022-02-06   -0.322612\n",
       "dtype: float64"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from collections import ChainMap\n",
    "\n",
    "\n",
    "pd.Series(ChainMap(dict1, dict2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "faf5f65b",
   "metadata": {},
   "source": [
    "## DataFrame\n",
    "\n",
    "Listen von Python list können in ein pandas DataFrame geladen werden mit:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "db421b4d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>-0.751442</td>\n",
       "      <td>0.816935</td>\n",
       "      <td>-0.272546</td>\n",
       "      <td>-0.268295</td>\n",
       "      <td>-0.296728</td>\n",
       "      <td>0.176255</td>\n",
       "      <td>-0.322612</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>-0.029608</td>\n",
       "      <td>-0.277982</td>\n",
       "      <td>2.693057</td>\n",
       "      <td>-0.850817</td>\n",
       "      <td>0.783868</td>\n",
       "      <td>-1.137835</td>\n",
       "      <td>-0.617132</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          0         1         2         3         4         5         6\n",
       "0 -0.751442  0.816935 -0.272546 -0.268295 -0.296728  0.176255 -0.322612\n",
       "1 -0.029608 -0.277982  2.693057 -0.850817  0.783868 -1.137835 -0.617132"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame([list1, list2])\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b935eaab",
   "metadata": {},
   "source": [
    "Ihr könnt auch eine Liste in einen DataFrame-Index überführen:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "641b64a4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "      <th>4</th>\n",
       "      <th>5</th>\n",
       "      <th>6</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2022-01-31</th>\n",
       "      <td>-0.751442</td>\n",
       "      <td>0.816935</td>\n",
       "      <td>-0.272546</td>\n",
       "      <td>-0.268295</td>\n",
       "      <td>-0.296728</td>\n",
       "      <td>0.176255</td>\n",
       "      <td>-0.322612</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2022-02-01</th>\n",
       "      <td>-0.029608</td>\n",
       "      <td>-0.277982</td>\n",
       "      <td>2.693057</td>\n",
       "      <td>-0.850817</td>\n",
       "      <td>0.783868</td>\n",
       "      <td>-1.137835</td>\n",
       "      <td>-0.617132</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   0         1         2         3         4         5  \\\n",
       "2022-01-31 -0.751442  0.816935 -0.272546 -0.268295 -0.296728  0.176255   \n",
       "2022-02-01 -0.029608 -0.277982  2.693057 -0.850817  0.783868 -1.137835   \n",
       "\n",
       "                   6  \n",
       "2022-01-31 -0.322612  \n",
       "2022-02-01 -0.617132  "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame([list1, list2], index=[\"2022-01-31\", \"2022-02-01\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3279218a",
   "metadata": {},
   "source": [
    "Ein pandas DataFrame kann aus einem Dict mit Werten in Listen erstellt werden:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "45527e34",
   "metadata": {},
   "outputs": [],
   "source": [
    "data = {\n",
    "    \"Code\": [\"U+0000\", \"U+0001\", \"U+0002\", \"U+0003\", \"U+0004\", \"U+0005\"],\n",
    "    \"Decimal\": [0, 1, 2, 3, 4, 5],\n",
    "    \"Octal\": [\"001\", \"002\", \"003\", \"004\", \"004\", \"005\"],\n",
    "    \"Key\": [\"NUL\", \"Ctrl-A\", \"Ctrl-B\", \"Ctrl-C\", \"Ctrl-D\", \"Ctrl-E\"],\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "76476364",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Code</th>\n",
       "      <th>Decimal</th>\n",
       "      <th>Octal</th>\n",
       "      <th>Key</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>U+0000</td>\n",
       "      <td>0</td>\n",
       "      <td>001</td>\n",
       "      <td>NUL</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>U+0001</td>\n",
       "      <td>1</td>\n",
       "      <td>002</td>\n",
       "      <td>Ctrl-A</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>U+0002</td>\n",
       "      <td>2</td>\n",
       "      <td>003</td>\n",
       "      <td>Ctrl-B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>U+0003</td>\n",
       "      <td>3</td>\n",
       "      <td>004</td>\n",
       "      <td>Ctrl-C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>U+0004</td>\n",
       "      <td>4</td>\n",
       "      <td>004</td>\n",
       "      <td>Ctrl-D</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>U+0005</td>\n",
       "      <td>5</td>\n",
       "      <td>005</td>\n",
       "      <td>Ctrl-E</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Code  Decimal Octal     Key\n",
       "0  U+0000        0   001     NUL\n",
       "1  U+0001        1   002  Ctrl-A\n",
       "2  U+0002        2   003  Ctrl-B\n",
       "3  U+0003        3   004  Ctrl-C\n",
       "4  U+0004        4   004  Ctrl-D\n",
       "5  U+0005        5   005  Ctrl-E"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da3c83b2",
   "metadata": {},
   "source": [
    "Eine weitere gängige Form von Daten sind verschachtelte Dict von Dicts:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "56028e86",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>U+0006</th>\n",
       "      <th>U+0007</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Decimal</th>\n",
       "      <td>6</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Octal</th>\n",
       "      <td>006</td>\n",
       "      <td>007</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Key</th>\n",
       "      <td>Ctrl-F</td>\n",
       "      <td>Ctrl-G</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         U+0006  U+0007\n",
       "Decimal       6       7\n",
       "Octal       006     007\n",
       "Key      Ctrl-F  Ctrl-G"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data2 = {\n",
    "    \"U+0006\": {\"Decimal\": \"6\", \"Octal\": \"006\", \"Key\": \"Ctrl-F\"},\n",
    "    \"U+0007\": {\"Decimal\": \"7\", \"Octal\": \"007\", \"Key\": \"Ctrl-G\"},\n",
    "}\n",
    "\n",
    "df2 = pd.DataFrame(data2)\n",
    "\n",
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5ab4b3a9",
   "metadata": {},
   "source": [
    "Dicts von Series werden in ähnlicher Weise behandelt:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "532b4f28",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>U+0006</th>\n",
       "      <th>U+0007</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Key</th>\n",
       "      <td>Ctrl-F</td>\n",
       "      <td>Ctrl-G</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     U+0006  U+0007\n",
       "Key  Ctrl-F  Ctrl-G"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data3 = {\"U+0006\": df2[\"U+0006\"][2:], \"U+0007\": df2[\"U+0007\"][2:]}\n",
    "\n",
    "pd.DataFrame(data3)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.13 Kernel",
   "language": "python",
   "name": "python313"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.0"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}