{ "cells": [ { "cell_type": "markdown", "id": "4f45ee46", "metadata": {}, "source": [ "# Indexierung\n", "\n", "## Index-Objekte\n", "\n", "Die Index-Objekte von pandas sind für die Achsenbeschriftungen und andere Metadaten, wie den Achsennamen, verantwortlich. Jedes Array oder jede andere Sequenz von Beschriftungen, die ihr bei der Konstruktion einer Serie oder eines DataFrame verwendet, wird intern in einen Index umgewandelt:" ] }, { "cell_type": "code", "execution_count": 1, "id": "ad72f627", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:11.707113Z", "iopub.status.busy": "2026-05-21T14:10:11.706632Z", "iopub.status.idle": "2026-05-21T14:10:11.948345Z", "shell.execute_reply": "2026-05-21T14:10:11.947614Z", "shell.execute_reply.started": "2026-05-21T14:10:11.707091Z" } }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "\n", "obj = pd.Series(range(7), index=pd.date_range(\"2022-02-02\", periods=7))" ] }, { "cell_type": "code", "execution_count": 2, "id": "8bd593cb", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:11.948943Z", "iopub.status.busy": "2026-05-21T14:10:11.948801Z", "iopub.status.idle": "2026-05-21T14:10:11.954248Z", "shell.execute_reply": "2026-05-21T14:10:11.953934Z", "shell.execute_reply.started": "2026-05-21T14:10:11.948934Z" } }, "outputs": [ { "data": { "text/plain": [ "DatetimeIndex(['2022-02-02', '2022-02-03', '2022-02-04', '2022-02-05',\n", " '2022-02-06', '2022-02-07', '2022-02-08'],\n", " dtype='datetime64[ns]', freq='D')" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj.index" ] }, { "cell_type": "code", "execution_count": 3, "id": "1d4f826d", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:11.954657Z", "iopub.status.busy": "2026-05-21T14:10:11.954580Z", "iopub.status.idle": "2026-05-21T14:10:11.957289Z", "shell.execute_reply": "2026-05-21T14:10:11.957061Z", "shell.execute_reply.started": "2026-05-21T14:10:11.954649Z" } }, "outputs": [ { "data": { "text/plain": [ "DatetimeIndex(['2022-02-05', '2022-02-06', '2022-02-07', '2022-02-08'], dtype='datetime64[ns]', freq='D')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj.index[3:]" ] }, { "cell_type": "markdown", "id": "7fbcdc7b", "metadata": {}, "source": [ "Indexobjekte sind unveränderlich (*immutable*) und können daher nicht geändert werden:" ] }, { "cell_type": "code", "execution_count": 4, "id": "9c4b3d00", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:11.957623Z", "iopub.status.busy": "2026-05-21T14:10:11.957565Z", "iopub.status.idle": "2026-05-21T14:10:12.166156Z", "shell.execute_reply": "2026-05-21T14:10:12.161642Z", "shell.execute_reply.started": "2026-05-21T14:10:11.957617Z" } }, "outputs": [ { "ename": "TypeError", "evalue": "Index does not support mutable operations", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[4], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mobj\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mindex\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m2022-02-03\u001b[39m\u001b[38;5;124m\"\u001b[39m\n", "File \u001b[0;32m~/cusy/trn/jupyter-tutorial/uvenvs/py313/.venv/lib/python3.13/site-packages/pandas/core/indexes/base.py:5371\u001b[0m, in \u001b[0;36mIndex.__setitem__\u001b[0;34m(self, key, value)\u001b[0m\n\u001b[1;32m 5369\u001b[0m \u001b[38;5;129m@final\u001b[39m\n\u001b[1;32m 5370\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21m__setitem__\u001b[39m(\u001b[38;5;28mself\u001b[39m, key, value) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m:\n\u001b[0;32m-> 5371\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mIndex does not support mutable operations\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mTypeError\u001b[0m: Index does not support mutable operations" ] } ], "source": [ "obj.index[1] = \"2022-02-03\"" ] }, { "cell_type": "markdown", "id": "2f9014ac", "metadata": {}, "source": [ "Die Unveränderlichkeit macht die gemeinsame Nutzung von Indexobjekten in Datenstrukturen sicherer:" ] }, { "cell_type": "code", "execution_count": 5, "id": "29d98eac", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.095302Z", "iopub.status.busy": "2026-05-21T14:10:22.094902Z", "iopub.status.idle": "2026-05-21T14:10:22.102808Z", "shell.execute_reply": "2026-05-21T14:10:22.102066Z", "shell.execute_reply.started": "2026-05-21T14:10:22.095280Z" } }, "outputs": [ { "data": { "text/plain": [ "Index([0, 1, 2], dtype='int64')" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "\n", "\n", "labels = pd.Index(np.arange(3))\n", "\n", "labels" ] }, { "cell_type": "code", "execution_count": 6, "id": "05e929b2", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.104040Z", "iopub.status.busy": "2026-05-21T14:10:22.103829Z", "iopub.status.idle": "2026-05-21T14:10:22.107819Z", "shell.execute_reply": "2026-05-21T14:10:22.107306Z", "shell.execute_reply.started": "2026-05-21T14:10:22.104022Z" } }, "outputs": [], "source": [ "rng = np.random.default_rng()\n", "obj2 = pd.Series(rng.normal(size=3), index=labels)" ] }, { "cell_type": "code", "execution_count": 7, "id": "51d97072", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.108815Z", "iopub.status.busy": "2026-05-21T14:10:22.108575Z", "iopub.status.idle": "2026-05-21T14:10:22.114281Z", "shell.execute_reply": "2026-05-21T14:10:22.113687Z", "shell.execute_reply.started": "2026-05-21T14:10:22.108800Z" } }, "outputs": [ { "data": { "text/plain": [ "0 1.320564\n", "1 0.750047\n", "2 1.963873\n", "dtype: float64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj2" ] }, { "cell_type": "code", "execution_count": 8, "id": "f5c147e3", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.116319Z", "iopub.status.busy": "2026-05-21T14:10:22.116099Z", "iopub.status.idle": "2026-05-21T14:10:22.119600Z", "shell.execute_reply": "2026-05-21T14:10:22.119140Z", "shell.execute_reply.started": "2026-05-21T14:10:22.116304Z" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj2.index is labels" ] }, { "cell_type": "markdown", "id": "0ab8314c", "metadata": {}, "source": [ "Um einem Array ähnlich zu sein verhält sich ein Index auch wie eine Menge mit fester Größe:" ] }, { "cell_type": "code", "execution_count": 9, "id": "b1c05fb5", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.120046Z", "iopub.status.busy": "2026-05-21T14:10:22.119970Z", "iopub.status.idle": "2026-05-21T14:10:22.122483Z", "shell.execute_reply": "2026-05-21T14:10:22.122100Z", "shell.execute_reply.started": "2026-05-21T14:10:22.120038Z" } }, "outputs": [], "source": [ "data1 = {\n", " \"Code\": [\"U+0000\", \"U+0001\", \"U+0002\", \"U+0003\", \"U+0004\", \"U+0005\"],\n", " \"Decimal\": [0, 1, 2, 3, 4, 5],\n", " \"Octal\": [\"001\", \"002\", \"003\", \"004\", \"004\", \"005\"],\n", "}\n", "df1 = pd.DataFrame(data1)" ] }, { "cell_type": "code", "execution_count": 10, "id": "f18e59bd", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.123048Z", "iopub.status.busy": "2026-05-21T14:10:22.122948Z", "iopub.status.idle": "2026-05-21T14:10:22.127418Z", "shell.execute_reply": "2026-05-21T14:10:22.126895Z", "shell.execute_reply.started": "2026-05-21T14:10:22.123040Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CodeDecimalOctal
0U+00000001
1U+00011002
2U+00022003
3U+00033004
4U+00044004
5U+00055005
\n", "
" ], "text/plain": [ " Code Decimal Octal\n", "0 U+0000 0 001\n", "1 U+0001 1 002\n", "2 U+0002 2 003\n", "3 U+0003 3 004\n", "4 U+0004 4 004\n", "5 U+0005 5 005" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1" ] }, { "cell_type": "code", "execution_count": 11, "id": "1182c978", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.128174Z", "iopub.status.busy": "2026-05-21T14:10:22.127988Z", "iopub.status.idle": "2026-05-21T14:10:22.130464Z", "shell.execute_reply": "2026-05-21T14:10:22.130226Z", "shell.execute_reply.started": "2026-05-21T14:10:22.128162Z" } }, "outputs": [ { "data": { "text/plain": [ "Index(['Code', 'Decimal', 'Octal'], dtype='object')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1.columns" ] }, { "cell_type": "code", "execution_count": 12, "id": "2836c7ca", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.130878Z", "iopub.status.busy": "2026-05-21T14:10:22.130796Z", "iopub.status.idle": "2026-05-21T14:10:22.133062Z", "shell.execute_reply": "2026-05-21T14:10:22.132809Z", "shell.execute_reply.started": "2026-05-21T14:10:22.130870Z" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"Code\" in df1.columns" ] }, { "cell_type": "code", "execution_count": 13, "id": "001941e1", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.133566Z", "iopub.status.busy": "2026-05-21T14:10:22.133435Z", "iopub.status.idle": "2026-05-21T14:10:22.135742Z", "shell.execute_reply": "2026-05-21T14:10:22.135310Z", "shell.execute_reply.started": "2026-05-21T14:10:22.133556Z" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"Key\" in df1.columns" ] }, { "cell_type": "markdown", "id": "f3111d2d", "metadata": {}, "source": [ "## Achsenindizes mit doppelten Labels\n", "\n", "Anders als Python-Sets kann ein Pandas-Index doppelte Label enthalten:" ] }, { "cell_type": "code", "execution_count": 14, "id": "37496464", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.137652Z", "iopub.status.busy": "2026-05-21T14:10:22.137516Z", "iopub.status.idle": "2026-05-21T14:10:22.141243Z", "shell.execute_reply": "2026-05-21T14:10:22.141013Z", "shell.execute_reply.started": "2026-05-21T14:10:22.137641Z" }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CodeDecimalOctal
0U+00000001
1U+00011002
2U+00022003
3U+00033004
4U+00044004
5U+00055005
0U+00066006
1U+00077007
\n", "
" ], "text/plain": [ " Code Decimal Octal\n", "0 U+0000 0 001\n", "1 U+0001 1 002\n", "2 U+0002 2 003\n", "3 U+0003 3 004\n", "4 U+0004 4 004\n", "5 U+0005 5 005\n", "0 U+0006 6 006\n", "1 U+0007 7 007" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data2 = {\n", " \"Code\": [\"U+0006\", \"U+0007\"],\n", " \"Decimal\": [6, 7],\n", " \"Octal\": [\"006\", \"007\"],\n", "}\n", "df2 = pd.DataFrame(data2)\n", "df12 = pd.concat([df1, df2])\n", "\n", "df12" ] }, { "cell_type": "markdown", "id": "36a5bfa8", "metadata": {}, "source": [ "Bei der [Auswahl](select-filter.ipynb) von doppelten Bezeichnungen werden alle Vorkommen der betreffenden Bezeichnung ausgewählt:" ] }, { "cell_type": "code", "execution_count": 15, "id": "95b47be1", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.141645Z", "iopub.status.busy": "2026-05-21T14:10:22.141576Z", "iopub.status.idle": "2026-05-21T14:10:22.144857Z", "shell.execute_reply": "2026-05-21T14:10:22.144640Z", "shell.execute_reply.started": "2026-05-21T14:10:22.141637Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CodeDecimalOctal
1U+00011002
1U+00077007
\n", "
" ], "text/plain": [ " Code Decimal Octal\n", "1 U+0001 1 002\n", "1 U+0007 7 007" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df12.loc[1]" ] }, { "cell_type": "markdown", "id": "e9855e3d", "metadata": {}, "source": [ "Die Datenauswahl ist einer der Hauptpunkte, der sich bei Duplikaten anders verhält. Die Indizierung eines Labels mit mehreren Einträgen ergibt eine Serie, während einzelne Einträge einen Einzelwert ergeben. Dies kann euren Code komplizierter machen, da der Ausgabetyp der Indizierung je nachdem, ob ein Label wiederholt wird oder nicht, variieren kann. Zudem setzen viele Pandas-Funktionen, wie z.B. `reindex`, voraus, dass Label eindeutig sind. Anhand der Eigenschaft `is_unique` des Index könnt ihr feststellen, ob seine Label eindeutig sind oder nicht:" ] }, { "cell_type": "code", "execution_count": 16, "id": "ebf3d148", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.145335Z", "iopub.status.busy": "2026-05-21T14:10:22.145207Z", "iopub.status.idle": "2026-05-21T14:10:22.147543Z", "shell.execute_reply": "2026-05-21T14:10:22.147349Z", "shell.execute_reply.started": "2026-05-21T14:10:22.145324Z" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df12.index.is_unique" ] }, { "cell_type": "markdown", "id": "f656d655", "metadata": {}, "source": [ "Um doppelte Label zu vermeiden, könnt ihr z.B. `ignore_index=True` verwenden:" ] }, { "cell_type": "code", "execution_count": 17, "id": "8146d059", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.147924Z", "iopub.status.busy": "2026-05-21T14:10:22.147848Z", "iopub.status.idle": "2026-05-21T14:10:22.151024Z", "shell.execute_reply": "2026-05-21T14:10:22.150746Z", "shell.execute_reply.started": "2026-05-21T14:10:22.147918Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CodeDecimalOctal
0U+00000001
1U+00011002
2U+00022003
3U+00033004
4U+00044004
5U+00055005
6U+00066006
7U+00077007
\n", "
" ], "text/plain": [ " Code Decimal Octal\n", "0 U+0000 0 001\n", "1 U+0001 1 002\n", "2 U+0002 2 003\n", "3 U+0003 3 004\n", "4 U+0004 4 004\n", "5 U+0005 5 005\n", "6 U+0006 6 006\n", "7 U+0007 7 007" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df12 = pd.concat([df1, df2], ignore_index=True)\n", "\n", "df12" ] }, { "cell_type": "markdown", "id": "e82888d3", "metadata": {}, "source": [ "## Einige Indexmethoden und -eigenschaften\n", "\n", "Jeder Index verfügt über eine Reihe von Methoden und Eigenschaften für die Mengenlogik, die andere allgemeine Fragen zu den darin enthaltenen Daten beantworten. Im folgenden einige nützliche Methoden und Eigenschaften:\n", "\n", "Methode | Beschreibung\n", ":------ | :-----------\n", "`append` | verkettet zusätzliche Indexobjekte, wodurch ein neuer Index entsteht\n", "`difference` | berechnet die Differenz zweier Mengen als Index\n", "`intersection` | berechnet die Schnittmenge\n", "`union` | berechnet die Vereinigungsmenge\n", "`isin` | berechnet ein boolesches Array, das angibt, ob jeder Wert in der übergebenen Sammlung enthalten ist\n", "`delete` | berechnet einen neuen Index, wobei das Element in Index `i` gelöscht wird\n", "`drop` | berechnet einen neuen Index durch Löschen der übergebenen Werte\n", "`insert` | berechnet neuen Index durch Einfügen des Elements in den Index `i`\n", "`is_monotonic` | gibt `True` zurück, wenn jedes Element größer oder gleich dem vorherigen Element ist\n", "`is_unique` | gibt `True` zurück, wenn der Index keine doppelten Werte enthält.\n", "`unique` | berechnet das Array der eindeutigen Werte im Index" ] }, { "cell_type": "markdown", "id": "ae0c0c0e", "metadata": {}, "source": [ "## Neuindizierung mit `reindex`" ] }, { "cell_type": "markdown", "id": "04891784", "metadata": {}, "source": [ "Eine wichtige Methode für Pandas-Objekte ist die Neuindizierung, d.h. die Erstellung eines neuen Objekts mit neu angeordneten Werten, die mit dem neuen Index übereinstimmen. Betrachtet z.B.:" ] }, { "cell_type": "code", "execution_count": 18, "id": "12a82bf7", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.151528Z", "iopub.status.busy": "2026-05-21T14:10:22.151449Z", "iopub.status.idle": "2026-05-21T14:10:22.153982Z", "shell.execute_reply": "2026-05-21T14:10:22.153595Z", "shell.execute_reply.started": "2026-05-21T14:10:22.151521Z" } }, "outputs": [], "source": [ "obj = pd.Series(range(7), index=pd.date_range(\"2022-02-02\", periods=7))" ] }, { "cell_type": "code", "execution_count": 19, "id": "dd157c06", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.154343Z", "iopub.status.busy": "2026-05-21T14:10:22.154273Z", "iopub.status.idle": "2026-05-21T14:10:22.156701Z", "shell.execute_reply": "2026-05-21T14:10:22.156302Z", "shell.execute_reply.started": "2026-05-21T14:10:22.154336Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-02 0\n", "2022-02-03 1\n", "2022-02-04 2\n", "2022-02-05 3\n", "2022-02-06 4\n", "2022-02-07 5\n", "2022-02-08 6\n", "Freq: D, dtype: int64" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj" ] }, { "cell_type": "code", "execution_count": 20, "id": "50037fe8", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.157274Z", "iopub.status.busy": "2026-05-21T14:10:22.157203Z", "iopub.status.idle": "2026-05-21T14:10:22.159120Z", "shell.execute_reply": "2026-05-21T14:10:22.158804Z", "shell.execute_reply.started": "2026-05-21T14:10:22.157268Z" } }, "outputs": [], "source": [ "new_index = pd.date_range(\"2022-02-03\", periods=7)" ] }, { "cell_type": "code", "execution_count": 21, "id": "e390296e", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.159635Z", "iopub.status.busy": "2026-05-21T14:10:22.159422Z", "iopub.status.idle": "2026-05-21T14:10:22.162731Z", "shell.execute_reply": "2026-05-21T14:10:22.162405Z", "shell.execute_reply.started": "2026-05-21T14:10:22.159611Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-03 1.0\n", "2022-02-04 2.0\n", "2022-02-05 3.0\n", "2022-02-06 4.0\n", "2022-02-07 5.0\n", "2022-02-08 6.0\n", "2022-02-09 NaN\n", "Freq: D, dtype: float64" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj.reindex(new_index)" ] }, { "cell_type": "markdown", "id": "d9c427cc", "metadata": {}, "source": [ "`reindex` erstellt einen neuen Index und indiziert den DataFrame neu. Standardmäßig werden Werte im neuen Index, für die es keine entsprechenden Datensätze im DataFrame gibt, zu `NaN`." ] }, { "cell_type": "markdown", "id": "6c0c4fe6", "metadata": {}, "source": [ "Bei geordneten Daten wie Zeitreihen kann es wünschenswert sein, bei der Neuindizierung Werte zu interpolieren oder zu füllen. Die Option `method` ermöglicht dies mit einer Methode wie `ffill`, die die Werte vorwärts füllt:" ] }, { "cell_type": "code", "execution_count": 22, "id": "b09ae6bd", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.163204Z", "iopub.status.busy": "2026-05-21T14:10:22.163132Z", "iopub.status.idle": "2026-05-21T14:10:22.165724Z", "shell.execute_reply": "2026-05-21T14:10:22.165409Z", "shell.execute_reply.started": "2026-05-21T14:10:22.163197Z" } }, "outputs": [ { "data": { "text/plain": [ "2022-02-03 1\n", "2022-02-04 2\n", "2022-02-05 3\n", "2022-02-06 4\n", "2022-02-07 5\n", "2022-02-08 6\n", "2022-02-09 6\n", "Freq: D, dtype: int64" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "obj.reindex(new_index, method=\"ffill\")" ] }, { "cell_type": "markdown", "id": "fed00121", "metadata": {}, "source": [ "Bei einem DataFrame kann `reindex` entweder den (Zeilen-)Index, die Spalten oder beides ändern. Wenn nur eine Sequenz übergeben wird, werden die Zeilen im Ergebnis neu indiziert:" ] }, { "cell_type": "code", "execution_count": 23, "id": "b0c0c5c7", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.166390Z", "iopub.status.busy": "2026-05-21T14:10:22.166216Z", "iopub.status.idle": "2026-05-21T14:10:22.169960Z", "shell.execute_reply": "2026-05-21T14:10:22.169686Z", "shell.execute_reply.started": "2026-05-21T14:10:22.166381Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CodeDecimalOctal
0U+00000.0001
1U+00011.0002
2U+00022.0003
3U+00033.0004
4U+00044.0004
5U+00055.0005
6NaNNaNNaN
\n", "
" ], "text/plain": [ " Code Decimal Octal\n", "0 U+0000 0.0 001\n", "1 U+0001 1.0 002\n", "2 U+0002 2.0 003\n", "3 U+0003 3.0 004\n", "4 U+0004 4.0 004\n", "5 U+0005 5.0 005\n", "6 NaN NaN NaN" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1.reindex(range(7))" ] }, { "cell_type": "markdown", "id": "65df301d", "metadata": {}, "source": [ "Die Spalten können mit dem Schlüsselwort `columns` neu indiziert werden:" ] }, { "cell_type": "code", "execution_count": 24, "id": "e8837570", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.170646Z", "iopub.status.busy": "2026-05-21T14:10:22.170533Z", "iopub.status.idle": "2026-05-21T14:10:22.173932Z", "shell.execute_reply": "2026-05-21T14:10:22.173650Z", "shell.execute_reply.started": "2026-05-21T14:10:22.170635Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OctalCodeDescription
0001U+0000NaN
1002U+0001NaN
2003U+0002NaN
3004U+0003NaN
4004U+0004NaN
5005U+0005NaN
\n", "
" ], "text/plain": [ " Octal Code Description\n", "0 001 U+0000 NaN\n", "1 002 U+0001 NaN\n", "2 003 U+0002 NaN\n", "3 004 U+0003 NaN\n", "4 004 U+0004 NaN\n", "5 005 U+0005 NaN" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "encoding = [\"Octal\", \"Code\", \"Description\"]\n", "\n", "df1.reindex(columns=encoding)" ] }, { "cell_type": "markdown", "id": "170ad52f", "metadata": {}, "source": [ "### Argumente der Funktion `reindex`\n", "\n", "Argument | Beschreibung\n", ":------- | :-----------\n", "`labels` | Neue Sequenz, die als Index verwendet werden soll. Kann eine Index-Instanz oder eine andere sequenzähnliche Python-Datenstruktur sein. Ein Index wird genau so verwendet, wie er ist, ohne dass er kopiert wird.\n", "`axis` | Die neu zu indizierende Achse, entweder `index` (Zeilen) oder `columns` (Spalten). Die Vorgabe ist `index`. Ihr könnt alternativ `reindex(index=new_labels)` oder `reindex(columns=new_labels)` verwenden.\n", "`method` | Interpolationsmethode; `ffill` füllt vorwärts, während `bfill` rückwärts füllt.\n", "`fill_value` | Ersatzwert, der zu verwenden ist, wenn fehlende Daten durch Neuindizierung eingefügt werden. Verwendet `fill_value='missing'` (das Standardverhalten), wenn die fehlenden Bezeichnungen im Ergebnis Nullwerte haben sollen.\n", "`limit` | Beim Vorwärts- oder Rückwärtsfüllen die maximale Anzahl der zu füllenden Elemente.\n", "`tolerance` | Beim Vorwärts- oder Rückwärtsauffüllen die maximale Größe der Lücke, die bei ungenauen Übereinstimmungen gefüllt werden soll.\n", "`level` | Einfachen Index auf Ebene von `MultiIndex` abgleichen; andernfalls Teilmenge auswählen.\n", "`copy` | Wenn `True`, werden die zugrunde liegenden Daten immer kopiert, auch wenn der neue Index dem alten Index entspricht; wenn `False`, werden die Daten nicht kopiert, wenn die Indizes gleichwertig sind." ] }, { "cell_type": "markdown", "id": "139a7059", "metadata": {}, "source": [ "## Achsenindizes umbenennen\n", "\n", "Die Achsenbeschriftungen können durch eine Funktion oder ein Mapping umgewandelt werden, um neue, anders beschriftete Objekte zu erzeugen. Ihr könnt auch die Achsen an Ort und Stelle ändern, ohne eine neue Datenstruktur zu erstellen. Hier ist ein einfaches Beispiel:" ] }, { "cell_type": "code", "execution_count": 25, "id": "77e042a1", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.174378Z", "iopub.status.busy": "2026-05-21T14:10:22.174310Z", "iopub.status.idle": "2026-05-21T14:10:22.178374Z", "shell.execute_reply": "2026-05-21T14:10:22.177844Z", "shell.execute_reply.started": "2026-05-21T14:10:22.174371Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
1234
Deutsch0123
English4567
Français891011
\n", "
" ], "text/plain": [ " 1 2 3 4\n", "Deutsch 0 1 2 3\n", "English 4 5 6 7\n", "Français 8 9 10 11" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3 = pd.DataFrame(\n", " np.arange(12).reshape((3, 4)),\n", " index=[\"Deutsch\", \"English\", \"Français\"],\n", " columns=[1, 2, 3, 4],\n", ")\n", "\n", "df3" ] }, { "cell_type": "markdown", "id": "4fab6848", "metadata": {}, "source": [ "### Achsenindizes umbenennen mit `map`" ] }, { "cell_type": "markdown", "id": "0a56efe9", "metadata": {}, "source": [ "Wie `Series` haben auch die Achsenindizes eine `map`-Methode:" ] }, { "cell_type": "code", "execution_count": 26, "id": "499cd60e", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.178772Z", "iopub.status.busy": "2026-05-21T14:10:22.178686Z", "iopub.status.idle": "2026-05-21T14:10:22.181763Z", "shell.execute_reply": "2026-05-21T14:10:22.181466Z", "shell.execute_reply.started": "2026-05-21T14:10:22.178765Z" } }, "outputs": [ { "data": { "text/plain": [ "Index(['DE', 'EN', 'FR'], dtype='object')" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def transform(x):\n", " return x[:2].upper()\n", "\n", "\n", "df3.index.map(transform)" ] }, { "cell_type": "markdown", "id": "4ea516cd", "metadata": {}, "source": [ "Ihr könnt den Index zuweisen und den DataFrame an Ort und Stelle ändern:" ] }, { "cell_type": "code", "execution_count": 27, "id": "420974c2", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.182401Z", "iopub.status.busy": "2026-05-21T14:10:22.182328Z", "iopub.status.idle": "2026-05-21T14:10:22.185810Z", "shell.execute_reply": "2026-05-21T14:10:22.185511Z", "shell.execute_reply.started": "2026-05-21T14:10:22.182394Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
1234
DE0123
EN4567
FR891011
\n", "
" ], "text/plain": [ " 1 2 3 4\n", "DE 0 1 2 3\n", "EN 4 5 6 7\n", "FR 8 9 10 11" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.index = df3.index.map(transform)\n", "\n", "df3" ] }, { "cell_type": "markdown", "id": "01537fef", "metadata": {}, "source": [ "### Achsenindizes umbenennen mit `rename`" ] }, { "cell_type": "markdown", "id": "52ff53e8", "metadata": {}, "source": [ "Wenn ihr eine umgewandelte Version eures Datensatzes erstellen möchtet ohne das Original zu verändern, könnt ihr `rename` verwenden:" ] }, { "cell_type": "code", "execution_count": 28, "id": "787c59fc", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.186253Z", "iopub.status.busy": "2026-05-21T14:10:22.186171Z", "iopub.status.idle": "2026-05-21T14:10:22.189331Z", "shell.execute_reply": "2026-05-21T14:10:22.189076Z", "shell.execute_reply.started": "2026-05-21T14:10:22.186246Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
1234
de0123
en4567
fr891011
\n", "
" ], "text/plain": [ " 1 2 3 4\n", "de 0 1 2 3\n", "en 4 5 6 7\n", "fr 8 9 10 11" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.rename(index=str.lower)" ] }, { "cell_type": "markdown", "id": "2d859fbb", "metadata": {}, "source": [ "Insbesondere kann `rename` in Verbindung mit einem `dict`-ähnlichen Objekt verwendet werden, das neue Werte für eine Teilmenge der Achsenbeschriftungen liefert:" ] }, { "cell_type": "code", "execution_count": 29, "id": "836238ff", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.189630Z", "iopub.status.busy": "2026-05-21T14:10:22.189566Z", "iopub.status.idle": "2026-05-21T14:10:22.192560Z", "shell.execute_reply": "2026-05-21T14:10:22.192267Z", "shell.execute_reply.started": "2026-05-21T14:10:22.189624Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123
BE0123
DE4567
EN891011
\n", "
" ], "text/plain": [ " 0 1 2 3\n", "BE 0 1 2 3\n", "DE 4 5 6 7\n", "EN 8 9 10 11" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.rename(\n", " index={\"DE\": \"BE\", \"EN\": \"DE\", \"FR\": \"EN\"},\n", " columns={1: 0, 2: 1, 3: 2, 4: 3},\n", ")" ] }, { "cell_type": "markdown", "id": "359c3066", "metadata": {}, "source": [ "`rename` erspart euch das manuelle Kopieren des DataFrame und die Zuweisung seiner Index- und Spaltenattribute. Wenn ihr einen Datensatz an Ort und Stelle ändern möchtet, übergebt zusätzlich noch `inplace=True`:" ] }, { "cell_type": "code", "execution_count": 30, "id": "70d80d1f", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.193159Z", "iopub.status.busy": "2026-05-21T14:10:22.193020Z", "iopub.status.idle": "2026-05-21T14:10:22.195956Z", "shell.execute_reply": "2026-05-21T14:10:22.195733Z", "shell.execute_reply.started": "2026-05-21T14:10:22.193151Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123
BE0123
DE4567
EN891011
\n", "
" ], "text/plain": [ " 0 1 2 3\n", "BE 0 1 2 3\n", "DE 4 5 6 7\n", "EN 8 9 10 11" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.rename(\n", " index={\"DE\": \"BE\", \"EN\": \"DE\", \"FR\": \"EN\"},\n", " columns={1: 0, 2: 1, 3: 2, 4: 3},\n", " inplace=True,\n", ")\n", "\n", "df3" ] }, { "cell_type": "markdown", "id": "32e3c948", "metadata": {}, "source": [ "## Hierarchische Indizierung\n", "\n", "Die hierarchische Indizierung ist eine wichtige Funktion von pandas, die euch ermöglicht, mehrere Indexebenen auf einer Achse zu haben. Dies bietet euch die Möglichkeit, mit höherdimensionalen Daten in einer niedrigdimensionalen Form zu arbeiten.\n", "\n", "Beginnen wir mit einem einfachen Beispiel: Erstellen wir eine Reihe Liste von Listen als Index:" ] }, { "cell_type": "code", "execution_count": 31, "id": "4ad2466c", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.196375Z", "iopub.status.busy": "2026-05-21T14:10:22.196308Z", "iopub.status.idle": "2026-05-21T14:10:22.199689Z", "shell.execute_reply": "2026-05-21T14:10:22.199409Z", "shell.execute_reply.started": "2026-05-21T14:10:22.196369Z" } }, "outputs": [ { "data": { "text/plain": [ "Jupyter Tutorial de 83080\n", " en 20336\n", "PyViz Tutorial de 11376\n", "Python Basics de 1228\n", " en 468\n", "dtype: int64" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits = pd.Series(\n", " [83080, 20336, 11376, 1228, 468],\n", " index=[\n", " [\n", " \"Jupyter Tutorial\",\n", " \"Jupyter Tutorial\",\n", " \"PyViz Tutorial\",\n", " \"Python Basics\",\n", " \"Python Basics\",\n", " ],\n", " [\"de\", \"en\", \"de\", \"de\", \"en\"],\n", " ],\n", ")\n", "\n", "hits" ] }, { "cell_type": "markdown", "id": "3f6fb556", "metadata": {}, "source": [ "Was ihr seht, ist eine graphische Ansicht einer Serie mit einem [pandas.MultiIndex](https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.html). Die „Lücken“ in der Indexanzeige bedeuten, dass die Beschriftung darüber verwendet werden soll." ] }, { "cell_type": "code", "execution_count": 32, "id": "c9d07965", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.203164Z", "iopub.status.busy": "2026-05-21T14:10:22.203046Z", "iopub.status.idle": "2026-05-21T14:10:22.205404Z", "shell.execute_reply": "2026-05-21T14:10:22.205192Z", "shell.execute_reply.started": "2026-05-21T14:10:22.203155Z" } }, "outputs": [ { "data": { "text/plain": [ "MultiIndex([('Jupyter Tutorial', 'de'),\n", " ('Jupyter Tutorial', 'en'),\n", " ( 'PyViz Tutorial', 'de'),\n", " ( 'Python Basics', 'de'),\n", " ( 'Python Basics', 'en')],\n", " )" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits.index" ] }, { "cell_type": "markdown", "id": "b6f9ae53", "metadata": {}, "source": [ "Bei einem hierarchisch indizierten Objekt ist eine so genannte partielle Indizierung möglich, mit der ihr Teilmengen der Daten gezielt auswählen könnt:" ] }, { "cell_type": "code", "execution_count": 33, "id": "236281de", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.205745Z", "iopub.status.busy": "2026-05-21T14:10:22.205658Z", "iopub.status.idle": "2026-05-21T14:10:22.208137Z", "shell.execute_reply": "2026-05-21T14:10:22.207905Z", "shell.execute_reply.started": "2026-05-21T14:10:22.205738Z" } }, "outputs": [ { "data": { "text/plain": [ "de 83080\n", "en 20336\n", "dtype: int64" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits[\"Jupyter Tutorial\"]" ] }, { "cell_type": "code", "execution_count": 34, "id": "9f475930", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.208655Z", "iopub.status.busy": "2026-05-21T14:10:22.208541Z", "iopub.status.idle": "2026-05-21T14:10:22.211466Z", "shell.execute_reply": "2026-05-21T14:10:22.211173Z", "shell.execute_reply.started": "2026-05-21T14:10:22.208643Z" } }, "outputs": [ { "data": { "text/plain": [ "Jupyter Tutorial de 83080\n", " en 20336\n", "PyViz Tutorial de 11376\n", "Python Basics de 1228\n", " en 468\n", "dtype: int64" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits[\"Jupyter Tutorial\":\"Python Basics\"]" ] }, { "cell_type": "code", "execution_count": 35, "id": "58713388", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.211920Z", "iopub.status.busy": "2026-05-21T14:10:22.211845Z", "iopub.status.idle": "2026-05-21T14:10:22.222539Z", "shell.execute_reply": "2026-05-21T14:10:22.222317Z", "shell.execute_reply.started": "2026-05-21T14:10:22.211913Z" } }, "outputs": [ { "data": { "text/plain": [ "Jupyter Tutorial de 83080\n", " en 20336\n", "Python Basics de 1228\n", " en 468\n", "dtype: int64" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits.loc[[\"Jupyter Tutorial\", \"Python Basics\"]]" ] }, { "cell_type": "markdown", "id": "3b0bef0a", "metadata": {}, "source": [ "Die Auswahl ist sogar von einer „inneren“ Ebene aus möglich. Im folgenden wähle ich alle Werte mit dem Wert `1` aus der zweiten Indexebene aus:" ] }, { "cell_type": "code", "execution_count": 36, "id": "f2e27088", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.223160Z", "iopub.status.busy": "2026-05-21T14:10:22.223077Z", "iopub.status.idle": "2026-05-21T14:10:22.225768Z", "shell.execute_reply": "2026-05-21T14:10:22.225562Z", "shell.execute_reply.started": "2026-05-21T14:10:22.223152Z" } }, "outputs": [ { "data": { "text/plain": [ "Jupyter Tutorial 83080\n", "PyViz Tutorial 11376\n", "Python Basics 1228\n", "dtype: int64" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits.loc[:, \"de\"]" ] }, { "cell_type": "markdown", "id": "a6faf883", "metadata": {}, "source": [ "Bei einem DataFrame kann jede Achse einen hierarchischen Index haben:" ] }, { "cell_type": "code", "execution_count": 37, "id": "0e5c2c2f", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.226301Z", "iopub.status.busy": "2026-05-21T14:10:22.226168Z", "iopub.status.idle": "2026-05-21T14:10:22.231490Z", "shell.execute_reply": "2026-05-21T14:10:22.231274Z", "shell.execute_reply.started": "2026-05-21T14:10:22.226291Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
12/202101/202202/2022
lateststablelateststablelateststable
Jupyter Tutorialde196510301340332950
en472218253497257640093707
PyViz Tutorialde257304873039300
Python Basicsde525042702760
en15708502260
\n", "
" ], "text/plain": [ " 12/2021 01/2022 02/2022 \n", " latest stable latest stable latest stable\n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", " en 4722 1825 3497 2576 4009 3707\n", "PyViz Tutorial de 2573 0 4873 0 3930 0\n", "Python Basics de 525 0 427 0 276 0\n", " en 157 0 85 0 226 0" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "version_hits = [\n", " [19651, 0, 30134, 0, 33295, 0],\n", " [4722, 1825, 3497, 2576, 4009, 3707],\n", " [2573, 0, 4873, 0, 3930, 0],\n", " [525, 0, 427, 0, 276, 0],\n", " [157, 0, 85, 0, 226, 0],\n", "]\n", "\n", "df = pd.DataFrame(\n", " version_hits,\n", " index=[\n", " [\n", " \"Jupyter Tutorial\",\n", " \"Jupyter Tutorial\",\n", " \"PyViz Tutorial\",\n", " \"Python Basics\",\n", " \"Python Basics\",\n", " ],\n", " [\"de\", \"en\", \"de\", \"de\", \"en\"],\n", " ],\n", " columns=[\n", " [\"12/2021\", \"12/2021\", \"01/2022\", \"01/2022\", \"02/2022\", \"02/2022\"],\n", " [\"latest\", \"stable\", \"latest\", \"stable\", \"latest\", \"stable\"],\n", " ],\n", ")\n", "\n", "df" ] }, { "cell_type": "markdown", "id": "699b27a8", "metadata": {}, "source": [ "Die Hierarchieebenen können Namen haben (als Zeichenketten oder beliebige Python-Objekte). Wenn dies der Fall ist, werden diese in der Konsolenausgabe angezeigt:" ] }, { "cell_type": "code", "execution_count": 38, "id": "69553366", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.232236Z", "iopub.status.busy": "2026-05-21T14:10:22.232056Z", "iopub.status.idle": "2026-05-21T14:10:22.236545Z", "shell.execute_reply": "2026-05-21T14:10:22.236281Z", "shell.execute_reply.started": "2026-05-21T14:10:22.232228Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
en472218253497257640093707
PyViz Tutorialde257304873039300
Python Basicsde525042702760
en15708502260
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", " en 4722 1825 3497 2576 4009 3707\n", "PyViz Tutorial de 2573 0 4873 0 3930 0\n", "Python Basics de 525 0 427 0 276 0\n", " en 157 0 85 0 226 0" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index.names = [\"Title\", \"Language\"]\n", "df.columns.names = [\"Month\", \"Version\"]\n", "\n", "df" ] }, { "cell_type": "markdown", "id": "7de31d9b", "metadata": {}, "source": [ "
\n", "\n", "**Warnung:**\n", "\n", "Achtet darauf, dass die Indexnamen `Month` und `Version` nicht Teil der Zeilenbezeichnungen (der `df.index`-Werte) sind.\n", "
" ] }, { "cell_type": "markdown", "id": "954d25b6", "metadata": {}, "source": [ "Mit [pandas.MultiIndex.from_arrays](https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.from_arrays.html) kann selbst ein `MultiIndex` erstellt und dann wiederverwendet werden; die Spalten im vorangehenden DataFrame mit Ebenennamen könnten so erstellt werden:" ] }, { "cell_type": "code", "execution_count": 39, "id": "1d1937f8", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.237036Z", "iopub.status.busy": "2026-05-21T14:10:22.236952Z", "iopub.status.idle": "2026-05-21T14:10:22.239790Z", "shell.execute_reply": "2026-05-21T14:10:22.239513Z", "shell.execute_reply.started": "2026-05-21T14:10:22.237028Z" } }, "outputs": [ { "data": { "text/plain": [ "MultiIndex([('Jupyter Tutorial', 'de'),\n", " ('Jupyter Tutorial', 'en'),\n", " ( 'PyViz Tutorial', 'de'),\n", " ( 'Python Basics', 'de'),\n", " ( 'Python Basics', 'en')],\n", " names=['Title', 'Language'])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.MultiIndex.from_arrays(\n", " [\n", " [\n", " \"Jupyter Tutorial\",\n", " \"Jupyter Tutorial\",\n", " \"PyViz Tutorial\",\n", " \"Python Basics\",\n", " \"Python Basics\",\n", " ],\n", " [\"de\", \"en\", \"de\", \"de\", \"en\"],\n", " ],\n", " names=[\"Title\", \"Language\"],\n", ")" ] }, { "cell_type": "markdown", "id": "e39fed36", "metadata": {}, "source": [ "Mit der Teilspaltenindizierung könnt ihr auf ähnliche Weise Spaltengruppen auswählen:" ] }, { "cell_type": "code", "execution_count": 40, "id": "ef2ab0ac", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.240143Z", "iopub.status.busy": "2026-05-21T14:10:22.240073Z", "iopub.status.idle": "2026-05-21T14:10:22.244097Z", "shell.execute_reply": "2026-05-21T14:10:22.243864Z", "shell.execute_reply.started": "2026-05-21T14:10:22.240136Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
Language
de196510301340332950
en472218253497257640093707
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Language \n", "de 19651 0 30134 0 33295 0\n", "en 4722 1825 3497 2576 4009 3707" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"Jupyter Tutorial\"]" ] }, { "cell_type": "code", "execution_count": 41, "id": "02352536", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.244527Z", "iopub.status.busy": "2026-05-21T14:10:22.244448Z", "iopub.status.idle": "2026-05-21T14:10:22.248235Z", "shell.execute_reply": "2026-05-21T14:10:22.247997Z", "shell.execute_reply.started": "2026-05-21T14:10:22.244521Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
PyViz Tutorialde257304873039300
Python Basicsde525042702760
en15708502260
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "PyViz Tutorial de 2573 0 4873 0 3930 0\n", "Python Basics de 525 0 427 0 276 0\n", " en 157 0 85 0 226 0" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[\"PyViz Tutorial\":\"Python Basics\"]" ] }, { "cell_type": "markdown", "id": "a6f861ad", "metadata": {}, "source": [ "`loc` mit Liste von Tupeln:" ] }, { "cell_type": "code", "execution_count": 42, "id": "2cd1a69b", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.248636Z", "iopub.status.busy": "2026-05-21T14:10:22.248533Z", "iopub.status.idle": "2026-05-21T14:10:22.253391Z", "shell.execute_reply": "2026-05-21T14:10:22.253128Z", "shell.execute_reply.started": "2026-05-21T14:10:22.248626Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
PyViz Tutorialde257304873039300
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", "PyViz Tutorial de 2573 0 4873 0 3930 0" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[[(\"Jupyter Tutorial\", \"de\"), (\"PyViz Tutorial\", \"de\")]]" ] }, { "cell_type": "markdown", "id": "d05625c8", "metadata": {}, "source": [ "`loc` mit Liste des zweiten Spaltenindex:" ] }, { "cell_type": "code", "execution_count": 43, "id": "13697633", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.253913Z", "iopub.status.busy": "2026-05-21T14:10:22.253826Z", "iopub.status.idle": "2026-05-21T14:10:22.257819Z", "shell.execute_reply": "2026-05-21T14:10:22.257571Z", "shell.execute_reply.started": "2026-05-21T14:10:22.253906Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
PyViz Tutorialde257304873039300
Python Basicsde525042702760
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", "PyViz Tutorial de 2573 0 4873 0 3930 0\n", "Python Basics de 525 0 427 0 276 0" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[:, [\"de\"], :]" ] }, { "cell_type": "markdown", "id": "789990de", "metadata": {}, "source": [ "Verwenden von `slice`:" ] }, { "cell_type": "code", "execution_count": 44, "id": "17d26226", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.258219Z", "iopub.status.busy": "2026-05-21T14:10:22.258144Z", "iopub.status.idle": "2026-05-21T14:10:22.261571Z", "shell.execute_reply": "2026-05-21T14:10:22.261351Z", "shell.execute_reply.started": "2026-05-21T14:10:22.258212Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
en472218253497257640093707
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", " en 4722 1825 3497 2576 4009 3707" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[slice(\"Jupyter Tutorial\"), :]" ] }, { "cell_type": "code", "execution_count": 45, "id": "229f0ed6", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.261957Z", "iopub.status.busy": "2026-05-21T14:10:22.261890Z", "iopub.status.idle": "2026-05-21T14:10:22.265675Z", "shell.execute_reply": "2026-05-21T14:10:22.265433Z", "shell.execute_reply.started": "2026-05-21T14:10:22.261950Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
en472218253497257640093707
PyViz Tutorialde257304873039300
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", " en 4722 1825 3497 2576 4009 3707\n", "PyViz Tutorial de 2573 0 4873 0 3930 0" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[slice(\"Jupyter Tutorial\", \"PyViz Tutorial\"), :]" ] }, { "cell_type": "code", "execution_count": 46, "id": "c7be5504", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.266132Z", "iopub.status.busy": "2026-05-21T14:10:22.266027Z", "iopub.status.idle": "2026-05-21T14:10:22.270134Z", "shell.execute_reply": "2026-05-21T14:10:22.269886Z", "shell.execute_reply.started": "2026-05-21T14:10:22.266122Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Versionlateststable
TitleLanguage
Jupyter Tutorialde196510
en47221825
PyViz Tutorialde25730
\n", "
" ], "text/plain": [ "Version latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0\n", " en 4722 1825\n", "PyViz Tutorial de 2573 0" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[slice(\"Jupyter Tutorial\", \"PyViz Tutorial\"), \"12/2021\"]" ] }, { "cell_type": "markdown", "id": "8af7f0ef", "metadata": {}, "source": [ "Verwenden von `slice`, und Listen:" ] }, { "cell_type": "code", "execution_count": 47, "id": "b41f4c64", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.270627Z", "iopub.status.busy": "2026-05-21T14:10:22.270532Z", "iopub.status.idle": "2026-05-21T14:10:22.274092Z", "shell.execute_reply": "2026-05-21T14:10:22.273802Z", "shell.execute_reply.started": "2026-05-21T14:10:22.270619Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
PyViz Tutorialde257304873039300
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", "PyViz Tutorial de 2573 0 4873 0 3930 0" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[slice(\"Jupyter Tutorial\", \"PyViz Tutorial\"), [\"de\"], :]" ] }, { "cell_type": "markdown", "id": "ccf2f979", "metadata": {}, "source": [ "## Ansicht vs. Kopie\n", "\n", "In Pandas hängt es von der Struktur und den Datentypen des ursprünglichen DataFrame ab, ob ihr einen View erhaltet oder nicht – und ob Änderungen, die an einem View vorgenommen werden, in den ursprünglichen DataFrame zurück übertragen werden." ] }, { "cell_type": "markdown", "id": "c2226f83", "metadata": {}, "source": [ "
\n", " \n", "**Siehe auch:**\n", "\n", "* [Returning a view versus a copy](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy)\n", "* [Views and Copies in pandas](https://www.practicaldatascience.org/html/views_and_copies_in_pandas.html)\n", "\n", "
" ] }, { "cell_type": "markdown", "id": "0ce285b0", "metadata": {}, "source": [ "### `stack` und `unstack`" ] }, { "cell_type": "markdown", "id": "fa5bc992", "metadata": {}, "source": [ "Die hierarchische Indizierung spielt eine wichtige Rolle bei der Umformung von Daten und gruppenbasierten Operationen wie der Bildung einer [Pivot-Tabelle](https://de.wikipedia.org/wiki/Pivot-Tabelle). Zum Beispiel könnt ihr diese Daten mit der [pandas.Series.unstack](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.unstack.html)-Methode in einen DataFrame umordnen:" ] }, { "cell_type": "code", "execution_count": 48, "id": "0e0590c0", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.274473Z", "iopub.status.busy": "2026-05-21T14:10:22.274396Z", "iopub.status.idle": "2026-05-21T14:10:22.278160Z", "shell.execute_reply": "2026-05-21T14:10:22.277912Z", "shell.execute_reply.started": "2026-05-21T14:10:22.274466Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
deen
Jupyter Tutorial83080.020336.0
PyViz Tutorial11376.0NaN
Python Basics1228.0468.0
\n", "
" ], "text/plain": [ " de en\n", "Jupyter Tutorial 83080.0 20336.0\n", "PyViz Tutorial 11376.0 NaN\n", "Python Basics 1228.0 468.0" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits.unstack()" ] }, { "cell_type": "markdown", "id": "c0bb185b", "metadata": {}, "source": [ "Die umgekehrte Operation von unstack ist [stack](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.stack.html):" ] }, { "cell_type": "code", "execution_count": 49, "id": "f574e228", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.278594Z", "iopub.status.busy": "2026-05-21T14:10:22.278527Z", "iopub.status.idle": "2026-05-21T14:10:22.281973Z", "shell.execute_reply": "2026-05-21T14:10:22.281663Z", "shell.execute_reply.started": "2026-05-21T14:10:22.278588Z" } }, "outputs": [ { "data": { "text/plain": [ "Jupyter Tutorial de 83080.0\n", " en 20336.0\n", "PyViz Tutorial de 11376.0\n", "Python Basics de 1228.0\n", " en 468.0\n", "dtype: float64" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hits.unstack().stack()" ] }, { "cell_type": "markdown", "id": "206164b3", "metadata": {}, "source": [ "### Umordnen und Sortieren von Ebenen\n", "\n", "Es kann vorkommen, dass ihr die Reihenfolge der Ebenen auf einer Achse neu anordnen oder die Daten nach den Werten in einer bestimmten Ebene sortieren wollt. Die Funktion [pandas.DataFrame.swaplevel](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.swaplevel.html) nimmt zwei Ebenennummern oder -namen entgegen und gibt ein neues Objekt zurück, in dem die Ebenen vertauscht sind (die Daten bleiben jedoch unverändert):" ] }, { "cell_type": "code", "execution_count": 50, "id": "aa471216", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.282499Z", "iopub.status.busy": "2026-05-21T14:10:22.282426Z", "iopub.status.idle": "2026-05-21T14:10:22.286431Z", "shell.execute_reply": "2026-05-21T14:10:22.286177Z", "shell.execute_reply.started": "2026-05-21T14:10:22.282493Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
LanguageTitle
deJupyter Tutorial196510301340332950
enJupyter Tutorial472218253497257640093707
dePyViz Tutorial257304873039300
Python Basics525042702760
enPython Basics15708502260
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Language Title \n", "de Jupyter Tutorial 19651 0 30134 0 33295 0\n", "en Jupyter Tutorial 4722 1825 3497 2576 4009 3707\n", "de PyViz Tutorial 2573 0 4873 0 3930 0\n", " Python Basics 525 0 427 0 276 0\n", "en Python Basics 157 0 85 0 226 0" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.swaplevel(\"Language\", \"Title\")" ] }, { "cell_type": "markdown", "id": "10929420", "metadata": {}, "source": [ "[pandas.DataFrame.sort_index](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_index.html) hingegen sortiert die Daten nur nach den Werten in einer einzigen Ebene. Beim Vertauschen von Ebenen ist es nicht unüblich, auch `sort_index` zu verwenden, damit das Ergebnis lexikografisch nach der angegebenen Ebene sortiert wird:" ] }, { "cell_type": "code", "execution_count": 51, "id": "844f3d38", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.286915Z", "iopub.status.busy": "2026-05-21T14:10:22.286828Z", "iopub.status.idle": "2026-05-21T14:10:22.291134Z", "shell.execute_reply": "2026-05-21T14:10:22.290862Z", "shell.execute_reply.started": "2026-05-21T14:10:22.286908Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
TitleLanguage
Jupyter Tutorialde196510301340332950
en472218253497257640093707
PyViz Tutorialde257304873039300
Python Basicsde525042702760
en15708502260
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Title Language \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", " en 4722 1825 3497 2576 4009 3707\n", "PyViz Tutorial de 2573 0 4873 0 3930 0\n", "Python Basics de 525 0 427 0 276 0\n", " en 157 0 85 0 226 0" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.sort_index(level=0)" ] }, { "cell_type": "code", "execution_count": 52, "id": "b741feca", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.291595Z", "iopub.status.busy": "2026-05-21T14:10:22.291521Z", "iopub.status.idle": "2026-05-21T14:10:22.295396Z", "shell.execute_reply": "2026-05-21T14:10:22.295173Z", "shell.execute_reply.started": "2026-05-21T14:10:22.291589Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
LanguageTitle
deJupyter Tutorial196510301340332950
PyViz Tutorial257304873039300
Python Basics525042702760
enJupyter Tutorial472218253497257640093707
Python Basics15708502260
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Language Title \n", "de Jupyter Tutorial 19651 0 30134 0 33295 0\n", " PyViz Tutorial 2573 0 4873 0 3930 0\n", " Python Basics 525 0 427 0 276 0\n", "en Jupyter Tutorial 4722 1825 3497 2576 4009 3707\n", " Python Basics 157 0 85 0 226 0" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.swaplevel(0, 1).sort_index(level=0)" ] }, { "cell_type": "markdown", "id": "95a5f2dc", "metadata": {}, "source": [ "
\n", "\n", "**Hinweis:**\n", "\n", "Die Leistung der Datenauswahl ist bei hierarchisch indizierten Objekten wesentlich besser, wenn der Index lexikografisch sortiert ist, beginnend mit der äußersten Ebene, d.h. dem Ergebnis des Aufrufs von `sort_index(level=0)` oder `sort_index()`.\n", "
" ] }, { "cell_type": "markdown", "id": "4833f262", "metadata": {}, "source": [ "### Zusammenfassende Statistiken nach Ebene\n", "\n", "Viele deskriptive und zusammenfassende Statistiken für `DataFrame` und `Series` verfügen über eine Ebenenoption, mit der die Ebene angeben können, nach der ihr auf einer bestimmten Achse aggregieren könnt. Betrachtet den obigen `DataFrame`; wir können entweder die Zeilen oder die Spalten nach der Ebene aggregieren wie folgt:" ] }, { "cell_type": "code", "execution_count": 53, "id": "559cc22f", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.295848Z", "iopub.status.busy": "2026-05-21T14:10:22.295760Z", "iopub.status.idle": "2026-05-21T14:10:22.300033Z", "shell.execute_reply": "2026-05-21T14:10:22.299779Z", "shell.execute_reply.started": "2026-05-21T14:10:22.295841Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month12/202101/202202/2022
Versionlateststablelateststablelateststable
Language
de227490354340375010
en487918253582257642353707
\n", "
" ], "text/plain": [ "Month 12/2021 01/2022 02/2022 \n", "Version latest stable latest stable latest stable\n", "Language \n", "de 22749 0 35434 0 37501 0\n", "en 4879 1825 3582 2576 4235 3707" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.groupby(level=\"Language\").sum()" ] }, { "cell_type": "code", "execution_count": 54, "id": "7216d2b1", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.300431Z", "iopub.status.busy": "2026-05-21T14:10:22.300346Z", "iopub.status.idle": "2026-05-21T14:10:22.304195Z", "shell.execute_reply": "2026-05-21T14:10:22.303889Z", "shell.execute_reply.started": "2026-05-21T14:10:22.300424Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/var/folders/hk/s8m0bblj0g10hw885gld52mc0000gn/T/ipykernel_37744/3343480277.py:1: FutureWarning: DataFrame.groupby with axis=1 is deprecated. Do `frame.T.groupby(...)` without axis instead.\n", " df.groupby(level=\"Month\", axis=1).sum()\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Month01/202202/202212/2021
TitleLanguage
Jupyter Tutorialde301343329519651
en607377166547
PyViz Tutorialde487339302573
Python Basicsde427276525
en85226157
\n", "
" ], "text/plain": [ "Month 01/2022 02/2022 12/2021\n", "Title Language \n", "Jupyter Tutorial de 30134 33295 19651\n", " en 6073 7716 6547\n", "PyViz Tutorial de 4873 3930 2573\n", "Python Basics de 427 276 525\n", " en 85 226 157" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.groupby(level=\"Month\", axis=1).sum()" ] }, { "cell_type": "markdown", "id": "03ba57fe", "metadata": {}, "source": [ "Intern wird dazu die [pandas.DataFrame.groupby](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html)-Maschinerie von Pandas verwendet, die in [Gruppenoperationen](group-operations.ipynb) näher erläutert wird." ] }, { "cell_type": "markdown", "id": "0a1ce4bc", "metadata": {}, "source": [ "### Indizierung mit den Spalten eines DataFrame\n", "\n", "Es ist nicht ungewöhnlich, eine oder mehrere Spalten eines DataFrame als Zeilenindex zu verwenden; alternativ könnt ihr den Zeilenindex auch in die Spalten des DataFrame verschieben. Hier ist ein Beispiel-DataFrame:" ] }, { "cell_type": "code", "execution_count": 55, "id": "093acd84", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.304529Z", "iopub.status.busy": "2026-05-21T14:10:22.304465Z", "iopub.status.idle": "2026-05-21T14:10:22.308014Z", "shell.execute_reply": "2026-05-21T14:10:22.307804Z", "shell.execute_reply.started": "2026-05-21T14:10:22.304523Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01234567
0Jupyter Tutorialde196510301340332950
1Jupyter Tutorialen472218253497257640093707
2PyViz Tutorialde257304873039300
3Python Basicsde525042702760
4Python Basicsen15708502260
\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 6 7\n", "0 Jupyter Tutorial de 19651 0 30134 0 33295 0\n", "1 Jupyter Tutorial en 4722 1825 3497 2576 4009 3707\n", "2 PyViz Tutorial de 2573 0 4873 0 3930 0\n", "3 Python Basics de 525 0 427 0 276 0\n", "4 Python Basics en 157 0 85 0 226 0" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = [\n", " [\"Jupyter Tutorial\", \"de\", 19651, 0, 30134, 0, 33295, 0],\n", " [\"Jupyter Tutorial\", \"en\", 4722, 1825, 3497, 2576, 4009, 3707],\n", " [\"PyViz Tutorial\", \"de\", 2573, 0, 4873, 0, 3930, 0],\n", " [\"Python Basics\", \"de\", 525, 0, 427, 0, 276, 0],\n", " [\"Python Basics\", \"en\", 157, 0, 85, 0, 226, 0],\n", "]\n", "\n", "df = pd.DataFrame(data)\n", "\n", "df" ] }, { "cell_type": "markdown", "id": "dec963df", "metadata": {}, "source": [ "Die Funktion [pandas.DataFrame.set_index](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.set_index.html) erstellt einen neuen DataFrame, der eine oder mehrere seiner Spalten als Index verwendet:" ] }, { "cell_type": "code", "execution_count": 56, "id": "073ce989", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.308643Z", "iopub.status.busy": "2026-05-21T14:10:22.308389Z", "iopub.status.idle": "2026-05-21T14:10:22.312598Z", "shell.execute_reply": "2026-05-21T14:10:22.312378Z", "shell.execute_reply.started": "2026-05-21T14:10:22.308630Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
234567
01
Jupyter Tutorialde196510301340332950
en472218253497257640093707
PyViz Tutorialde257304873039300
Python Basicsde525042702760
en15708502260
\n", "
" ], "text/plain": [ " 2 3 4 5 6 7\n", "0 1 \n", "Jupyter Tutorial de 19651 0 30134 0 33295 0\n", " en 4722 1825 3497 2576 4009 3707\n", "PyViz Tutorial de 2573 0 4873 0 3930 0\n", "Python Basics de 525 0 427 0 276 0\n", " en 157 0 85 0 226 0" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = df.set_index([0, 1])\n", "\n", "df2" ] }, { "cell_type": "markdown", "id": "5a441aa3", "metadata": {}, "source": [ "Standardmäßig werden die Spalten aus dem DataFrame entfernt, Ihr könnt sie aber auch drin lassen, indem ihr `drop=False` an `set_index` übergebt:" ] }, { "cell_type": "code", "execution_count": 57, "id": "b163b1bb", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.312957Z", "iopub.status.busy": "2026-05-21T14:10:22.312891Z", "iopub.status.idle": "2026-05-21T14:10:22.317033Z", "shell.execute_reply": "2026-05-21T14:10:22.316804Z", "shell.execute_reply.started": "2026-05-21T14:10:22.312950Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01234567
01
Jupyter TutorialdeJupyter Tutorialde196510301340332950
enJupyter Tutorialen472218253497257640093707
PyViz TutorialdePyViz Tutorialde257304873039300
Python BasicsdePython Basicsde525042702760
enPython Basicsen15708502260
\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 6 \\\n", "0 1 \n", "Jupyter Tutorial de Jupyter Tutorial de 19651 0 30134 0 33295 \n", " en Jupyter Tutorial en 4722 1825 3497 2576 4009 \n", "PyViz Tutorial de PyViz Tutorial de 2573 0 4873 0 3930 \n", "Python Basics de Python Basics de 525 0 427 0 276 \n", " en Python Basics en 157 0 85 0 226 \n", "\n", " 7 \n", "0 1 \n", "Jupyter Tutorial de 0 \n", " en 3707 \n", "PyViz Tutorial de 0 \n", "Python Basics de 0 \n", " en 0 " ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.set_index([0, 1], drop=False)" ] }, { "cell_type": "markdown", "id": "40abeb1e", "metadata": {}, "source": [ "[pandas.DataFrame.reset_index](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.reset_index.html) hingegen bewirkt das Gegenteil von `set_index`; die hierarchischen Indexebenen werden in die Spalten verschoben:" ] }, { "cell_type": "code", "execution_count": 58, "id": "6e70c0e9", "metadata": { "execution": { "iopub.execute_input": "2026-05-21T14:10:22.317482Z", "iopub.status.busy": "2026-05-21T14:10:22.317417Z", "iopub.status.idle": "2026-05-21T14:10:22.320701Z", "shell.execute_reply": "2026-05-21T14:10:22.320451Z", "shell.execute_reply.started": "2026-05-21T14:10:22.317475Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01234567
0Jupyter Tutorialde196510301340332950
1Jupyter Tutorialen472218253497257640093707
2PyViz Tutorialde257304873039300
3Python Basicsde525042702760
4Python Basicsen15708502260
\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 6 7\n", "0 Jupyter Tutorial de 19651 0 30134 0 33295 0\n", "1 Jupyter Tutorial en 4722 1825 3497 2576 4009 3707\n", "2 PyViz Tutorial de 2573 0 4873 0 3930 0\n", "3 Python Basics de 525 0 427 0 276 0\n", "4 Python Basics en 157 0 85 0 226 0" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2.reset_index()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.13 Kernel", "language": "python", "name": "python313" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }