2019-2020/SNT/Donnes_Traitement/Traitement de données.ipynb
2020-05-05 09:53:14 +02:00

1473 lines
46 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Traitement de données avec python\n",
"\n",
"Dans ce TP, vous allez explorer concernant la population de toutes les villes française dans un premier temps puis les festivals.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Villes de France\n",
"\n",
"À côté de ce document, vous trouverez un document `villes.csv` . Le copier dans son dossier personnel puis l'ouvrir avec notepad++\n",
"\n",
"### Description des données\n",
"\n",
"1. Décrire le format du fichier.\n",
"2. Quelles sont les informations stockées dans ce fichier?\n",
"3. À quoi correspond chaque ligne?\n",
"3. Combien de lignes va-t-on pouvoir étudier?\n",
"\n",
"4. Rédiger deux questions qui pourrait être répondu en utilisant ce tableau.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Ouverture et manipulation avec python\n",
"\n",
"Ouvrir pythonedu, copier/coller le programme ci-dessous et sauvegarder votre script (vide) dans le même dossier où vous avez mis le fichier `villes.csv`.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"villes = pd.read_csv(\"villes.csv\")\n",
"print(villes)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"6. Executer ce programme. Que fait-il? Commenter ce qui s'est écrit dans la console."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Festivals en France\n",
"\n",
"1. Aller sur la [plateforme d'opendata de l'état français](https://www.data.gouv.fr/fr/) et trouver des données sur les festivals de France\n",
"2. Décrire les données trouvée\n",
"3. Trouver 2 questions et essayer d'y répondre en utilisant Python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Mémo Python et pandas"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Séléctionner une information\n",
"\n",
"- Séléctionner une ligne (ici la ligne 2)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dep 1\n",
"nom Plagne\n",
"cp 1130\n",
"nb_hab_2010 129\n",
"nb_hab_1999 83\n",
"nb_hab_2012 100\n",
"dens 20\n",
"surf 6.2\n",
"long 5.73333\n",
"lat 46.1833\n",
"alt_min 560\n",
"alt_max 922\n",
"Name: 2, dtype: object"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes.iloc[2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Séléctionner une seule information (ici `cp` de la ligne 2)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'1130'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes.loc[2,'cp']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Séléctionner tout une colonne (ici `nom`)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 Ozan\n",
"1 Cormoranche-sur-Saône\n",
"2 Plagne\n",
"3 Tossiat\n",
"4 Pouillat\n",
" ... \n",
"36695 Sada\n",
"36696 Tsingoni\n",
"36697 Saint-Barthélemy\n",
"36698 Saint-Martin\n",
"36699 Saint-Pierre-et-Miquelon\n",
"Name: nom, Length: 36700, dtype: object"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes[\"nom\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Séléctionner les lignes suivant une caractéristique (ici les villes avec une altitude minimum supérieur à 1500)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>dep</th>\n",
" <th>nom</th>\n",
" <th>cp</th>\n",
" <th>nb_hab_2010</th>\n",
" <th>nb_hab_1999</th>\n",
" <th>nb_hab_2012</th>\n",
" <th>dens</th>\n",
" <th>surf</th>\n",
" <th>long</th>\n",
" <th>lat</th>\n",
" <th>alt_min</th>\n",
" <th>alt_max</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1618</th>\n",
" <td>4</td>\n",
" <td>Larche</td>\n",
" <td>4540</td>\n",
" <td>74</td>\n",
" <td>83</td>\n",
" <td>100</td>\n",
" <td>1</td>\n",
" <td>68.86</td>\n",
" <td>6.85000</td>\n",
" <td>44.4500</td>\n",
" <td>1606.0</td>\n",
" <td>3165.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1790</th>\n",
" <td>5</td>\n",
" <td>Ristolas</td>\n",
" <td>5460</td>\n",
" <td>90</td>\n",
" <td>78</td>\n",
" <td>100</td>\n",
" <td>1</td>\n",
" <td>82.18</td>\n",
" <td>6.95000</td>\n",
" <td>44.7667</td>\n",
" <td>1571.0</td>\n",
" <td>3294.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1798</th>\n",
" <td>5</td>\n",
" <td>Saint-Véran</td>\n",
" <td>5350</td>\n",
" <td>257</td>\n",
" <td>265</td>\n",
" <td>300</td>\n",
" <td>5</td>\n",
" <td>44.75</td>\n",
" <td>6.86667</td>\n",
" <td>44.7000</td>\n",
" <td>1756.0</td>\n",
" <td>3175.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1847</th>\n",
" <td>5</td>\n",
" <td>Molines-en-Queyras</td>\n",
" <td>5350</td>\n",
" <td>315</td>\n",
" <td>322</td>\n",
" <td>300</td>\n",
" <td>5</td>\n",
" <td>53.62</td>\n",
" <td>6.85000</td>\n",
" <td>44.7333</td>\n",
" <td>1625.0</td>\n",
" <td>3160.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1904</th>\n",
" <td>5</td>\n",
" <td>Abriès</td>\n",
" <td>5460</td>\n",
" <td>365</td>\n",
" <td>358</td>\n",
" <td>400</td>\n",
" <td>4</td>\n",
" <td>77.13</td>\n",
" <td>6.93333</td>\n",
" <td>44.7833</td>\n",
" <td>1513.0</td>\n",
" <td>3305.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1923</th>\n",
" <td>5</td>\n",
" <td>Villar-d'Arêne</td>\n",
" <td>5480</td>\n",
" <td>287</td>\n",
" <td>217</td>\n",
" <td>300</td>\n",
" <td>3</td>\n",
" <td>77.51</td>\n",
" <td>6.33711</td>\n",
" <td>45.0423</td>\n",
" <td>1519.0</td>\n",
" <td>3883.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26927</th>\n",
" <td>66</td>\n",
" <td>La Llagonne</td>\n",
" <td>66210</td>\n",
" <td>242</td>\n",
" <td>264</td>\n",
" <td>300</td>\n",
" <td>10</td>\n",
" <td>23.09</td>\n",
" <td>2.11667</td>\n",
" <td>42.5333</td>\n",
" <td>1546.0</td>\n",
" <td>2196.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>26943</th>\n",
" <td>66</td>\n",
" <td>Caudiès-de-Conflent</td>\n",
" <td>66360</td>\n",
" <td>13</td>\n",
" <td>6</td>\n",
" <td>0</td>\n",
" <td>2</td>\n",
" <td>6.50</td>\n",
" <td>2.16139</td>\n",
" <td>42.5673</td>\n",
" <td>1616.0</td>\n",
" <td>2045.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27039</th>\n",
" <td>66</td>\n",
" <td>Porté-Puymorens</td>\n",
" <td>66760</td>\n",
" <td>131</td>\n",
" <td>147</td>\n",
" <td>100</td>\n",
" <td>2</td>\n",
" <td>49.42</td>\n",
" <td>1.83333</td>\n",
" <td>42.5500</td>\n",
" <td>1557.0</td>\n",
" <td>2827.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27125</th>\n",
" <td>66</td>\n",
" <td>Mont-Louis</td>\n",
" <td>66210</td>\n",
" <td>247</td>\n",
" <td>272</td>\n",
" <td>300</td>\n",
" <td>633</td>\n",
" <td>0.39</td>\n",
" <td>2.11667</td>\n",
" <td>42.5167</td>\n",
" <td>1516.0</td>\n",
" <td>1608.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>27134</th>\n",
" <td>66</td>\n",
" <td>Angles</td>\n",
" <td>66210</td>\n",
" <td>566</td>\n",
" <td>589</td>\n",
" <td>600</td>\n",
" <td>13</td>\n",
" <td>43.20</td>\n",
" <td>2.07445</td>\n",
" <td>42.5778</td>\n",
" <td>1531.0</td>\n",
" <td>2808.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29970</th>\n",
" <td>73</td>\n",
" <td>Bessans</td>\n",
" <td>73480</td>\n",
" <td>343</td>\n",
" <td>310</td>\n",
" <td>300</td>\n",
" <td>2</td>\n",
" <td>128.08</td>\n",
" <td>6.99167</td>\n",
" <td>45.3167</td>\n",
" <td>1673.0</td>\n",
" <td>3754.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30114</th>\n",
" <td>73</td>\n",
" <td>Val-d'Isère</td>\n",
" <td>73150</td>\n",
" <td>1563</td>\n",
" <td>1628</td>\n",
" <td>1600</td>\n",
" <td>16</td>\n",
" <td>94.39</td>\n",
" <td>6.98333</td>\n",
" <td>45.4500</td>\n",
" <td>1785.0</td>\n",
" <td>3599.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30127</th>\n",
" <td>73</td>\n",
" <td>Bonneval-sur-Arc</td>\n",
" <td>73480</td>\n",
" <td>241</td>\n",
" <td>239</td>\n",
" <td>200</td>\n",
" <td>2</td>\n",
" <td>82.72</td>\n",
" <td>7.05000</td>\n",
" <td>45.3667</td>\n",
" <td>1759.0</td>\n",
" <td>3642.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" dep nom cp nb_hab_2010 nb_hab_1999 nb_hab_2012 \\\n",
"1618 4 Larche 4540 74 83 100 \n",
"1790 5 Ristolas 5460 90 78 100 \n",
"1798 5 Saint-Véran 5350 257 265 300 \n",
"1847 5 Molines-en-Queyras 5350 315 322 300 \n",
"1904 5 Abriès 5460 365 358 400 \n",
"1923 5 Villar-d'Arêne 5480 287 217 300 \n",
"26927 66 La Llagonne 66210 242 264 300 \n",
"26943 66 Caudiès-de-Conflent 66360 13 6 0 \n",
"27039 66 Porté-Puymorens 66760 131 147 100 \n",
"27125 66 Mont-Louis 66210 247 272 300 \n",
"27134 66 Angles 66210 566 589 600 \n",
"29970 73 Bessans 73480 343 310 300 \n",
"30114 73 Val-d'Isère 73150 1563 1628 1600 \n",
"30127 73 Bonneval-sur-Arc 73480 241 239 200 \n",
"\n",
" dens surf long lat alt_min alt_max \n",
"1618 1 68.86 6.85000 44.4500 1606.0 3165.0 \n",
"1790 1 82.18 6.95000 44.7667 1571.0 3294.0 \n",
"1798 5 44.75 6.86667 44.7000 1756.0 3175.0 \n",
"1847 5 53.62 6.85000 44.7333 1625.0 3160.0 \n",
"1904 4 77.13 6.93333 44.7833 1513.0 3305.0 \n",
"1923 3 77.51 6.33711 45.0423 1519.0 3883.0 \n",
"26927 10 23.09 2.11667 42.5333 1546.0 2196.0 \n",
"26943 2 6.50 2.16139 42.5673 1616.0 2045.0 \n",
"27039 2 49.42 1.83333 42.5500 1557.0 2827.0 \n",
"27125 633 0.39 2.11667 42.5167 1516.0 1608.0 \n",
"27134 13 43.20 2.07445 42.5778 1531.0 2808.0 \n",
"29970 2 128.08 6.99167 45.3167 1673.0 3754.0 \n",
"30114 16 94.39 6.98333 45.4500 1785.0 3599.0 \n",
"30127 2 82.72 7.05000 45.3667 1759.0 3642.0 "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes[villes[\"alt_min\"]>1500]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Analyser des données"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Compter le nombre de ligne"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dep 36700\n",
"nom 36700\n",
"cp 36700\n",
"nb_hab_2010 36700\n",
"nb_hab_1999 36700\n",
"nb_hab_2012 36700\n",
"dens 36700\n",
"surf 36700\n",
"long 36700\n",
"lat 36700\n",
"alt_min 36568\n",
"alt_max 36568\n",
"dtype: int64"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes.count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Calculer une moyenne (ici la moyenne des habitants en 2010)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1768.0113079019075"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes[\"nb_hab_2010\"].mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Trouver un minimum (ici des habitants en 2010)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes[\"nb_hab_2010\"].min()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Trouver un maximum (ici des habitants en 2010)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2243833"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes[\"nb_hab_2010\"].max()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Trier les données\n",
"\n",
"- Trier en ordre croissant de la population en 2010"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>dep</th>\n",
" <th>nom</th>\n",
" <th>cp</th>\n",
" <th>nb_hab_2010</th>\n",
" <th>nb_hab_1999</th>\n",
" <th>nb_hab_2012</th>\n",
" <th>dens</th>\n",
" <th>surf</th>\n",
" <th>long</th>\n",
" <th>lat</th>\n",
" <th>alt_min</th>\n",
" <th>alt_max</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>21096</th>\n",
" <td>55</td>\n",
" <td>Bezonvaux</td>\n",
" <td>55100</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>9.23</td>\n",
" <td>5.46750</td>\n",
" <td>49.2367</td>\n",
" <td>226.0</td>\n",
" <td>367.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21155</th>\n",
" <td>55</td>\n",
" <td>Louvemont-Côte-du-Poivre</td>\n",
" <td>55100</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>8.25</td>\n",
" <td>5.39834</td>\n",
" <td>49.2378</td>\n",
" <td>214.0</td>\n",
" <td>375.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21038</th>\n",
" <td>55</td>\n",
" <td>Fleury-devant-Douaumont</td>\n",
" <td>55100</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>10.27</td>\n",
" <td>5.43445</td>\n",
" <td>49.1950</td>\n",
" <td>227.0</td>\n",
" <td>390.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21051</th>\n",
" <td>55</td>\n",
" <td>Haumont-près-Samogneux</td>\n",
" <td>55100</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>10.81</td>\n",
" <td>5.35251</td>\n",
" <td>49.2728</td>\n",
" <td>194.0</td>\n",
" <td>355.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21300</th>\n",
" <td>55</td>\n",
" <td>Beaumont-en-Verdunois</td>\n",
" <td>55100</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>7.87</td>\n",
" <td>5.40778</td>\n",
" <td>49.2587</td>\n",
" <td>233.0</td>\n",
" <td>372.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2049</th>\n",
" <td>6</td>\n",
" <td>Nice</td>\n",
" <td>06000-06100-06200-06300</td>\n",
" <td>343304</td>\n",
" <td>343123</td>\n",
" <td>344900</td>\n",
" <td>4773</td>\n",
" <td>71.92</td>\n",
" <td>7.25000</td>\n",
" <td>43.7000</td>\n",
" <td>0.0</td>\n",
" <td>520.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>11718</th>\n",
" <td>31</td>\n",
" <td>Toulouse</td>\n",
" <td>31000-31100-31200-31300-31400-31500</td>\n",
" <td>441802</td>\n",
" <td>390301</td>\n",
" <td>439600</td>\n",
" <td>3734</td>\n",
" <td>118.30</td>\n",
" <td>1.43333</td>\n",
" <td>43.6000</td>\n",
" <td>115.0</td>\n",
" <td>263.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>28152</th>\n",
" <td>69</td>\n",
" <td>Lyon</td>\n",
" <td>69001-69002-69003-69004-69005-69006-69007-6900...</td>\n",
" <td>484344</td>\n",
" <td>445274</td>\n",
" <td>474900</td>\n",
" <td>10117</td>\n",
" <td>47.87</td>\n",
" <td>4.84139</td>\n",
" <td>45.7589</td>\n",
" <td>162.0</td>\n",
" <td>312.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4439</th>\n",
" <td>13</td>\n",
" <td>Marseille</td>\n",
" <td>13001-13002-13003-13004-13005-13006-13007-1300...</td>\n",
" <td>850726</td>\n",
" <td>797491</td>\n",
" <td>851400</td>\n",
" <td>3535</td>\n",
" <td>240.62</td>\n",
" <td>5.37639</td>\n",
" <td>43.2967</td>\n",
" <td>0.0</td>\n",
" <td>640.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30437</th>\n",
" <td>75</td>\n",
" <td>Paris</td>\n",
" <td>75001-75002-75003-75004-75005-75006-75007-7500...</td>\n",
" <td>2243833</td>\n",
" <td>2125851</td>\n",
" <td>2211000</td>\n",
" <td>21288</td>\n",
" <td>105.40</td>\n",
" <td>2.34445</td>\n",
" <td>48.8600</td>\n",
" <td>27.0</td>\n",
" <td>127.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>36700 rows × 12 columns</p>\n",
"</div>"
],
"text/plain": [
" dep nom \\\n",
"21096 55 Bezonvaux \n",
"21155 55 Louvemont-Côte-du-Poivre \n",
"21038 55 Fleury-devant-Douaumont \n",
"21051 55 Haumont-près-Samogneux \n",
"21300 55 Beaumont-en-Verdunois \n",
"... .. ... \n",
"2049 6 Nice \n",
"11718 31 Toulouse \n",
"28152 69 Lyon \n",
"4439 13 Marseille \n",
"30437 75 Paris \n",
"\n",
" cp nb_hab_2010 \\\n",
"21096 55100 0 \n",
"21155 55100 0 \n",
"21038 55100 0 \n",
"21051 55100 0 \n",
"21300 55100 0 \n",
"... ... ... \n",
"2049 06000-06100-06200-06300 343304 \n",
"11718 31000-31100-31200-31300-31400-31500 441802 \n",
"28152 69001-69002-69003-69004-69005-69006-69007-6900... 484344 \n",
"4439 13001-13002-13003-13004-13005-13006-13007-1300... 850726 \n",
"30437 75001-75002-75003-75004-75005-75006-75007-7500... 2243833 \n",
"\n",
" nb_hab_1999 nb_hab_2012 dens surf long lat alt_min \\\n",
"21096 0 0 0 9.23 5.46750 49.2367 226.0 \n",
"21155 0 0 0 8.25 5.39834 49.2378 214.0 \n",
"21038 0 0 0 10.27 5.43445 49.1950 227.0 \n",
"21051 0 0 0 10.81 5.35251 49.2728 194.0 \n",
"21300 0 0 0 7.87 5.40778 49.2587 233.0 \n",
"... ... ... ... ... ... ... ... \n",
"2049 343123 344900 4773 71.92 7.25000 43.7000 0.0 \n",
"11718 390301 439600 3734 118.30 1.43333 43.6000 115.0 \n",
"28152 445274 474900 10117 47.87 4.84139 45.7589 162.0 \n",
"4439 797491 851400 3535 240.62 5.37639 43.2967 0.0 \n",
"30437 2125851 2211000 21288 105.40 2.34445 48.8600 27.0 \n",
"\n",
" alt_max \n",
"21096 367.0 \n",
"21155 375.0 \n",
"21038 390.0 \n",
"21051 355.0 \n",
"21300 372.0 \n",
"... ... \n",
"2049 520.0 \n",
"11718 263.0 \n",
"28152 312.0 \n",
"4439 640.0 \n",
"30437 127.0 \n",
"\n",
"[36700 rows x 12 columns]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes.sort_values(by=\"nb_hab_2010\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Trie en ordre décroissant (ici de l'altitude)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>dep</th>\n",
" <th>nom</th>\n",
" <th>cp</th>\n",
" <th>nb_hab_2010</th>\n",
" <th>nb_hab_1999</th>\n",
" <th>nb_hab_2012</th>\n",
" <th>dens</th>\n",
" <th>surf</th>\n",
" <th>long</th>\n",
" <th>lat</th>\n",
" <th>alt_min</th>\n",
" <th>alt_max</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>30114</th>\n",
" <td>73</td>\n",
" <td>Val-d'Isère</td>\n",
" <td>73150</td>\n",
" <td>1563</td>\n",
" <td>1628</td>\n",
" <td>1600</td>\n",
" <td>16</td>\n",
" <td>94.39</td>\n",
" <td>6.98333</td>\n",
" <td>45.45000</td>\n",
" <td>1785.0</td>\n",
" <td>3599.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>30127</th>\n",
" <td>73</td>\n",
" <td>Bonneval-sur-Arc</td>\n",
" <td>73480</td>\n",
" <td>241</td>\n",
" <td>239</td>\n",
" <td>200</td>\n",
" <td>2</td>\n",
" <td>82.72</td>\n",
" <td>7.05000</td>\n",
" <td>45.36670</td>\n",
" <td>1759.0</td>\n",
" <td>3642.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1798</th>\n",
" <td>5</td>\n",
" <td>Saint-Véran</td>\n",
" <td>5350</td>\n",
" <td>257</td>\n",
" <td>265</td>\n",
" <td>300</td>\n",
" <td>5</td>\n",
" <td>44.75</td>\n",
" <td>6.86667</td>\n",
" <td>44.70000</td>\n",
" <td>1756.0</td>\n",
" <td>3175.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>29970</th>\n",
" <td>73</td>\n",
" <td>Bessans</td>\n",
" <td>73480</td>\n",
" <td>343</td>\n",
" <td>310</td>\n",
" <td>300</td>\n",
" <td>2</td>\n",
" <td>128.08</td>\n",
" <td>6.99167</td>\n",
" <td>45.31670</td>\n",
" <td>1673.0</td>\n",
" <td>3754.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1847</th>\n",
" <td>5</td>\n",
" <td>Molines-en-Queyras</td>\n",
" <td>5350</td>\n",
" <td>315</td>\n",
" <td>322</td>\n",
" <td>300</td>\n",
" <td>5</td>\n",
" <td>53.62</td>\n",
" <td>6.85000</td>\n",
" <td>44.73330</td>\n",
" <td>1625.0</td>\n",
" <td>3160.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36695</th>\n",
" <td>976</td>\n",
" <td>Sada</td>\n",
" <td>97640</td>\n",
" <td>10195</td>\n",
" <td>10195</td>\n",
" <td>10195</td>\n",
" <td>933</td>\n",
" <td>10.92</td>\n",
" <td>45.10470</td>\n",
" <td>-12.84860</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36696</th>\n",
" <td>976</td>\n",
" <td>Tsingoni</td>\n",
" <td>97680</td>\n",
" <td>10454</td>\n",
" <td>10454</td>\n",
" <td>10454</td>\n",
" <td>300</td>\n",
" <td>34.76</td>\n",
" <td>45.10700</td>\n",
" <td>-12.78970</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36697</th>\n",
" <td>971</td>\n",
" <td>Saint-Barthélemy</td>\n",
" <td>97133</td>\n",
" <td>8938</td>\n",
" <td>8938</td>\n",
" <td>8938</td>\n",
" <td>372</td>\n",
" <td>24.00</td>\n",
" <td>-62.83330</td>\n",
" <td>17.91670</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36698</th>\n",
" <td>971</td>\n",
" <td>Saint-Martin</td>\n",
" <td>97150</td>\n",
" <td>36979</td>\n",
" <td>36979</td>\n",
" <td>36979</td>\n",
" <td>695</td>\n",
" <td>53.20</td>\n",
" <td>18.09130</td>\n",
" <td>-63.08290</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36699</th>\n",
" <td>975</td>\n",
" <td>Saint-Pierre-et-Miquelon</td>\n",
" <td>97500</td>\n",
" <td>6080</td>\n",
" <td>6080</td>\n",
" <td>6080</td>\n",
" <td>25</td>\n",
" <td>242.00</td>\n",
" <td>46.71070</td>\n",
" <td>1.71819</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>36700 rows × 12 columns</p>\n",
"</div>"
],
"text/plain": [
" dep nom cp nb_hab_2010 nb_hab_1999 \\\n",
"30114 73 Val-d'Isère 73150 1563 1628 \n",
"30127 73 Bonneval-sur-Arc 73480 241 239 \n",
"1798 5 Saint-Véran 5350 257 265 \n",
"29970 73 Bessans 73480 343 310 \n",
"1847 5 Molines-en-Queyras 5350 315 322 \n",
"... ... ... ... ... ... \n",
"36695 976 Sada 97640 10195 10195 \n",
"36696 976 Tsingoni 97680 10454 10454 \n",
"36697 971 Saint-Barthélemy 97133 8938 8938 \n",
"36698 971 Saint-Martin 97150 36979 36979 \n",
"36699 975 Saint-Pierre-et-Miquelon 97500 6080 6080 \n",
"\n",
" nb_hab_2012 dens surf long lat alt_min alt_max \n",
"30114 1600 16 94.39 6.98333 45.45000 1785.0 3599.0 \n",
"30127 200 2 82.72 7.05000 45.36670 1759.0 3642.0 \n",
"1798 300 5 44.75 6.86667 44.70000 1756.0 3175.0 \n",
"29970 300 2 128.08 6.99167 45.31670 1673.0 3754.0 \n",
"1847 300 5 53.62 6.85000 44.73330 1625.0 3160.0 \n",
"... ... ... ... ... ... ... ... \n",
"36695 10195 933 10.92 45.10470 -12.84860 NaN NaN \n",
"36696 10454 300 34.76 45.10700 -12.78970 NaN NaN \n",
"36697 8938 372 24.00 -62.83330 17.91670 NaN NaN \n",
"36698 36979 695 53.20 18.09130 -63.08290 NaN NaN \n",
"36699 6080 25 242.00 46.71070 1.71819 NaN NaN \n",
"\n",
"[36700 rows x 12 columns]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes.sort_values(by=\"alt_min\", ascending=False)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Faire des opérations avec les données\n",
"\n",
"On pourrait vouloir calculer la différence entre la population en 2010 et en 2012 et ajouter le résultat dans une nouvelle colonne\n",
"On peut le faire ainsi"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>dep</th>\n",
" <th>nom</th>\n",
" <th>cp</th>\n",
" <th>nb_hab_2010</th>\n",
" <th>nb_hab_1999</th>\n",
" <th>nb_hab_2012</th>\n",
" <th>dens</th>\n",
" <th>surf</th>\n",
" <th>long</th>\n",
" <th>lat</th>\n",
" <th>alt_min</th>\n",
" <th>alt_max</th>\n",
" <th>diff_10_12</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>Ozan</td>\n",
" <td>1190</td>\n",
" <td>618</td>\n",
" <td>469</td>\n",
" <td>500</td>\n",
" <td>93</td>\n",
" <td>6.60</td>\n",
" <td>4.91667</td>\n",
" <td>46.38330</td>\n",
" <td>170.0</td>\n",
" <td>205.0</td>\n",
" <td>-118</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>Cormoranche-sur-Saône</td>\n",
" <td>1290</td>\n",
" <td>1058</td>\n",
" <td>903</td>\n",
" <td>1000</td>\n",
" <td>107</td>\n",
" <td>9.85</td>\n",
" <td>4.83333</td>\n",
" <td>46.23330</td>\n",
" <td>168.0</td>\n",
" <td>211.0</td>\n",
" <td>-58</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1</td>\n",
" <td>Plagne</td>\n",
" <td>1130</td>\n",
" <td>129</td>\n",
" <td>83</td>\n",
" <td>100</td>\n",
" <td>20</td>\n",
" <td>6.20</td>\n",
" <td>5.73333</td>\n",
" <td>46.18330</td>\n",
" <td>560.0</td>\n",
" <td>922.0</td>\n",
" <td>-29</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>Tossiat</td>\n",
" <td>1250</td>\n",
" <td>1406</td>\n",
" <td>1111</td>\n",
" <td>1400</td>\n",
" <td>138</td>\n",
" <td>10.17</td>\n",
" <td>5.31667</td>\n",
" <td>46.13330</td>\n",
" <td>244.0</td>\n",
" <td>501.0</td>\n",
" <td>-6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1</td>\n",
" <td>Pouillat</td>\n",
" <td>1250</td>\n",
" <td>88</td>\n",
" <td>58</td>\n",
" <td>100</td>\n",
" <td>14</td>\n",
" <td>6.23</td>\n",
" <td>5.43333</td>\n",
" <td>46.33330</td>\n",
" <td>333.0</td>\n",
" <td>770.0</td>\n",
" <td>12</td>\n",
" </tr>\n",
" <tr>\n",
" <th>...</th>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" <td>...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36695</th>\n",
" <td>976</td>\n",
" <td>Sada</td>\n",
" <td>97640</td>\n",
" <td>10195</td>\n",
" <td>10195</td>\n",
" <td>10195</td>\n",
" <td>933</td>\n",
" <td>10.92</td>\n",
" <td>45.10470</td>\n",
" <td>-12.84860</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36696</th>\n",
" <td>976</td>\n",
" <td>Tsingoni</td>\n",
" <td>97680</td>\n",
" <td>10454</td>\n",
" <td>10454</td>\n",
" <td>10454</td>\n",
" <td>300</td>\n",
" <td>34.76</td>\n",
" <td>45.10700</td>\n",
" <td>-12.78970</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36697</th>\n",
" <td>971</td>\n",
" <td>Saint-Barthélemy</td>\n",
" <td>97133</td>\n",
" <td>8938</td>\n",
" <td>8938</td>\n",
" <td>8938</td>\n",
" <td>372</td>\n",
" <td>24.00</td>\n",
" <td>-62.83330</td>\n",
" <td>17.91670</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36698</th>\n",
" <td>971</td>\n",
" <td>Saint-Martin</td>\n",
" <td>97150</td>\n",
" <td>36979</td>\n",
" <td>36979</td>\n",
" <td>36979</td>\n",
" <td>695</td>\n",
" <td>53.20</td>\n",
" <td>18.09130</td>\n",
" <td>-63.08290</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>36699</th>\n",
" <td>975</td>\n",
" <td>Saint-Pierre-et-Miquelon</td>\n",
" <td>97500</td>\n",
" <td>6080</td>\n",
" <td>6080</td>\n",
" <td>6080</td>\n",
" <td>25</td>\n",
" <td>242.00</td>\n",
" <td>46.71070</td>\n",
" <td>1.71819</td>\n",
" <td>NaN</td>\n",
" <td>NaN</td>\n",
" <td>0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>36700 rows × 13 columns</p>\n",
"</div>"
],
"text/plain": [
" dep nom cp nb_hab_2010 nb_hab_1999 \\\n",
"0 1 Ozan 1190 618 469 \n",
"1 1 Cormoranche-sur-Saône 1290 1058 903 \n",
"2 1 Plagne 1130 129 83 \n",
"3 1 Tossiat 1250 1406 1111 \n",
"4 1 Pouillat 1250 88 58 \n",
"... ... ... ... ... ... \n",
"36695 976 Sada 97640 10195 10195 \n",
"36696 976 Tsingoni 97680 10454 10454 \n",
"36697 971 Saint-Barthélemy 97133 8938 8938 \n",
"36698 971 Saint-Martin 97150 36979 36979 \n",
"36699 975 Saint-Pierre-et-Miquelon 97500 6080 6080 \n",
"\n",
" nb_hab_2012 dens surf long lat alt_min alt_max \\\n",
"0 500 93 6.60 4.91667 46.38330 170.0 205.0 \n",
"1 1000 107 9.85 4.83333 46.23330 168.0 211.0 \n",
"2 100 20 6.20 5.73333 46.18330 560.0 922.0 \n",
"3 1400 138 10.17 5.31667 46.13330 244.0 501.0 \n",
"4 100 14 6.23 5.43333 46.33330 333.0 770.0 \n",
"... ... ... ... ... ... ... ... \n",
"36695 10195 933 10.92 45.10470 -12.84860 NaN NaN \n",
"36696 10454 300 34.76 45.10700 -12.78970 NaN NaN \n",
"36697 8938 372 24.00 -62.83330 17.91670 NaN NaN \n",
"36698 36979 695 53.20 18.09130 -63.08290 NaN NaN \n",
"36699 6080 25 242.00 46.71070 1.71819 NaN NaN \n",
"\n",
" diff_10_12 \n",
"0 -118 \n",
"1 -58 \n",
"2 -29 \n",
"3 -6 \n",
"4 12 \n",
"... ... \n",
"36695 0 \n",
"36696 0 \n",
"36697 0 \n",
"36698 0 \n",
"36699 0 \n",
"\n",
"[36700 rows x 13 columns]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"villes[\"diff_10_12\"] = villes[\"nb_hab_2012\"] - villes[\"nb_hab_2010\"]\n",
"villes"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}