Dasher83/python-pandas-tutorialPublic

forked from4GeeksAcademy/python-pandas-tutorial

NotificationsYou must be signed in to change notification settings
Fork0
Star0

Commita2835a5

committed

more exercises with the baby dataset

1 parent9b70a04 commita2835a5Copy full SHA for a2835a5

File tree

16 files changed

+1016521

-22

lines changed

.learn
- assets
  - us_baby_names_right.csv
- exercises
  - 05.6-iloc
    - README.md
  - 05.6-loc
    - README.md
    - solution.hide.py
  - 05.7-filter_and_count
    - README.md
    - solution.hide.py
  - 05.7-iterate-dataframe
    - README.md
  - 06-Clean-up-dataset
    - README.md
    - solution.hide.py
  - 06.01-Remove_first_column
    - README.md
  - 06.2-Value_counts
    - README.md
    - solution.hide.py
  - 06.3-Group_by
    - README.md
    - solution.hide.py
- vscode_queue.json
app.py

16 files changed

+1016521

-22

lines changed

`‎.learn/assets/us_baby_names_right.csv`

Lines changed: 1016396 additions & 0 deletions

Large diffs are not rendered by default.

`‎.learn/exercises/05.6-iloc/README.md`

Lines changed: 0 additions & 17 deletions

This file was deleted.

`‎.learn/exercises/05.6-loc/README.md`

Lines changed: 29 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,29 @@`
	`1`	`+#Using loc function in Pandas`
	`2`	`+`
	`3`	`+You can also use the data_frame.loc function to filter records using logical operations as indexes, like this:`
	`4`	`+`
	`5`	+```python
	`6`	`+# get people more than 18 years old`
	`7`	`+data_frame.loc[data_frame['age']<18]`
	`8`	+```
	`9`	`+`
	`10`	`+##📝 Instructions`
	`11`	`+`
	`12`	`+Using the loc function, print on the terminal all pokemons with an atack of more than 80`
	`13`	`+`
	`14`	`+##💻 Expected output`
	`15`	`+`
	`16`	+```bash
	`17`	`+# Name Type 1 Type 2 HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary`
	`18`	`+2 3 Venusaur Grass Poison 80 82 83 100 100 80 1 False`
	`19`	`+3 3 VenusaurMega Venusaur Grass Poison 80 100 123 122 120 80 1 False`
	`20`	`+6 6 Charizard Fire Flying 78 84 78 109 85 100 1 False`
	`21`	`+7 6 CharizardMega Charizard X Fire Dragon 78 130 111 130 85 100 1 False`
	`22`	`+8 6 CharizardMega Charizard Y Fire Flying 78 104 78 159 115 100 1 False`
	`23`	`+.. ... ... ... ... .. ... ... ... ... ... ... ...`
	`24`	`+795 719 Diancie Rock Fairy 50 100 150 100 150 50 6 True`
	`25`	`+796 719 DiancieMega Diancie Rock Fairy 50 160 110 160 110 110 6 True`
	`26`	`+797 720 HoopaHoopa Confined Psychic Ghost 80 110 60 150 130 70 6 True`
	`27`	`+798 720 HoopaHoopa Unbound Psychic Dark 80 160 60 170 130 80 6 True`
	`28`	`+799 721 Volcanion Fire Water 80 110 120 130 90 70 6 True`
	`29`	+```

`‎.learn/exercises/05.6-iloc/solution.hide.pyrenamed to‎.learn/exercises/05.6-loc/solution.hide.py`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`	`1`	`importpandasaspd`
`2`	`2`
`3`	`3`	`data_frame=pd.read_csv('.learn/assets/pokemon_data.csv')`
`4`		`-print(data_frame.iloc[133,6])`
	`4`	`+print(data_frame.loc[data_frame['Attack']>80])`

`‎.learn/exercises/05.7-filter_and_count/README.md`

Lines changed: 9 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,9 @@`
	`1`	`+#Filter and cound`
	`2`	`+`
	`3`	`+How many pokemos are legendary?`
	`4`	`+`
	`5`	+Hint: use the loc function with a logica operation as index, use the`len` function to count.
	`6`	`+`
	`7`	`+##💻 Expected output`
	`8`	`+`
	`9`	+`65`

`‎.learn/exercises/05.7-filter_and_count/solution.hide.py`

Lines changed: 4 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,4 @@`
	`1`	`+importpandasaspd`
	`2`	`+`
	`3`	`+data_frame=pd.read_csv('.learn/assets/pokemon_data.csv')`
	`4`	`+print(len(data_frame.loc[data_frame['Legendary']==True]))`

`‎.learn/exercises/05.7-iterate-dataframe/README.md`

Whitespace-only changes.

`‎.learn/exercises/06-Clean-up-dataset/README.md`

Lines changed: 21 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,21 @@`
	`1`	`+#Clean up Baby Names dataset`
	`2`	`+`
	`3`	`+Let's start a new clean exercise to clean up a publicly know datasate of[US baby names](https://www.kaggle.com/kaggle/us-baby-names) from Kaggle.`
	`4`	`+`
	`5`	`+##📝 Instructions`
	`6`	`+`
	`7`	`+First lets understand the dataset by printing the first 10 rows`
	`8`	`+`
	`9`	+- Import the`./assets/us_baby_names_right.csv` dataset.
	`10`	`+- Print the first 5 records on the command line.`
	`11`	`+`
	`12`	`+##💻 Expected output`
	`13`	`+`
	`14`	+```bash
	`15`	`+ Unnamed: 0 Id Name Year Gender State Count`
	`16`	`+0 11349 11350 Emma 2004 F AK 62`
	`17`	`+1 11350 11351 Madison 2004 F AK 48`
	`18`	`+2 11351 11352 Hannah 2004 F AK 46`
	`19`	`+3 11352 11353 Grace 2004 F AK 44`
	`20`	`+4 11353 11354 Emily 2004 F AK 41`
	`21`	+```

`‎.learn/exercises/06-Clean-up-dataset/solution.hide.py`

Lines changed: 4 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,4 @@`
	`1`	`+importpandasaspd`
	`2`	`+`
	`3`	`+data_frame=pd.read_csv('.learn/assets/us_baby_names_right.csv')`
	`4`	`+print(data_frame.head())`

`‎.learn/exercises/06.01-Remove_first_column/README.md`

Lines changed: 19 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,19 @@`
	`1`	`+#Remove column`
	`2`	`+`
	`3`	`+We can see that the datasate first column is called "Unnamed: 0" and it contains a number that we don't know what it is.`
	`4`	`+`
	`5`	`+##📝 Instructions`
	`6`	`+`
	`7`	`+- Remove the first column from the dataset.`
	`8`	`+- Print the first 5 records again.`
	`9`	`+`
	`10`	`+##💻 Expected output`
	`11`	`+`
	`12`	+```bash
	`13`	`+ Id Name Year Gender State Count`
	`14`	`+0 11350 Emma 2004 F AK 62`
	`15`	`+1 11351 Madison 2004 F AK 48`
	`16`	`+2 11352 Hannah 2004 F AK 46`
	`17`	`+3 11353 Grace 2004 F AK 44`
	`18`	`+4 11354 Emily 2004 F AK 41`
	`19`	+```

`‎.learn/exercises/06.2-Value_counts/README.md`

Lines changed: 13 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,13 @@`
	`1`	`+##Dataframe value_counts`
	`2`	`+`
	`3`	`+Are there more male or female names in the dataset?`
	`4`	`+`
	`5`	+Hint: use the value_counts function to get the`Gender` value count.
	`6`	`+`
	`7`	`+##💻 Expected output`
	`8`	`+`
	`9`	+```bash
	`10`	`+F 558846`
	`11`	`+M 457549`
	`12`	`+Name: Gender, dtype: int64`
	`13`	+```

`‎.learn/exercises/06.2-Value_counts/solution.hide.py`

Lines changed: 5 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,5 @@`
	`1`	`+importpandasaspd`
	`2`	`+`
	`3`	`+data_frame=pd.read_csv('.learn/assets/us_baby_names_right.csv')`
	`4`	`+count=data_frame['Gender'].value_counts()`
	`5`	`+print(count)`

`‎.learn/exercises/06.3-Group_by/README.md`

Lines changed: 11 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,11 @@`
	`1`	`+#Group By`
	`2`	`+`
	`3`	`+How many different names exist in the dataset?`
	`4`	`+`
	`5`	+1. Use the dataframe.`groupby` function to group your table by name.
	`6`	+2. Use the .`sum()` function available after you group by to cound the number of gruped occurences.
	`7`	+3. Use the`len(result)` function to count the number of groups
	`8`	`+`
	`9`	`+##💻 Expected output`
	`10`	`+`
	`11`	+`17632`

`‎.learn/exercises/06.3-Group_by/solution.hide.py`

Lines changed: 5 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,5 @@`
	`1`	`+importpandasaspd`
	`2`	`+`
	`3`	`+data_frame=pd.read_csv('.learn/assets/us_baby_names_right.csv')`
	`4`	`+names=data_frame.groupby("Name").sum()`
	`5`	`+print(len(names))`

`‎.learn/vscode_queue.json`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -1 +1 @@`
`1`		-[{"name":"initializing","time":442957258393.204},{"name":"reset","time":442957258439.738},{"name":"configuration_loaded","time":442958277103.692},{"name":"start_exercise","time":443015519309.933,"data":"00-welcome"},{"name":"start_exercise","time":443060088697.605,"data":"01-terminal"},{"name":"start_exercise","time":443074953846.711,"data":"01.2-Pipenv"},{"name":"start_exercise","time":443118267987.033,"data":"02-installation"},{"name":"start_exercise","time":443140770208.037,"data":"02.2-create-script"},{"name":"start_exercise","time":443206131153.638,"data":"02.3-import_pandas"},{"name":"start_exercise","time":443230227006.7,"data":"03-Dataset"},{"name":"start_exercise","time":443294961210.749,"data":"04-Data_Frame"},{"name":"start_exercise","time":443317656762.219,"data":"04.1-from_dict"},{"name":"start_exercise","time":443330217152.771,"data":"04.1-iloc"},{"name":"start_exercise","time":443400861084.733,"data":"04.2-head"},{"name":"start_exercise","time":443415968058.695,"data":"04.3-tail"},{"name":"start_exercise","time":443420883130.11,"data":"04.4-print-columns"},{"name":"start_exercise","time":443468150689.343,"data":"04.5-iloc"},{"name":"start_exercise","time":443588107863.337,"data":"04-Data_Frame"}]
	`1`	+[{"name":"initializing","time":442957258393.204},{"name":"reset","time":442957258439.738},{"name":"configuration_loaded","time":442958277103.692},{"name":"start_exercise","time":443015519309.933,"data":"00-welcome"},{"name":"start_exercise","time":443060088697.605,"data":"01-terminal"},{"name":"start_exercise","time":443074953846.711,"data":"01.2-Pipenv"},{"name":"start_exercise","time":443118267987.033,"data":"02-installation"},{"name":"start_exercise","time":443140770208.037,"data":"02.2-create-script"},{"name":"start_exercise","time":443206131153.638,"data":"02.3-import_pandas"},{"name":"start_exercise","time":443230227006.7,"data":"03-Dataset"},{"name":"start_exercise","time":443294961210.749,"data":"04-Data_Frame"},{"name":"start_exercise","time":443317656762.219,"data":"04.1-from_dict"},{"name":"start_exercise","time":443330217152.771,"data":"04.1-iloc"},{"name":"start_exercise","time":443400861084.733,"data":"04.2-head"},{"name":"start_exercise","time":443415968058.695,"data":"04.3-tail"},{"name":"start_exercise","time":443420883130.11,"data":"04.4-print-columns"},{"name":"start_exercise","time":443468150689.343,"data":"04.5-iloc"},{"name":"start_exercise","time":443588107863.337,"data":"04-Data_Frame"},{"name":"start_exercise","time":447177537931.47,"data":"04.1-from_dict"},{"name":"start_exercise","time":447180555960.543,"data":"04-Data_Frame"},{"name":"connection_ended","time":453841679092.838}]

`‎app.py`

Lines changed: 3 additions & 3 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,5 @@`
`1`	`1`	`importpandasaspd`
`2`	`2`
`3`		`-date_series=pd.date_range(start='05-01-2021',end='05-12-2021')`
`4`		`-`
`5`		`-print(date_series)`
	`3`	`+data_frame=pd.read_csv('.learn/assets/us_baby_names_right.csv')`
	`4`	`+names=data_frame.groupby("Name").sum()`
	`5`	`+print(len(names))`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commita2835a5

File tree

16 files changed

16 files changed

`‎.learn/assets/us_baby_names_right.csv`

`‎.learn/exercises/05.6-iloc/README.md`

`‎.learn/exercises/05.6-loc/README.md`

`‎.learn/exercises/05.6-iloc/solution.hide.pyrenamed to‎.learn/exercises/05.6-loc/solution.hide.py`

`‎.learn/exercises/05.7-filter_and_count/README.md`

`‎.learn/exercises/05.7-filter_and_count/solution.hide.py`

`‎.learn/exercises/05.7-iterate-dataframe/README.md`

`‎.learn/exercises/06-Clean-up-dataset/README.md`

`‎.learn/exercises/06-Clean-up-dataset/solution.hide.py`

`‎.learn/exercises/06.01-Remove_first_column/README.md`

`‎.learn/exercises/06.2-Value_counts/README.md`

`‎.learn/exercises/06.2-Value_counts/solution.hide.py`

`‎.learn/exercises/06.3-Group_by/README.md`

`‎.learn/exercises/06.3-Group_by/solution.hide.py`

`‎.learn/vscode_queue.json`

`‎app.py`

0 commit comments