- Notifications
You must be signed in to change notification settings - Fork12
Web-scraped Transfermarkt data for all soccer/football transfers in 10 European leagues over 30 seasons
License
emordonez/transfermarkt-transfers
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
All soccer/football club transfers from 1992/93–2020/21 for 10 of the top European leagues, namely
- Premier League 🏴
- La Liga 🇪🇸
- Bundesliga 🇩🇪
- Serie A 🇮🇹
- Ligue 1 🇫🇷
- Primeira Liga 🇵🇹
- Eredivisie 🇳🇱
- Premier Liga* 🇷🇺
- Jupiler Pro League* 🇧🇪
- Scottish Premiership* 🏴
Data were obtained by web scraping league transfer data fromTransfermarkt.
*Transfermarkt does not provide data for the 2011/12 Premier Liga season, the 1992/93 and 1993/94 Jupiler Pro League seasons, or the 1992/93–2002/03 Scottish Premiership seasons.
All data are provided in thedata
directory and grouped into season subdirectories.Feel free to use this dataset for your own purposes!You can clone it ordownload it via DownGit.Consult theREADME for more information.
If you'd like to pull the raw data directly from the source or scrape data for other countries and leagues, you can use the Python script provided bytmtransfers
.
Clone this repository and open a terminal in the cloned folder.First ensure all dependencies are met:
pip install -r requirements.txt
The module can now be run as a script from the top directory:
python -m tmtransfers
This launches a series of text prompts.You should see the following output to start:
Select currency (default is euro):[1] EUR €[2] GBP £[3] USD $===>
Follow the prompts to input your desired league parameters.Scraped data will then be written to CSVs in a createddata
directory.
As an example, an output CSV for the Premier League's 2020/21 season with the default options and before cleaning should look like:
club | name | age | nationality | position | short_pos | market_value | dealing_club | dealing_country | fee | movement | window | league | season |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Arsenal FC | Thomas Partey | 27 | Ghana | Defensive Midfield | DM | €40.00m | Atlético Madrid | Spain | €50.00m | in | summer | premier-league | 2020 |
Arsenal FC | Gabriel | 22 | Brazil | Centre-Back | CB | €20.00m | LOSC Lille | France | €26.00m | in | summer | premier-league | 2020 |
Arsenal FC | Pablo Marí | 26 | Spain | Centre-Back | CB | €4.80m | Flamengo | Brazil | €5.00m | in | summer | premier-league | 2020 |
Arsenal FC | Rúnar Alex Rúnarsson | 25 | Iceland | Goalkeeper | GK | €1.20m | Dijon | France | €2.00m | in | summer | premier-league | 2020 |
Arsenal FC | Cédric Soares | 28 | Portugal | Right-Back | RB | €8.00m | Southampton | England | free transfer | in | summer | premier-league | 2020 |
Note: If you run the script again and scrape data for the same league and same season, the existing CSV will be overwritten.Be sure to move or rename existing files if you need them as is before running the script again.
If you'd like to use this module elsewhere, install it from the top directory with
pip install.
It provides two functions,scrape_transfermarkt
andtidy_transfers
.Use them like so:
importpandasimporttmtransfers# Web scrape data for a league not explicitly given in the script# Returns a Pandas dataframedf=tmtransfers.scrape_transfermarkt(league_name='championship',league_id='GB2',season_id='2005',write=True)# Clean the data# Returns another Pandas dataframetidy_df=tmtransfers.tidy_transfers(df)
See the documentation intmtransfers.py
for more details.
Note: These functions have been tested for only the above leagues through the listed seasons.You'll have to browse Transfermarkt for what to input to scrape other countries and leagues.
All data are scraped fromTransfermarkt according to theirterms of use.
About
Web-scraped Transfermarkt data for all soccer/football transfers in 10 European leagues over 30 seasons