- Notifications
You must be signed in to change notification settings - Fork2
What skills and qualifications are required for a data scientist?
NotificationsYou must be signed in to change notification settings
thiyangt/DSjobtracker
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The DSjobtraker package compiles data from 1172 job advertisements indata science and statistics, making it an invaluable tool for learningkey skills required in these areas. Here is a word cloud of 97 requestedskills mentioned in the job advertisements. It provides insights intothe most sought-after skills in the field.
You can install thedevelopment version fromGitHub:
#install.packages("devtools")devtools::install_github("thiyangt/DSjobtracker")library(DSjobtracker)
tibble::glimpse(DStidy)
Rows: 1,172Columns: 109$ ID <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, …$ Consultant <chr> "Thiyanga", "Jayani", "Jayani", "Ja…$ DateRetrieved <dttm> 2020-08-05, 2020-08-07, 2020-08-07…$ DatePublished <dttm> NA, 2020-07-31, 2020-08-06, 2020-0…$ Job_title <chr> NA, "Junior Data Scientist", "Engin…$ Company <chr> NA, "Dialog Axiata PLC", "London St…$ R <dbl> 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0,…$ SAS <dbl> 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,…$ SPSS <dbl> 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,…$ Python <dbl> 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0,…$ MAtlab <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Scala <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `C#` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `MS Word` <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,…$ `Ms Excel` <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,…$ `OLE/DB` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `Ms Access` <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,…$ `Ms PowerPoint` <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0,…$ Spreadsheets <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Data_visualization <dbl> 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,…$ Presentation_Skills <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Communication <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ BigData <dbl> 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0,…$ Data_warehouse <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ cloud_storage <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Google_Cloud <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ AWS <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Machine_Learning <dbl> 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0,…$ `Deep Learning` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Computer_vision <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Java <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0,…$ `C++` <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ C <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ `Linux/Unix` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ SQL <dbl> 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0,…$ NoSQL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ RDBMS <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Oracle <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,…$ MySQL <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0,…$ PHP <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…$ SPL <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ web_design_and_development_tools <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…$ AI <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `Natural_Language_Processing(NLP)` <dbl> 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,…$ `Microsoft Power BI` <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Google_Analytics <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ graphics_and_design_skills <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…$ Data_marketing <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ SEO <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Content_Management <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Tableau <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,…$ D3 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,…$ Alteryx <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ KNIME <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Spotfire <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Spark <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,…$ S3 <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ Redshift <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ DigitalOcean <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ Javascript <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ Kafka <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Storm <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Bash <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Hadoop <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…$ Data_Pipelines <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ MPP_Platforms <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Qlik <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Pig <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Hive <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…$ Tensorflow <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `Map/Reduce` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Impala <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Solr <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Teradata <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ MongoDB <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Elasticsearch <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ YOLO <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `agile execution` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…$ Data_management <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ pyspark <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Data_mining <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ Data_science <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,…$ Web_Analytic_tools <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ IOT <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Numerical_Analysis <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Economic <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Finance_Knowledge <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Investment_Knowledge <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Problem_Solving <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Team_Handling <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Debtor_reconcilation <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Payroll_management <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Bayesian <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Optimization <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ Knowledge_in <chr> NA, NA, "Elasticsearch, Logstash, K…$ City <chr> NA, "Colombo", "Colombo", "Colombo"…$ Educational_qualifications <chr> NA, "Degree in Engineering / IT or …$ Salary <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…$ URL <chr> NA, "https://www.google.com/search?…$ Search_Term <chr> NA, "Data Analysis Jobs in Sri Lank…$ Job_Category <chr> "Unimportant", "Data Science", "Dat…$ Experience_Category <chr> "More than 2 and less than 5 years"…$ Location <chr> NA, "Sri Lanka", "Sri Lanka", "Sri …$ `Payment Frequency` <chr> "unspecified", "unspecified", "unsp…$ BSc_needed <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1,…$ MSc_needed <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…$ PhD_needed <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…$ `English Needed` <dbl> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,…$ year <dbl> 2020, 2020, 2020, 2020, 2020, 2020,…df2021<- get_data(2021)df2021
# A tibble: 382 × 115 ID Consultant URL Search_Term DateRetrieved DatePublished Job_Field <dbl> <chr> <chr> <chr> <chr> <chr> <chr> 1 1 Abhishanya https://w… statistics… 11/10/2021 11/10/2021 Informat… 2 2 Abhishanya https://w… statistics… 11/11/2021 11/11/2021 Education 3 3 Abhishanya https://w… Data scien… 11/11/2021 11/9/2021 Other 4 4 Abhishanya https://w… Data scien… 11/11/2021 10/13/2021 Other 5 5 Abhishanya https://w… Statistics… 11/11/2021 10/14/2021 Informat… 6 6 Abhishanya https://w… Data Scien… 11/11/2021 10/14/2020 Health 7 7 Abhishanya https://c… Data engin… 11/11/2021 11/8/2021 Other 8 8 Abhishanya https://l… Data engin… 11/11/2021 11/9/2021 Other 9 9 Abhishanya https://w… Data engin… 11/11/2021 10/20/2021 Informat…10 10 Abhishanya https://w… Data engin… 11/11/2021 10/21/2021 Finance # ℹ 372 more rows# ℹ 108 more variables: Job_title <chr>, Company <chr>, Knowledge_in <chr>,# `Minimum Experience in Years` <dbl>, City <chr>, Location <chr>,# Educational_qualifications <chr>, `Payment Frequency` <chr>,# Currency <chr>, Salary <dbl>, `English Needed` <dbl>,# `English proficiency description` <chr>, Additional_languages <chr>,# AI <dbl>, `Natural_Language_Processing(NLP)` <dbl>, Data_Pipelines <dbl>, …head(DStidy)# A tibble: 6 × 109 ID Consultant DateRetrieved DatePublished Job_title Company <dbl> <chr> <dttm> <dttm> <chr> <chr> 1 1 Thiyanga 2020-08-05 00:00:00 NA <NA> <NA> 2 2 Jayani 2020-08-07 00:00:00 2020-07-31 00:00:00 Junior Data … Dialog…3 3 Jayani 2020-08-07 00:00:00 2020-08-06 00:00:00 Engineer, An… London…4 4 Jayani 2020-08-07 00:00:00 2020-07-24 00:00:00 CI-Statistic… E.D. B…5 5 Jayani 2020-08-07 00:00:00 2020-07-24 00:00:00 DA-Data Anal… E.D. B…6 6 Jayani 2020-08-07 00:00:00 2020-08-13 00:00:00 Data Scienti… Emirat…# ℹ 103 more variables: R <dbl>, SAS <dbl>, SPSS <dbl>, Python <dbl>,# MAtlab <dbl>, Scala <dbl>, `C#` <dbl>, `MS Word` <dbl>, `Ms Excel` <dbl>,# `OLE/DB` <dbl>, `Ms Access` <dbl>, `Ms PowerPoint` <dbl>,# Spreadsheets <dbl>, Data_visualization <dbl>, Presentation_Skills <dbl>,# Communication <dbl>, BigData <dbl>, Data_warehouse <dbl>,# cloud_storage <dbl>, Google_Cloud <dbl>, AWS <dbl>, Machine_Learning <dbl>,# `Deep Learning` <dbl>, Computer_vision <dbl>, Java <dbl>, `C++` <dbl>, …head(DSraw)# A tibble: 6 × 152 ID Consultant DateRetrieved DatePublished Job_title Company R SAS <dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl>1 1 Thiyanga 05/08/2020 <NA> <NA> <NA> 1 12 2 Jayani 07/08/2020 31/07/2020 Junior Data … Dialog… 1 03 3 Jayani 07/08/2020 06/08/20 Engineer, An… London… 0 04 4 Jayani 07/08/2020 24/07/2020 CI-Statistic… E.D. B… 1 15 5 Jayani 07/08/2020 24/07/2020 DA-Data Anal… E.D. B… 0 16 6 Jayani 07/08/2020 13/08/2020 Data Scienti… Emirat… 1 0# ℹ 144 more variables: SPSS <dbl>, Python <dbl>, MAtlab <dbl>, Scala <dbl>,# `C#` <dbl>, `MS Word` <dbl>, `Ms Excel` <dbl>, `OLE/DB` <dbl>,# `Ms Access` <dbl>, `Ms PowerPoint` <dbl>, Spreadsheets <dbl>,# Data_visualization <dbl>, Presentation_Skills <dbl>, Communication <dbl>,# BigData <dbl>, Data_warehouse <dbl>, cloud_storage <dbl>,# Google_Cloud <dbl>, AWS <dbl>, Machine_Learning <dbl>,# `Deep Learning` <dbl>, Computer_vision <dbl>, Java <dbl>, `C++` <dbl>, …About
What skills and qualifications are required for a data scientist?
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
No releases published
Packages0
No packages published
Contributors2
Uh oh!
There was an error while loading.Please reload this page.