Movatterモバイル変換

survivoR

75 seasons. 1417 people. 1 package!

survivoR is a collection of data sets detailing events across 75seasons of Survivor US, Australia, South Africa, New Zealand and UK. Itincludes castaway information, vote history, immunity and rewardchallenge winners, jury votes, advantage details and a lot more.

For analysis and updates you can follow me on Bluesky@danoehm.bsky.social

For those that aren’t R users you can get the data onGoogleSheets as wel, ordownloadas an xlsx.

You can also access the data inJSONformat to feed directly into applications

Installation

Install from CRAN (v2.3.7) or Git(v2.3.8).

If Git > CRAN I’d suggest install from Git. We are constantlyimproving the data sets so the github version is likely to be slightlyimproved.

install.packages("survivoR")

devtools::install_github("doehm/survivoR")

Next release

The next release is planned for the 9th Oct for CRAN. There are a fewkey data updates so definitely reccommend installing from Git untilthen.

News: survivoR 2.3.7

Survivor Australia vs. The World added

Survivor Stats Db

Survivor Stats Db is thesurvivoR package’s companion. It holds interactive tables and chartsdetailing the castaways, challenges, vote history, confessionals,ratings, and more.

Confessional timing

Included in the package is a confessional timing app to record thelength of confessionals while watching the episode.

To launch the app, first install the package and run,

library(survivoR)launch_confessional_app()

To try it out online 👉Confessional timingapp

More infohere.

Dataset overview

There are 19 data sets included in the package:

advantage_movement
advantage_details
boot_mapping
castaway_details
castaway_scores
castaways
challenge_results
challenge_description
challenge_summary
confessionals
jury_votes
season_summary
tribe_colours
tribe_mapping
episodes
vote_history
survivor_auction
auction_details
screen_time
season_palettes
journeys

See the sections below for more details on the key data sets.

Season summary

Season summary

A table containing summary details of each season of Survivor,including the winner, runner ups and location.

season_summary#> # A tibble: 75 × 26#>    version version_season season_name season location country tribe_setup n_cast#>    <chr>   <chr>          <chr>        <dbl> <chr>    <chr>   <chr>        <int>#>  1 US      US50           Survivor: …     50 <NA>     <NA>     <NA>           24#>  2 US      US49           Survivor: …     49 <NA>     <NA>     <NA>           18#>  3 US      US48           Survivor: …     48 Mamanuc… Fiji    "Three tri…     18#>  4 US      US47           Survivor: …     47 Mamanuc… Fiji    "Three tri…     18#>  5 US      US46           Survivor: …     46 Mamanuc… Fiji    "Three tri…     18#>  6 US      US45           Survivor: …     45 Mamanuc… Fiji    "Three tri…     18#>  7 US      US44           Survivor: …     44 Mamanuc… Fiji    "Three tri…     18#>  8 US      US43           Survivor: …     43 Mamanuc… Fiji    "Three tri…     18#>  9 US      US42           Survivor: …     42 Mamanuc… Fiji    "Three tri…     18#> 10 US      US41           Survivor: …     41 Mamanuc… Fiji    "Three tri…     18#> 11 US      US40           Survivor: …     40 Mamanuc… Fiji    "Two tribe…     20#> 12 US      US39           Survivor: …     39 Mamanuc… Fiji    "Two tribe…     20#> 13 US      US38           Survivor: …     38 Mamanuc… Fiji    "Two tribe…     18#> 14 US      US37           Survivor: …     37 Mamanuc… Fiji    "Two tribe…     20#> 15 US      US36           Survivor: …     36 Mamanuc… Fiji    "Two tribe…     20#> 16 US      US35           Survivor: …     35 Mamanuc… Fiji    "Three tri…     18#> 17 US      US34           Survivor: …     34 Mamanuc… Fiji    "Two tribe…     20#> 18 US      US33           Survivor: …     33 Mamanuc… Fiji    "Two tribe…     20#> 19 US      US32           Survivor: …     32 Koh Ron… Cambod… "Three tri…     18#> 20 US      US31           Survivor: …     31 Koh Ron… Cambod… "Two tribe…     20#> 21 US      US30           Survivor: …     30 San Jua… Nicara… "Three tri…     18#> 22 US      US29           Survivor: …     29 San Jua… Nicara… "Nine pair…     18#> 23 US      US28           Survivor: …     28 Palaui … Philip… "Three tri…     18#> 24 US      US27           Survivor: …     27 Palaui … Philip… "Two tribe…     20#> 25 US      US26           Survivor: …     26 Caramoa… Philip… "Two tribe…     20#> 26 US      US25           Survivor: …     25 Caramoa… Philip… "Three tri…     18#> 27 US      US24           Survivor: …     24 San Jua… Nicara… "Two tribe…     18#> 28 US      US23           Survivor: …     23 San Jua… Nicara… "Upolu, Sa…     18#> 29 US      US22           Survivor: …     22 San Jua… Nicara… "Two tribe…     18#> 30 US      US21           Survivor: …     21 San Jua… Nicara… "Two tribe…     20#> # ℹ 45 more rows#> # ℹ 18 more variables: n_tribes <int>, n_finalists <int>, n_jury <int>,#> #   full_name <chr>, winner_id <chr>, winner <chr>, runner_ups <chr>,#> #   final_vote <chr>, timeslot <chr>, premiered <date>, ended <date>,#> #   filming_started <date>, filming_ended <date>, viewers_reunion <dbl>,#> #   viewers_premiere <dbl>, viewers_finale <dbl>, viewers_mean <dbl>,#> #   rank <dbl>

Castaways

Castaways

This data set contains season and demographic information about eachcastaway. It is structured to view their results for each season.Castaways that have played in multiple seasons will feature more thanonce with the age and location representing that point in time.Castaways that re-entered the game will feature more than once in thesame season as they technically have more than one boot ordere.g. Natalie Anderson - Winners at War.

Each castaway has a uniquecastaway_id which links theindividual across all data sets and seasons. It also links to thefollowing ID’s found on thevote_history,jury_votes andchallenges data sets.

vote_id
voted_out_id
finalist_id

castaways|>filter(season==45)#> # A tibble: 18 × 26#>    version version_season season full_name      castaway_id castaway   age city#>    <chr>   <chr>           <dbl> <chr>          <chr>       <chr>    <dbl> <chr>#>  1 US      US45               45 Hannah Rose    US0669      Hannah      33 Balt…#>  2 US      US45               45 Brandon Donlon US0665      Brandon     25 Sick…#>  3 US      US45               45 Sabiyah Brode… US0677      Sabiyah     27 Jack…#>  4 US      US45               45 Sean Edwards   US0678      Sean        34 Prov…#>  5 US      US45               45 Brando Meyer   US0664      Brando      23 Seat…#>  6 US      US45               45 J. Maya        US0670      J. Maya     24 Los …#>  7 US      US45               45 Sifu Alsup     US0679      Sifu        30 O'Fa…#>  8 US      US45               45 Kaleb Gebrewo… US0673      Kaleb       29 Vanc…#>  9 US      US45               45 Kellie Nalban… US0675      Kellie      30 New …#> 10 US      US45               45 Kendra McQuar… US0676      Kendra      30 Stea…#> 11 US      US45               45 Bruce Perreau… US0657      Bruce       46 Warw…#> 12 US      US45               45 Emily Flippen  US0668      Emily       28 Laur…#> 13 US      US45               45 Drew Basile    US0667      Drew        23 Phil…#> 14 US      US45               45 Julie Alley    US0672      Julie       49 Bren…#> 15 US      US45               45 Katurah Topps  US0674      Katurah     34 Broo…#> 16 US      US45               45 Jake O'Kane    US0671      Jake        26 Bost…#> 17 US      US45               45 Austin Li Coon US0663      Austin      26 Chic…#> 18 US      US45               45 Dee Valladares US0666      Dee         26 Miami#> # ℹ 18 more variables: state <chr>, episode <dbl>, day <dbl>, order <dbl>,#> #   result <chr>, jury_status <chr>, place <dbl>, original_tribe <chr>,#> #   jury <lgl>, finalist <lgl>, winner <lgl>, acknowledge <lgl>,#> #   ack_look <lgl>, ack_speak <lgl>, ack_gesture <lgl>, ack_smile <lgl>,#> #   ack_quote <chr>, ack_score <dbl>

Castaway details

A few castaways have changed their name from season to season or havebeen referred to by a different name during the season e.g. AmberMariano; in season 8 Survivor All-Stars there was Rob C and Rob M. Thatinformation has been retained here in thecastaways dataset.

castaway_details contains unique information for eachcastaway. It takes the full name from their most current season andtheir most verbose short name which is handy for labelling.

It also includes gender, date of birth, occupation, race, ethnicityand other data. If no source was found to determine a castaways race andethnicity, the data is kept as missing rather than making anassumption.

african_american,asian_american,latin_american,native_american,race,ethnicity, andbipoc datais complete only for the US.bipoc isTRUEwhen any of the*_american fields areTRUE.These fields have been recorded as per the (Survivor wiki)[https://survivor.fandom.com/wiki/Main_Page]. Otherversions have been left blank as the data is not complete and the term‘people of colour’ is typically only used in the US.

I have deprecated the old fieldpoc in order to be moreinclusive and to make using the race/ethnicity fields simpler.

I have included acollar field is experimental andderived from a language model. I suggest caution with it’s use as manyoccupations may not fit neatly into a classification.

castaway_details#> # A tibble: 1,180 × 22#>    castaway_id full_name     full_name_detailed castaway last_name date_of_birth#>    <chr>       <chr>         <chr>              <chr>    <chr>     <date>#>  1 US0001      Sonja Christ… Sonja Christopher  Sonja    Christop… 1937-01-28#>  2 US0002      B.B. Andersen B.B. Andersen      B.B.     Andersen  1936-01-18#>  3 US0003      Stacey Still… Stacey Stillman    Stacey   Stillman  1972-08-11#>  4 US0004      Ramona Gray   Ramona Gray        Ramona   Gray      1971-01-20#>  5 US0005      Dirk Been     Dirk Been          Dirk     Been      1976-06-15#>  6 US0006      Joel Klug     Joel Klug          Joel     Klug      1972-04-13#>  7 US0007      Gretchen Cor… Gretchen Cordy     Gretchen Cordy     1962-02-07#>  8 US0008      Greg Buis     Greg Buis          Greg     Buis      1975-12-31#>  9 US0009      Jenna Lewis   Jenna Lewis        Jenna L. Lewis     1977-07-16#> 10 US0010      Gervase Pete… Gervase Peterson   Gervase  Peterson  1969-11-02#> 11 US0011      Colleen Hask… Colleen Haskell    Colleen  Haskell   1976-12-06#> 12 US0012      Sean Kenniff  Sean Kenniff       Sean     Kenniff   1969-11-27#> 13 US0013      Susan Hawk    Susan Hawk         Sue      Hawk      1961-08-17#> 14 US0014      Rudy Boesch   Rudy Boesch        Rudy     Boesch    1928-01-20#> 15 US0015      Kelly Wigles… Kelly Wiglesworth  Kelly    Wigleswo… 1977-06-24#> 16 US0016      Richard Hatch Richard Hatch      Richard  Hatch     1961-04-08#> 17 US0017      Debb Eaton    Debb Eaton         Debb     Eaton     1955-06-11#> 18 US0018      Kel Gleason   Kel Gleason        Kel      Gleason   1968-01-05#> 19 US0019      Maralyn Hers… Maralyn Hershey    Maralyn  Hershey   1949-01-24#> 20 US0020      Mitchell Ols… Mitchell Olson     Mitchell Olson     1977-03-17#> 21 US0021      Kimmi Kappen… Kimmi Kappenberg   Kimmi    Kappenbe… 1972-11-11#> 22 US0022      Michael Skup… Michael Skupin     Michael  Skupin    1962-01-29#> 23 US0023      Jeff Varner   Jeff Varner        Jeff     Varner    1966-04-16#> 24 US0024      Alicia Calaw… Alicia Calaway     Alicia   Calaway   1968-05-01#> 25 US0025      Jerri Manthey Jerri Manthey      Jerri    Manthey   1970-09-05#> 26 US0026      Nick Brown    Nick Brown         Nick     Brown     1977-04-02#> 27 US0027      Amber Mariano Amber Mariano      Amber    Mariano   1978-08-11#> 28 US0028      Rodger Bingh… Rodger Bingham     Rodger   Bingham   1947-07-05#> 29 US0029      Elisabeth Fi… Elisabeth Filarski Elisabe… Filarski  1977-05-28#> 30 US0030      Keith Famie   Keith Famie        Keith    Famie     1960-02-11#> # ℹ 1,150 more rows#> # ℹ 16 more variables: date_of_death <date>, gender <chr>, african <lgl>,#> #   asian <lgl>, latin_american <lgl>, native_american <lgl>, bipoc <lgl>,#> #   lgbt <lgl>, personality_type <chr>, occupation <chr>, collar <chr>,#> #   three_words <chr>, hobbies <chr>, pet_peeves <chr>, race <chr>,#> #   ethnicity <chr>

Castaway scores

I have created a measure for challenge success, vote history ortribal council success and advantage success. For more details pleasesee follow the links:

castaway_scores#> # A tibble: 1,129 × 55#>    version version_season season castaway castaway_id score_overall score_outwit#>    <fct>   <chr>           <dbl> <chr>    <chr>               <dbl>        <dbl>#>  1 US      US01                1 Sonja    US0001             0.0266  0.000000975#>  2 US      US01                1 B.B.     US0002             0.0612  0.0120#>  3 US      US01                1 Stacey   US0003             0.124   0.137#>  4 US      US01                1 Ramona   US0004             0.233   0.355#>  5 US      US01                1 Dirk     US0005             0.269   0.391#>  6 US      US01                1 Joel     US0006             0.348   0.515#>  7 US      US01                1 Gretchen US0007             0.555   0.688#>  8 US      US01                1 Greg     US0008             0.556   0.423#>  9 US      US01                1 Jenna    US0009             0.521   0.561#> 10 US      US01                1 Gervase  US0010             0.590   0.454#> 11 US      US01                1 Colleen  US0011             0.612   0.516#> 12 US      US01                1 Sean     US0012             0.554   0.529#> 13 US      US01                1 Sue      US0013             0.574   0.653#> 14 US      US01                1 Rudy     US0014             0.559   0.503#> 15 US      US01                1 Kelly    US0015             0.852   0.748#> 16 US      US01                1 Richard  US0016             0.662   0.706#> 17 US      US02                2 Debb     US0017             0.0266  0.00000527#> 18 US      US02                2 Kel      US0018             0.0577  0.00331#> 19 US      US02                2 Maralyn  US0019             0.205   0.318#> 20 US      US02                2 Mitchell US0020             0.271   0.450#> 21 US      US02                2 Kimmi    US0021             0.297   0.442#> 22 US      US02                2 Michael  US0022             0.432   0.714#> 23 US      US02                2 Jeff     US0023             0.516   0.582#> 24 US      US02                2 Alicia   US0024             0.507   0.536#> 25 US      US02                2 Jerri    US0025             0.584   0.597#> 26 US      US02                2 Nick     US0026             0.529   0.382#> 27 US      US02                2 Amber    US0027             0.475   0.416#> 28 US      US02                2 Rodger   US0028             0.491   0.405#> 29 US      US02                2 Elisabe… US0029             0.546   0.537#> 30 US      US02                2 Keith    US0030             0.624   0.526#> # ℹ 1,099 more rows#> # ℹ 48 more variables: score_outplay <dbl>, score_outlast <dbl>,#> #   score_result <dbl>, score_jury <dbl>, score_vote <dbl>, score_adv <dbl>,#> #   score_inf <dbl>, r_score_chal_all <dbl>, r_score_chal_immunity <dbl>,#> #   r_score_chal_reward <dbl>, r_score_chal_tribal <dbl>,#> #   r_score_chal_tribal_immunity <dbl>, r_score_chal_tribal_reward <dbl>,#> #   r_score_chal_individual <dbl>, r_score_chal_individual_immunity <dbl>, …

Vote history

Vote history

This data frame contains a complete history of votes cast across allseasons of Survivor. This allows you to see who who voted for who atwhich Tribal Council. It also includes details on who had individualimmunity as well as who had their votes nullified by a hidden immunityidol. This details the key events for the season.

There is some information on split votes to help calculate if aplayer engaged in a split vote but ultimately hit their target. Thereare events which influence the vote e.g. Extra votes, safety withoutpower, etc. These are recorded here as well.

vh<- vote_history|>filter(    season==45,    episode==9  )vh#> # A tibble: 9 × 23#>   version version_season season episode   day tribe_status tribe    castaway#>   <chr>   <chr>           <dbl>   <dbl> <dbl> <chr>        <chr>    <chr>#> 1 US      US45               45       9    17 Merged       Dakuwaqa Bruce#> 2 US      US45               45       9    17 Merged       Dakuwaqa Jake#> 3 US      US45               45       9    17 Merged       Dakuwaqa Katurah#> 4 US      US45               45       9    17 Merged       Dakuwaqa Dee#> 5 US      US45               45       9    17 Merged       Dakuwaqa Julie#> 6 US      US45               45       9    17 Merged       Dakuwaqa Kendra#> 7 US      US45               45       9    17 Merged       Dakuwaqa Emily#> 8 US      US45               45       9    17 Merged       Dakuwaqa Austin#> 9 US      US45               45       9    17 Merged       Dakuwaqa Drew#> # ℹ 15 more variables: immunity <chr>, vote <chr>, vote_event <chr>,#> #   vote_event_outcome <chr>, split_vote <chr>, nullified <lgl>, tie <lgl>,#> #   voted_out <chr>, order <dbl>, vote_order <dbl>, castaway_id <chr>,#> #   vote_id <chr>, voted_out_id <chr>, sog_id <dbl>, challenge_id <dbl>

vh|>count(vote)#> # A tibble: 3 × 2#>   vote       n#>   <chr>  <int>#> 1 Jake       1#> 2 Kendra     6#> 3 <NA>       2

Challenges

Challenge results

Note: From v1.1 thechallenge_results dataset has beenimproved but could break existing code. The old table is maintained atchallenge_results_dep

There are 3 tableschallenge_results,challenge_description, andchallenge_summary.

Challenge results

A tidy data frame of immunity and reward challenge results. Thewinners and losers of the challenges are found recorded here.

challenge_results|>filter(season==45)|>group_by(castaway)|>summarise(won =sum(result=="Won"),lost =sum(result=="Lost"),total_challenges =n(),chosen_for_reward =sum(chosen_for_reward)  )#> # A tibble: 18 × 5#>    castaway   won  lost total_challenges chosen_for_reward#>    <chr>    <int> <int>            <int>             <int>#>  1 Austin      10     7               18                 1#>  2 Brando       4     3                7                 0#>  3 Brandon      0     3                3                 0#>  4 Bruce        8     5               13                 0#>  5 Dee          9     9               18                 2#>  6 Drew         8     8               16                 0#>  7 Emily        3    11               14                 0#>  8 Hannah       0     2                2                 0#>  9 J. Maya      6     2                8                 0#> 10 Jake         5    12               18                 2#> 11 Julie        7     8               17                 1#> 12 Kaleb        3     5                9                 0#> 13 Katurah      6    11               18                 2#> 14 Kellie       5     4               10                 0#> 15 Kendra       5     5               11                 0#> 16 Sabiyah      1     4                5                 0#> 17 Sean         1     5                6                 0#> 18 Sifu         7     2                9                 0

Thechallenge_id is the primary key for thechallenge_description data set. Thechallange_id will change as the data or descriptionschange.

Challenge description

Note: This data frame is going through a massive revamp. Staytuned.

This data set contains the name, description, and descriptivefeatures for each challenge where it is known. Challenges can go bydifferent names so have included the unique name and the recurringchallenge name. These are taken directly from theSurvivorWiki. Sometimes there can be variations made on the challenge but gobut the same name, or the challenge is integrated with a longerobstacle. In these cases the challenge may share the same recurringchallenge name but have a different challenge name. Even if they sharethe same names the description could be different.

The features of each challenge have been determined largely throughstring searches of key words that describe the challenge. It may not be100% accurate due to the different and inconsistent descriptions but inmost part they will provide a good basis for analysis.

If any descriptive features need altering please let me know in theissues.

challenge_description#> # A tibble: 1,876 × 45#>    version version_season season episode challenge_id challenge_number#>    <fct>   <chr>           <dbl>   <dbl>        <dbl>            <dbl>#>  1 US      US01                1       1            1                1#>  2 US      US01                1       2            2                1#>  3 US      US01                1       2            3                2#>  4 US      US01                1       3            4                1#>  5 US      US01                1       3            5                2#>  6 US      US01                1       4            6                1#>  7 US      US01                1       4            7                2#>  8 US      US01                1       5            8                1#>  9 US      US01                1       5            9                2#> 10 US      US01                1       6           10                1#> 11 US      US01                1       6           11                2#> 12 US      US01                1       7           12                1#> 13 US      US01                1       8           13                1#> 14 US      US01                1       8           14                2#> 15 US      US01                1       9           15                1#> 16 US      US01                1       9           16                2#> 17 US      US01                1      10           17                1#> 18 US      US01                1      10           18                2#> 19 US      US01                1      11           19                1#> 20 US      US01                1      11           20                2#> 21 US      US01                1      11           21                3#> 22 US      US01                1      12           22                1#> 23 US      US01                1      12           23                2#> 24 US      US01                1      13           24                1#> 25 US      US01                1      13           25                2#> 26 US      US02                2       1            1                1#> 27 US      US02                2       2            2                1#> 28 US      US02                2       2            3                2#> 29 US      US02                2       3            4                1#> 30 US      US02                2       3            5                2#> # ℹ 1,846 more rows#> # ℹ 39 more variables: challenge_type <chr>, name <chr>, recurring_name <chr>,#> #   description <chr>, reward <chr>, additional_stipulation <chr>,#> #   balance <lgl>, balance_ball <lgl>, balance_beam <lgl>, endurance <lgl>,#> #   fire <lgl>, food <lgl>, knowledge <lgl>, memory <lgl>, mud <lgl>,#> #   obstacle_blindfolded <lgl>, obstacle_cargo_net <lgl>,#> #   obstacle_chopping <lgl>, obstacle_combination_lock <lgl>, …challenge_description|>summarise_if(is_logical,~sum(.x,na.rm =TRUE))|>glimpse()#> Rows: 1#> Columns: 33#> $ balance                   <int> 361#> $ balance_ball              <int> 46#> $ balance_beam              <int> 156#> $ endurance                 <int> 455#> $ fire                      <int> 68#> $ food                      <int> 24#> $ knowledge                 <int> 77#> $ memory                    <int> 29#> $ mud                       <int> 49#> $ obstacle_blindfolded      <int> 52#> $ obstacle_cargo_net        <int> 150#> $ obstacle_chopping         <int> 32#> $ obstacle_combination_lock <int> 22#> $ obstacle_digging          <int> 96#> $ obstacle_knots            <int> 40#> $ obstacle_padlocks         <int> 74#> $ precision                 <int> 304#> $ precision_catch           <int> 65#> $ precision_roll_ball       <int> 13#> $ precision_slingshot       <int> 54#> $ precision_throw_balls     <int> 79#> $ precision_throw_coconuts  <int> 23#> $ precision_throw_rings     <int> 20#> $ precision_throw_sandbags  <int> 65#> $ puzzle                    <int> 409#> $ puzzle_slide              <int> 17#> $ puzzle_word               <int> 29#> $ race                      <int> 1338#> $ strength                  <int> 131#> $ turn_based                <int> 237#> $ water                     <int> 358#> $ water_paddling            <int> 149#> $ water_swim                <int> 263

See the help manual for more detailed descriptions of thefeatures.

Challenge Summary

Thechallenge_summary table is solving an annoyingproblem withchallenge_results and the way some challengesare constructed. You may want to count how many individual challengessomeone has won, or tribal immunities, etc. To do so you’ll have to usethechallenge_type,outcome_type, andresults fields. There are some challenges which arecombined e.g. Team / Individual challenges which makes thisnot a straight process to summarise the table.

Hence whychallenge_summary exisits. Thecategory column consists of the following categories:

All: All challenge types
Reward
Immunity
Tribal
Tribal Reward
Tribal Immunity
Individual
Individual Reward
Individual Immunity
Team
Team Reward
Team Immunity
Duel

There is obviously overlap with the categories but this structuremakes it simple to summarise the table how you desire e.g.

challenge_summary|>group_by(category, version_season, castaway)|>summarise(n_challenges =n(),n_won =sum(won)    )#> `summarise()` has grouped output by 'category', 'version_season'. You can#> override using the `.groups` argument.#> # A tibble: 11,677 × 5#> # Groups:   category, version_season [761]#>    category version_season castaway      n_challenges n_won#>    <chr>    <chr>          <chr>                <int> <dbl>#>  1 All      AU01           Andrew                  17     7#>  2 All      AU01           Barry                    9     5#>  3 All      AU01           Bianca                   3     2#>  4 All      AU01           Brooke                  29    20#>  5 All      AU01           Conner                  22     8#>  6 All      AU01           Craig                   18     7#>  7 All      AU01           Des                      2     0#>  8 All      AU01           El                      35    16#>  9 All      AU01           Evan                     5     1#> 10 All      AU01           Flick                   34    18#> 11 All      AU01           Jennah-Louise           27    18#> 12 All      AU01           Kat                     15     5#> 13 All      AU01           Kate                    23     7#> 14 All      AU01           Kristie                 35     6#> 15 All      AU01           Kylie                   25    19#> 16 All      AU01           Lee                     35    17#> 17 All      AU01           Matt                    33    18#> 18 All      AU01           Nick                    24    17#> 19 All      AU01           Peter                    6     5#> 20 All      AU01           Phoebe                  21     5#> 21 All      AU01           Rohan                   14     5#> 22 All      AU01           Sam                     32    18#> 23 All      AU01           Sue                     26     7#> 24 All      AU01           Tegan                   11     7#> 25 All      AU02           AK                      21    12#> 26 All      AU02           Adam                     5     3#> 27 All      AU02           Aimee                   10     5#> 28 All      AU02           Anneliese               28    13#> 29 All      AU02           Ben                     22    11#> 30 All      AU02           Henry                   29    15#> # ℹ 11,647 more rows

How to add the challenge scores to challenge summary.

challenge_summary|>group_by(category, version_season, castaway_id, castaway)|>summarise(n_challenges =n_distinct(challenge_id),n_won =sum(won),.groups ="drop"  )|>left_join(    castaway_scores|>select(version_season, castaway_id,starts_with("score_chal"))|>pivot_longer(c(-version_season,-castaway_id),names_to ="category",values_to ="score")|>mutate(category =str_remove(category,"score_chal_"),category =str_replace_all(category,"_"," "),category =str_to_title(category)      )|>select(category, version_season, castaway_id, score),join_by(category, version_season, castaway_id)  )#> Error in `pivot_longer()`:#> ! `cols` must select at least one column.

See the R docs for more details on the fields. Join tochallenge_results withversion_season andchallenge_id.

Jury votes

Jury votes

History of jury votes. It is more verbose than it needs to be,however having a 0-1 column indicating if a vote was placed or not makesit easier to summarise castaways that received no votes.

jury_votes|>filter(season==45)#> # A tibble: 24 × 8#>    version version_season season castaway finalist  vote castaway_id finalist_id#>    <chr>   <chr>           <dbl> <chr>    <chr>    <dbl> <chr>       <chr>#>  1 US      US45               45 Bruce    Austin       1 US0657      US0663#>  2 US      US45               45 Drew     Austin       1 US0667      US0663#>  3 US      US45               45 Emily    Austin       0 US0668      US0663#>  4 US      US45               45 Julie    Austin       0 US0672      US0663#>  5 US      US45               45 Kaleb    Austin       0 US0673      US0663#>  6 US      US45               45 Katurah  Austin       0 US0674      US0663#>  7 US      US45               45 Kellie   Austin       0 US0675      US0663#>  8 US      US45               45 Kendra   Austin       1 US0676      US0663#>  9 US      US45               45 Bruce    Dee          0 US0657      US0666#> 10 US      US45               45 Drew     Dee          0 US0667      US0666#> 11 US      US45               45 Emily    Dee          1 US0668      US0666#> 12 US      US45               45 Julie    Dee          1 US0672      US0666#> 13 US      US45               45 Kaleb    Dee          1 US0673      US0666#> 14 US      US45               45 Katurah  Dee          1 US0674      US0666#> 15 US      US45               45 Kellie   Dee          1 US0675      US0666#> 16 US      US45               45 Kendra   Dee          0 US0676      US0666#> 17 US      US45               45 Bruce    Jake         0 US0657      US0671#> 18 US      US45               45 Drew     Jake         0 US0667      US0671#> 19 US      US45               45 Emily    Jake         0 US0668      US0671#> 20 US      US45               45 Julie    Jake         0 US0672      US0671#> 21 US      US45               45 Kaleb    Jake         0 US0673      US0671#> 22 US      US45               45 Katurah  Jake         0 US0674      US0671#> 23 US      US45               45 Kellie   Jake         0 US0675      US0671#> 24 US      US45               45 Kendra   Jake         0 US0676      US0671

jury_votes|>filter(season==45)|>group_by(finalist)|>summarise(votes =sum(vote))#> # A tibble: 3 × 2#>   finalist votes#>   <chr>    <dbl>#> 1 Austin       3#> 2 Dee          5#> 3 Jake         0

Advantages

Advantage Details

This dataset lists the hidden idols and advantages in the game forall seasons. It details where it was found, if there was a clue to theadvantage, location and other advantage conditions. This maps to theadvantage_movement table.

advantage_details|>filter(season==45)#> # A tibble: 10 × 8#>    version version_season season advantage_id advantage_type       clue_details#>    <chr>   <chr>           <dbl>        <dbl> <chr>                <chr>#>  1 US      US45               45            1 Hidden Immunity Idol No clue#>  2 US      US45               45            2 Hidden Immunity Idol No clue#>  3 US      US45               45            3 Safety without Power No clue#>  4 US      US45               45            4 Goodwill Advantage   No clue#>  5 US      US45               45            5 Amulet               No clue#>  6 US      US45               45            6 Amulet               No clue#>  7 US      US45               45            7 Amulet               No clue#>  8 US      US45               45            8 Hidden Immunity Idol No clue#>  9 US      US45               45            9 Hidden Immunity Idol Found around…#> 10 US      US45               45           10 Challenge Advantage  No clue#> # ℹ 2 more variables: location_found <chr>, conditions <chr>

Advantage Movement

Theadvantage_movement table tracks who found theadvantage, who they may have handed it to and who the played it for.Each step is called an event. Thesequence_id tracks thelogical step of the advantage. For example in season 41, JD found anExtra Vote advantage. JD gave it to Shan in good faith who then votedhim out keeping the Extra Vote. Shan gave it to Ricard in good faith whoeventually gave it back before Shan played it for Naseer. That movementis recorded in this table.

advantage_movement|>filter(advantage_id=="USEV4102")#> # A tibble: 0 × 15#> # ℹ 15 variables: version <chr>, version_season <chr>, season <dbl>,#> #   castaway <chr>, castaway_id <chr>, advantage_id <dbl>, sequence_id <dbl>,#> #   day <dbl>, episode <dbl>, event <chr>, played_for <chr>,#> #   played_for_id <chr>, success <chr>, votes_nullified <dbl>, sog_id <dbl>

Confessionals

Confessionals

A dataset containing the number of confessionals for each castaway byseason and episode. There are multiple contributors to this data. Wherethere are multiple sets of counts for a season the average is taken andadded to the package. The aim is to establish consistency inconfessional counts in the absence of official sources. Given thesubjective nature of the counts and the potential for clerical error nosingle source is more valid than another. So it is reasonable to averageacross all sources.

Confessional time exists for a few seasons. This is the totalcumulative time for each castaway in seconds. This is a much moreaccurate indicator of the ‘edit’.

confessionals|>filter(season==45)|>group_by(castaway)|>summarise(count =sum(confessional_count),time =sum(confessional_time)    )#> # A tibble: 18 × 3#>    castaway count  time#>    <chr>    <dbl> <dbl>#>  1 Austin      72  1436#>  2 Brando      10   147#>  3 Brandon     12   214#>  4 Bruce       38   735#>  5 Dee         67  1102#>  6 Drew        64  1171#>  7 Emily       62  1332#>  8 Hannah       4    44#>  9 J. Maya     11   210#> 10 Jake        60  1290#> 11 Julie       46   814#> 12 Kaleb       45   692#> 13 Katurah     66  1169#> 14 Kellie      29   515#> 15 Kendra      37   506#> 16 Sabiyah     22   342#> 17 Sean        16   325#> 18 Sifu        11   236

The confessional index is available on this data set. The index is astandardised measure of the number of confessionals the player hasreceived compared to the others. It is stratified by tribe so itmeasures how many confessionals each player gets proportional to evenshare within tribe e.g. an index of 1.5 means that player as received50% more than others in their tribe.

The tribe grouping is important since the tribe that attends tribalcouncil typical get more screen time, which is fair enough. I don’tthink we should expect even share across everyone in the pre-merge stageof the game.

The index is cumulative with episode, so the players final index isthe index in their final episode.

confessionals|>filter(season==45)|>group_by(castaway)|>slice_max(episode)|>arrange(desc(index_time))|>select(castaway, episode, confessional_count, confessional_time, index_count, index_time)#> Error in `arrange()`:#> ℹ In argument: `..1 = index_time`.#> Caused by error:#> ! object 'index_time' not found

Screen time

Screen time [EXPERIMENTAL]

This dataset contains the estimated screen time for each castawayduring an episode. Please note that this is still in the early days ofdevelopment. There is likely to be misclassification and other sourcesof error. The model will be refined over time.

An individuals’ screen time is calculated, at a high-level, via thefollowing process:

Frames are sampled from episodes on a 1 second timeinterval
MTCNN detects the human faces within each frame
VGGFace2 converts each detected face into a 512d vectorspace
A training set of labelled images (1 for each contestant + 3 forJeff Probst) is processed in the same way to determine where they sit inthe vector space. TODO: This could be made more accurate by increasingthe number of training images per contestant.
The Euclidean distance is calculated for the faces detected inthe frame to each of the contestants in the season (+Jeff). If theminimum distance is greater than 1.2 the face is labelled as “unknown”.TODO: Review how robust this distance cutoff truly is - currently basedon manual review of Season 42.
A multi-class SVM is trained on the training set to label faces.For any face not identified as “unknown”, the vector embedding is runinto this model and a label is generated.
All labelled faces are aggregated together, with an assumption of1-5 full second of screen time each time a face is seen and factoring intime between detection capping at a max of 5 seconds.

screen_time|>filter(version_season=="US45")|>group_by(castaway_id)|>summarise(total_mins =sum(screen_time)/60)|>left_join(    castaway_details|>select(castaway_id,castaway = short_name),by ="castaway_id"  )|>arrange(desc(total_mins))#> Error in `select()`:#> ! Can't select columns that don't exist.#> ✖ Column `short_name` doesn't exist.

Currently it only includes data for season 42. More seasons will beadded as they are completed.

Boot mapping

Boot mapping

A mapping table to detail who is still alive at each stage of thegame. It is useful for easy filtering to say the final players.

# filter to season 45 and when there are 6 people left# 18 people in the season, therefore 12 bootsstill_alive<-function(.version, .season, .n_boots) {  survivoR::boot_mapping|>filter(      version== .version,      season== .season,      final_n==6,      game_status%in%c("In the game","Returned")    )}still_alive("US",45,6)#> # A tibble: 6 × 13#>   version version_season season episode order n_boots final_n sog_id castaway_id#>   <chr>   <chr>           <dbl>   <dbl> <dbl>   <dbl>   <dbl>  <dbl> <chr>#> 1 US      US45               45      12    12      12       6     13 US0671#> 2 US      US45               45      12    12      12       6     13 US0674#> 3 US      US45               45      12    12      12       6     13 US0666#> 4 US      US45               45      12    12      12       6     13 US0672#> 5 US      US45               45      12    12      12       6     13 US0663#> 6 US      US45               45      12    12      12       6     13 US0667#> # ℹ 4 more variables: castaway <chr>, tribe <chr>, tribe_status <chr>,#> #   game_status <chr>

Episodes

Episodes

Episodes is an episode level table. It contains the episodeinformation such as episode title, air date, length, IMDb rating and theviewer information for every episode across all seasons.

episodes|>filter(season==45)#> # A tibble: 13 × 13#>    version version_season season episode_number_overall episode episode_title#>    <chr>   <chr>           <dbl>                  <dbl>   <dbl> <chr>#>  1 US      US45               45                    610       1 We Can Do Hard …#>  2 US      US45               45                    611       2 Brought a Bazoo…#>  3 US      US45               45                    612       3 No Man Left Beh…#>  4 US      US45               45                    613       4 Music to My Ears#>  5 US      US45               45                    614       5 I Don't Want to…#>  6 US      US45               45                    615       6 I'm Not Batman,…#>  7 US      US45               45                    616       7 The Thorn In My…#>  8 US      US45               45                    617       8 Following a Dea…#>  9 US      US45               45                    618       9 Sword of Damocl…#> 10 US      US45               45                    619      10 How Am I the Mo…#> 11 US      US45               45                    620      11 This Game Rips …#> 12 US      US45               45                    621      12 The Ex-Girlfrie…#> 13 US      US45               45                    622      13 Living the Surv…#> # ℹ 7 more variables: episode_label <chr>, episode_date <date>,#> #   episode_length <dbl>, viewers <dbl>, imdb_rating <dbl>, n_ratings <dbl>,#> #   episode_summary <chr>

Survivor Auction

Survivor Auction

There are 2 data sets,survivor_acution andauction_details.survivor_auction simply showswho attended the auction andauction_details holds thedetails of the auction e.g. who bought what and at what price.

auction_details|>filter(season==45)#> # A tibble: 11 × 18#>    version version_season season  item item_description        category castaway#>    <chr>   <chr>           <dbl> <dbl> <chr>                   <chr>    <chr>#>  1 US      US45               45     1 Salty Pretzels And Beer Food an… Kendra#>  2 US      US45               45     2 French Fries, Ketchup,… Food an… Kellie#>  3 US      US45               45     3 Cheese Platter, Deli M… Food an… Emily#>  4 US      US45               45     4 Chocolate Milkshake     Food an… Dee#>  5 US      US45               45     5 Two Giant Fish Eyes     Bad item Katurah#>  6 US      US45               45     5 Two Giant Fish Eyes     Bad item Austin#>  7 US      US45               45     6 Bowl Of Lollies And Ch… Food an… Drew#>  8 US      US45               45     7 Slice Of Pepperoni Piz… Food an… Austin#>  9 US      US45               45     8 Toothbrush And Toothpa… Comfort  Julie#> 10 US      US45               45     9 Chocolate Cake          Food an… Jake#> 11 US      US45               45    10 Pbandj Sandwich, Chips… Food an… Kellie#> # ℹ 11 more variables: castaway_id <chr>, cost <dbl>, covered <lgl>,#> #   money_remaining <dbl>, auction_num <dbl>, participated <chr>, notes <chr>,#> #   alternative_offered <lgl>, alternative_accepted <lgl>, other_item <chr>,#> #   other_item_category <chr>

Journeys

Journeys

Details on Journeys in the New Era including the advantage they wonand if they lost their vote.

journeys|>filter(season==45)#> # A tibble: 10 × 12#>    version season version_season episode sog_id castaway_id castaway reward#>    <chr>    <dbl> <chr>            <dbl>  <dbl> <chr>       <chr>    <chr>#>  1 US          45 US45                 2      2 US0657      Bruce    <NA>#>  2 US          45 US45                 2      2 US0665      Brandon  Lost vote#>  3 US          45 US45                 2      2 US0667      Drew     Safety Wit…#>  4 US          45 US45                 5      5 US0663      Austin   Amulet#>  5 US          45 US45                 5      5 US0675      Kellie   Amulet#>  6 US          45 US45                 5      5 US0670      J. Maya  Amulet#>  7 US          45 US45                 9     10 US0663      Austin   Regained v…#>  8 US          45 US45                 9     10 US0668      Emily    Lost vote#>  9 US          45 US45                 9     10 US0674      Katurah  Lost vote#> 10 US          45 US45                11     12 US0668      Emily    <NA>#> # ℹ 4 more variables: lost_vote <lgl>, game_played <chr>, chose_to_play <lgl>,#> #   event <chr>

Issues

Given the variable nature of the game of Survivor and changing of therules, there are bound to be edges cases where the data is not quiteright. Before logging an issue please install the git version to see ifit has already been corrected. If not, please log an issue and I willcorrect the datasets.

New features will be added, such as details on exiled castawaysacross the seasons. If you have a request for specific data let me knowin the issues and I’ll see what I can do.

Showcase

Survivor Dashboard

CarlyLevitz has developed a fantasticdashboardshowcasing the data and allowing you to drill down into seasons,castaways, voting history and challenges.

Data viz

This looks at the number of immunity idols won and votes received foreach winner.

Contributors

A big thank you to:

Package contributor andmaintainers

CarlyLevitz for ongoing data collection and curation

Data contributors

Sam forcontributing to the confessional counts
DarioMavec for developing the face detection model forestimating total screen time
MattStiles for collecting and contributing the acknowledgmentfeatures on thecastaways data frame.
Camilla Bendetti for collating the personality typedata for each castaway.
Uygar Sozer for adding the filming start and enddates for each season.
Holt Skinner for creating the castaway ID to mappeople across seasons and manage name changes.
Kosta Psaltis for the original race data.

References

Data was sourced fromWikipediaand theSurvivorWiki. Other data, such as the tribe colours, was manually recordedand entered by myself and contributors.

[8]ページ先頭