Movatterモバイル変換


[0]ホーム

URL:


Title:A Package for Processing Lexical Response Data
Version:0.1.0
Description:Lexical response data is a package that can be used for processing cued-recall, free-recall, and sentence responses from memory experiments.
Depends:R (≥ 3.5.0)
Imports:stats, utils, knitr
Suggests:ggplot2, rmarkdown, reshape
License:LGPL-3
Encoding:UTF-8
LazyData:true
RoxygenNote:7.1.2
VignetteBuilder:knitr
URL:https://npm27.github.io/lrd/
NeedsCompilation:no
Packaged:2021-12-06 21:49:29 UTC; nickm
Author:Nicholas MaxwellORCID iD [aut, cre], Erin M. BuchananORCID iD [aut]
Maintainer:Nicholas Maxwell <nicholas.maxwell@usm.edu>
Repository:CRAN
Date/Publication:2021-12-09 09:50:02 UTC

Answer Key Example Data

Description

Dataset that includes the answer key for free recall data.Pair with the wide_data dataset for examples.

Usage

data(answer_key_free)

Format

A data frame of answers for a free recall test

Answer_Key: a list of free recall answers


Answer Key Example Data

Description

Dataset that includes the answer key for free recall data.Pair with the free_data dataset for examples.

Usage

data(answer_key_free2)

Format

A data frame of answers for a free recall test

Answer_Key: a list of free recall answers


Arrange Data for Free Recall Scoring

Description

This function takes wide format free recall data where allresponses are stored in the same cell and converts it to long format.

Usage

arrange_data(data, responses, sep, id, repeated = NULL)

Arguments

data

a dataframe of the variables you would like to return.Other variables will be included in the returned output in long formatif they represent a one to one match with the participant ID. If youhave repeated data, please use the repeated argument or run thisfunction several times for each trial.

responses

a column name in the dataframe that containsthe participant answers for each item in quotes (i.e., "column")

sep

a character separating each response in quotes - example: ",".

id

a column name containing participant ID numbers fromthe original dataframe

repeated

(optional) a single column name or set of columnsthat indicate repeated measures columns you would like tokeep with the data. You should include all columns that are not a oneto one match with the subject ID (i.e., participants saw multipletrials). Please see our vignette for an example.

Value

A dataframe of the participant answers including:

Sub.ID

The participant id number

response

The participant response

position

The position number of the response listed

other

Any additional columns included

Examples

#This dataset includes a subject number, set of answers, and#experiment condition.data(wide_data)DF_long <- arrange_data( data = wide_data, responses = "Response", sep = ",", id = "Sub.ID")head(DF_long)

Conditional Response Probability

Description

This function calculates the conditional responseprobability of each lag position. Participants' lagbetween subsequent named items is tallied and thendivided by the possible combination of subsequent lagsgiven their response pattern.

Usage

crp(data, position, answer, id, key, scored)

Arguments

data

a dataframe of the scored free recall that you wouldlike to calculate - use prop_correct_free() for best formatting.

position

a column name in the dataframe that containsanswered position of each response in quotes (i.e., "column")

answer

a column name of the answer given for that positionin the original dataframe.

id

a column name of the participant id in the originaldataframe.

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.We assume your answer key is in the tested position order. You shouldnot include duplicates in your answer key.

scored

a column in the original dataframe indicating if theparticipant got the answer correct (1) or incorrect (0).

Details

This output can then be used to create a CRP visualizations,and an example can be found in our manuscript/vignettes.

Important: The code is written assuming the data provided are fora single recall list. If repeated measures are used (i.e., there aremultiple lists completed by each participant or multiple list versions),you should use this function several times, once on each list/answer key.

Value

DF_CRP

A dataframe of the proportion correct for eachconditional lag position including any other between subjectsvariables present in the data.

Examples

data(free_data)data(answer_key_free2)free_data <- subset(free_data, List_Type == "Cat_Recall_L1")DF_long <- arrange_data(data = free_data, responses = "Response", sep = " ", id = "Username")scored_output <- prop_correct_free( data = DF_long, responses = "response", key = answer_key_free2$Answer_Key, id = "Sub.ID", cutoff = 1, flag = TRUE, group.by = "Version")crp_output <- crp(data = scored_output$DF_Scored, position = "position", answer = "Answer", id = "Sub.ID", key = answer_key_free2$Answer_Key, scored = "Scored") head(crp_output)

Conditional Response Probability for Multiple Lists

Description

This function calculates the conditional responseprobability of each lag position. Participants' lagbetween subsequent named items is tallied and thendivided by the possible combination of subsequent lagsgiven their response pattern. This function was designedto handle multiple or randomized lists across participants.

Usage

crp_multiple(data, position, answer, id, key, key.trial, id.trial, scored)

Arguments

data

a dataframe of the scored free recall that you wouldlike to calculate - use prop_correct_free() for best formatting.

position

a column name in the dataframe that containsanswered position of each response in quotes (i.e., "column")

answer

a column name of the answer given for that positionin the original dataframe.

id

a column name of the participant id in the originaldataframe.

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.We assume your answer key is in the tested position order. You shouldnot include duplicates in your answer key.

key.trial

a vector containing the trial numbers for each answer.Note: If you input long data (i.e., repeating trial-answer responses),we will take the unique combination of the responses. If a trial numberis repeated, you will receive an error. Key and key.trial can also bea separate dataframe, depending on how your output data is formatted.

id.trial

a column name containing the trial numbersfor the participant data from the original dataframe. Note thatthe free response "key" trial and this trial number should match.The trial key will be repeated for each answer a participant gave.

scored

a column in the original dataframe indicating if theparticipant got the answer correct (1) or incorrect (0).

Details

This output can then be used to create a CRP visualizations,and an example can be found in our manuscript/vignettes.

Value

DF_CRP

A dataframe of the proportion correct for eachconditional lag position including any other between subjectsvariables present in the data.

Examples

data("multi_data")data("multi_answers")DF_long <- arrange_data(data = multi_data,                       responses = "Response",                       sep = " ",                       id = "Sub.ID",                       repeated = "List.Number")library(reshape)multi_answers$position <- 1:nrow(multi_answers)answer_long <- melt(multi_answers,                    measured = colnames(multi_answers),                    id = "position")colnames(answer_long) <- c("position", "List.ID", "Answer")answer_long$List.ID <- gsub(pattern = "List",                            replacement = "",                            x = answer_long$List.ID)DF_long$response <- tolower(DF_long$response)answer_long$Answer <- tolower(answer_long$Answer)answer_long$Answer <- gsub(" ", "", answer_long$Answer)scored_output <- prop_correct_multiple(data = DF_long,                                    responses = "response",                                    key = answer_long$Answer,                                    key.trial = answer_long$List.ID,                                    id = "Sub.ID",                                    id.trial = "List.Number",                                    cutoff = 1,                                    flag = TRUE)head(scored_output$DF_Scored)head(scored_output$DF_Participant)crp_output <- crp_multiple(data = scored_output$DF_Scored,                          key = answer_long$Answer,                          position = "position",                          scored = "Scored",                          answer = "Answer",                          id = "Sub.ID",                          key.trial = answer_long$List.ID,                          id.trial = "List.Number") head(crp_output)

Cued Recall Data

Description

Dataset that includes cued recall data in long format.Participants were given a cue, and they were required toremember the response listed in the dataset. This datasetis in long format, which is required for most functions.

Usage

data(cued_data)

Format

A data frame of answers for a cued recall test data

id: the participant idtrial: the trial idresponse: the response the participant gave to the cuekey: the answer for this trial idcondition: the between subjects group the participants were in


Cued Recall Data with Multiple Conditions

Description

Dataset that includes cued recall data in long format.Participants were given a cue, and they were required toremember the response listed in the dataset. This datasetis in long format, which is required for most functions.

Usage

data(cued_data_groupby)

Format

A data frame of answers for a cued recall test data

Subject: the participant idTarget: the answer for this trial idResponse: the response the participant gave to the cueCondition: the between subjects group the participantswere inCondition2: the second between subjects group theparticipants were in


Cued Recall Data from Manuscript

Description

Dataset that includes cued recall data in long format.Participants were given a cue, and they were required toremember the response listed in the dataset. This datasetis in long format, which is required for most functions.

Usage

data(cued_data)

Format

A data frame of answers for a cued recall test data

Sub.ID: the participant idTrial_num: the trial idCue: the cue shown to participantsTarget: the answer for this trial idAnswer: the participant answer for this trial


Free Recall Data

Description

Dataset that includes free recall data in long format.Participants were given a list of words to remember, andthen asked to recall the words. This datasetis in wide format, which should be converted with arrangedata.

Usage

data(free_data)

Format

A data frame of answers for a free recall test data

Username: the participant idList_Types: a repeated measures condition participants were inResponse: the response the participant gave to the cueVersion: the version of the list_type givenBatch: the batch of participants that were run together


Cohen's Kappa

Description

This function returns Cohen's Kappa k for two raters. Kappa indicatesthe inter-rater reliability for categorical items. High scores (closerto one) indicate agreement between raters, while low scores (closerto zero) indicate low agreement between raters. Negative numbers indicatethey don't agree at all!

Usage

kappa(rater1, rater2, confidence = 0.95)

Arguments

rater1

Rater 1 scores or categorical listings

rater2

Rater 2 scores or categorical listings

confidence

Confidence interval proportion for the kappa intervalestimate. You must supply a value between 0 and 1.

Details

Note: All missing values will be ignored. This function calculates kappafor 0 and 1 scoring. If you pass categorical variables, thefunction will return a percent match score between these values.

Value

p_agree

Percent agreement between raters

kappa

Cohen's kappa for yes/no matching

se_kappa

Standard error for kappa wherein standard erroris the square root of: (agree \* (1-agree)) / (N \* (1 - randomagreement)^2)

kappa_LL

Lower limit for the confidence interval of kappa

kappa_UL

Upper limit for the confidence interval of kappa

Examples

#This dataset includes two raters who wrote the word listed by#the participant and rated if the word was correct in the recall#experiment.data(rater_data)#Consider normalizing the text if raters used different styles#Calculate percent match for categorical answerskappa(rater_data$rater1_word, rater_data$rater2_word)kappa(rater_data$rater1_score, rater_data$rater2_score)

Answer Key Example Data for Multiple Lists

Description

Dataset that includes the answer key for free recall data.Pair with the multi_data dataset for examples.

Usage

data(multi_answers)

Format

A data frame of answers for a free recall test

List1: a list of free recall answersList2: a second list of free recall answersetc.


Free Recall Data in Wide Format with Multiple Lists

Description

Dataset that includes free recall data in long format.Participants were given a list of words to remember, andthen asked to recall the words. This datasetis in wide format, which should be converted with arrangedata.

Usage

data(multi_data)

Format

A data frame of answers for a free recall test data

Sub.ID: the participant idList.Type: the type of list a person sawResponse: the response the participant gave to the cueList.Number: the number of the list they completed


Probability of First Recall

Description

This function calculates the probability of first recallfor each serial position. The total number of times anitem was recalled first is divided by the total number offirst recalls (i.e., the number of participants who wroteanything down!).

Usage

pfr(data, position, answer, id, key, scored, group.by = NULL)

Arguments

data

a dataframe of the scored free recall that you wouldlike to calculate - use prop_correct_free() for best formatting.

position

a column name in the dataframe that containsanswered position of each response in quotes (i.e., "column")

answer

a column name of the answer given for that positionin the original dataframe.

id

a column name of the participant id in the originaldataframe.

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.We assume your answer key is in the tested position order. You shouldnot include duplicates in your answer key.

scored

a column in the original dataframe indicating if theparticipant got the answer correct (1) or incorrect (0).

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

This output can then be used to create a PFR visualizations,and an example can be found in our manuscript/vignettes.

Important: The code is written assuming the data provided are fora single recall list. If repeated measures are used (i.e., there aremultiple lists completed by each participant or multiple list versions),you should use this function several times, once on each list/answer key.

Value

DF_PFR

A dataframe of the probability of first responsefor each position including group by variables if indicated.

Examples

data(free_data)data(answer_key_free2)free_data <- subset(free_data, List_Type == "Cat_Recall_L1")DF_long <- arrange_data(data = free_data, responses = "Response", sep = " ", id = "Username")scored_output <- prop_correct_free(data = DF_long, responses = "response", key = answer_key_free2$Answer_Key, id = "Sub.ID", cutoff = 1, flag = TRUE, group.by = "Version")pfr_output <- pfr(data = scored_output$DF_Scored, position = "position", answer = "Answer", id = "Sub.ID", key = answer_key_free2$Answer_Key, scored = "Scored", group.by = "Version") head(pfr_output)

Probability of First Recall for Multiple Lists

Description

This function calculates the probability of first recallfor each serial position. The total number of times anitem was recalled first is divided by the total number offirst recalls (i.e., the number of participants who wroteanything down!).

Usage

pfr_multiple(  data,  position,  answer,  id,  key,  key.trial,  id.trial,  scored,  group.by = NULL)

Arguments

data

a dataframe of the scored free recall that you wouldlike to calculate - use prop_correct_free() for best formatting.

position

a column name in the dataframe that containsanswered position of each response in quotes (i.e., "column")

answer

a column name of the answer given for that positionin the original dataframe.

id

a column name of the participant id in the originaldataframe.

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.We assume your answer key is in the tested position order. You shouldnot include duplicates in your answer key.

key.trial

a vector containing the trial numbers for each answer.Note: If you input long data (i.e., repeating trial-answer responses),we will take the unique combination of the responses. If a trial numberis repeated, you will receive an error. Key and key.trial can also bea separate dataframe, depending on how your output data is formatted.

id.trial

a column name containing the trial numbersfor the participant data from the original dataframe. Note thatthe free response "key" trial and this trial number should match.The trial key will be repeated for each answer a participant gave.

scored

a column in the original dataframe indicating if theparticipant got the answer correct (1) or incorrect (0).

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

This output can then be used to create a PFR visualizations,and an example can be found in our manuscript/vignettes.

Value

DF_PFR

A dataframe of the probability of first responsefor each position including group by variables if indicated.

Examples

data("multi_data")data("multi_answers")DF_long <- arrange_data(data = multi_data,                       responses = "Response",                       sep = " ",                       id = "Sub.ID",                       repeated = "List.Number")library(reshape)multi_answers$position <- 1:nrow(multi_answers)answer_long <- melt(multi_answers,                    measured = colnames(multi_answers),                    id = "position")colnames(answer_long) <- c("position", "List.ID", "Answer")answer_long$List.ID <- gsub(pattern = "List",                            replacement = "",                            x = answer_long$List.ID)DF_long$response <- tolower(DF_long$response)answer_long$Answer <- tolower(answer_long$Answer)answer_long$Answer <- gsub(" ", "", answer_long$Answer)scored_output <- prop_correct_multiple(data = DF_long,                                    responses = "response",                                    key = answer_long$Answer,                                    key.trial = answer_long$List.ID,                                    id = "Sub.ID",                                    id.trial = "List.Number",                                    cutoff = 1,                                    flag = TRUE)head(scored_output$DF_Scored)head(scored_output$DF_Participant)head(scored_output$DF_Group)pfr_output <- pfr_multiple(data = scored_output$DF_Scored,                          key = answer_long$Answer,                          position = "position",                          scored = "Scored",                          answer = "Answer",                          id = "Sub.ID",                          key.trial = answer_long$List.ID,                          id.trial = "List.Number") head(pfr_output)

Proportion Correct Cued Recall

Description

This function computes the proportion of correct responsesper participant. Proportions can either be separated bycondition or collapsed across conditions. You will need to ensureeach trial is marked with a unique id to correspond to the answerkey.

Usage

prop_correct_cued(  data,  responses,  key,  key.trial,  id,  id.trial,  cutoff = 0,  flag = FALSE,  group.by = NULL)

Arguments

data

a dataframe of the variables you would like to return.Other variables will be included in the scored output andin the participant output if they are a one to one match withthe participant id.

responses

a column name in the dataframe that containsthe participant answers for each item in quotes (i.e., "column")

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.

key.trial

a vector containing the trial numbers for each answer.Note: If you input long data (i.e., repeating trial-answer responses),we will take the unique combination of the responses. If a trial numberis repeated, you will receive an error. Key and key.trial can also bea separate dataframe, depending on how your output data is formatted.

id

a column name containing participant ID numbers fromthe original dataframe.

id.trial

a column name containing the trial numbersfor the participant data from the original dataframe.

cutoff

a numeric value that determines the criteria forscoring (i.e., 0 = strictest, 5 = is most lenient). The scoringcriteria uses a Levenshtein distance measure to match participantresponses to the answer key.

flag

a logical argument if you want to flag participant scoresthat are outliers using z-scores away from the mean score for group

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

Note: other columns included in the dataframe will be foundin the final scored dataset. If these other columns arebetween subjects data, they will also be included in theparticipant dataset (i.e., there's a one to one match ofparticipant ID and column information).

Value

DF_Scored

The dataframe of the original response, answer,scoring, and any other or grouping variables. This dataframe canbe used to determine if the cutoff score and scoring matched youranswer key as intended. Distance measures are not perfect! Issuesand suggestions for improvement are welcome.

DF_Participant

A dataframe of the proportion correct byparticipant, which also includes optional z-scoring, grouping, andother variables.

DF_Group

A dataframe of the summary scores by any optionalgrouping variables, along with overall total proportion correctscoring.

Examples

#This data contains cued recall test with responses and answers together.#You can use a separate answer key, but this example will show you an#embedded answer key. This example also shows how you can use different#stimuli across participants (i.e., each person sees a randomly selected#set of trials from a larger set).data(cued_data)scored_output <- prop_correct_cued(data = cued_data, responses = "response", key = "key", key.trial = "trial", id = "id", id.trial = "trial", cutoff = 1, flag = TRUE, group.by = "condition")head(scored_output$DF_Scored)head(scored_output$DF_Participant)head(scored_output$DF_Group)

Proportion Correct Free Recall

Description

This function computes the proportion of correct responsesper participant. Proportions can either be separated bycondition or collapsed across conditions.

Usage

prop_correct_free(  data,  responses,  key,  id,  cutoff = 0,  flag = FALSE,  group.by = NULL)

Arguments

data

a dataframe of the variables you would like to return.Other variables will be included in the scored output andin the participant output if they are a one to one match withthe participant id.

responses

a column name in the dataframe that containsthe participant answers for each item in quotes (i.e., "column")

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.

id

a column name containing participant ID numbers fromthe original dataframe

cutoff

a numeric value that determines the criteria forscoring (i.e., 0 = strictest, 5 = is most lenient). The scoringcriteria uses a Levenshtein distance measure to match participantresponses to the answer key.

flag

a logical argument if you want to flag participant scoresthat are outliers using z-scores away from the mean score for group

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

Note: other columns included in the dataframe will be foundin the final scored dataset. If these other columns arebetween subjects data, they will also be included in theparticipant dataset (i.e., there's a one to one match ofparticipant ID and column information).

Value

DF_Scored

The dataframe of the original response, answer,scoring, and any other or grouping variables. This dataframe canbe used to determine if the cutoff score and scoring matched youranswer key as intended. Distance measures are not perfect! Issuesand suggestions for improvement are welcome.

DF_Participant

A dataframe of the proportion correct byparticipant, which also includes optional z-scoring, grouping, andother variables.

DF_Group

A dataframe of the summary scores by any optionalgrouping variables, along with overall total proportion correctscoring.

Examples

data(wide_data)data(answer_key_free)DF_long <- arrange_data(data = wide_data, responses = "Response", sep = ",", id = "Sub.ID")scored_output <- prop_correct_free(data = DF_long, responses = "response", key = answer_key_free$Answer_Key, id = "Sub.ID", cutoff = 1, flag = TRUE, group.by = "Disease.Condition")head(scored_output$DF_Scored)head(scored_output$DF_Participant)head(scored_output$DF_Group)

Proportion Correct Free Recall for Multiple Lists

Description

This function computes the proportion of correct responsesper participant. Proportions can either be separated bycondition or collapsed across conditions. This functionextends prop_correct_free() to include multiple or randomizedlists for participants.

Usage

prop_correct_multiple(  data,  responses,  key,  key.trial,  id,  id.trial,  cutoff = 0,  flag = FALSE,  group.by = NULL)

Arguments

data

a dataframe of the variables you would like to return.Other variables will be included in the scored output andin the participant output if they are a one to one match withthe participant id.

responses

a column name in the dataframe that containsthe participant answers for each item in quotes (i.e., "column")

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.

key.trial

a vector containing the trial numbers for each answer.Note: If you input long data (i.e., repeating trial-answer responses),we will take the unique combination of the responses. If a trial numberis repeated, you will receive an error. Key and key.trial can also bea separate dataframe, depending on how your output data is formatted.

id

a column name containing participant ID numbers fromthe original dataframe.

id.trial

a column name containing the trial numbersfor the participant data from the original dataframe. Note thatthe free response "key" trial and this trial number should match.The trial key will be repeated for each answer a participant gave.

cutoff

a numeric value that determines the criteria forscoring (i.e., 0 = strictest, 5 = is most lenient). The scoringcriteria uses a Levenshtein distance measure to match participantresponses to the answer key.

flag

a logical argument if you want to flag participant scoresthat are outliers using z-scores away from the mean score for group

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

Note: other columns included in the dataframe will be foundin the final scored dataset. If these other columns arebetween subjects data, they will also be included in theparticipant dataset (i.e., there's a one to one match ofparticipant ID and column information).

Value

DF_Scored

The dataframe of the original response, answer,scoring, and any other or grouping variables. This dataframe canbe used to determine if the cutoff score and scoring matched youranswer key as intended. Distance measures are not perfect! Issuesand suggestions for improvement are welcome.

DF_Participant

A dataframe of the proportion correct byparticipant, which also includes optional z-scoring, grouping, andother variables.

DF_Group

A dataframe of the summary scores by any optionalgrouping variables, along with overall total proportion correctscoring.

Examples

data("multi_data")data("multi_answers")DF_long <- arrange_data(data = multi_data,                       responses = "Response",                       sep = " ",                       id = "Sub.ID",                       repeated = "List.Number")library(reshape)multi_answers$position <- 1:nrow(multi_answers)answer_long <- melt(multi_answers,                    measured = colnames(multi_answers),                    id = "position")colnames(answer_long) <- c("position", "List.ID", "Answer")answer_long$List.ID <- gsub(pattern = "List",                            replacement = "",                            x = answer_long$List.ID)DF_long$response <- tolower(DF_long$response)answer_long$Answer <- tolower(answer_long$Answer)answer_long$Answer <- gsub(" ", "", answer_long$Answer)scored_output <- prop_correct_multiple(data = DF_long,                                    responses = "response",                                    key = answer_long$Answer,                                    key.trial = answer_long$List.ID,                                    id = "Sub.ID",                                    id.trial = "List.Number",                                    cutoff = 1,                                    flag = TRUE)head(scored_output$DF_Scored)head(scored_output$DF_Participant)

Proportion Correct for Sentences

Description

This function computes the proportion of correct sentence responsesper participant. Proportions can either be separated bycondition or collapsed across conditions. You will need to ensureeach trial is marked with a unique id to correspond to the answerkey.

Usage

prop_correct_sentence(  data,  responses,  key,  key.trial,  id,  id.trial,  cutoff = 0,  flag = FALSE,  group.by = NULL,  token.split = " ")

Arguments

data

a dataframe of the variables you would like to return.Other variables will be included in the scored output andin the participant output if they are a one to one match withthe participant id.

responses

a column name in the dataframe that containsthe participant answers for each item in quotes (i.e., "column")

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.

key.trial

a vector containing the trial numbers for each answer.Note: If you input long data (i.e., repeating trial-answer responses),we will take the unique combination of the responses. If a trial numberis repeated, you will receive an error. Key and key.trial can also bea separate dataframe, depending on how your output data is formatted.

id

a column name containing participant ID numbers fromthe original dataframe

id.trial

a column name containing the trial numbersfor the participant data from the original dataframe

cutoff

a numeric value that determines the criteria forscoring (i.e., 0 = strictest, 5 = is most lenient). The scoringcriteria uses a Levenshtein distance measure to match participantresponses to the answer key.

flag

a logical argument if you want to flag participant scoresthat are outliers using z-scores away from the mean score for group

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

token.split

an optional argument that can be used to delineatehow to separate tokens. The default is a space after punctuation andadditional spacing is removed.

Details

Note: other columns included in the dataframe will be foundin the final scored dataset. If these other columns arebetween subjects data, they will also be included in theparticipant dataset (i.e., there's a one to one match ofparticipant ID and column information).

Value

DF_Scored

The dataframe of the original response, answer,scoring, and any other or grouping variables. This dataframe canbe used to determine if the cutoff score and scoring matched youranswer key as intended. Distance measures are not perfect! Issuesand suggestions for improvement are welcome.

DF_Participant

A dataframe of the proportion correct byparticipant, which also includes optional z-scoring, grouping, andother variables.

DF_Group

A dataframe of the summary scores by any optionalgrouping variables, along with overall total proportion correctscoring.

Examples

#This data contains sentence recall test with responses and answers together.#You can use a separate answer key, but this example will show you an#embedded answer key. This example also shows how you can use different#stimuli across participants (i.e., each person sees a randomly selected#set of trials from a larger set).data(sentence_data)scored_output <- prop_correct_sentence(data = sentence_data, responses = "Response", key = "Sentence", key.trial = "Trial.ID", id = "Sub.ID", id.trial = "Trial.ID", cutoff = 1, flag = TRUE, group.by = "Condition", token.split = " ")head(scored_output$DF_Scored)head(scored_output$DF_Participant)head(scored_output$DF_Group)

Rater Data

Description

Dataset that contains scoring and ratings for a recall testthat was rated by two raters. Use with the kappa functionas an example.

Usage

data(rater_data)

Format

A data frame of scored answers for inter-rater reliability

Sub.ID: the participant idrater1_word: the word choice for the subject the rater selectedrater1_score: the score for the participant given by the raterrater2_word: the word choice for the subject the rater selectedrater2_score: the score for the participant given by the rater


Sentence Recall Data

Description

Dataset that includes sentence recall data in long format.Participants were given a sentence to remember, andthen asked to recall the words. This datasetis in long format, which is required for these functions.

Usage

data(sentence_data)

Format

A data frame of answers for a sentence recall test data

Sub.ID: the participant idTrial.ID: the id for the trial given to participantSentence: the answer to the trial that the participantshould have givenResponse: the response the participant gave to that trialCondition: the between subjects condition the participantwas in


Serial Position Calculator

Description

This function calculates the proportion correct of each item in theserial position curve. Data should include the participant's answersin long format (use arrange_data() in this package for help), the answerkey of the items in order, and a column that denotes the order aparticipant listed each item. The function will then calculatethe items remembered within a window of 1 before or 1 after thetested position. The first and last positions must be answered in thecorrect place.

Usage

serial_position(data, position, answer, key, scored, group.by = NULL)

Arguments

data

a dataframe of the scored free recall that you wouldlike to calculate - use prop_correct_free() for best formatting.

position

a column name in the dataframe that containsanswered position of each response in quotes (i.e., "column")

answer

a column name of the answer given for that positionin the original dataframe.

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.We assume your answer key is in the tested position order. You shouldnot include duplicates in your answer key.

scored

a column in the original dataframe indicating if theparticipant got the answer correct (1) or incorrect (0).

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

This output can then be used to create a serial position curve visualizations,and an example can be found in our manuscript/vignettes.

Important: The code is written assuming group.by variables arebetween subjects for an individual recall list.If repeated measures are used (i.e., there aremultiple lists completed by each participant or multiple list versions),you should use this function several times, once on each list/answer key.

Value

DF_Serial

A dataframe of the proportion correct for eachtested position by any optional grouping variables included.

Examples

data(free_data)data(answer_key_free2)free_data <- subset(free_data, List_Type == "Cat_Recall_L1")DF_long <- arrange_data(data = free_data, responses = "Response", sep = " ", id = "Username")scored_output <- prop_correct_free(data = DF_long, responses = "response", key = answer_key_free2$Answer_Key, id = "Sub.ID", cutoff = 1, flag = TRUE, group.by = "Version")serial_output <- serial_position(data = scored_output$DF_Scored, key = answer_key_free2$Answer_Key, position = "position", scored = "Scored", answer = "Answer", group.by = "Version") head(serial_output)

Serial Position Calculator for Multiple Lists

Description

This function calculates the proportion correct of each item in theserial position curve. Data should include the participant's answersin long format (use arrange_data() in this package for help), the answerkey of the items in order, and a column that denotes the order aparticipant listed each item. The function will then calculatethe items remembered within a window of 1 before or 1 after thetested position. The first and last positions must be answered in thecorrect place. Specifically, this function is an extension ofserial_position() for free recall when there are multiple listsor randomized lists.

Usage

serial_position_multiple(  data,  position,  answer,  key,  key.trial,  id.trial,  scored,  group.by = NULL)

Arguments

data

a dataframe of the scored free recall that you wouldlike to calculate - use prop_correct_multiple() for best formatting.

position

a column name in the dataframe that containsanswered position of each response in quotes (i.e., "column")

answer

a column name of the answer given for that positionin the original dataframe.

key

a vector containing the scoring key or data column name.This column does not have to be included in the original dataframe.We assume your answer key is in the tested position order. You shouldnot include duplicates in your answer key.

key.trial

a vector containing the trial numbers for each answer.Note: If you input long data (i.e., repeating trial-answer responses),we will take the unique combination of the responses. If a trial numberis repeated, you will receive an error. Key and key.trial can also bea separate dataframe, depending on how your output data is formatted.

id.trial

a column name containing the trial numbersfor the participant data from the original dataframe. Note thatthe free response "key" trial and this trial number should match.The trial key will be repeated for each answer a participant gave.

scored

a column in the original dataframe indicating if theparticipant got the answer correct (1) or incorrect (0).

group.by

an optional argument that can be used to group theoutput by condition columns. These columns should be in the originaldataframe and concatenated c() if there are multiple columns

Details

This output can then be used to create a serial position curve visualizations,and an example can be found in our manuscript/vignettes.

Value

DF_Serial

A dataframe of the proportion correct for eachtested position by any optional grouping variables included.

Examples

data("multi_data")data("multi_answers")DF_long <- arrange_data(data = multi_data,                       responses = "Response",                       sep = " ",                       id = "Sub.ID",                       repeated = "List.Number")library(reshape)multi_answers$position <- 1:nrow(multi_answers)answer_long <- melt(multi_answers,                    measured = colnames(multi_answers),                    id = "position")colnames(answer_long) <- c("position", "List.ID", "Answer")answer_long$List.ID <- gsub(pattern = "List",                            replacement = "",                            x = answer_long$List.ID)DF_long$response <- tolower(DF_long$response)answer_long$Answer <- tolower(answer_long$Answer)answer_long$Answer <- gsub(" ", "", answer_long$Answer)scored_output <- prop_correct_multiple(data = DF_long,                                    responses = "response",                                    key = answer_long$Answer,                                    key.trial = answer_long$List.ID,                                    id = "Sub.ID",                                    id.trial = "List.Number",                                    cutoff = 1,                                    flag = TRUE)head(scored_output$DF_Scored)head(scored_output$DF_Participant)serial_output <- serial_position_multiple(data = scored_output$DF_Scored,                                         position = "position",                                         answer = "Answer",                                         key = answer_long$Answer,                                         key.trial = answer_long$List.ID,                                         scored = "Scored",                                         id.trial = "List.Number") head(serial_output)

Free Recall Data in Wide Format

Description

Dataset that includes free recall data in long format.Participants were given a list of words to remember, andthen asked to recall the words. This datasetis in wide format, which should be converted with arrangedata.

Usage

data(wide_data)

Format

A data frame of answers for a free recall test data

Sub.ID: the participant idResponse: the response the participant gave to the cueDisease.Condition: healthy or sick participant condition


[8]ページ先頭

©2009-2025 Movatter.jp