Movatterモバイル変換

Introduction

stenR main focus in onstandardizationandnormalization of raw scores of questionnaire orsurvey on basis of Classical Test Theorem.

Particularly in psychology and other social studies it is very commonto not interpret the raw results of measurement in individual context.In actuality, it would be usually a mistake to do so. Instead, there isa need to evaluate the score of single questionee on the basis of largersample. It can be done by finding the place of every individual rawscore in the distribution of representative sample. One can refer tothis process asnormalization. Additional step in thisphase would be tostandardize the data even further:from the quantile to fitting standard scale.

It need to be noted that rarely one answer for one question (oritem) is enough to measure a latent variable. Almost alwaysthere is a need to construct scale or factor of similar items to gathera behavioral sample. This vital preprocessing phase of transforming theitem-level raw scores toscale-level can be alsohandled by functions available in this package, though this feature isnot the main focus.

Factor analysis and actual construction of scales or factors isbeyond the scope of this package. There are multiple useful and solidtools available for this. Look uponpsych and/orlavaan for these features.

This journey from raw, questionnaire data to normalized andstandardized results will be presented in this vignette.

Raw questionnaire data preprocessing

We will work on the dataset available in this package:SLCS. It contains answers of 103 people to the Polishversion of Self-Liking Self-Competence Scale.

library(stenR)#> This is version 0.6.9 of stenR package.#> Visit https://github.com/statismike/stenR to report an issue or contribute. If you like it - star it!str(SLCS)#> 'data.frame':    103 obs. of  19 variables:#>  $ user_id: chr  "damaged_kiwi" "unilateralised_anglerfish" "technical_anemonecrab" "temperate_americancurl" ...#>  $ sex    : chr  "M" "F" "F" "F" ...#>  $ age    : int  30 31 22 26 22 17 27 24 20 19 ...#>  $ SLCS_1 : int  4 5 4 5 5 5 5 4 4 5 ...#>  $ SLCS_2 : int  2 2 4 3 2 3 1 5 2 1 ...#>  $ SLCS_3 : int  1 2 4 2 3 1 1 4 1 2 ...#>  $ SLCS_4 : int  2 1 4 2 4 2 1 4 4 2 ...#>  $ SLCS_5 : int  2 2 4 1 2 2 2 4 2 2 ...#>  $ SLCS_6 : int  4 4 5 5 5 5 1 2 5 4 ...#>  $ SLCS_7 : int  4 4 4 5 3 5 2 3 5 3 ...#>  $ SLCS_8 : int  4 5 4 5 4 5 5 4 4 5 ...#>  $ SLCS_9 : int  2 3 2 1 3 1 1 4 1 1 ...#>  $ SLCS_10: int  4 4 3 4 4 4 5 4 5 5 ...#>  $ SLCS_11: int  1 1 2 1 1 2 1 3 1 1 ...#>  $ SLCS_12: int  4 2 4 3 3 2 2 4 3 1 ...#>  $ SLCS_13: int  4 5 5 4 3 4 4 4 5 5 ...#>  $ SLCS_14: int  2 1 3 2 4 1 1 4 1 1 ...#>  $ SLCS_15: int  5 4 4 4 4 3 3 2 5 4 ...#>  $ SLCS_16: int  4 5 5 4 5 4 5 5 5 5 ...

As can be seen above, it contains some demographical data and eachquestionee answers to 16 diagnostic items.

Authors of the measure have prepared instructions for calculating thescores for two subscales (Self-Liking andSelf-Competence).General Score is, actually, just sumof the subscale scores.

Self-Liking: 1R, 3, 5, 6R, 7R, 9, 11, 15R
Self-Competence: 2, 4, 8R, 10R, 12, 13R, 14, 16

Items numbers suffixed withR means, that thisparticular item need to bereversed before summarizing with therest of them to calculate the raw score for a subscale. That’s becauseduring the measure construction, the answers to these items werenegatively correlated with the whole scale.

All of this steps can be achieved using theitem-preprocessing functions fromstenR.

Firstly, you need to create scale specification objects that refer tothe items in the available data by their name. It need to also list theitems that need reversing (if any) and declareNA insertionstrategies (by default: no insertion).

Absolute minimum and maximum score for each itemneed to be also provided on this step. It allows correct computationeven if the absolute values are not actually available in the data thatwill be summed into factor. This situation shouldnothappen during first computation of the score table on fullrepresentative sample, but it is very likely to happen when summarizingscores for only few observations.

# create ScaleSpec objects for sub-scalesSL_spec<-ScaleSpec(name ="Self-Liking",item_names =c("SLCS_1","SLCS_3","SLCS_5","SLCS_6","SLCS_7","SLCS_9","SLCS_11","SLCS_15"),min =1,max =5,reverse =c("SLCS_1","SLCS_6","SLCS_7","SLCS_15"))SC_spec<-ScaleSpec(name ="Self-Competence",item_names =c("SLCS_2","SLCS_4","SLCS_8","SLCS_10","SLCS_12","SLCS_13","SLCS_14","SLCS_16"),min =1,max =5,reverse =c("SLCS_8","SLCS_10","SLCS_13"))# create CombScaleSpec object for general scale using single-scale# specificationGS_spec<-CombScaleSpec(name ="General Score",  SL_spec,  SC_spec)print(SL_spec)#> <ScaleSpec>: Self-Liking#> No. items: 8 [4 reversed]print(SC_spec)#> <ScaleSpec>: Self-Competence#> No. items: 8 [3 reversed]print(GS_spec)#> <CombScaleSpec>: General Score#> Total items: 16#> Underlying objects:#> 1. <ScaleSpec> Self-Liking [No.items: 8]#> 2. <ScaleSpec> Self-Competence [No.items: 8]

After the scale specification objects have been created, we canfinally transform ouritem-level raw scores toscale-level ones usingsum_items_to_scale()function.

EachScaleSpec orCombScaleSpec object providedduring its call will be used to create one variable, taking into accountitems that need reversing (or sub-scales in case ofCombScaleSpec), as well asNA imputationstrategies chosen for each of the scales.

By default only these columns will be available in the resultingdata.frame, but by specifying theretain argumentwe can control that.

summed_data<-sum_items_to_scale(data = SLCS,  SL_spec,  SC_spec,  GS_spec,retain =c("user_id","sex"))str(summed_data)#> 'data.frame':    103 obs. of  5 variables:#>  $ user_id        : chr  "damaged_kiwi" "unilateralised_anglerfish" "technical_anemonecrab" "temperate_americancurl" ...#>  $ sex            : chr  "M" "F" "F" "F" ...#>  $ Self-Liking    : int  13 15 19 10 16 12 18 28 10 14 ...#>  $ Self-Competence: int  20 15 26 19 25 17 14 28 19 13 ...#>  $ General Score  : int  33 30 45 29 41 29 32 56 29 27 ...

At this point we successfully prepared our data: it now describes thelatent variables that we actually wanted to measure, not individualitems. All is in place for next step: resultsnormalization andstandardization.

BothScaleSpec andCombScaleSpecobjects have their specificprint andsummarymethods defined.

Normalize and standardize the results

We will take a brief look at theprocedural workflow ofnormalization and standardization. It should be noted, that it is moreverbose and have less features than theobject-orientedworkflow. Nevertheless, it is recommended for useRs that don’t havemuch experience utilizingR6 classes. For more informationabout both, readProcedural and Object-oriented workflows ofstenR vignette.

To process the data,stenR need to compute the object ofclassScoreTable. It is very similar to the regular scoretables that can be seen in many measures documentations, though it iscomputed directly on the basis of available raw scores fromrepresentative sample. After that first, initial construction it can bereused on new observations.

This is a two step process. Firstly, we need to compute aFrequencyTable object that is void of any standard score scalefor every sub-scale and scale.

# Create the FrequencyTablesSL_ft<-FrequencyTable(summed_data$`Self-Liking`)#> ℹ There are missing raw score values between minimum and maximum raw scores.#>   They have been filled automatically.#>   No. missing: 3/33 [9.09%]SC_ft<-FrequencyTable(summed_data$`Self-Competence`)#> ℹ There are missing raw score values between minimum and maximum raw scores.#>   They have been filled automatically.#>   No. missing: 1/24 [4.17%]GS_ft<-FrequencyTable(summed_data$`General Score`)#> ℹ There are missing raw score values between minimum and maximum raw scores.#>   They have been filled automatically.#>   No. missing: 13/53 [24.53%]

There were some warnings printed out there: they are generated ifthere were any raw score values that were missing in-betweenactual minimal and maximal values of raw scores. By the rule ofthe thumb - the wider the raw score range and the smaller andless-representative the sample is, the bigger possibility for this tohappen. It is recommended to try and gather bigger sample if thishappens - unless you are sure that it is representative enough.

After they are defined, they can be transformed intoScoreTable objects by providing them someStandardScale object. Objects for some of more popular scalesin psychology are already defined - we will use very commonly utilizedStandard Ten Scale:STEN

# Check what is the STEN *StandardScale* definitionprint(STEN)#> <StandardScale>: sten#> `M`: 5.5 `SD`: 2 `min` 1: `max`: 10# Calculate the ScoreTablesSL_st<-ScoreTable(SL_ft, STEN)SC_st<-ScoreTable(SC_ft, STEN)GS_st<-ScoreTable(GS_ft, STEN)

At this point, the last thing that remains is to normalize thescores. It can be done usingnormalize_score() ornormalize_scores_df() functions.

# normalize each of the scores in one callnormalized_at_once<-normalize_scores_df(  summed_data,vars =c("Self-Liking","Self-Competence","General Score"),  SL_st,  SC_st,  GS_st,what ="sten",retain =c("user_id","sex"))str(normalized_at_once)#> 'data.frame':    103 obs. of  5 variables:#>  $ user_id        : chr  "damaged_kiwi" "unilateralised_anglerfish" "technical_anemonecrab" "temperate_americancurl" ...#>  $ sex            : chr  "M" "F" "F" "F" ...#>  $ Self-Liking    : num  3 4 5 2 4 3 5 8 2 4 ...#>  $ Self-Competence: num  5 2 7 4 7 3 2 8 4 2 ...#>  $ General Score  : num  4 3 6 3 5 3 4 8 3 2 ...# or normalize scores individuallySL_sten<-normalize_score(summed_data$`Self-Liking`,table = SL_st,what ="sten")SC_sten<-normalize_score(summed_data$`Self-Competence`,table = SC_st,what ="sten")GC_sten<-normalize_score(summed_data$`General Score`,table = GS_st,what ="sten")# check the structurestr(list(SL_sten, SC_sten, GC_sten))#> List of 3#>  $ : num [1:103] 3 4 5 2 4 3 5 8 2 4 ...#>  $ : num [1:103] 5 2 7 4 7 3 2 8 4 2 ...#>  $ : num [1:103] 4 3 6 3 5 3 4 8 3 2 ...

Movatterモバイル変換

Tour from data to results

Introduction

Raw questionnaire data preprocessing

Normalize and standardize the results

Summary