TPC Journal-Vol 10- Issue 3-FULL ISSUE

The Professional Counselor | Volume 10, Issue 3 343 The video-based training protocol was used as the means of training participants in dispositional assessment. The purpose of the trainings was to increase consistency of admissions raters in evaluating the admissions interviews of applicants to a master’s-level counselor education program. Typically, participants completed the video training in small groups consisting of approximately six to 10 people. In addition to viewing the training video, participants also took part in group discussion and established a consensus of opinion on group ratings of video clips. Coming to a consensus on ratings, which also included feedback on rubric items and video clips, was an important aspect of the training. Statistical Analysis The PDCA-RA scores from the counselor education faculty, adjunct faculty, doctoral students, and site supervisors’ ratings of the vignettes before training were used as the pretest or baseline interrater reliability. The PDCA-RA scores after participants were trained in the tool were used as the posttest. The intraclass correlation coefficient (ICC) was calculated as a measure of interrater reliability. Interrater reliability correlations quantify rater subjectivity (Herman et al., 1992). The ICC was calculated for pretest and posttest scores. Cronbach’s alpha coefficients were calculated for internal consistency, and Fleiss’ kappa ( κ ) was calculated for absolute agreement. In addition, Fleiss’ free-marginal kappa ( κ free ) and percent overall agreement were calculated. Calculations were made for both the pretest and posttest ratings, and a t -test was conducted, using SPSS, to determine whether training improved interrater reliability. Results The ICC estimates and associated 95% confidence intervals were calculated using SPSS statistical package version 23 and based on an individual rating, absolute agreement, 2-way random-effects model. ICC single measures for absolute agreement were calculated for the pretest administration of the PDCA-RA at .53 (95% CI [0.333–0.807]). The ICC single measures for absolute agreement were calculated for the posttest administration of the PDCA-RA at .76 (95% CI [0.582–0.920]). Cronbach’s alpha was calculated at .99 for both pretest and posttest scores. Pretest and posttest ICCs were compared using a t -test with an a priori significance level set at .05. The test was significant (p < .05), suggesting that there was a difference between the pretest and posttest reliability, with reliability improving from the “moderate” range to the “good” range (Koo & Li, 2016) with training. Using Excel, kappa (κ) was calculated as a measure of overall agreement for pretest and posttest scores. This particular kappa was extended by Fleiss (1971) and accommodates multiple raters like those rating the PDCA-RA. Assumptions underpinning Fleiss’ kappa include categorical data (i.e., nominal or ordinal) with mutually exclusive categories, symmetrical cross-tabulations, and independence of raters. Data in this study met all assumptions. Data was ordinal with three mutually exclusive response categories for each dispositional area assessed, which resulted in all cross-tabulations being symmetrical. Although raters were trained in a collaborative setting where discussions about ratings were fostered, when the actual ratings of study participants occurred, raters did not discuss their ratings with others and were thus independent of one another. Pretest scores for the nine rubric items reflected a κ of .33, fair agreement according to Landis and Koch (1977). After training, posttest scores on the nine items reflected a κ of .55, moderate agreement according to Landis and Koch. As an additional analysis, percent overall agreement and κ free was calculated. κ free is appropriate when raters do not know how many cases should be distributed into each category. In addition, κ free is resistant to influence by prevalence and bias (Randolph, 2005). The percent of overall agreement is the measure of agreement between raters and historically has also been used to calculate interrater

RkJQdWJsaXNoZXIy NDU5MTM1