TPC-Journal-V4-Issue3

The Professional Counselor \Volume 4, Issue 3 218 Therefore, standards related to the alignment of the instruments with DSM symptoms (i.e., evidence based on test content) are circumspect. As Helzer et al. (2006) reported, the dimensional approach to diagnosis must align with the definition of the diagnosis in the DSM-5 . Connecting Validity Standards to CCSMs Pertinent to the utilization of the emerging measures for the purposes of diagnosis and clinical decision making is the extent to which the measures align with diagnostic criteria and are useful. The American Educational Research Association (AERA), the American Psychological Association, and the National Council on Measurement in Education (NCME) jointly publish the Standards for Educational and Psychological Testing. AERA et al. (1999) outlined issues related to instrument development, fairness and bias, and application of results to various settings (e.g., educational, vocational, psychological). With respect to evaluating research, issues of test construction, specifically evaluating validity and reliability, need to be addressed. According to AERA et al., “validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of test” (1999, p. 9). Validity, therefore, is not simply about the alignment of an instrument with theory and research, but also about how the scores are used. The most recent edition of the standards was published in 1999, which represented the fourth edition of the joint publication and the sixth publication by at least one of the representative bodies. As of August 2013, AERA et al. approved a revision to the 1999 Standards ; however, a publication date is pending the development of a new agreement regarding how the revised Standards will be managed and published (AERA et al., 2009). Thus, the 1999 Standards represent the most current edition for measurement guidelines. AERA et al. (1999) identified five evidences for evaluating the validity of a measure: (a) evidence based on test content, (b) evidence based on response processes, (c) evidence based on internal structure, (d) evidence based on relationships to other variables and (e) evidence based on consequences of testing. Evidence based on test content is specifically related to the extent to which the items are aligned with existing theory and the operational definition of the construct. Evidence of test content often is established through documentation of a review of extant literature and expert review. Evidence based on response processes includes an analysis of how respondents answer or perform on given items. In counseling research, some documentation about how respondents interpret the items may be noted. Evidence based on internal structure refers to the psychometric properties of the instrument. For example, items on a scale should be correlated as they measure the same construct, but they should not be overly correlated, as that could indicate that the items are not measuring anything unique. Generally, factor analysis and reliability estimates are used to indicate adequate factor structure and accurate and consistent responses for scores. Evidence based on relationships to other variables is usually demonstrated through some type of correlational research in which the scores on an instrument are correlated with scores on another instrument. Hence, how an instrument correlates to another instrument provides evidence that the same construct is being measured. Evidence based on consequences of testing refers to the need to document the “intended and unintended consequences” of test scores (AERA et al., 1999, p. 16). The choice of using scores on an instrument should be aligned with theory and practice. Evidence of Validity for the Emerging Measures To address the psychometric properties of each of the measures is outside the scope of this article. The APA promoted various measures with common psychometric properties reported extensively in research, while other measures’ psychometric properties were not as evident (Aldea, Rahman, & Storch, 2009; Allgaier, Pietsch, Fr he, Sigl-Glöckner, & Schulte-Körne, 2012 ; Altman, Hedeker, Peterson, & Davis, 1997; Feldman, Joormann, & Johnson, 2008; Han et al., 2009; Livianos-Aldana & Rojo-Moreno, 2001; Storch et al., 2007; Storch et al., 2009; Stringaris et al., 2012; Titov et al., 2011). From the reported measures, fairly strong psychometric properties were apparent. However, not all of the measures promoted have extensive reports (e.g., PROMIS