INTRODUCTION: This study examined the reliability of the scores of an assessment instrument, the Debriefing Assessment for Simulation in Healthcare (DASH), in evaluating the quality of health care simulation debriefings. The secondary objective was to evaluate whether the instrument's scores demonstrate evidence of validity. METHODS: Two aspects of reliability were examined, interrater reliability and internal consistency. To assess interrater reliability, intraclass correlations were calculated for 114 simulation instructors enrolled in webinar training courses in the use of the DASH. The instructors reviewed a series of 3 standardized debriefing sessions. To assess internal consistency, Cronbach alpha was calculated for this cohort. Finally, 1 measure of validity was examined by comparing the scores across 3 debriefings of different quality. RESULTS: Intraclass correlation coefficients for the individual elements were predominantly greater than 0.6. The overall intraclass correlation coefficient for the combined elements was 0.74. Cronbach alpha was 0.89 across the webinar raters. There were statistically significant differences among the ratings for the 3 standardized debriefings (P < 0.001). CONCLUSIONS: The DASH scores showed evidence of good reliability and preliminary evidence of validity. Additional work will be needed to assess the generalizability of the DASH based on the psychometrics of DASH data from other settings.