Page tree

Topic

Researchers Don't Let Friends Use Unreliable Task Ratings

Program

Usability studies often include tasks for participants to complete, which are then ranked for perceived difficulty. However, these rankings are often subjective and non-standardized across studies conducted within the same team.

To learn how we might improve task ratings' reliability, we welcomed a special guest speaker, Ann M. Aly, a social scientist, and UX researcher. Ann shared a 5-level rubric her team created to assess task difficulty for task factors contributing to (non)completion. Interrater reliability testing used Cohen's Kappa (which measures the agreement between raters) to ensure rigor and consistency in rankings.

The presentation included best practices and practical tips, like:

  • How to make task assessments less subjective
  • How to keep ratings rigorous, eligible for quantitative assessments, and increase research maturity among team members
  • The benefits of a standardized rubric across studies and the importance of developing a shared language into a cross-functional team
  • Using detailed task ratings to justify which domains or features require the most resources before release, like a mini heuristic evaluation
  • A how-to guide for conducting a quantitative assessment and sample rubric

Ann is a social scientist and mixed methods UX researcher at Agile 6 Applications. She is currently teamed with Fearless Solutions to support Human-Centered Design at CMS' Office of Information Technology. Ann firmly believes in keeping the social in social science by centering those whose research her work serves and making her insights accessible to non-specialists. Before joining Agile 6, Ann conducted research and consulted on higher education, bilingual communities, skill acquisition, and culturally relevant healthcare applications. Ann enjoys caffeinated beverages, remote beaches, and entry-level woodworking when not lost in a spreadsheet or transcript.

Resources




  • No labels