Document Type


Publication Date



This article has been peer reviewed. It is the author’s final published version in Frontiers in Psychology, Volume 9, Issue AUG, Article number 1343, published first by Frontiers Media.

The published version is available at Copyright © Hass et al.


A new system for subjective rating of responses to divergent thinking tasks was tested using raters recruited from Amazon Mechanical Turk. The rationale for the study was to determine if such raters could provide reliable (aka generalizable) ratings from the perspective of generalizability theory. To promote reliability across the Alternative Uses and Consequence task prompts often used by researchers as measures of Divergent Thinking, two parallel scales were developed to facilitate feasibility and validity of ratings performed by laypeople. Generalizability and dependability studies were conducted separately for two scoring systems: the average-rating system and the snapshot system. Results showed that it is difficult to achieve adequate reliability using the snapshot system, while good reliability can be achieved on both task families using the average-rating system and a specific number of items and raters. Additionally, the construct validity of the average-rating system is generally good, with less validity for certain Consequences items. Recommendations for researchers wishing to adopt the new scales are discussed, along with broader issues of generalizability of subjective creativity ratings. © 2018 Hass, Rivera and Silvia.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.