DOI QR코드

DOI QR Code

The Influence of Likert Scale Format on Response Result, Validity, and Reliability of Scale -Using Scales Measuring Economic Shopping Orientation-

  • Kim, Sae-Hee (Div. of Fashion & Beauty, Busan Kyungsang College)
  • Received : 2010.03.23
  • Accepted : 2010.05.13
  • Published : 2010.06.06

Abstract

This study investigates the influence of Likert scale formats such as the number of response categories and the inclusion of a mid-point from a methodological point of view using instruments that measure a fashionmarketing-related subject. Using a self-administered questionnaire, 201 respondents rated their economic clothing shopping orientation on three formats of scales that differed only in the number of response categories (ranging from 5 to 7) from February 8 to February 12, 2010. Descriptive statistics, Spearman's rank order correlation, t-test, exploratory factor analysis, confirmatory factor analysis, Pearson's correlation, and Cronbach's alpha were used in the analysis. The results are as follows. First, three scale formats were generally suitable for use due to validity and reliability. Second, the response results varied with the number of categories and the inclusion of a mid-point, although the differences were statistically insignificant (with only a few cases that differed). Third, construct validity was more secure in scales with fewer categories, whereas convergent and discriminant validity was generally good in all scale formats. Fourth, reliability coefficients were higher in scales with more categories. Fifth, the number of categories was of greater importance to instrument design than the inclusion of a mid-point. Implications for appropriate scale designs are suggested in this study.

Keywords

References

  1. Alwin, D. F. (1997). Feeling thermometers versus 7-point scales: Which are better? Sociological Methods & Research, 25(3), 318-340. https://doi.org/10.1177/0049124197025003003
  2. Andrews, F.M. (1984). Construct validity and error components of survey measures: A structural modeling approach. Public Opinion Quarterly, 48, 409-442. https://doi.org/10.1086/268840
  3. Bae, J. (2002). An analysis of the validity and reliability about the utility of the neutral point response category on a likert scale. Unpublished master's thesis, Ewha Womans University, Seoul.
  4. Boote, A. S. (1981). Reliability testing of psychographic scales: Five-point or seven-point? Anchored or labeled? Journal of Advertising Research, 21, 53-60.
  5. Brown, G, Wilding, R. E. II., & Coulter, R. L. (1991). Customer evaluation of retail salespeople using the SOCO scale: A replication, extension, and application. Journal of the Academy of Marketing Science, 9, 347-351.
  6. Chae, S. (2005). Social research method and analysis (3rd ed.). Seoul: B&M Books.
  7. Chang, L. (1994). A psychometric evaluation of 4-point and 6-point liken-type scales in relation to reliability and validity. Applied Psychological Measurement, 18(3), 205-215. https://doi.org/10.1177/014662169401800302
  8. Chen, C., Lee, S., & Stevenson, H. W. (1995). Response style and cross-cultural comparisons of rating scales among East Asian and North American students. Psychological Science, 6(3), 170-175. https://doi.org/10.1111/j.1467-9280.1995.tb00327.x
  9. Clarke, III. I. (2000). Extreme response style in cross-cultural research: An empirical investigation. Journal of Social Behavior & Personality, 15(1), 137-152.
  10. Converse, J., & Presser, S. (1986). Survey questions. Beverly Hills, CA: Sage Publications.
  11. Cox, E. P. III. (1980). The optimal number of response alternatives for a scales: A review. Journal of Marketing Rsearch, 17, 407-422 https://doi.org/10.2307/3150495
  12. Dawes, J. (2008). Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point, and 10-point scales. International Journal of Market Research, 50(1), 61-77. https://doi.org/10.1177/147078530805000106
  13. Finn, R. H. (1972). Effects of some variations in rating scale characteristics on the means and reliabilities of ratings. Educational and Psychological Measurement, 32, 255-265. https://doi.org/10.1177/001316447203200203
  14. Friedman, H. H., Wilamowsky, Y., & Friedman, L. W. (1981). A comparison of balanced and unbalanced rating scales. The Mid-Atlantic Journal of Business, 19(2), 1-7.
  15. Garland, R. (1991). The mid-point on a rating scale: Is it desirable? Marketing Bulletine, 2, 66-70.
  16. Garner, W. R. (1960). Rating scales, discriminability, and information transmission. The Psychological Review, 67, 342-352.
  17. Green, P. E., & Rao, V. R. (1970). Rating scales and information recovery: How many scales and response categories to use? Journal of Marketing, 34, 33-39.
  18. Guilford, J. P. (1954). Psychometric methods. New York: McGraw-Hill.
  19. Hancock, G. R., & Klockars, A. J. (1991). The effect of scale manipulations on validity: Targeting frequency rating scales for anticipated performance levels. Applied Ergonomics, 22, 147-154. https://doi.org/10.1016/0003-6870(91)90153-9
  20. Hofacker, C. F. (1984). Categorical judgment scaling with ordinal assumptions. Multivariate Behavioral Research, 19, 91-106. https://doi.org/10.1207/s15327906mbr1901_5
  21. Jacoby, J., & Matell, M. S. (1971). Three-point scales always good enough. Journal of Marketing Research, 8, 495-500. https://doi.org/10.2307/3150242
  22. Jenkins, G, & Taber, T. (1977). A Monte Carlo study of factors affecting three indices of composite scale reliability. Journal of Applied Psychology, 62, 392-398. https://doi.org/10.1037/0021-9010.62.4.392
  23. Kim, N. (2001). A comparative analysis of items election methods for developing the likert scale. Unpublished master's thesis, Yonsei University, Seoul.
  24. Kim, S., & Rhee, E. (2004). Development of measurement scale for clothing shopping orientation (Part I). Journal of the Korean Society of Clothing and Textiles, 28(9/10), 1253-1264.
  25. Komorita, S. S. (1963). Attitude content, intensity, and the neutral point on a likert scale. Journal of Social Psychology, 61(December), 327-334. https://doi.org/10.1080/00224545.1963.9919489
  26. Komorita, S. S., & Graham, W. K. (1965). Number of scale points and the reliability of scales. Educational and Psychological Measurement, 25, 987-995. https://doi.org/10.1177/001316446502500404
  27. Kulas, J. T., Stachowski, A. A., & Haynes, B. A. (2008). Middle response functioning in likert-responses to personality items. Journal of Business and Psychology, 22(3), 251-259. https://doi.org/10.1007/s10869-008-9064-2
  28. Lissitz, R. W., & Green, S. B. (1975). Effect of the number of scale points on reliability: A Monte Carlo approach. Journal of Applied Psychology, 60, 10-13. https://doi.org/10.1037/h0076268
  29. Loken, B., Pirie, P., Virnig, K. A., Hinkle, R. L., & Salmon, C. T. (1987). The use of 0-10 scales in telephone surveys. Journal of the Market Research Society, 29(3), 353-362.
  30. Lozano, L. M., Garcia-Cueto, E., & Muniz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. European Journal of Research Methods for the Behavioral and Social Sciences, 4(2), 73-79. https://doi.org/10.1027/1614-2241.4.2.73
  31. Matell, M. S., & Jacoby, J. (1971). Is there an optimal number of alternatives for likert scales items? Study I: Reliability and validity. Educational and Psychological Measurement, 31, 657-674. https://doi.org/10.1177/001316447103100307
  32. Matell, M. S., & Jacoby, J. (1972). Is there an optimal number of alternatives for likert scale items? Effects of testing time and scale properties. Journal of Applied Psychology, 56(6), 506-509. https://doi.org/10.1037/h0033601
  33. Mckelvie, S. J. (1978). Graphic rating scales: How many categories? British Journal of Psychology, 69, 185-202. https://doi.org/10.1111/j.2044-8295.1978.tb01647.x
  34. Nunnally, J. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.
  35. Peabody, D. (1962). Two components in bipolar scales: Direction and extremeness. Psychological Review. 69, 65-73. https://doi.org/10.1037/h0039737
  36. Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104, 1-15. https://doi.org/10.1016/S0001-6918(99)00050-5
  37. Ramsay, J. O. (1973). The effect of number of categories in rating scales on precision of estimation of scale values. Psychometrika, 38, 513-533. https://doi.org/10.1007/BF02291492
  38. Remmers, H. H., & Ewart, E. (1941). Reliability of multiple choice measuring instruments as a function of the Spearman-Brown prophecy formula. Journal of Educational Psychology, 32, 61-66. https://doi.org/10.1037/h0061781
  39. Saris, W. E. (1988). Variation in response functions: A source of measurement error in attitude research. Amsterdam: Sociometric Research Foundation.
  40. Schutz, H. G., & Rucker, M. H. (1975). A comparison of variable configurations across scale lengths: An empirical study. Educational and Psychological Measurement, 35, 319-324. https://doi.org/10.1177/001316447503500210
  41. Son, Y., & Chae, S. (2008). Systematic questionnaire design (2nd ed.). Seoul: B&M Books.
  42. Stone, M. H. (2004). Substantive scale construction. In E. V. Smith Jr. & R. M. Smith (Eds.), Introduction to rasch measurement (pp. 201-225). Maple Grove, MN: JAM Press.
  43. Wildt, A. R., & Mazis, M. B. (1978). Determinants of scale response: Label versus position. Journal of Marketing Research. 15, 261-267. https://doi.org/10.2307/3151256
  44. Windschitl, P. D., & Wells, G. L. (1996). Measuring psychological uncertainty: Verbal versus numeric methods. Journal of Experimental Psychology: Applied, 2(4), 343-364. https://doi.org/10.1037/1076-898X.2.4.343