DOI QR코드

DOI QR Code

Learning-associated Reward and Penalty in Feedback Learning: an fMRI activation study

학습피드백으로서 보상과 처벌 관련 두뇌 활성화 연구

  • Kim, Jinhee (Department of Psychology, Kangwon National University) ;
  • Kan, Eunjoo (Department of Psychology, Kangwon National University)
  • Received : 2017.02.25
  • Accepted : 2017.03.13
  • Published : 2017.03.31

Abstract

Rewards or penalties become informative only when contingent on an immediately preceding response. Our goal was to determine if the brain responds differently to motivational events depending on whether they provide feedback with the contingencies effective for learning. Event-related fMRI data were obtained from 22 volunteers performing a visuomotor categorical task. In learning-condition trials, participants learned by trial and error to make left or right responses to letter cues (16 consonants). Monetary rewards (+500) or penalties (-500) were given as feedback (learning feedback). In random-condition trials, cues (4 vowels) appeared right or left of the display center, and participants were instructed to respond with the appropriate hand. However, rewards or penalties (random feedback) were given randomly (50/50%) regardless of the correctness of response. Feedback-associated BOLD responses were analyzed with ANOVA [trial type (learning vs. random) x feedback type (reward vs. penalty)] using SPM8 (voxel-wise FWE p < .001). The right caudate nucleus and right cerebellum showed activation, whereas the left parahippocampus and other regions as the default mode network showed deactivation, both greater for learning trials than random trials. Activations associated with reward feedback did not differ between the two trial types for any brain region. For penalty, both learning-penalty and random-penalty enhanced activity in the left insular cortex, but not the right. The left insula, however, as well as the left dorsolateral prefrontal cortex and dorsomedial prefrontal cortex/dorsal anterior cingulate cortex, showed much greater responses for learning-penalty than for random-penalty. These findings suggest that learning-penalty plays a critical role in learning, unlike rewards or random-penalty, probably not only due to its evoking of aversive emotional responses, but also because of error-detection processing, either of which might lead to changes in planning or strategy.

본 연구의 목적은 학습상황에서 피드백으로 주어지는 금전적 획득/손실(학습 피드백)과 비학습적 상황에서 우연히 제시되는 의사 피드백(무선 피드백)을 비교하는 방법을 사용하여, 금전적 보상과 처벌의 학습 피드백으로서만 가지는 정보처리에 어느 두뇌 영역이 관여하는지를 규명하는 데 있다. 이를 위해 정상 성인(n = 22)을 대상으로 fMRI scan 동안 단서 자극에 대한 범주 버튼 반응(좌/우)의 정확 여부에 따라 피드백이 제시되는 시행(학습시행)과 단서 자극의 위치판단 반응과 무관하게 피드백이 제시되는 시행(무선시행)을 사건 관련 fMRI 방략으로 제시하였다. 두 시행 간 보상과 처벌과 같은 동기적 사건에 대한 두뇌 반응이 변별적으로 나타나는지를 알아보기 위해 시행 유형(학습 vs. 무선)과 피드백 유형(보상 vs, 처벌)을 두 독립변인으로 한 반복측정 이원분산분석을 하였다(voxel-wise FWE p < .001). 그 결과, 좌측 배외측 전두피질(dorsolateral prefrontal cortex), 좌측 전측 도(anterior insular), 배내측 전두피질(dorsomedial prefrontal cortex) 등의 영역에서 유의한 상호작용 효과가 관찰되었는데, 이들 영역은 모두 학습-보상 피드백 및 무선-처벌 피드백보다 학습-처벌 피드백에 대해 증가한 두뇌 활성을 보였다. 본 연구 결과는 학습상황에서 주어지는 처벌 피드백에 대한 기존 전략의 변경이나 재평가를 위한 집행적 처리, 적절하지 못하거나 틀린 행동에 대한 오류처리 과정 그리고 실패 경험에 대한 부정적 정서처리가 위에서 언급한 피질신경망을 중심으로 이루어질 가능성을 보여준다. 따라서 학습의 처벌 피드백은 보상과 달리 위와 같은 추가적 정보처리 과정이 존재할 가능성을 시사한다.

Keywords

References

  1. Amiez, C., Joseph, J. -P., & Procyk, E. (2006). Reward encoding in the monkey anterior cingulate cortex. Cerebral Cortex, 16, 1040-1055. https://doi.org/10.1093/cercor/bhj046
  2. Anderson, A. K., Christoff, K., Stappen, I., Panitz, D., Ghahremani, D. G., Glover, G., Gabrieli, J. D., & Sobel, N. (2003). Dissociated neural representations of intensity and valence in human olfaction. Nature Neuroscience, 6, 196-202. https://doi.org/10.1038/nn1001
  3. Aron, A. R., Shohamy, D., Clark, J., Myers, C., Gluck, M. A., & Poldrack, R. A. (2004). Human midbrain sensitivity to cognitive feedback and uncertainty during classification learning. Journal of Neurophysiology, 92, 1144-1152. https://doi.org/10.1152/jn.01209.2003
  4. Balleine, B. W., Delgado, M. R., & Hikosaka, O. (2007). The role of the dorsal striatum in reward and decision-making. The Journal of Neuroscience, 27, 8161-8165. https://doi.org/10.1523/JNEUROSCI.1554-07.2007
  5. Bastin, J., Deman, P., David, O., Gueguen, M., Benis, D., Minotti, L., Hoffman, D., Combrisson, E., Kujala, J., & Perrone-Bertolotti, M. (2016). Direct recordings from human anterior insula reveal its leading role within the error-monitoring network. Cerebral Cortex, bhv352.
  6. Berridge, K. C., & Robinson, T. E. (1998). What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience?. Brain Research Reviews, 28, 309-369. https://doi.org/10.1016/S0165-0173(98)00019-8
  7. Bischoff-Grethe, A., Hazeltine, E., Bergren, L., Ivry, R. B., & Grafton, S. T. (2009). The influence of feedback valence in associative learning. Neuroimage, 44, 243-251. https://doi.org/10.1016/j.neuroimage.2008.08.038
  8. Bray, S., & O'Doherty, J. (2007). Neural coding of reward-prediction error signals during classical conditioning with attractive faces. Journal of Neurophysiology, 97, 3036-3045. https://doi.org/10.1152/jn.01211.2006
  9. Breiter, H. C., Aharon, I., Kahneman, D., Dale, A., & Shizgal, P. (2001). Functional imaging of neural responses to expectancy and experience of monetary gains and losses. Neuron, 30, 619-639. https://doi.org/10.1016/S0896-6273(01)00303-8
  10. Buckner, R. L., Andrews-Hanna, J. R., & Schacter, D. L. (2008). The brain's default network. Annals of the New York Academy of Sciences, 1124, 1-38. https://doi.org/10.1196/annals.1440.011
  11. Buzzell, G. A., Roberts, D. M., Fedota, J. R., Thompson, J. C., Parasuraman, R., & McDonald, C. G. (2016). Uncertainty-dependent activity within the ventral striatum predicts task-related changes in response strategy. Cognitive, Affective & Behavioral Neuroscience, 16, 219-233. https://doi.org/10.3758/s13415-015-0383-2
  12. Carreiras, M., & Price, C. J. (2008). Brain activation for consonants and vowels. Cerebral Cortex, 18, 1727-1735. https://doi.org/10.1093/cercor/bhm202
  13. Collins, A. G., & Frank, M. J. (2012). How much of reinforcement learning is working memory, not reinforcement learning? A behavioral, computational, and neurogenetic analysis. European Journal of Neuroscience, 35, 1024-1035. https://doi.org/10.1111/j.1460-9568.2011.07980.x
  14. Curtis, C. E., & D'Esposito, M. (2003). Persistent activity in the prefrontal cortex during working memory. Trends in Cognitive Sciences, 7, 415-423. https://doi.org/10.1016/S1364-6613(03)00197-9
  15. Dale, A. M. (1999). Optimal experimental design for event-related fMRI. Human Brain Mapping, 8, 109-114. https://doi.org/10.1002/(SICI)1097-0193(1999)8:2/3<109::AID-HBM7>3.0.CO;2-W
  16. Debener, S., Ullsperger, M., Siegel, M., Fiehler, K., von Cramon, D. Y., & Engel, A. K. (2005). Trial-by-trial coupling of concurrent electroencephalogram and functional magnetic resonance imaging identifies the dynamics of performance monitoring. The Journal of Neuroscience, 25, 11730-11737. https://doi.org/10.1523/JNEUROSCI.3286-05.2005
  17. Delgado, M. R., Locke, H. M., Stenger, V. A., & Fiez, J. A. (2003). Dorsal striatum responses to reward and punishment: effects of valence and magnitude manipulations. Cognitive, Affective & Behavioral Neuroscience, 3, 27-38. https://doi.org/10.3758/CABN.3.1.27
  18. Delgado, M. R., Miller, M. M., Inati, S., & Phelps, E. A. (2005). An fMRI study of reward-related probability learning. Neuroimage, 24, 862-873. https://doi.org/10.1016/j.neuroimage.2004.10.002
  19. Fox, M. D., Snyder, A. Z., Vincent, J. L., Corbetta, M., Van Essen, D. C., & Raichle, M. E. (2005). The human brain is intrinsically organized into dynamic, anticorrelated functional networks. Proceedings of the National Academy of Sciences, 102, 9673-9678. https://doi.org/10.1073/pnas.0504136102
  20. Gu, X., Hof, P. R., Friston, K. J., & Fan, J. (2013). Anterior insular cortex and emotional awareness. The Journal of comparative neurology, 521, 3371-3388. https://doi.org/10.1002/cne.23368
  21. Ham, T., Leff, A., de Boissezon, X., Joffe, A., & Sharp, D. J. (2013). Cognitive control and the salience network: an investigation of error processing and effective connectivity. The Journal of Neuroscience, 33, 7091-7098. https://doi.org/10.1523/JNEUROSCI.4692-12.2013
  22. Hare, T. A., Camerer, C. F., & Rangel, A. (2009). Self-control in decision-making involves modulation of the vmPFC valuation system. Science, 324, 646-648. https://doi.org/10.1126/science.1168450
  23. Holroyd, C. B., & Coles, M. G. (2002). The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679-709. https://doi.org/10.1037/0033-295X.109.4.679
  24. Holroyd, C. B., Nieuwenhuis, S., Yeung, N., Nystrom, L., Mars, R. B., Coles, M. G., & Cohen, J. D. (2004). Dorsal anterior cingulate cortex shows fMRI response to internal and external error signals. Nature Neuroscience, 7, 497-498. https://doi.org/10.1038/nn1238
  25. Iannaccone, R., Hauser, T. U., Staempfli, P., Walitza, S., Brandeis, D., & Brem, S. (2015). Conflict monitoring and error processing: New insights from simultaneous EEG-fMRI. Neuroimage, 105, 395-407. https://doi.org/10.1016/j.neuroimage.2014.10.028
  26. Kim, H., Shimojo, S., & O'Doherty, J. P. (2006). Is avoiding an aversive outcome rewarding? Neural substrates of avoidance learning in the human brain. PLoS Biology, 4, e233. https://doi.org/10.1371/journal.pbio.0040233
  27. Kim, S., Kim, J., & Kang, E. (2015). Dynamic changes in feedback processing as learning progresses. The Korean Journal of Cognitive and Biological Psychology, 27, 419-450. https://doi.org/10.22172/cogbio.2015.27.3.005
  28. Knutson, B., Adams, C. M., Fong, G. W., & Hommer, D. (2001). Anticipation of increasing monetary reward selectively recruits nucleus accumbens. The Journal of Neuroscience, 21, RC159. https://doi.org/10.1523/JNEUROSCI.21-16-j0002.2001
  29. Knutson, B., Fong, G. W., Bennett, S. M., Adams, C. M., & Hommer, D. (2003). A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: characterization with rapid event-related fMRI. Neuroimage, 18, 263-272. https://doi.org/10.1016/S1053-8119(02)00057-5
  30. Krawczyk, D. C. (2002). Contributions of the prefrontal cortex to the neural basis of human decision making. Neuroscience and Biobehavioral Reviews, 26, 631-664. https://doi.org/10.1016/S0149-7634(02)00021-0
  31. Kuhnen, C. M., & Knutson, B. (2005). The neural basis of financial risk taking. Neuron, 47, 763-770. https://doi.org/10.1016/j.neuron.2005.08.008
  32. Liu, X., Hairston, J., Schrier, M., & Fan, J. (2011). Common and distinct networks underlying reward valence and processing stages: a meta-analysis of functional neuroimaging studies. Neuroscience and Biobehavioral Reviews, 35, 1219-1236. https://doi.org/10.1016/j.neubiorev.2010.12.012
  33. Lotze, M., Montoya, P., Erb, M., Hulsmann, E., Flor, H., Klose, U., Birbaumer, N., & Grodd, W. (1999). Activation of cortical and cerebellar motor areas during executed and imagined hand movements: an fMRI study. Journal of Cognitive Neuroscience, 11, 491-501. https://doi.org/10.1162/089892999563553
  34. Luking, K. R., & Barch, D. M. (2013). Candy and the brain: neural response to candy gains and losses. Cognitive, Affective & Behavioral Neuroscience, 13, 437-451. https://doi.org/10.3758/s13415-013-0156-8
  35. Maratos, E. J., Dolan, R. J., Morris, J. S., Henson, R. N., & Rugg, M. D. (2001). Neural activity associated with episodic memory for emotional context. Neuropsychologia, 39, 910-920. https://doi.org/10.1016/S0028-3932(01)00025-2
  36. Mckiernan, K. A., Kaufman, J. N., Kucera-Thompson, J., & Binder, J. R. (2003). A parametric manipulation of factors affecting task-induced deactivation in functional neuroimaging. Journal of Cognitive Neuroscience, 15, 394-408. https://doi.org/10.1162/089892903321593117
  37. Menon, V., & Uddin, L. Q. (2010). Saliency, switching, attention and control: a network model of insula function. Brain Structure & Function, 214, 655-667. https://doi.org/10.1007/s00429-010-0262-0
  38. Metereau, E., & Dreher, J. -C. (2013). Cerebral correlates of salient prediction error for different rewards and punishments. Cerebral Cortex, 23, 477-487. https://doi.org/10.1093/cercor/bhs037
  39. Middleton, F. A., & Strick, P. L. (2002). Basal-ganglia 'projections' to the prefrontal cortex of the primate. Cerebral Cortex, 12, 926-935. https://doi.org/10.1093/cercor/12.9.926
  40. Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. The Journal of Neuroscience, 16, 1936-1947. https://doi.org/10.1523/JNEUROSCI.16-05-01936.1996
  41. O'Doherty, J., Critchley, H., Deichmann, R., & Dolan, R. J. (2003). Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices. The Journal of Neuroscience, 23, 7931-7939. https://doi.org/10.1523/JNEUROSCI.23-21-07931.2003
  42. O'Doherty, J., Dayan, P., Schultz, J., Deichmann, R., Friston, K., & Dolan, R. J. (2004). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304, 452-454. https://doi.org/10.1126/science.1094285
  43. O'Doherty, J., Kringelbach, M. L., Rolls, E. T., Hornak, J., & Andrews, C. (2001). Abstract reward and punishment representations in the human orbitofrontal cortex. Nature Neuroscience, 4, 95-102. https://doi.org/10.1038/82959
  44. Obleser, J., Leaver, A., VanMeter, J., & Rauschecker, J. (2010). Segregation of Vowels and Consonants in Human Auditory Cortex: Evidence for Distributed Hierarchical Organization. Frontiers in Psychology, 1, 232.
  45. Pessiglione, M., Schmidt, L., Draganski, B., Kalisch, R., Lau, H., Dolan, R. J., & Frith, C. D. (2007). How the brain translates money into force: a neuroimaging study of subliminal motivation. Science, 316, 904-906. https://doi.org/10.1126/science.1140459
  46. Peterburs, J., & Desmond, J. E. (2016). The role of the human cerebellum in performance monitoring. Current Opinion in Neurobiology, 40, 38-44. https://doi.org/10.1016/j.conb.2016.06.011
  47. Peters, S., Van Duijvenvoorde, A. C. K., Koolschijn, P. C. M. P., & Crone, E. A. (2016). Longitudinal development of frontoparietal activity during feedback learning: Contributions of age, performance, working memory and cortical thickness. Developmental Cognitive Neuroscience, 19, 211-222. https://doi.org/10.1016/j.dcn.2016.04.004
  48. Poldrack, R. A., Clark, J., Pare-Blagoev, E. J., Shohamy, D., Creso Moyano, J., Myers, C., & Gluck, M. A. (2001). Interactive memory systems in the human brain. Nature, 414, 546-550. https://doi.org/10.1038/35107080
  49. Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, 98, 676-682. https://doi.org/10.1073/pnas.98.2.676
  50. Ressler, N. (2004). Rewards and punishments, goal-directed behavior and consciousness. Neuroscience and Biobehavioral Reviews, 28, 27-39. https://doi.org/10.1016/j.neubiorev.2003.10.003
  51. Rustemeier, M., Koch, B., Schwarz, M., & Bellebaum, C. (2016). Processing of positive and negative feedback in patients with cerebellar lesions. Cerebellum, 15, 425-438. https://doi.org/10.1007/s12311-015-0702-8
  52. Salimpoor, V. N., van den Bosch, I., Kovacevic, N., McIntosh, A. R., Dagher, A., & Zatorre, R. J. (2013). Interactions between the nucleus accumbens and auditory cortices predict music reward value. Science, 340, 216-219. https://doi.org/10.1126/science.1231059
  53. Samanez-Larkin, G. R., Hollon, N. G., Carstensen, L. L., & Knutson, B. (2008). Individual differences in insular sensitivity during loss anticipation predict avoidance learning. Psychological Science, 19, 320-323. https://doi.org/10.1111/j.1467-9280.2008.02087.x
  54. Schultz, W. (1997). Dopamine neurons and their role in reward mechanisms. Current Opinion in Neurobiology, 7, 191-197.
  55. Schultz, W., Dayan, P., & Montague, P. R. (1997). A neural substrate of prediction and reward. Science, 275, 1593-1599. https://doi.org/10.1126/science.275.5306.1593
  56. Seger, C. A. (2008). How do the basal ganglia contribute to categorization? Their roles in generalization, response selection, and learning via feedback. Neuroscience and Biobehavioral Reviews, 32, 265-278. https://doi.org/10.1016/j.neubiorev.2007.07.010
  57. Seger, C. A., & Cincotta, C. M. (2005). The roles of the caudate nucleus in human classification learning. The Journal of Neuroscience, 25, 2941-2951. https://doi.org/10.1523/JNEUROSCI.3401-04.2005
  58. Seger, C. A., & Cincotta, C. M. (2006). Dynamics of frontal, striatal, and hippocampal systems during rule learning. Cerebral Cortex, 16, 1546-1555.
  59. Shackman, A. J., Salomons, T. V., Slagter, H. A., Fox, A. S., Winter, J. J., & Davidson, R. J. (2011). The integration of negative affect, pain and cognitive control in the cingulate cortex. Nature Reviews Neuroscience, 12, 154-167.
  60. Singer, T., Critchley, H. D., & Preuschoff, K. (2009). A common role of insula in feelings, empathy and uncertainty. Trends in Cognitive Sciences, 13, 334-340. https://doi.org/10.1016/j.tics.2009.05.001
  61. Skinner, B. F. (1938). The behavior of organisms: An experimental analysis. New York, NY: D. Appleton-Century Company.
  62. Spreng, R. N. (2012). The fallacy of a "task-negative" network. Frontiers in Psychology, 3, 145.
  63. Steele, V. R., Anderson, N. E., Claus, E. D., Bernat, E. M., Rao, V., Assaf, M., Pearlson, G. D., Calhoun, V. D., & Kiehl, K. A. (2016). Neuroimaging measures of error-processing: Extracting reliable signals from event-related potentials and functional magnetic resonance imaging. Neuroimage, 132, 247-260. https://doi.org/10.1016/j.neuroimage.2016.02.046
  64. Stephani, C., Fernandez-Baca Vaca, G., Maciunas, R., Koubeissi, M., & Luders, H. O. (2011). Functional neuroanatomy of the insular lobe. Brain Structure & Function, 216, 137-149. https://doi.org/10.1007/s00429-010-0296-3
  65. Stoodley, C. J., Valera, E. M., & Schmahmann, J. D. (2012). Functional topography of the cerebellum for motor and cognitive tasks: an fMRI study. Neuroimage, 59, 1560-1570. https://doi.org/10.1016/j.neuroimage.2011.08.065
  66. Taylor, S. F., Liberzon, I., Fig, L. M., Decker, L. R., Minoshima, S., & Koeppe, R. A. (1998). The effect of emotional content on visual recognition memory: a PET activation study. Neuroimage, 8, 188-197. https://doi.org/10.1006/nimg.1998.0356
  67. Tricomi, E. M., Delgado, M. R., & Fiez, J. A. (2004). Modulation of caudate activity by action contingency. Neuron, 41, 281-292. https://doi.org/10.1016/S0896-6273(03)00848-1
  68. Tricomi, E. M., & Fiez, J. A. (2012). Information content and reward processing in the human striatum during performance of a declarative memory task. Cognitive, Affective & Behavioral Neuroscience, 12, 361-372. https://doi.org/10.3758/s13415-011-0077-3
  69. Ullsperger, M., & von Cramon, D. Y. (2003). Error monitoring using external feedback: specific roles of the habenular complex, the reward system, and the cingulate motor area revealed by functional magnetic resonance imaging. The Journal of Neuroscience, 23, 4308-4314. https://doi.org/10.1523/JNEUROSCI.23-10-04308.2003
  70. Wheeler, E. Z., & Fellows, L. K. (2008). The human ventromedial frontal lobe is critical for learning from negative feedback. Brain, 131, 1323-1331. https://doi.org/10.1093/brain/awn041
  71. Williams, Z. M., & Eskandar, E. N. (2006). Selective enhancement of associative learning by microstimulation of the anterior caudate. Nature Neuroscience, 9, 562-568. https://doi.org/10.1038/nn1662
  72. Xu, L., Liang, Z. -Y., Wang, K., Li, S., & Jiang, T. (2009). Neural mechanism of intertemporal choice: from discounting future gains to future losses. Brain Research, 1261, 65-74. https://doi.org/10.1016/j.brainres.2008.12.061
  73. Zanolie, K., Van Leijenhorst, L., Rombouts, S. A., & Crone, E. A. (2008). Separable neural mechanisms contribute to feedback processing in a rule-learning task. Neuropsychologia, 46, 117-126. https://doi.org/10.1016/j.neuropsychologia.2007.08.009