Incomplete Cholesky Decomposition based Kernel Cross Modal Factor Analysis for Audiovisual Continuous Dimensional Emotion Recognition |
Li, Xia
(College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications)
Lu, Guanming (College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications) Yan, Jingjie (College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications) Li, Haibo (College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications) Zhang, Zhengyan (College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications) Sun, Ning (Engineering Research Center of Wideband Wireless Communication Technology, Ministry of Education, Nanjing University of Posts and Telecommunications) Xie, Shipeng (College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications) |
1 | Z. Huang, T. Dang, N. Cummins, B. Stasak, P. Le, V. Sethu, J. Epps, "An investigation of annotation delay compensation and output-associative fusion for multimodal continuous emotion prediction," in Proc. of 5th International Workshop on Audio/Visual Emotion Challenge, pp. 41-48, Oct. 2015. |
2 | C. C. Chang, and C. J. Lin, "LibSVM: a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, Apr. 2011 |
3 | G. Tigeorgis, F. Ringeval, R. Brueckner, E. Marchi, M. A. Nicolaou, B. Schuller, and S. Zafeiriou, "Adieu features? end-to-end speech emotion recognition using a deep convolutional recurrent network," In Proc.of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5200-5203, Mar. 2016. |
4 | Z. Zeng, M. Pantic, G. Roisman, and T. Huang, "A survey of affect recognition methods: audio, visual and spontaneous expressions," IEEE Transactions on Pattern Analysis amd Machine Inteligencel, Vol. 31, no. 1, pp. 39-58, Jan. 2009. DOI |
5 | J. Yan, W. Zheng, M. Xin, and J. Yan, "Integrating facial expression and body gesture in videos for emotion recognition," IEICE Transactions on Information and Systerms, vol. E97.D, no. 3, pp. 610-613, Mar. 2014. DOI |
6 | J. Yan, W. Zheng, Q. X, G. L, H. Li, and B. W, "Sparse kernel reduced-rank regression for bimodal emotion recognition from facial expression and speech," IEEE Transactions on Multimedia, vol. 18, no. 7, pp. 1319-1329, Jul. 2016. DOI |
7 | Y. Wang, L. Guan, and A. N. Venetsanopoulos, "Audiovisual emotion recognition via cross-modal association in kernel space," in Proc. of IEEE International Conference on Multimedia & Expo, pp. 1-6, Jul. 2011. |
8 | Y. Wang, L. Guan, and A. N. Venetsanopoulos, "Kernel cross-modal factor analysis for information fusion with application to bimodal emotion recognition," IEEE Transations on Multimedia, vol. 14, no. 3, pp. 597-607, Jun. 2012. DOI |
9 | D. Li, N. Dimitrova, M. Li, and I. K. Sethi, "Multimedia content processing through cross-modal association," in Proc. of 11th ACM International Conference on Multimedia, pp. 604-611, Nov. 2003. |
10 | C. H. Wu, J. C. Lin, and W.L. Wei, "Survey on audiovisual emotion recognition: databases, features, and data fusion strategies," Apsipa Transactions on Signal & Information Processing, vol. 3, pp. 1-18, 2014. DOI |
11 | B.Schuller, M. Valstar, F. Eyben, R. Cwie, and M. Pantic, "AVEC 2012 - the continuous audio/visual emotion challenge," in Proc. of 14th ACM International Conference on Multimodal Interaction, pp. 449-456, Oct. 2012. |
12 | C. Vinola, and K. Vimaladevi, "A survey on human emotion recognition approaches, databases and applications," Electronic Letters on Computer Vision & Image Analysis, vol. 14, no. 2, pp. 24-44, 2015. DOI |
13 | L. Pang, S. Zhu, and C. W. Ngo, "Deep Multimodal Learning for Affective Analysis and Retrieval," IEEE Transactions on Multimedia, vol. 17, no. 11, pp. 2008-2020, Nov. 2015. DOI |
14 | C. H. Wu, J. C. Lin, and W. L. Wei, "Two-level hierarchical alignment of semi-coupled HMM-based audiovisual emotion recognition with temporal course," IEEE Transactions on Multimedia, vol. 15, no. 8, pp. 1880-1895, Dec. 2013. DOI |
15 | M. Valstar, B. Schuller, K. Smith, F. Eyben, B. Jiang, S. Bilakhia, S. Schnieder, R. Cowie, and M. Pantic, "AVEC 2013 - the continuous audio/visual emotion and depression recognition challenge," in Proc. of the 3rd ACM International international worshop on Audio/visusl emotion challenge, pp. 3-10, Oct. 2013. |
16 | C. Soladie, H. Salam, N. Stoiber, and R. Seguier, "Continuous facial expression representation for multimodal emotion detection," International Journal of Advanced Computer Science, vol. 3, no. 5, pp. 202-216, May. 2013. |
17 | M. Valstar, B. Schuller, K. Smith, T. Almaev, F. Eyben, J. Krajewski, R. Cowie, and M. Pantic, "AVEC 2014-3D dimensional affect and depression recognition challenge," in Proc. of 4th International Workshop on Audio/Visual Emotiona Challenge, pp. 3-10, Nov. 2014. |
18 | F. Ringeval, B. Schuller, M. Valster, S. Jaiswal, E. Marchi, D. Lalanne R. Cowie, and M. Pantic, "AV+EC 2015-the first affect recognition challenge bridging across audio, video, and physiological data," in proc. of 5th International Workshop on Audio/Visual Emotion Challenge, pp. 3-8, Oct. 2015. |
19 | M. Valstar, J. Gratch, B. Schuller, F. Ringeval, D. Lalanne, M. T. Torres, S. Scherer, G. Stratou, R. Cowie, and M. Pantic, "AVEC 2016 - depression, mood, and emotion recognition workshop and challenge," in Proc. of 6th International Workshaop on Audio/Visual Challenge, pp. 3-10, Oct. 2016. |
20 | F. Eyben, M. Wöllmer, M.F. Valstar, H.Gunes, B.Schuller, and M.Pantic, "String-based audiovisual fusion of behavioural events for the assessment of dimensional affect," in Proc. of IEEE International Conference on Automatic Face & Gesture Recognition, pp. 322-329, Mar. 2011. |
21 | L. Chao, J. Tao, M. Yang, Y. Li, and Z. Wen, "Long short term memory recurrent neural network based multimodal dimensional emotion recognition," in Proc. of 5th International Workshop on Audo/Visual Emotion Challenge, pp. 65-72, Oct. 2015. |
22 | S. Chen, and Q. Jin, "Multi-modal dimensional emotion recognition using recurrent neural networks," in Proc. of 5th International Workshop on Audo/Visual Emotion Challenge, pp. 49-56, Oct. 2015. |
23 | P. Cardinal, M. Dehak, A. Lameiras, J. Alam, and P. Boucher, "ETS system for AV+EC 2015 challenge," in Proc. of 5th International Workshop on Audo/Visual Emotion Challenge, pp. 17-23, Oct. 2015. |
24 | A. Sayedelahl, R. Araujo, and M. S. Kamel, "Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversation," in Proc. of IEEE International Conference on Multimedia and Expo Workshops, pp. 1-6, Oct. 2013. |
25 | Y. Falinie, A. Gaus, H. Meng, A. Jan, F. Zhang, and S. Turabzadeh, "Automatic affective dimension recognition from naturalistic facial expressions based on wavelet filtering and PLS regression," in Proc. of IEEE International Conference and Workshop on Automatic Face and Gesture Recognition, pp. 1-6, Oct. 2015. |
26 | M. Kachele, M. Schels, P. Thiam, and F. Schwenker, "Fusion mappings for multimodal affect recognition," in Proc. of IEEE Symposium Series on Computional Intelligence, pp. 307-313, Jan. 2015. |
27 | L. Tian, J. D. Moore, and C. Lai, "Recognizing emotions in dialogues with acoustic and lexical features," in Proc. of IEEE International Conference on Affective Computing and Intelligent Interaction, pp. 737-742, Dec. 2015. |
28 | J. Nicolle, V. Rapp, K. Bailly, L. Prevost, and M. Chetouani, "Robust continuous prediction of human emotions using multiscale dynamic cues," in Proc. of 14th ACM International Conference on Multimodal Interaction, pp. 501-508, Oct. 2012. |
29 | C. Soladie. H. Salam, C. Pelachaud, N. Stoiber, and R. Seguier, "A multimodal fuzzy inference system using a continuous facial expression representation for emotion detection," in Proc. of 14th ACM International Conference on Multimodal Interaction, pp. 493-500, Oct. 2012. |
30 | A. Metallinou, A. Katsamanis, Y. Wang, and S. Narayanan, "Tracking changes in continuous emotion state using body language and prosodic cues," in Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2288-2291, Jul. 2011. |
31 | D. R.Hardoon, S. Szedmak, and J. Shawe-Taylor, "Canonical Correlation Analysis: An Overview with Application to Learning methods," Neural Computation, vol. 16, no. 12, pp.2639-2664, Dec. 2004. DOI |
32 | Y. Song, L. P. Morency, and R. Davis, "Learning a sparse codebook of facial and body microexpressions for emotion recognition," in Proc. of 15th ACM on International Conference on Multimodal Interaction, pp. 237-244, Dec. 2013. |
33 | F. R. Bach, and M. I. Jordan, "Kernel independent component analysis," Journal of Machine Learning Research, vol. 3, pp. 1-48, Jul. 2002. |
34 | J. Shawe-Taylor and N. Cristianini, Kernel Method for Pattern Analysis, Cambridge, New York, 2004. |
35 | F. Ringeval, A. Sondergger, J. Sauer, and D. Lalanne, "Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions," in Proc. of IEEE International Conference and Workshop on Automatic Face and Gesture Recognition, pp. 1-8, Jul. 2013. |
![]() |