Browse > Article
http://dx.doi.org/10.3837/tiis.2020.01.005

Knowledge Transfer Using User-Generated Data within Real-Time Cloud Services  

Zhang, Jing (School of Computer Science and Engineering, Nanjing University of Science and Technology)
Pan, Jianhan (School of Computer Science and Technology, Jiangsu Normal University)
Cai, Zhicheng (School of Computer Science and Engineering, Nanjing University of Science and Technology)
Li, Min (School of Computer Science and Engineering, Nanjing University of Science and Technology)
Cui, Lin (Intelligent Information Processing Laboratory, Suzhou University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.1, 2020 , pp. 77-92 More about this Journal
Abstract
When automatic speech recognition (ASR) is provided as a cloud service, it is easy to collect voice and application domain data from users. Harnessing these data will facilitate the provision of more personalized services. In this paper, we demonstrate our transfer learning-based knowledge service that built with the user-generated data collected through our novel system that deliveries personalized ASR service. First, we discuss the motivation, challenges, and prospects of building up such a knowledge-based service-oriented system. Second, we present a Quadruple Transfer Learning (QTL) method that can learn a classification model from a source domain and transfer it to a target domain. Third, we provide an overview architecture of our novel system that collects voice data from mobile users, labels the data via crowdsourcing, utilises these collected user-generated data to train different machine learning models, and delivers the personalised real-time cloud services. Finally, we use the E-Book data collected from our system to train classification models and apply them in the smart TV domain, and the experimental results show that our QTL method is effective in two classification tasks, which confirms that the knowledge transfer provides a value-added service for the upper-layer mobile applications in different domains.
Keywords
Cloud computing; distributed computing; personalized service; transfer learning; user behavior mining;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Mehrabani, S. Bangalore and B. Stern, "Personalized speech recognition for Internet of Things," in Proc. of the 2nd IEEE World Forum on Internet of Things, pp. 369-374, 2015.
2 J. Pan, X. Hu, Y. Zhang, P. Li, Y. Lin, H. Li, W. He and L. Li, "Quadruple Transfer Learning: Exploiting both shared and non-shared concepts for text classification," Knowledge-Based Systems, vol. 90, pp. 199-210, 2015.   DOI
3 S. J. Pan and Q. Yang, "A survey on transfer learning," IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345-1359, 2010.   DOI
4 A. Sheshadri and M. Lease, "SQUARE: A benchmark for research on computing crowd consensus," in Proc. of the First AAAI Conference on Human Computation and Crowdsourcing, pp. 156-164, 2013.
5 C. Wang, I. A. Rayan and K. Schwan, "Faster, larger, easier: reining real-time big data processing in cloud," in Proc. of the Posters and Demo Track at Middleware '12, pp. 1-2, 2012.
6 X. Wu, X. Zhu, G. Q. Wu and W Ding, "Data mining with big data," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 1, pp. 97-107, 2014.   DOI
7 W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu and G. Zweig, "The Microsoft 2016 conversational speech recognition system," in Proc. of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5255-5259, 2017.
8 T. Yoshioka, N. Ito, M. Delcroix, A. Ogawa, K. Kinoshita, M. Fujimoto, C. Yu, W. J. Fabian, M. Espi, T. Higuchi and S. Araki, "The NTT CHiME-3 system: Advances in speech enhancement and recognition for mobile multi-microphone devices," in Proc. of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 436-443, 2015.
9 J. Zhang, V. S. Sheng, J. Wu and X. Wu, "Multi-Class Ground Truth Inference in Crowdsourcing with Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 4, pp. 1080-1085, 2016.   DOI
10 J. Zhang, G. Wu, X. Hu and X. Wu, "A distributed cache for Hadoop distributed file system in real-time cloud services," in Proc. of the 2012 ACM/IEEE 13th International Conference on Grid Computing, pp. 12-21, 2012.
11 F. Zhuang, P. Luo, C. Du, Q. He, Z. Shi and H. Xiong, "Triplex transfer learning: exploiting both shared and distinct concepts for text classification," IEEE Transactions on Cybernetics, vol. 44, no. 7, pp. 1191-1203, 2014.   DOI
12 F. Zhuang, P. Luo, P. Yin, Q. He and Z. Shi, "Concept learning for cross-domain text classification: A general probabilistic framework," in Proc. of the 2013 International Joint Conferences on Artificial Intelligence, pp. 1960-1966, 2013.
13 S. Y. Ho and S. H. Kwok, "The attraction of personalized service for users in mobile commerce: an empirical study," ACM SIGecom Exchanges, vol. 3, no. 4, pp. 10-18, 2002.   DOI
14 J. Zhang, X. Wu and V. S Sheng, "Imbalanced multiple noisy labeling," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 2, pp. 489-503, 2015.   DOI
15 L. Besacier, E. Barnard, A. Karpov and T. Schultz, "Automatic speech recognition for under-resourced languages: A survey," Speech Communication, vol. 56, pp. 85-100, 2014.   DOI
16 D. M. Blei, A. Y. Ng and M. I. Jordan, "Latent Dirichlet allocation," Journal of Machine Learning Research, vol. 3, pp. 993-1022, 2003.
17 G. Chen, H. V. Jagadish, D. Jiang, D. Maier, B. C. Ooi, K. L. Tan and W. C. Tan, "Federation in cloud data management: Challenges and opportunities," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 7, pp. 1670-1678, 2014.   DOI
18 T. Condie, P. Mineiro, N. Polyzotis and M. Weimer, "Machine learning for big data," in Proc. of the 2013 ACM SIGMOD International Conference on Management of Data, pp. 939-942, 2013.
19 J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2018.   DOI
20 W. Galuba and S. Girdzijauskas, "Distributed hash table," Encyclopedia of Database Systems, pp. 903-904, 2009.
21 T. Hofmann, "Unsupervised learning by probabilistic latent semantic analysis," Machine Learning, vol. 42, no. 1-2, pp. 177-196, 2001.   DOI
22 O. V. Joldzic and D. R. Vukovic, "The impact of cluster characteristics on HiveQL query optimization," in Proc. of the 21st IEEE Telecommunications Forum, pp. 837-840, 2013.
23 M. Long, J. Wang, G. Ding, W. Cheng, X. Zhang and W. Wang, "Dual Transfer Learning," in Proc. of the 2012 SIAM International Conference on Data Mining, pp. 540-551, 2012.