Browse > Article
http://dx.doi.org/10.3837/tiis.2022.06.002

Hot Keyword Extraction of Sci-tech Periodicals Based on the Improved BERT Model  

Liu, Bing (School of Economics and Management, Dalian University of Technology)
Lv, Zhijun (School of Economics and Management, Dalian University of Technology)
Zhu, Nan (School of Economics and Management, Dalian University of Technology)
Chang, Dongyu (School of Economics and Management, Dalian University of Technology)
Lu, Mengxin (School of Economics and Management, Dalian University of Technology)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.16, no.6, 2022 , pp. 1800-1817 More about this Journal
Abstract
With the development of the economy and the improvement of living standards, the hot issues in the subject area have become the main research direction, and the mining of the hot issues in the subject currently has problems such as a large amount of data and a complex algorithm structure. Therefore, in response to this problem, this study proposes a method for extracting hot keywords in scientific journals based on the improved BERT model.It can also provide reference for researchers,and the research method improves the overall similarity measure of the ensemble,introducing compound keyword word density, combining word segmentation, word sense set distance, and density clustering to construct an improved BERT framework, establish a composite keyword heat analysis model based on I-BERT framework.Taking the 14420 articles published in 21 kinds of social science management periodicals collected by CNKI(China National Knowledge Infrastructure) in 2017-2019 as the experimental data, the superiority of the proposed method is verified by the data of word spacing, class spacing, extraction accuracy and recall of hot keywords. In the experimental process of this research, it can be found that the method proposed in this paper has a higher accuracy than other methods in extracting hot keywords, which can ensure the timeliness and accuracy of scientific journals in capturing hot topics in the discipline, and finally pass Use information technology to master popular key words.
Keywords
Bidirectional encoder; hot keyword; representations from transformers (bert); sci-tech periodicals; similarity measurement;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Mehmood, Rashid, Guangzhi Zhang, Rongfang Bie, Hassan Dawood, and Haseeb Ahmad, "Clustering by fast search and find of density peaks via heat diffusion," Neurocomputing, vol. 208, pp. 210-217, 2016.   DOI
2 Liu, Bing, Zhijun Lv, Nan Zhu, and Dongyu Chang, "Research on the evaluation of the dissemination ability of Sci-Tech periodicals based on hesitant fuzzy linguistic," International Journal of Uncertainty, Fuzziness and Knowledge-Based System, vol. s 28, no.02, pp.153-167, 2020.
3 Jin, Yuran, and Xin Li, "Visualizing the hotspots and emerging trends of multimedia big data through scientometrics," Multimedia Tools and Applications, vol. 78, no. 2, pp. 1289-1313, 2019.   DOI
4 Bin, Cheng, Chen Weiqi, Chu Shaoling, and Hu Chunxia, "Visual Analysis of Research Hot Spots, Characteristics, and Dynamic Evolution of International Competitive Basketball Based on Knowledge Mapping," SAGE Open, vol. 11, no. 1, pp.1-13, 2021.
5 Yu, Ma, Su Shuang, and Liu Jie, "Knowledge map of career research in China over past 20 yearsCiteSpace bibliometric analysis based on CSSCI journals," in Proc. of the 2019 Annual Meeting on Management Engineering, Kuala Lumpur, Malaysia, pp. 154-159, 2019.
6 Zhao, Jian, Guanyu Yu, Mengxi Cai, Xiao Lei, Yanyong Yang, Qijin Wang, and Xiao Zhai, "Bibliometric analysis of global scientific activity on umbilical cord mesenchymal stem cells: a swiftly expanding and shifting focus," Stem cell research & therapy, vol. 9, no. 1, pp.1-9, 2018.   DOI
7 Ding, Yi, and Xian Fu, "The research of text mining based on self-organizing maps," Procedia Engineering, vol. 29, pp. 537-541, 2012.   DOI
8 Lei, Hongzhen, and Xiaoli Chen, "Hot Spots and Trends of Spillover Effects of Brand Scandals: Visual Analysis Based on Citespace," in Proc. of 2020 International Conference on Modern Education and Information Management (ICMEIM), Dalian, China, pp.607-610, 25-27 Sept, 2020.
9 Bhatta, Janardan, Dipesh Shrestha, Santosh Nepal, Saurav Pandey, and Shekhar Koirala, "Efficient estimation of Nepali word representations in vector space," Journal of Innovations in Engineering Education, vol. 3, no. 1, pp. 71-77, 2020.   DOI
10 Tang, Zhong, Wenqiang Li, Yan Li, Wu Zhao, and Song Li, "Several alternative term weighting methods for text representation and classification," Knowledge-Based Systems, vol. 207, pp. 109- 121, 2020.
11 Chen, Ziyan, Yu Huang, Yuexian Liang, Yang Wang, Xingyu Fu, and Kun Fu, "RGloVe: An improved approach of global vectors for distributional entity relation representation," Algorithms, vol. 10, no. 2, pp. 42-53, 2017.   DOI
12 Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," Computation and Language, pp.4171-4186, 2018.
13 Bayrak Tuncay, "A comparative analysis of the world's constitutions: a text mining approach". Social Network Analysis and Mining, vol. 12, no. 1, pp. 00857, 2022.
14 Zhang Yali, and Shang Pengjian, "KM-MIC: An improved maximum information coefficient based on K-Medoids clustering," Communications in Nonlinear Science and Numerical Simulation, vol. 111, no. 8, pp. 106418, 2022.   DOI
15 Wang, Bojie, Qin Zhang, and Fengqi Cui, "Scientific research on ecosystem services and human well-being: A bibliometric analysis," Ecological Indicators, vol. 125, pp.107-115, 2021.
16 Anzoise, Valentina, Debora Slanzi, and Irene Poli, "Local stakeholders' narratives about largescale urban development: The Zhejiang Hangzhou future Sci-Tech City," Urban Studies, vol. 57, no. 3, pp.655-671, 2020.   DOI
17 Zhao, Jian, Guanyu Yu, Mengxi Cai, Xiao Lei, Yanyong Yang, Qijin Wang, and Xiao Zhai, "Bibliometric analysis of global scientific activity on umbilical cord mesenchymal stem cells: a swiftly expanding and shifting focus," Stem cell research & therapy, vol. 9, no. 1, pp. 1-9, 2018.   DOI
18 Feng, Zhong-kai, Wen-jing Niu, Zheng-yang Tang, Zhi-qiang Jiang, Yang Xu, Yi Liu, and Hairong Zhang, "Monthly runoff time series prediction by variational mode decomposition and support vector machine based on quantum-behaved particle swarm optimization," Journal of Hydrology, vol. 583, pp.124-135, 2020.
19 Sarzynska-Wawer, Justyna, Aleksander Wawer, Aleksandra Pawlak, Julia Szymanowska, Izabela Stefaniak, Michal Jarkiewicz, and Lukasz Okruszek, "Detecting formal thought disorder by deep contextualized word representations," Psychiatry Research, vol. 304, pp.114-135,2015.
20 Qiao, Wenchuan, Zheng Fang, and Bailu Si, "A sampling-based multi-tree fusion algorithm for frontier detection," International Journal of Advanced Robotic Systems, vol. 16, no. 4, pp. 1-14, 2019.
21 Huertas-Valdivia, Irene, Anna Maria Ferrari, Davide Settembre-Blundo, and Fernando E. GarciaMuina, "Social life-cycle assessment: A review by bibliometric analysis," Sustainability, vol. 12, no. 15, pp.62-73, 2020.
22 Mingxi Zhang, Xuemin Li, Shuibo Yue, and Liuqian Yang, "An empirical study of TextRank for keyword extraction," IEEE Access, vol. 8, pp. 178849-178858, 2020.   DOI
23 Mohammadi Ehsan, and Karami Amir, "Exploring research trends in big data across disciplines: A text mining analysis," Journal of Information Science, vol. 48, no. 1, pp. 44-56, 2022.   DOI