Browse > Article
http://dx.doi.org/10.13088/jiis.2021.27.3.075

Analyzing the discriminative characteristic of cover letters using text mining focused on Air Force applicants  

Kwon, Hyeok (Department of Industrial Engineering, Yonsei University)
Kim, Wooju (Department of Industrial Engineering, Yonsei University)
Publication Information
Journal of Intelligence and Information Systems / v.27, no.3, 2021 , pp. 75-94 More about this Journal
Abstract
The low birth rate and shortened military service period are causing concerns about selecting excellent military officers. The Republic of Korea entered a low birth rate society in 1984 and an aged society in 2018 respectively, and is expected to be in a super-aged society in 2025. In addition, the troop-oriented military is changed as a state-of-the-art weapons-oriented military, and the reduction of the military service period was implemented in 2018 to ease the burden of military service for young people and play a role in the society early. Some observe that the application rate for military officers is falling due to a decrease of manpower resources and a preference for shortened mandatory military service over military officers. This requires further consideration of the policy of securing excellent military officers. Most of the related studies have used social scientists' methodologies, but this study applies the methodology of text mining suitable for large-scale documents analysis. This study extracts words of discriminative characteristics from the Republic of Korea Air Force Non-Commissioned Officer Applicant cover letters and analyzes the polarity of pass and fail. It consists of three steps in total. First, the application is divided into general and technical fields, and the words characterized in the cover letter are ordered according to the difference in the frequency ratio of each field. The greater the difference in the proportion of each application field, the field character is defined as 'more discriminative'. Based on this, we extract the top 50 words representing discriminative characteristics in general fields and the top 50 words representing discriminative characteristics in technology fields. Second, the number of appropriate topics in the overall cover letter is calculated through the LDA. It uses perplexity score and coherence score. Based on the appropriate number of topics, we then use LDA to generate topic and probability, and estimate which topic words of discriminative characteristic belong to. Subsequently, the keyword indicators of questions used to set the labeling candidate index, and the most appropriate index indicator is set as the label for the topic when considering the topic-specific word distribution. Third, using L-LDA, which sets the cover letter and label as pass and fail, we generate topics and probabilities for each field of pass and fail labels. Furthermore, we extract only words of discriminative characteristics that give labeled topics among generated topics and probabilities by pass and fail labels. Next, we extract the difference between the probability on the pass label and the probability on the fail label by word of the labeled discriminative characteristic. A positive figure can be seen as having the polarity of pass, and a negative figure can be seen as having the polarity of fail. This study is the first research to reflect the characteristics of cover letters of Republic of Korea Air Force non-commissioned officer applicants, not in the private sector. Moreover, these methodologies can apply text mining techniques for multiple documents, rather survey or interview methods, to reduce analysis time and increase reliability for the entire population. For this reason, the methodology proposed in the study is also applicable to other forms of multiple documents in the field of military personnel. This study shows that L-LDA is more suitable than LDA to extract discriminative characteristics of Republic of Korea Air Force Noncommissioned cover letters. Furthermore, this study proposes a methodology that uses a combination of LDA and L-LDA. Therefore, through the analysis of the results of the acquisition of non-commissioned Republic of Korea Air Force officers, we would like to provide information available for acquisition and promotional policies and propose a methodology available for research in the field of military manpower acquisition.
Keywords
Air Force Non-Commissioned Officer; Cover Letter; Text Mining; LDA(latent Dirichlet allocation); L-LDA(Labeled latent Dirichlet allocation);
Citations & Related Records
연도 인용수 순위
  • Reference
1 Bae, S. H., X. Ku, C. Park, J. Ki, "A Latent Topic Modeling approach for Subject Summarization of Research on the Military Art and Science in South Korea," Korean Journal of Military Art and Science, Vol.76, No.2(2020), 181~216.   DOI
2 Lee, M. C., H. J. Kim, "Construction of Event Networks from Large News Data Using Text Mining Techniques," Journal of Intelligence and Information Systems, Vol.24, No.1(2018), 183-203.   DOI
3 Lee, J. H., S. H. Jung, J. H. Kim, E. J. Min, U. Y. Yeo, J. W. Kim, "Product Evaluation Criteria Extraction through Online Review Analysis : Using LDA and k-Nearest Neighbor Approach," Journal of Intelligence and Information Systems, Vol.26, No.1(2020), 97-117.
4 Allahyari, M., S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, K. Kochut, "A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques," arXiv: 1707.02919v2(2015).
5 Blei, D. M., A. Y. Ng, M. I. Jordan, "Latent dirichlet allocation," Journal of Machine Learning Research, Vol.3(2003), 993-1022.
6 Jeon, G. W., I. Kang, J. H. Jeon, "Systematic Analysis on the Trend of Defense Technologies Using Topic Modeling : A Case of an Armoured Fighting Vehicle," The Journal of Business and Economics, Vol.36, NO.1(2020), 69-94.
7 Moon, S. H., J. Y. Kang, "A study on detective story authors' style differentiation and style structure based on Text Mining," Journal of Intelligence and Information Systems, Vol.25, No.3(2019), 89-115.   DOI
8 Dohkgoh, S., P. R. Kim, "The deepening of low birthrates and the issue of military manpower acquisition in developed countries," KIDA Defense Weekly, Vol.1652(2017).
9 Kim, H. J., W. J. Kim, "A Study on Automatic Analysis System of National Defense Articles," Journal of the KIMST, Vol.21, No.1(2018), 86-93.
10 Kim, Y. S., H. S. Moon, J. K. Kim, "Self Introduction Essay Classification Using Doc2Vec for Efficient Job Matching," Journal of Information Technology Service, Vol.19, No.1(2020), 103-113.   DOI
11 Ramage, D., D. Hall, R. Nallapati and C. D. Manning, "Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora," Proceedings of the 2009 conference on empirical methods in natural language processing, (2009).
12 Teh, Y. W., M. I. Jordan, M. J. Beal and D. M. Blei, "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, Vol.101, No.476(2006), 1566-1581.   DOI
13 Lim, S. S., M. G. Lee, "A study on military organizational tasks analysis methodology," The Korean Data and Information Science Society, Vol.30, No.1(2019), 139-157.   DOI
14 Baek, S. Y., J. U. Leem, H. J. Kwon, "An Empirical Study on The Relationship Between Professional Soldiers Selection Variables and Job Satisfaction, Job Performance," Journal of Employment and Career, Vol.9, No.2(2019), 95-116.   DOI
15 Kim, D. W., J. Y. Kang, J. I. Lim, "Comparative Analysis of Job Satisfaction Factors, Using LDA Topic Modeling by Industries : The Case Study of Job Planet Reviews," Journal of Information Technology Services, Vol.15, No.3(2016), 157-171.   DOI
16 Kim, H. K., "A Study on Teaching How to Write a Cover Letter for a Job," The Society Of Korean Literary Criticism, Vol.51(2014), 7-34.
17 Lee, C. Y., H. S. Moon, "Study on analysis of North Korea's news trends associated with provocations using text mining," Journal of National Defence Studies, Vol.59, No.4(2016), 103-124.
18 Lee, D. G., I. H. Kim, "An Analysis of Self-introduction Texts based on Statistical Text Analysis," Korean Cultural Studies, Vol.81(2018), 649-684.   DOI
19 Newman, D., J. H. Lau, K. Grieser, T. Baldwin, "Automatic evaluation of topic coherence," In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, (2010), 100-108.
20 Oh, S. H., H. J. Kim, "A Study on the 'Low Fertility' Research Trends Using Text Mining Technique: Focusing on the Comparison with the Process of Low Fertility Policy," Health and Social Welfare Review, Vol.40, No.3 (2020), 492-533.   DOI
21 Tan, A. H., "Text mining: The state of the art and the challenges," Proceedings of the PAKDD 1999 Workshop on Knowledge Discovery from Advanced Databases, (1999), 65-70.
22 Shin, J. S., "A Study on Teaching Method of Self-introduction for Employment," A collection of Southeast Asian literature, Vol.40(2015), 83-113.
23 Yoon, S., S. Kim, K. Shin, "Development of the Accident Prediction Model for Enlisted Men through an Integrated Approach to Datamining and Textmining," Journal of Intelligence and Information Systems, Vol.21, No.3(2015), 1-17.   DOI
24 Kim, S. G., J. Y. Kang, "Analyzing the discriminative attributes of products using text mining focused on cosmetic reviews," Information Processing and Management, Vol.54, No.6(2018), 938-957.   DOI