Analyzing the discriminative characteristic of cover letters using text mining focused on Air Force applicants (텍스트 마이닝을 이용한 공군 부사관 지원자 자기소개서의 차별적 특성 분석)
-
- Journal of Intelligence and Information Systems
- /
- v.27 no.3
- /
- pp.75-94
- /
- 2021
The low birth rate and shortened military service period are causing concerns about selecting excellent military officers. The Republic of Korea entered a low birth rate society in 1984 and an aged society in 2018 respectively, and is expected to be in a super-aged society in 2025. In addition, the troop-oriented military is changed as a state-of-the-art weapons-oriented military, and the reduction of the military service period was implemented in 2018 to ease the burden of military service for young people and play a role in the society early. Some observe that the application rate for military officers is falling due to a decrease of manpower resources and a preference for shortened mandatory military service over military officers. This requires further consideration of the policy of securing excellent military officers. Most of the related studies have used social scientists' methodologies, but this study applies the methodology of text mining suitable for large-scale documents analysis. This study extracts words of discriminative characteristics from the Republic of Korea Air Force Non-Commissioned Officer Applicant cover letters and analyzes the polarity of pass and fail. It consists of three steps in total. First, the application is divided into general and technical fields, and the words characterized in the cover letter are ordered according to the difference in the frequency ratio of each field. The greater the difference in the proportion of each application field, the field character is defined as 'more discriminative'. Based on this, we extract the top 50 words representing discriminative characteristics in general fields and the top 50 words representing discriminative characteristics in technology fields. Second, the number of appropriate topics in the overall cover letter is calculated through the LDA. It uses perplexity score and coherence score. Based on the appropriate number of topics, we then use LDA to generate topic and probability, and estimate which topic words of discriminative characteristic belong to. Subsequently, the keyword indicators of questions used to set the labeling candidate index, and the most appropriate index indicator is set as the label for the topic when considering the topic-specific word distribution. Third, using L-LDA, which sets the cover letter and label as pass and fail, we generate topics and probabilities for each field of pass and fail labels. Furthermore, we extract only words of discriminative characteristics that give labeled topics among generated topics and probabilities by pass and fail labels. Next, we extract the difference between the probability on the pass label and the probability on the fail label by word of the labeled discriminative characteristic. A positive figure can be seen as having the polarity of pass, and a negative figure can be seen as having the polarity of fail. This study is the first research to reflect the characteristics of cover letters of Republic of Korea Air Force non-commissioned officer applicants, not in the private sector. Moreover, these methodologies can apply text mining techniques for multiple documents, rather survey or interview methods, to reduce analysis time and increase reliability for the entire population. For this reason, the methodology proposed in the study is also applicable to other forms of multiple documents in the field of military personnel. This study shows that L-LDA is more suitable than LDA to extract discriminative characteristics of Republic of Korea Air Force Noncommissioned cover letters. Furthermore, this study proposes a methodology that uses a combination of LDA and L-LDA. Therefore, through the analysis of the results of the acquisition of non-commissioned Republic of Korea Air Force officers, we would like to provide information available for acquisition and promotional policies and propose a methodology available for research in the field of military manpower acquisition.
Recommender Systems have been huge influence users and business more and more. Recently the importance of E-commerce has been reached rapid growth greatly in world-wide COVID-19 pandemic. Recommender system is the center of E-commerce lively. Top ranked E-commerce managers mentioned that recommender systems have a major influence on customer's purchase such as about 50% of Netflix, Amazon sales from their recommender systems. Most algorithms have been focused on improving accuracy of recommender system regardless of novelty, diversity, serendipity etc. Recommender systems with only high accuracy cannot satisfy business long-term profit because of generating sales polarization. In addition, customers do not experience enjoyment of shopping from only focusing accuracy recommender system because customer's preference is changed constantly. Therefore, recommender systems with various values need to be developed for user's high satisfaction. Reranking is the most useful methodology to realize diversity of recommender system. In this paper, diversity of recommender system is represented through constructing high similarity with users who have different preference using each user's purchased item's category algorithm. It is distinguished from past research approach which is changing the algorithm of recommender system without user's diversity preference level. We tried to discover user's diversity preference level and observed the results how the effect was different according to user's diversity preference level. In addition, graph-based recommender system was used to show diversity through user's network, not collaborative filtering. In this paper, Amazon Grocery and Gourmet Food data was used because the low-involvement product, such as habitual product, foods, low-priced goods etc., had high probability to show customer's diversity. First, a bipartite graph with users and items simultaneously is constructed to make graph-based recommender system. However, each users and items unipartite graph also need to be established to show diversity of recommender system. The weight of each unipartite graph has played crucial role changing Jaccard Distance of item's category. We can observe two important results from the user's unipartite network. First, the user's diversity preference level is observed from the network and second, dissimilar users can be discovered in the user's network. Through the research process, diversity of recommender system is presented highly with small accuracy loss and optimalization for higher accuracy is possible controlling diversity ratio. This paper has three important theoretical points. First, this research expands recommender system research for user's satisfaction with various values. Second, the graph-based recommender system is developed newly. Third, the evaluation indicator of diversity is made for diversity. In addition, recommender systems are useful for corporate profit practically and this paper has contribution on business closely. Above all, business long-term profit can be improved using recommender system with diversity and the recommender system can provide right service according to user's diversity level. Lastly, the corporate selling low-involvement products have great effect based on the results.
Recently, interest in social robots that can socially interact with humans is increasing. Thanks to the development of ICT technology, social robots have become easier to provide personalized services and emotional connection to individuals, and the role of social robots is drawing attention as a means to solve modern social problems and the resulting decline in the quality of individual lives. Along with the interest in social robots, the spread of social robots is also increasing significantly. Many companies are introducing robot products to the market to target various target markets, but so far there is no clear trend leading the market. Accordingly, there are more and more attempts to differentiate robots through the design of social robots. In particular, anthropomorphism has been studied importantly in social robot design, and many approaches have been attempted to anthropomorphize social robots to produce positive effects. However, there is a lack of research that systematically describes the mechanism by which anthropomorphism for social robots is formed. Most of the existing studies have focused on verifying the positive effects of the anthropomorphism of social robots on consumers. In addition, the formation of anthropomorphism of social robots may vary depending on the individual's motivation or temperament, but there are not many studies examining this. A vague understanding of anthropomorphism makes it difficult to derive design optimal points for shaping the anthropomorphism of social robots. The purpose of this study is to verify the mechanism by which the anthropomorphism of social robots is formed. This study confirmed the effect of the human-likeness of social robots(Within-subjects) and the construal level of consumers(Between-subjects) on the formation of anthropomorphism through an experimental study of 3×2 mixed design. Research hypotheses on the mechanism by which anthropomorphism is formed were presented, and the hypotheses were verified by analyzing data from a sample of 206 people. The first hypothesis in this study is that the higher the human-likeness of the robot, the higher the level of anthropomorphism for the robot. Hypothesis 1 was supported by a one-way repeated measures ANOVA and a post hoc test. The second hypothesis in this study is that depending on the construal level of consumers, the effect of human-likeness on the level of anthropomorphism will be different. First, this study predicts that the difference in the level of anthropomorphism as human-likeness increases will be greater under high construal condition than under low construal condition.Second, If the robot has no human-likeness, there will be no difference in the level of anthropomorphism according to the construal level. Thirdly,If the robot has low human-likeness, the low construal level condition will make the robot more anthropomorphic than the high construal level condition. Finally, If the robot has high human-likeness, the high construal levelcondition will make the robot more anthropomorphic than the low construal level condition. We performed two-way repeated measures ANOVA to test these hypotheses, and confirmed that the interaction effect of human-likeness and construal level was significant. Further analysis to specifically confirm interaction effect has also provided results in support of our hypotheses. The analysis shows that the human-likeness of the robot increases the level of anthropomorphism of social robots, and the effect of human-likeness on anthropomorphism varies depending on the construal level of consumers. This study has implications in that it explains the mechanism by which anthropomorphism is formed by considering the human-likeness, which is the design attribute of social robots, and the construal level of consumers, which is the way of thinking of individuals. We expect to use the findings of this study as the basis for design optimization for the formation of anthropomorphism in social robots.
As the information technology industry develops, various kinds of data are being created, and it is now essential to process them and use them in the industry. Analyzing and utilizing various digital data collected online and offline is a necessary process to provide an appropriate experience for customers in the industry. In order to create new businesses, products, and services, it is essential to use customer data collected in various ways to deeply understand potential customers' needs and analyze behavior patterns to capture hidden signals of desire. However, it is true that research using data analysis and UX methodology, which should be conducted in parallel for effective service development, is being conducted separately and that there is a lack of examples of use in the industry. In thiswork, we construct a single process by applying data analysis methods and UX methodologies. This study is important in that it is highly likely to be used because it applies methodologies that are actively used in practice. We conducted a survey on the topic to identify and cluster the associations between factors to establish customer classification and target customers. The research methods are as follows. First, we first conduct a factor, regression analysis to determine the association between factors in the happiness data survey. Groups are grouped according to the survey results and identify the relationship between 34 questions of psychological stability, family life, relational satisfaction, health, economic satisfaction, work satisfaction, daily life satisfaction, and residential environment satisfaction. Second, we classify clusters based on factors affecting happiness and extract the optimal number of clusters. Based on the results, we cross-analyzed the characteristics of each cluster. Third, forservice definition, analysis was conducted by correlating with keywords related to happiness. We leverage keyword analysis of the thumb trend to derive ideas based on the interest and associations of the keyword. We also collected approximately 11,000 news articles based on the top three keywords that are highly related to happiness, then derived issues between keywords through text mining analysis in SAS, and utilized them in defining services after ideas were conceived. Fourth, based on the characteristics identified through data analysis, we selected segmentation and targetingappropriate for service discovery. To this end, the characteristics of the factors were grouped and selected into four groups, and the profile was drawn up and the main target customers were selected. Fifth, based on the characteristics of the main target customers, interviewers were selected and the In-depthinterviews were conducted to discover the causes of happiness, causes of unhappiness, and needs for services. Sixth, we derive customer behavior patterns based on segment results and detailed interviews, and specify the objectives associated with the characteristics. Seventh, a typical persona using qualitative surveys and a persona using data were produced to analyze each characteristic and pros and cons by comparing the two personas. Existing market segmentation classifies customers based on purchasing factors, and UX methodology measures users' behavior variables to establish criteria and redefine users' classification. Utilizing these segment classification methods, applying the process of producinguser classification and persona in UX methodology will be able to utilize them as more accurate customer classification schemes. The significance of this study is summarized in two ways: First, the idea of using data to create a variety of services was linked to the UX methodology used to plan IT services by applying it in the hot topic era. Second, we further enhance user classification by applying segment analysis methods that are not currently used well in UX methodologies. To provide a consistent experience in creating a single service, from large to small, it is necessary to define customers with common goals. To this end, it is necessary to derive persona and persuade various stakeholders. Under these circumstances, designing a consistent experience from beginning to end, through fast and concrete user descriptions, would be a very effective way to produce a successful service.
Recommender system has become one of the most important technologies in e-commerce in these days. The ultimate reason to shop online, for many consumers, is to reduce the efforts for information search and purchase. Recommender system is a key technology to serve these needs. Many of the past studies about recommender systems have been devoted to developing and improving recommendation algorithms and collaborative filtering (CF) is known to be the most successful one. Despite its success, however, CF has several shortcomings such as cold-start, sparsity, gray sheep problems. In order to be able to generate recommendations, ordinary CF algorithms require evaluations or preference information directly from users. For new users who do not have any evaluations or preference information, therefore, CF cannot come up with recommendations (Cold-star problem). As the numbers of products and customers increase, the scale of the data increases exponentially and most of the data cells are empty. This sparse dataset makes computation for recommendation extremely hard (Sparsity problem). Since CF is based on the assumption that there are groups of users sharing common preferences or tastes, CF becomes inaccurate if there are many users with rare and unique tastes (Gray sheep problem). This study proposes a new algorithm that utilizes Social Network Analysis (SNA) techniques to resolve the gray sheep problem. We utilize 'degree centrality' in SNA to identify users with unique preferences (gray sheep). Degree centrality in SNA refers to the number of direct links to and from a node. In a network of users who are connected through common preferences or tastes, those with unique tastes have fewer links to other users (nodes) and they are isolated from other users. Therefore, gray sheep can be identified by calculating degree centrality of each node. We divide the dataset into two, gray sheep and others, based on the degree centrality of the users. Then, different similarity measures and recommendation methods are applied to these two datasets. More detail algorithm is as follows: Step 1: Convert the initial data which is a two-mode network (user to item) into an one-mode network (user to user). Step 2: Calculate degree centrality of each node and separate those nodes having degree centrality values lower than the pre-set threshold. The threshold value is determined by simulations such that the accuracy of CF for the remaining dataset is maximized. Step 3: Ordinary CF algorithm is applied to the remaining dataset. Step 4: Since the separated dataset consist of users with unique tastes, an ordinary CF algorithm cannot generate recommendations for them. A 'popular item' method is used to generate recommendations for these users. The F measures of the two datasets are weighted by the numbers of nodes and summed to be used as the final performance metric. In order to test performance improvement by this new algorithm, an empirical study was conducted using a publically available dataset - the MovieLens data by GroupLens research team. We used 100,000 evaluations by 943 users on 1,682 movies. The proposed algorithm was compared with an ordinary CF algorithm utilizing 'Best-N-neighbors' and 'Cosine' similarity method. The empirical results show that F measure was improved about 11% on average when the proposed algorithm was used