• Title/Summary/Keyword: 문서표현

Search Result 1,135, Processing Time 0.029 seconds

Query-based Answer Extraction using Korean Dependency Parsing (의존 구문 분석을 이용한 질의 기반 정답 추출)

  • Lee, Dokyoung;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.161-177
    • /
    • 2019
  • In this paper, we study the performance improvement of the answer extraction in Question-Answering system by using sentence dependency parsing result. The Question-Answering (QA) system consists of query analysis, which is a method of analyzing the user's query, and answer extraction, which is a method to extract appropriate answers in the document. And various studies have been conducted on two methods. In order to improve the performance of answer extraction, it is necessary to accurately reflect the grammatical information of sentences. In Korean, because word order structure is free and omission of sentence components is frequent, dependency parsing is a good way to analyze Korean syntax. Therefore, in this study, we improved the performance of the answer extraction by adding the features generated by dependency parsing analysis to the inputs of the answer extraction model (Bidirectional LSTM-CRF). The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. In this study, we compared the performance of the answer extraction model when inputting basic word features generated without the dependency parsing and the performance of the model when inputting the addition of the Eojeol tag feature and dependency graph embedding feature. Since dependency parsing is performed on a basic unit of an Eojeol, which is a component of sentences separated by a space, the tag information of the Eojeol can be obtained as a result of the dependency parsing. The Eojeol tag feature means the tag information of the Eojeol. The process of generating the dependency graph embedding consists of the steps of generating the dependency graph from the dependency parsing result and learning the embedding of the graph. From the dependency parsing result, a graph is generated from the Eojeol to the node, the dependency between the Eojeol to the edge, and the Eojeol tag to the node label. In this process, an undirected graph is generated or a directed graph is generated according to whether or not the dependency relation direction is considered. To obtain the embedding of the graph, we used Graph2Vec, which is a method of finding the embedding of the graph by the subgraphs constituting a graph. We can specify the maximum path length between nodes in the process of finding subgraphs of a graph. If the maximum path length between nodes is 1, graph embedding is generated only by direct dependency between Eojeol, and graph embedding is generated including indirect dependencies as the maximum path length between nodes becomes larger. In the experiment, the maximum path length between nodes is adjusted differently from 1 to 3 depending on whether direction of dependency is considered or not, and the performance of answer extraction is measured. Experimental results show that both Eojeol tag feature and dependency graph embedding feature improve the performance of answer extraction. In particular, considering the direction of the dependency relation and extracting the dependency graph generated with the maximum path length of 1 in the subgraph extraction process in Graph2Vec as the input of the model, the highest answer extraction performance was shown. As a result of these experiments, we concluded that it is better to take into account the direction of dependence and to consider only the direct connection rather than the indirect dependence between the words. The significance of this study is as follows. First, we improved the performance of answer extraction by adding features using dependency parsing results, taking into account the characteristics of Korean, which is free of word order structure and omission of sentence components. Second, we generated feature of dependency parsing result by learning - based graph embedding method without defining the pattern of dependency between Eojeol. Future research directions are as follows. In this study, the features generated as a result of the dependency parsing are applied only to the answer extraction model in order to grasp the meaning. However, in the future, if the performance is confirmed by applying the features to various natural language processing models such as sentiment analysis or name entity recognition, the validity of the features can be verified more accurately.

가정 폭력 경험이 남자 범죄 청소년의 남성성에 미치는 영향에 관한 연구

  • Kim, Kyung-Ho
    • 한국사회복지학회:학술대회논문집
    • /
    • 2003.05a
    • /
    • pp.282-309
    • /
    • 2003
  • This exploratory qualitative study investigates the effects of experiencing domestic violence on male adolescent offenders' masculinities. Empirical and theoretical literature suggests that negative male role models in violent families result in male adolescents' experiencing conflict in constructing gender identities, especially masculinities. Moreover. criminologists argue that masculinities are often connected with crimes as a way to prove masculine competence. This study compares male adolescent offenders who have experienced domestic violence with those who have not experienced domestic violence and explores how domestic violence experiences influence the construction of gender identities among male adolescent offenders. The study used a secondary qualitative data analysis method. The data consisted of ethnographic in-depth interview transcripts, observational field notes, and formal facility records collected at a juvenile correctional facility in Minnesota. The process of data analysis was a "constant comparative method" that sought to understand differences and similarities in the expressed gender narratives and identity patterns between the two groups of offenders. This process also examined differences within each group. The qualitative data analysis revealed that domestic violence experiences in childhood may be related to the construction of gender identities during adolescence. The findings of this study showed that male adolescent offenders who had experienced domestic violence tended to attach themselves to oppressed mothers more readily than those who had not experienced domestic violence. Next, their attachment to mothers related to the construction of more relational gender identities although most participants, regardless of domestic violence experiences, had much in common regarding gender expression. Finally, despite these relational gender identities, male adolescent offenders who had experienced domestic violence tended to depend upon violence and crimes to show masculine competence, as did male adolescent offenders who had not experienced domestic violence. The study findings suggest a need for research to understand the construction of gender identities in the context of particular experiences and the importance of building theories that advance a comprehensive understanding of the construction of masculinities and youth crime. This study also discusses the development of social work programs that protect young men from adherence to exaggerated masculinity, which is often associated with crimes.

  • PDF

Consumers Perceptions on Monosodium L-glutamate in Social Media (소셜미디어 분석을 통한 소비자들의 L-글루타민산나트륨에 대한 인식 조사)

  • Lee, Sooyeon;Lee, Wonsung;Moon, Il-Chul;Kwon, Hoonjeong
    • Journal of Food Hygiene and Safety
    • /
    • v.31 no.3
    • /
    • pp.153-166
    • /
    • 2016
  • The purpose of this study was to investigate consumers' perceptions on monosodium L-glutamate (MSG) in social media. Data were collected from Naver blogs and Naver web communities (Korean representative portal web-site), and media reports including comment sections on a Yonhap news website (Korean largest news agency). The results from Naver blogs and Naver web communities showed that it was primarily mentioned MSG-use restaurant reviews, 'MSG-no added' products, its safety, and methods of reducing MSG in food. When TV shows on current affairs, newspaper, or TV news reported uses and side effects of MSG, search volume for MSG has increased in both PC and mobile search engines. Search volume has increased especially when TV shows on current affairs reported it. There are more periods with increased search volume for Mobile than PC. Also, it was mainly commented about safety of MSG, criticism of low-quality foods, abuse of MSG, and distrust of government below the news on the Yonhap news site. The label of MSG-no added products in market emphasized "MSG-free" even though it is allocated as an acceptable daily intake (ADI) not-specified by the Joint FAO/WHO Expert Committee on Food Additives (JECFA). When consumers search for MSG (monosodium L-glutamate) or purchase food on market, they might perceive that 'MSG-no added' products are better. Competent authorities, offices of education and local government provide guidelines based on no added MSG principle and these policies might affect consumers' perceptions. TV program or news program could be a powerful and effective consumer communication channel about MSG through Mobile rather than PC. Therefore media including TV should report item on monosodium L-glutamate with responsibility and information based on scientific background for consumers to get reliable information.

A Comparative Analysis of South Korean and the U.S. Home Economics Curricula and Achievement Standards (한국과 미국의 가정과 교육과정과 성취기준 비교 분석 연구)

  • Kwon, Yoojin;Kim, Eun Jeung;Lee, Yoon-Jung
    • Journal of Korean Home Economics Education Association
    • /
    • v.25 no.4
    • /
    • pp.29-46
    • /
    • 2013
  • The concepts of core competencies and achievement standards were newly introduced within national curriculum documents since the 2009 Revised National Curriculum. The purpose of this introduction was to develop a curriculum that reflects unique characteristics of each subject and for the effectiveness of student evaluation. The purpose of this study was to suggest a direction for the future national curriculum and achievement standards development through comparing the national curriculum and standards between South Korea and the U.S. In particular, this study focused on two aspects: 1) the hierarchical relationships and the structural system of achievement standards in the curricula of two countries, and 2) the details of differences in two countries' achievement standards of a specific content area, 'family'. The results are as follows: the Korean national curriculum includes core competencies was included in the objective statement, and standards were provided as a lower-level system, while the U.S. national standards was composed of hierarchical system of comprehensive standards(higher-level), contents standards(middle-level), and competencies(lower-level). This may be attributable to the difference in the definition of competencies. The analysis results of detailed contents of the curriculum was related to the terminologies used in curriculum documents of the two countries. For example, work and family balance was frequently mentioned in Korean document, while the U.S. national curriculum just displayed multiple roles of individuals rather than using the term explicitly. Also, terms such as happiness and welfare were frequently mentioned in Korean curriculum, while 'well-being' was more frequently used in the U.S. curriculum. These differences in usage of terms reflects the differences in cultural values and perspectives of the two countries.

  • PDF

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.