Search | Korea Science

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

Ko, Eunjung;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.24 no.2
- /
- pp.125-148
- /
- 2018
Recently, as the demand for big data analysis increases, cases of analyzing unstructured data and using the results are also increasing. Among the various types of unstructured data, text is used as a means of communicating information in almost all fields. In addition, many analysts are interested in the amount of data is very large and relatively easy to collect compared to other unstructured and structured data. Among the various text analysis applications, document classification which classifies documents into predetermined categories, topic modeling which extracts major topics from a large number of documents, sentimental analysis or opinion mining that identifies emotions or opinions contained in texts, and Text Summarization which summarize the main contents from one document or several documents have been actively studied. Especially, the text summarization technique is actively applied in the business through the news summary service, the privacy policy summary service, ect. In addition, much research has been done in academia in accordance with the extraction approach which provides the main elements of the document selectively and the abstraction approach which extracts the elements of the document and composes new sentences by combining them. However, the technique of evaluating the quality of automatically summarized documents has not made much progress compared to the technique of automatic text summarization. Most of existing studies dealing with the quality evaluation of summarization were carried out manual summarization of document, using them as reference documents, and measuring the similarity between the automatic summary and reference document. Specifically, automatic summarization is performed through various techniques from full text, and comparison with reference document, which is an ideal summary document, is performed for measuring the quality of automatic summarization. Reference documents are provided in two major ways, the most common way is manual summarization, in which a person creates an ideal summary by hand. Since this method requires human intervention in the process of preparing the summary, it takes a lot of time and cost to write the summary, and there is a limitation that the evaluation result may be different depending on the subject of the summarizer. Therefore, in order to overcome these limitations, attempts have been made to measure the quality of summary documents without human intervention. On the other hand, as a representative attempt to overcome these limitations, a method has been recently devised to reduce the size of the full text and to measure the similarity of the reduced full text and the automatic summary. In this method, the more frequent term in the full text appears in the summary, the better the quality of the summary. However, since summarization essentially means minimizing a lot of content while minimizing content omissions, it is unreasonable to say that a "good summary" based on only frequency always means a "good summary" in its essential meaning. In order to overcome the limitations of this previous study of summarization evaluation, this study proposes an automatic quality evaluation for text summarization method based on the essential meaning of summarization. Specifically, the concept of succinctness is defined as an element indicating how few duplicated contents among the sentences of the summary, and completeness is defined as an element that indicating how few of the contents are not included in the summary. In this paper, we propose a method for automatic quality evaluation of text summarization based on the concepts of succinctness and completeness. In order to evaluate the practical applicability of the proposed methodology, 29,671 sentences were extracted from TripAdvisor 's hotel reviews, summarized the reviews by each hotel and presented the results of the experiments conducted on evaluation of the quality of summaries in accordance to the proposed methodology. It also provides a way to integrate the completeness and succinctness in the trade-off relationship into the F-Score, and propose a method to perform the optimal summarization by changing the threshold of the sentence similarity.
https://doi.org/10.13088/jiis.2018.24.2.125 인용 PDF KSCI

A Study on the Structure of an Animation and the Generation of Signification (애니메이션 <겨울왕국>의 구조와 의미생성 연구)

Sung, Re-A;Kim, Hye-Sung
- Cartoon and Animation Studies
- /
- s.37
- /
- pp.197-219
- /
- 2014
, one of the Disney's animations, hit the 10 million audience mark for the first time in the history of animations released in Korea. not only raised the fever with its theme song, 'Let it go', as well as Elsa, Anna, and Olaf's character products but caused sensations in many ways. If so, we need to think about what kind of meaning did create in Korea to be so sensational. This study examines the value that Frozen intended to deliver and the meaning it generated by using Greimas actant model and semiotic square. From the actant model analysis on Anna and Elsa from , it was identified that Anna desired to recover her relationship with Elsa and to take summer back in Arendelle. Her desires can be interpreted as her love toward Elsa and people in Arendelle. Meanwhile, Elsa always desired freedom although she confined herself because of her ability to freeze. In other words, Elsa desired to free herself from her freezing ability by finding out how to control her ability. Such desires of Anna and Elsa were achieved by their actions of true love, and the solution of all the conflicts in was an action of true love. From the semiotic square analysis on the meaning of , it was found out that created past-oriented value with which characters tried to change their abnormal lives of the present into their normal lives of the past. The characters tried to change their present lives where freezing winter comes in the middle of summer, communication between the sisters is cut off, and people try to take advantage of the abnormal state deliberately, into the past when the sisters had a good relationship and the natural season of summer in Arendelle. The past-oriented value that tried to tell us is similar to our reality. In our reality with a lot of unbelievable news and unstable circumstances, we desire to go back to the past when we were filled with affection and hope even though our lives were tough and difficult. This sentiment must have contributed to the huge success of in Korea.
https://doi.org/10.7230/KOSCAS.2014.37.197 인용 PDF KSCI

The Study on Change in Sex-Related Knowledge and Attitude through Sex Education : focusing on the 1st grade students in girls' junior high schools (성교육 실시에 따른 성지식, 성태도 변화 연구 -1학년 여중생을 대상으로-)

계수연;문인옥
- Korean Journal of Health Education and Promotion
- /
- v.16 no.2
- /
- pp.137-155
- /
- 1999
The purpose of this study was to assess the effect of sex education on knowledge and attitude related to sex. The subjects were taken from by 199 students in 3 classes from 1st grade in H girl's junior high school as the study group, and 2 classes from 1st grade in S girl's junior high school as control group. During the survey period(September 21, 1998 to September 30, 1998), 6 times in terms of one-hour class for sex education were taught to the study group. A pre-test was executed on September 19, 1998 and the post-test on September 30. The findings were as follows. 1. According to the research, 20.1％ of the subjects have experienced sex education from parents and 89.9％ from teacher. They have mostly obtained the sex-related information from teachers(59.8％), following movie, radio, TV, or video tape(40.7％), friends(35.2％), reading materials such as books, cartoons, news papers and magazines(31.7％), parents(15.6％), siblings(7.0％), PC(1.5％) and telephone service(1.5％). 2. 27.1％ of the subjects reported that they had sex-related worry concerning from friendship with the opposite sex, following physiological phenomenon(31.5％), sex violence(11.1％), physical characteristics(7.4％), VD and contraception(5.6％), sexual impulse(5.6％), pregnancy and delivery(5.5％), and sexual behaviour(3.7％). The research showed that the adolescents usually solved their problems through the consultation with theifriends(44.4％). However, 16.7％ of the subjects were turned out not to request any solution. The other minor routes to settle their problems were written materials such as books, magazines(13.0％), parents(13.0％), movie, radio, TV, or video tape(5.5％), acquainted female elders(3.7％) and teachers(3.7％). 3. The most interesting part regarding sex was the friendship with the opposite sex(61.8％), following adolescent's emotion(55.8％), physiological differences between two genders(52.8％), AIDS(48.7％), VD(46.7％), pregnancy(45.2％), contraception(45.2％), abortion(41.7％), intercourse(41.7％), masturbation(41.2％), sex violence(41.2％) and genital structure and secondary sexual characteristics(28.6％). 4. In regard to characteristics of the subjects influencing sex-related knowledge, the higher educational career of mother, living with at least either parent and the experience of sex education by teachers were statistically significant factors(p〈0.05). 5. In regard to characteristics of the subjects influencing attitudes toward sex, the experience of sex education by parents or teachers was a statistically significant factor(p〈0.05). 6. The analysis of knowledge score comparing results before and after sex education showed that control group's score decreased from 12.5 to 12.44 while the study group's score increased from 12.33 to 21.31, which was statistically significant(p〈0.001). 7. The analysis of the attitude scores before and after sex education showed that the control group's score slightly increased from 55.57 to 56.36, while the study group's score increased from 54.79 to 61.95, which was statistically significant(p〈0.001). 8. The level of sex-related concerns of the study group after sex education marked both the increase in some items and the decrease in others. 9. Most instructive session among the sex education was the third “to be a good friend to the opposite sex”(27.0％).
PDF

WHICH INFORMATION MOVES PRICES: EVIDENCE FROM DAYS WITH DIVIDEND AND EARNINGS ANNOUNCEMENTS AND INSIDER TRADING

Kim, Chan-Wung;Lee, Jae-Ha
- The Korean Journal of Financial Studies
- /
- v.3 no.1
- /
- pp.233-265
- /
- 1996
We examine the impact of public and private information on price movements using the thirty DJIA stocks and twenty-one NASDAQ stocks. We find that the standard deviation of daily returns on information days (dividend announcement, earnings announcement, insider purchase, or insider sale) is much higher than on no-information days. Both public information matters at the NYSE, probably due to masked identification of insiders. Earnings announcement has the greatest impact for both DJIA and NASDAQ stocks, and there is some evidence of positive impact of insider asle on return volatility of NASDAQ stocks. There has been considerable debate, e.g., French and Roll (1986), over whether market volatility is due to public information or private information-the latter gathered through costly search and only revealed through trading. Public information is composed of (1) marketwide public information such as regularly scheduled federal economic announcements (e.g., employment, GNP, leading indicators) and (2) company-specific public information such as dividend and earnings announcements. Policy makers and corporate insiders have a better access to marketwide private information (e.g., a new monetary policy decision made in the Federal Reserve Board meeting) and company-specific private information, respectively, compated to the general public. Ederington and Lee (1993) show that marketwide public information accounts for most of the observed volatility patterns in interest rate and foreign exchange futures markets. Company-specific public information is explored by Patell and Wolfson (1984) and Jennings and Starks (1985). They show that dividend and earnings announcements induce higher than normal volatility in equity prices. Kyle (1985), Admati and Pfleiderer (1988), Barclay, Litzenberger and Warner (1990), Foster and Viswanathan (1990), Back (1992), and Barclay and Warner (1993) show that the private information help by informed traders and revealed through trading influences market volatility. Cornell and Sirri (1992)' and Meulbroek (1992) investigate the actual insider trading activities in a tender offer case and the prosecuted illegal trading cased, respectively. This paper examines the aggregate and individual impact of marketwide information, company-specific public information, and company-specific private information on equity prices. Specifically, we use the thirty common stocks in the Dow Jones Industrial Average (DJIA) and twenty one National Association of Securities Dealers Automated Quotations (NASDAQ) common stocks to examine how their prices react to information. Marketwide information (public and private) is estimated by the movement in the Standard and Poors (S & P) 500 Index price for the DJIA stocks and the movement in the NASDAQ Composite Index price for the NASDAQ stocks. Divedend and earnings announcements are used as a subset of company-specific public information. The trading activity of corporate insiders (major corporate officers, members of the board of directors, and owners of at least 10 percent of any equity class) with an access to private information can be cannot legally trade on private information. Therefore, most insider transactions are not necessarily based on private information. Nevertheless, we hypothesize that market participants observe how insiders trade in order to infer any information that they cannot possess because insiders tend to buy (sell) when they have good (bad) information about their company. For example, Damodaran and Liu (1993) show that insiders of real estate investment trusts buy (sell) after they receive favorable (unfavorable) appraisal news before the information in these appraisals is released to the public. Price discovery in a competitive multiple-dealership market (NASDAQ) would be different from that in a monopolistic specialist system (NYSE). Consequently, we hypothesize that NASDAQ stocks are affected more by private information (or more precisely, insider trading) than the DJIA stocks. In the next section, we describe our choices of the fifty-one stocks and the public and private information set. We also discuss institutional differences between the NYSE and the NASDAQ market. In Section II, we examine the implications of public and private information for the volatility of daily returns of each stock. In Section III, we turn to the question of the relative importance of individual elements of our information set. Further analysis of the five DJIA stocks and the four NASDAQ stocks that are most sensitive to earnings announcements is given in Section IV, and our results are summarized in Section V.
PDF

Text Mining-Based Emerging Trend Analysis for the Aviation Industry (항공산업 미래유망분야 선정을 위한 텍스트 마이닝 기반의 트렌드 분석)

Kim, Hyun-Jung;Jo, Nam-Ok;Shin, Kyung-Shik
- Journal of Intelligence and Information Systems
- /
- v.21 no.1
- /
- pp.65-82
- /
- 2015
Recently, there has been a surge of interest in finding core issues and analyzing emerging trends for the future. This represents efforts to devise national strategies and policies based on the selection of promising areas that can create economic and social added value. The existing studies, including those dedicated to the discovery of future promising fields, have mostly been dependent on qualitative research methods such as literature review and expert judgement. Deriving results from large amounts of information under this approach is both costly and time consuming. Efforts have been made to make up for the weaknesses of the conventional qualitative analysis approach designed to select key promising areas through discovery of future core issues and emerging trend analysis in various areas of academic research. There needs to be a paradigm shift in toward implementing qualitative research methods along with quantitative research methods like text mining in a mutually complementary manner. The change is to ensure objective and practical emerging trend analysis results based on large amounts of data. However, even such studies have had shortcoming related to their dependence on simple keywords for analysis, which makes it difficult to derive meaning from data. Besides, no study has been carried out so far to develop core issues and analyze emerging trends in special domains like the aviation industry. The change used to implement recent studies is being witnessed in various areas such as the steel industry, the information and communications technology industry, the construction industry in architectural engineering and so on. This study focused on retrieving aviation-related core issues and emerging trends from overall research papers pertaining to aviation through text mining, which is one of the big data analysis techniques. In this manner, the promising future areas for the air transport industry are selected based on objective data from aviation-related research papers. In order to compensate for the difficulties in grasping the meaning of single words in emerging trend analysis at keyword levels, this study will adopt topic analysis, which is a technique used to find out general themes latent in text document sets. The analysis will lead to the extraction of topics, which represent keyword sets, thereby discovering core issues and conducting emerging trend analysis. Based on the issues, it identified aviation-related research trends and selected the promising areas for the future. Research on core issue retrieval and emerging trend analysis for the aviation industry based on big data analysis is still in its incipient stages. So, the analysis targets for this study are restricted to data from aviation-related research papers. However, it has significance in that it prepared a quantitative analysis model for continuously monitoring the derived core issues and presenting directions regarding the areas with good prospects for the future. In the future, the scope is slated to expand to cover relevant domestic or international news articles and bidding information as well, thus increasing the reliability of analysis results. On the basis of the topic analysis results, core issues for the aviation industry will be determined. Then, emerging trend analysis for the issues will be implemented by year in order to identify the changes they undergo in time series. Through these procedures, this study aims to prepare a system for developing key promising areas for the future aviation industry as well as for ensuring rapid response. Additionally, the promising areas selected based on the aforementioned results and the analysis of pertinent policy research reports will be compared with the areas in which the actual government investments are made. The results from this comparative analysis are expected to make useful reference materials for future policy development and budget establishment.
https://doi.org/10.13088/jiis.2015.21.1.65 인용 PDF KSCI

Analysis of media trends related to spent nuclear fuel treatment technology using text mining techniques (텍스트마이닝 기법을 활용한 사용후핵연료 건식처리기술 관련 언론 동향 분석)

Jeong, Ji-Song;Kim, Ho-Dong
- Journal of Intelligence and Information Systems
- /
- v.27 no.2
- /
- pp.33-54
- /
- 2021
With the fourth industrial revolution and the arrival of the New Normal era due to Corona, the importance of Non-contact technologies such as artificial intelligence and big data research has been increasing. Convergent research is being conducted in earnest to keep up with these research trends, but not many studies have been conducted in the area of nuclear research using artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. This study was conducted to confirm the applicability of data science analysis techniques to the field of nuclear research. Furthermore, the study of identifying trends in nuclear spent fuel recognition is critical in terms of being able to determine directions to nuclear industry policies and respond in advance to changes in industrial policies. For those reasons, this study conducted a media trend analysis of pyroprocessing, a spent nuclear fuel treatment technology. We objectively analyze changes in media perception of spent nuclear fuel dry treatment techniques by applying text mining analysis techniques. Text data specializing in Naver's web news articles, including the keywords "Pyroprocessing" and "Sodium Cooled Reactor," were collected through Python code to identify changes in perception over time. The analysis period was set from 2007 to 2020, when the first article was published, and detailed and multi-layered analysis of text data was carried out through analysis methods such as word cloud writing based on frequency analysis, TF-IDF and degree centrality calculation. Analysis of the frequency of the keyword showed that there was a change in media perception of spent nuclear fuel dry treatment technology in the mid-2010s, which was influenced by the Gyeongju earthquake in 2016 and the implementation of the new government's energy conversion policy in 2017. Therefore, trend analysis was conducted based on the corresponding time period, and word frequency analysis, TF-IDF, degree centrality values, and semantic network graphs were derived. Studies show that before the 2010s, media perception of spent nuclear fuel dry treatment technology was diplomatic and positive. However, over time, the frequency of keywords such as "safety", "reexamination", "disposal", and "disassembly" has increased, indicating that the sustainability of spent nuclear fuel dry treatment technology is being seriously considered. It was confirmed that social awareness also changed as spent nuclear fuel dry treatment technology, which was recognized as a political and diplomatic technology, became ambiguous due to changes in domestic policy. This means that domestic policy changes such as nuclear power policy have a greater impact on media perceptions than issues of "spent nuclear fuel processing technology" itself. This seems to be because nuclear policy is a socially more discussed and public-friendly topic than spent nuclear fuel. Therefore, in order to improve social awareness of spent nuclear fuel processing technology, it would be necessary to provide sufficient information about this, and linking it to nuclear policy issues would also be a good idea. In addition, the study highlighted the importance of social science research in nuclear power. It is necessary to apply the social sciences sector widely to the nuclear engineering sector, and considering national policy changes, we could confirm that the nuclear industry would be sustainable. However, this study has limitations that it has applied big data analysis methods only to detailed research areas such as "Pyroprocessing," a spent nuclear fuel dry processing technology. Furthermore, there was no clear basis for the cause of the change in social perception, and only news articles were analyzed to determine social perception. Considering future comments, it is expected that more reliable results will be produced and efficiently used in the field of nuclear policy research if a media trend analysis study on nuclear power is conducted. Recently, the development of uncontact-related technologies such as artificial intelligence and big data research is accelerating in the wake of the recent arrival of the New Normal era caused by corona. Convergence research is being conducted in earnest in various research fields to follow these research trends, but not many studies have been conducted in the nuclear field with artificial intelligence and big data-related technologies such as natural language processing and text mining analysis. The academic significance of this study is that it was possible to confirm the applicability of data science analysis technology in the field of nuclear research. Furthermore, due to the impact of current government energy policies such as nuclear power plant reductions, re-evaluation of spent fuel treatment technology research is undertaken, and key keyword analysis in the field can contribute to future research orientation. It is important to consider the views of others outside, not just the safety technology and engineering integrity of nuclear power, and further reconsider whether it is appropriate to discuss nuclear engineering technology internally. In addition, if multidisciplinary research on nuclear power is carried out, reasonable alternatives can be prepared to maintain the nuclear industry.
https://doi.org/10.13088/jiis.2021.27.2.033 인용 PDF KSCI

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

Park, Ho-yeon;Kim, Kyoung-jae
- Journal of Intelligence and Information Systems
- /
- v.25 no.4
- /
- pp.141-154
- /
- 2019
Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.
https://doi.org/10.13088/jiis.2019.25.4.141 인용 PDF KSCI

Search Result 107, Processing Time 0.023 seconds

Automatic Quality Evaluation with Completeness and Succinctness for Text Summarization (완전성과 간결성을 고려한 텍스트 요약 품질의 자동 평가 기법)

A Study on the Structure of an Animation and the Generation of Signification (애니메이션 <겨울왕국>의 구조와 의미생성 연구)

The Study on Change in Sex-Related Knowledge and Attitude through Sex Education : focusing on the 1st grade students in girls' junior high schools (성교육 실시에 따른 성지식, 성태도 변화 연구 -1학년 여중생을 대상으로-)

WHICH INFORMATION MOVES PRICES: EVIDENCE FROM DAYS WITH DIVIDEND AND EARNINGS ANNOUNCEMENTS AND INSIDER TRADING

Text Mining-Based Emerging Trend Analysis for the Aviation Industry (항공산업 미래유망분야 선정을 위한 텍스트 마이닝 기반의 트렌드 분석)

Analysis of media trends related to spent nuclear fuel treatment technology using text mining techniques (텍스트마이닝 기법을 활용한 사용후핵연료 건식처리기술 관련 언론 동향 분석)

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)