• Title/Summary/Keyword: Text analysis

Search Result 3,350, Processing Time 0.032 seconds

Statistical Analysis Between Size and Balance of Text Corpus by Evaluation of the effect of Interview Sentence in Language Modeling (언어모델 인터뷰 영향 평가를 통한 텍스트 균형 및 사이즈간의 통계 분석)

  • Jung Eui-Jung;Lee Youngjik
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.87-90
    • /
    • 2002
  • This paper analyzes statistically the relationship between size and balance of text corpus by evaluation of the effect of interview sentences in language model for Korean broadcast news transcription system. Our Korean broadcast news transcription system's ultimate purpose is to recognize not interview speech, but the anchor's and reporter's speech in broadcast news show. But the gathered text corpus for constructing language model consists of interview sentences a portion of the whole, $15\%$ approximately. The characteristic of interview sentence is different from the anchor's and the reporter's in one thing or another. Therefore it disturbs the anchor and reporter oriented language modeling. In this paper, we evaluate the effect of interview sentences in language model for Korean broadcast news transcription system and analyze statistically the relationship between size and balance of text corpus by making an experiment as the same procedure according to varying the size of corpus.

  • PDF

Discovering Meaningful Trends in the Inaugural Addresses of United States Presidents Via Text Mining (텍스트마이닝을 활용한 미국 대통령 취임 연설문의 트렌드 연구)

  • Cho, Su Gon;Cho, Jaehee;Kim, Seoung Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.41 no.5
    • /
    • pp.453-460
    • /
    • 2015
  • Identification of meaningful patterns and trends in large volumes of text data is an important task in various research areas. In the present study, we propose a procedure to find meaningful tendencies based on a combination of text mining, cluster analysis, and low-dimensional embedding. To demonstrate applicability and effectiveness of the proposed procedure, we analyzed the inaugural addresses of the presidents of the United States from 1789 to 2009. The main results of this study show that trends in the national policy agenda can be discovered based on clustering and visualization algorithms.

Text Categorization with Improved Deep Learning Methods

  • Wang, Xingfeng;Kim, Hee-Cheol
    • Journal of information and communication convergence engineering
    • /
    • v.16 no.2
    • /
    • pp.106-113
    • /
    • 2018
  • Although deep learning methods of convolutional neural networks (CNNs) and long-/short-term memory (LSTM) are widely used for text categorization, they still have certain shortcomings. CNNs require that the text retain some order, that the pooling lengths be identical, and that collateral analysis is impossible; In case of LSTM, it requires the unidirectional operation and the inputs/outputs are very complex. Against these problems, we thus improved these traditional deep learning methods in the following ways: We created collateral CNNs accepting disorder and variable-length pooling, and we removed the input/output gates when creating bidirectional LSTMs. We have used four benchmark datasets for topic and sentiment classification using the new methods that we propose. The best results were obtained by combining LTSM regional embeddings with data convolution. Our method is better than all previous methods (including deep learning methods) in terms of topic and sentiment classification.

A study on the Rhetorical Strategies of Academic Text Construction for KAP learners (학문 목적 학습자를 위한 학술적 텍스트 구성의 수사적 전략 연구)

  • Hong, Yunhye
    • Journal of Korean language education
    • /
    • v.28 no.2
    • /
    • pp.235-264
    • /
    • 2017
  • The purpose of this study is to explore and categorize the rhetorical strategies of text construction in research articles and to provide data for academic writing education for foreign graduate students. This study analyzes 30 research articles by Korean writers from Korean language and Korean language education fields, and categorizes the rhetorical strategies according to the roles of the writer as a RA form composer, a manager of research content, and a communicator. On the basis of the strategies, this study analyzes 18 term papers of foreign graduate students and inspects their weaknesses in using the rhetorical strategies. Based on the results of analysis, this study suggests rhetorical strategy education for KAP learners that emphasizes validity and clarifies argument along with attracting readers.

Joint-transform Correlator Multiple-image Encryption System Based on Quick-response Code Key

  • Chen, Qi;Shen, Xueju;Cheng, Yue;Huang, Fuyu;Lin, Chao;Liu, HeXiong
    • Current Optics and Photonics
    • /
    • v.3 no.4
    • /
    • pp.320-328
    • /
    • 2019
  • A method for joint-transform correlator (JTC) multiple-image encryption based on a quick-response (QR) code key is proposed. The QR codes converted from different texts are used as key masks to encrypt and decrypt multiple images. Not only can Chinese text and English text be used as key text, but also symbols can be used. With this method, users have no need to transmit the whole key mask; they only need to transmit the text that is used to generate the key. The correlation coefficient is introduced to evaluate the decryption performance of our proposed cryptosystem, and we explore the sensitivity of the key mask and the capability for multiple-image encryption. Robustness analysis is also conducted in this paper. Computer simulations and experimental results verify the correctness of this method.

The Ebb and Flow of Regional Integration Vision in Asia-Pacific: From a Lens of Leaders' Declarations over 30 Years

  • Jeongmeen Suh
    • East Asian Economic Review
    • /
    • v.27 no.4
    • /
    • pp.303-325
    • /
    • 2023
  • This paper examines how APEC has transformed itself into an international forum for the vision of regional integration. It aims to quantify the documentation produced by the international organization and provide quantifiable evidence that aligns with prior knowledge rather than relying solely on intuition. For this purpose, I use various text mining techniques to extract multi-dimensional features from the text of APEC Leaders' Declarations from 1993 to 2023. In terms of interest and expectations for APEC as a forum, it is found that members have experienced two major peaks and troughs over the last three decades. It is found that the change point coincides with the Asian financial crisis of 1997 and the tensions between the United States and China since 2017. To explore more various aspects of economic integration in the Asia-Pacific region, this study also considers how consistently APEC has been an international forum for addressing issues, which members are active, and how members have clustered based on their views of APEC.

Developing Sentimental Analysis System Based on Various Optimizer

  • Eom, Seong Hoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.1
    • /
    • pp.100-106
    • /
    • 2021
  • Over the past few decades, natural language processing research has not made much. However, the widespread use of deep learning and neural networks attracted attention for the application of neural networks in natural language processing. Sentiment analysis is one of the challenges of natural language processing. Emotions are things that a person thinks and feels. Therefore, sentiment analysis should be able to analyze the person's attitude, opinions, and inclinations in text or actual text. In the case of emotion analysis, it is a priority to simply classify two emotions: positive and negative. In this paper we propose the deep learning based sentimental analysis system according to various optimizer that is SGD, ADAM and RMSProp. Through experimental result RMSprop optimizer shows the best performance compared to others on IMDB data set. Future work is to find more best hyper parameter for sentimental analysis system.

A Pilot Study on Applying Text Mining Tools to Analyzing Steel Industry Trends : A Case Study of the Steel Industry for the Company "P" (철강산업 트렌드 분석을 위한 텍스트 마이닝 도입 연구 : P사(社) 사례를 중심으로)

  • Min, Ki Young;Kim, Hoon Tae;Ji, Yong Gu
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.51-64
    • /
    • 2014
  • It becomes more and more important for business survival to have the ability to predict the future with uncertainties increasing faster and faster. To predict the future, text mining tools are one of the main candidate other than traditional quantitative analyses, but those efforts are still at their infancy. This paper is to introduce one of those efforts using the case of company "P" in the steel industry. Even with only four month pilot studies, we found strong possibilities, if not testified robustly, to predict future industrial trends using text mining tools. For these text mining case studies, we categorized steel industry trend keywords into ten components (10 categories) to study ten different subjects for each category. Once found any meaningful changes in a trend, we had investigated in more detail what and how some trend happened so. To be more roust, firstly we need to define more cleary the purpose of text mining analyses. Then we need to categorize industry trend key words in a more systematic way using systems thinking models. With these improvements, we are quite sure that applying text mining tools to analyzing industry trends will contribute to predicting the future industry trends as well as to identifying the unseen trends otherwise.

Case Study Analysis of Digital Education Design to Basic Concept Design Trend by Target of Education Needs in UK and Sweden (디지털 교육매체의 기초 컨셉디자인 동향 파악을 위한 선진국 사례 분석 - 영국과 스웨덴의 사용자 니즈를 중심으로 -)

  • Kim, Jung-Hee
    • Cartoon and Animation Studies
    • /
    • s.34
    • /
    • pp.345-366
    • /
    • 2014
  • From the beginning of Digital text book in 2007, there are many kinds of digital text book such as English, Science etc at Public education. Above many problems at the beginning like just using paper text book's scan data as digital text book, now use special contents and design for only digital text book. But only for digital text book not for other. There is gap between advanced country of education and us. This is research based on LG europe design center in London, UK target is UK, Sweden by heuristic analysis, question investigation to get Target's UX with digital education media. Advancement of digital and interest of education bring the world development of digital education device. UK, where is education advanced nation, is using lot's of digital education device which is interactive board, digital desk etc. Result of Analysis of Digital Education Design trend by Target of Education Needs apply rough Design by LG europe design center. We can get more sophisticated needs and UX result by target then Korea that can use for our future Digital education design plan. Also help to reduce gap between advanced country and Korea.

Entrepreneur Speech and User Comments: Focusing on YouTube Contents (기업가 연설문의 주제와 시청자 댓글 간의 관계 분석: 유튜브 콘텐츠를 중심으로)

  • Kim, Sungbum;Lee, Junghwan
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.5
    • /
    • pp.513-524
    • /
    • 2020
  • Recently, YouTube's growth started drawing attention. YouTube is not only a content-consumption channel but also provides a space for consumers to express their intention. Consumers share their opinions on YouTube through comments. The study focuses on the text of global entrepreneurs' speeches and the comments in response to those speeches on YouTube. A content analysis was conducted for each speech and comment using the text mining software Leximancer. We analyzed the theme of each entrepreneurial speech and derived topics related to the propensity and characteristics of individual entrepreneurs. In the comments, we found the theme of money, work and need to be common regardless of the content of each speech. Talking into account the different lengths of text, we additionally performed a Prominence Index analysis. We derived time, future, better, best, change, life, business, and need as common keywords for speech contents and viewer comments. Users who watched an entrepreneur's speech on YouTube responded equally to the topics of life, time, future, customer needs, and positive change.