Investigations on Techniques and Applications of Text Analytics (텍스트 분석 기술 및 활용 동향)

  • Kim, Namgyu;Lee, Donghoon;Choi, Hochang;Wong, William Xiu Shun
    • The Journal of Korean Institute of Communications and Information Sciences
    • v.42 no.2
    • pp.471-492
    • 2017
  • The demand and interest in big data analytics are increasing rapidly. The concepts around big data include not only existing structured data, but also various kinds of unstructured data such as text, images, videos, and logs. Among the various types of unstructured data, text data have gained particular attention because it is the most representative method to describe and deliver information. Text analysis is generally performed in the following order: document collection, parsing and filtering, structuring, frequency analysis, and similarity analysis. The results of the analysis can be displayed through word cloud, word network, topic modeling, document classification, and semantic analysis. Notably, there is an increasing demand to identify trending topics from the rapidly increasing text data generated through various social media. Thus, research on and applications of topic modeling have been actively carried out in various fields since topic modeling is able to extract the core topics from a huge amount of unstructured text documents and provide the document groups for each different topic. In this paper, we review the major techniques and research trends of text analysis. Further, we also introduce some cases of applications that solve the problems in various fields by using topic modeling.

A Study on the Necessity and Applicability of Interactive Electronic Technical Manual(IETM) for Construction Projects (건설분야 전자매뉴얼의 필요성 및 특성분석을 통한 실무적용성 연구)

  • Kang, Leen-Seok;Jung, Won-Myung;Kwak, Joong-Min
    • Korean Journal of Construction Engineering and Management
    • v.6 no.1 s.23
    • pp.99-108
    • 2005
  • Interactive electronic technical manual(IETM) for construction projects means an electronic tool that regulations and specifications related to construction method or maintenance process ale described by electronic book type. It has a meaning of integrated information system that includes virtual reality(VR), 3D animation and image contents for representing real construction information so that user can easily understand the construction situation and maintenance process. The basic information and technical manuals of construction facilities are being written as paper documents in our construction industry. As the result, the information management in the maintenance phase of construction projects is inefficient, and maintenance cost is being increased. This study attempts to improve the lack of understanding about construction IETM through the analysis of necessity and unique function of construction IETM comparing with the IETMS in other industry, Finally, this study shows a scenario of construction IErM for mitigating natural disaster of construction facilities to verify applicability of IETM.

Analyzing Comments of YouTube Video to Measure Use and Gratification Theory Using Videos of Trot Singer, Cho Myung-sub (YouTube 동영상 의견분석을 통한 사용과 충족 이론 측정 : 트로트 가수 조명섭 동영상을 중심으로)

  • Hong, Han-Kook;Leem, Byung-hak;Kim, Sam-Moon
    • The Journal of the Korea Contents Association
    • v.20 no.9
    • pp.29-42
    • 2020
  • The purpose of this study is to present a qualitative research method for extracting and analyzing the comments written by YouTube video users. To do this, we used YouTube users' feedback to measure the hedonic, social, and utilitarian gratification of use and gratification theory(UGT) through by using analysis and topic modeling. The result of the measurement found that the first reason why users watch the trot singer, Cho Myung-sub's video in the KBS Korean broadcasting channel is to achieve hedonic gratification with high frequency. In word-document network analysis, the degree of centrality was high in words, such as 'cheering', 'thank you', 'fighting', and 'best'. Betweenness centrality is similar to the degree of centrality. Eigenvector centrality also shows that words such as 'love', 'heart', and 'thank you' are the most influential words of users' opinions. The results of the centrality analysis present that the majority of video users show their 'love', 'heart' and 'thank you' for the video. it indicates that the high words in centrality analysis is consistent with the high frequency words of hedonic and social gratification dimension of the UGT. The study has research methodological implication that shed light on the motivations for watching YouTube videos with UGT using text mining techniques that automate qualitative analysis, rather than following a survey-based structural equation model.

Printed Numeric Character Recognition using Fractal Dimension and Modified Henon Attractor (프랙탈 차원과 수정된 에농 어트랙터를 이용한 인쇄체 숫자인식)

  • 손영우
    • Journal of Korea Multimedia Society
    • /
    • v.6 no.1
    • pp.89-96
    • 2003
  • This paper propose the new method witch is adopted in extracting character features and recognizing numeric characters using fractal dimension and modified Henon Attractor of the Chaos Theory. Firstly, it gets features of mesh feature, projection feature and cross distance feature from numeric character images And their feature hi converted into time series data. Then using the modified Henon system suggested in this paper, it gets last features of numeric character image after calculating Natural Measure and information bit which art meant fractal dimension. Finally, numeric character recognition is performed by statistically finding out the each information bit showing the minimum difference against the normalized pattern database. An Experimental result shows 100% character classification rates for 10 digits and 90% of recognition rates in real situation and the recognition speed was 26 characters per second.

Minimum-cost Path Algorithm for Separating Touching English Characters (최단 경로 알고리즘을 이용한 접합 영문자 분할)

  • Lee, Duk-Ryong;Oh, Il-Seok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • /
    • /
    • 2012
  • The paper proposes an algorithm which finds a nonlinear cut path for a printed grayscale touching character image. The conventional algorithms were observed to fail in situations of complicated touching. We analyzed those situations, and based on the analysis results we identified problematic issues of the conventional algorithms. We modified the conventional algorithms in two aspects. First we propose a new penalizing term which is probable to guide correctly the cut path for touching situations difficult to separate. Second the preposed algorithm adopts a strategy of producing both the downward and upward paths and selecting better one. The experimental results on actual touching character images showed that the proposed algorithm was superior th conventional algorithms by 3~4% in terms of success ratio of separation.

Printed Hangul Recognition with Adaptive Hierarchical Structures Depending on 6-Types (6-유형 별로 적응적 계층 구조를 갖는 인쇄 한글 인식)

  • Ham, Dae-Sung;Lee, Duk-Ryong;Choi, Kyung-Ung;Oh, Il-Seok
    • The Journal of the Korea Contents Association
    • /
    • /
    • /
    • 2010
  • Due to a large number of classes in Hangul character recognition, it is usual to use the six-type preclassification stage. After the preclassification, the first consonent, vowel, and last consonent can be classified separately. Though each of three components has a few of classes, classification errors occurs often due to shape similarity such as 'ㅔ' and 'ㅖ'. So this paper proposes a hierarchical recognition method which adopts multi-stage tree structures for each of 6-types. In addition, to reduce the interference among three components, the method uses the recognition results of first consonents and vowel as features of vowel classifier. The recognition accuracy for the test set of PHD08 database was 98.96%.

Implementation of Wireless Contents Access PMP using ARM 9 Embedded System (ARM 9 임베디드 시스템에 의한 무선 컨텐츠 액세스 PMP 구현)

  • Han, Kyong-Ho;Kim, Hee-Su
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • /
    • /
    • 2007
  • In this paper, diskless personal multimedia player(PMP) that can access and decode the remote large multimedia data is implemented via wireless network. To implement this, WLAN based NFS protocol is used to connect PMP to the remote server and text image and movie files are decoded and played using ARM9 cored PXA255 embedded processor and Linux OS. The fuction and performance of the PMP is evaluated and verified using variuos types of contents. Linux kernel elements are configured and built in according to the hardware and software on the target board to install on the target board. The confirming result shows the required functions and performances.

The Design of Efficient Learning Management System using Streaming Service (스트리밍 서비스를 이용한 효율적인 학습 관리 시스템 설계)

  • Kim, Bong-Hyun;Kim, Seong-Youn;Han, Jin-Young
    • Annual Conference of KIPS
    • /
    • /
    • /
    • 2002
  • 최근 컴퓨터와 인터넷의 급속한 발전과 더불어 교육 문화는 사용자들로 하여금 새로운 세상을 열어주게 되었다. 21세기 교육이 평생교육사회라는 분야로 집중화되면서 끊임없는 자기 개발로 빠르게 변화하는 지식기반 사회에서 자신의 경쟁력을 키워가고 있다. 원격교육은 바로 이러한 평생교육을 달성 할 수 있는 매우 효과적인 방식으로 언제나 어디서나 누구나 양질의 교육을 받을 수 있는 열린 교육문화로서 본 연구에서는 학습자의 철저한 관리를 통한 학습능력의 향상을 중점으로 별도의 인터넷 접속과정 없이 편리하고 개인성이 보장된 원격 교육 시스템을 구현하는데 목적을 두었다. 본 논문은 MS의 미디어 스트리밍 서비스를 이용한 동영상 강의 시청 및 조절 과정, My_SQL을 이용한 동영상 강의에 대한 서브노트 문서 제공 및 질의 응답이 이루어지는 토론형 시스템 설계, 사용자의 강의 진도 체크와 출결사항 및 수강과목을 편리하게 관리할 수 있도록 설계되어 있다. 또한 시스템의 처리 과정에서 사용자 수강 관리에 연구의 초점을 두었으며 시스템 내에서 실시간으로 이루어지는 학습자 관리를 중점으로 구성되어 있다.

FPGA-based ML-DSA Post-Quantum Cryptography Hardware Accelerator Design using High Level Synthesis (HLS 를 이용한 FPGA 기반 ML-DSA 양자내성암호 하드웨어 가속기 설계)

  • Hanho Lee;Yunseong Jang
    • Transactions on Semiconductor Engineering
    • /
    • v.2 no.4
    • pp.21-28
    • 2024
  • This paper presents the design and implementation of ML-DSA, a next-generation post-quantum cryptography, as a hardware accelerator on an FPGA using High-Level Synthesis (HLS). We optimized the ML-DSA algorithm using various directives provided by Vitis HLS, configured the AXI interface, and designed a hardware accelerator that can be implemented on an FPGA. Then, we used Vivado tool to design the IP block and implement it on the ZYNQ ZCU104 FPGA. Finally, the video and document were saved and processing with Python code in the PYNQ framework, and the video data’s digital signature generation and verification were accelerated using ML-DSA hardware accelerator implemented on the FPGA.

Design of Micropayment System Using anonymous OTP for M-commerce (M-Commerce에서 OTP를 이용하여 사용자 익명성을 보장하는 소액 결제 시스템)

  • Shin Dong-Gyu;Jung Ki-Won;Choun Jun-ho;Jun Moon-Seog
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • /
    • 2005
  • 인터넷을 통한 전자상거래의 급격한 발전으로 현재는 사용자가 이동하는 상황에서도 전자상거래가 이루어지고 있다. 즉 M-Commerce의 발전방향은 기존의 고가의 물품에 대해서가 아니라 문서, 음악 파일, 동영상과 같은 소액에 대한 상거래가 급격히 발전할 것이다. 이로 인해 현재 소액결제시스템이 발전하고 보급되어있는 상태이다. 하지만 기존의 소액결제 시스템의 문제점인 사용자 익명성에 관한 부분에 대하여 본 논문에서는 OTP형식의 소액결제시스템에서의 새로운 프로토콜을 제안하였다. 거래가 이루어지기전에 이미 서로의 인증이 되는 고객과 이동통신업체에 중점을 두어 고객이 직접 인터넷 콘텐츠 공급업체에 개인정보를 입력하지 않아도 인증을 할 수 있도록 설계하였다.

