• Title/Summary/Keyword: 전자 텍스트

Search Result 447, Processing Time 0.022 seconds

String extraction from text-background mixed documents using mathematical morphology (텍스트-배경무늬 혼합문서로부터 수리형태학을 이용한 문자열 추출)

  • 성연진;어진우
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.34S no.10
    • /
    • pp.104-111
    • /
    • 1997
  • It is known as a difficult problem to recognize text-background mixed documents. In this paper a new string extraction algorithm, using mathematical morphology for the document consisting of text and overlapped periodic background pattern, is proposed. The algorithm consists of pattern periodicity feature extraction and background removal. The extracted pattern periodicity feature is used to determine the shape of structuring elements for morphological pre- and post-processing to remove background. The effectiveness of the proposed algorithm over the existing one is also verified through the experiments with various test documents.

  • PDF

Feature based Text Watermarking in Digital Binary Image (이진 문서 영상에서의 특징 기반 텍스트 워터마킹)

  • 공영민;추현곤;최종욱;김희율
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.359-362
    • /
    • 2002
  • In this paper, we propose a new feature-based text watermarking for the binary text image. The structure of specific characters from preprocessed text image are modified to embed watermark. Watermark message are embedded and detected by the following method; Hole line disconnect using the connectivity of the character containing a hole, Center line shift using the hole area and Differential encoding using difference of flippable score points. Experimental results show that the proposed method is robust to rotation and scaling distortion.

  • PDF

A Web Information Mining Agent for Electrical Elements(WIMA-EE) (전자부품관련 웹 정보 마이닝 에이전트(WIMA-EE))

  • 오석일;변영태
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.66-68
    • /
    • 2000
  • 웹에서 공개하는 정보의 많은 부분이 문자에 의존해서 제공되고 있으며, 이렇게 단어의 여러 형태로 구성된 웹 문서에서 원하는 정보를 찾아 추출하기 위한 노력은 다양하게 시도되고 있다. 본 논문에서는 전자부품관련 정보 제공 사이트와 관련해서 텍스트 기반과 웹 문서가 갖는 특별한 형태의 태그를 포함하는 형태에서 테이블 형식의 정보 표현과 같이 반 구조적(semi-structured) 문서에서의 정보 추출 방법과 이를 적용한 시스템을 구성하여 정보 추출의 가능성을 제시하고자 한다.

  • PDF

SIP-based mobility management protocol to support computing environment mobility (컴퓨팅 환경의 이동성 지원을 위한 SIP 기반의 이동성 관리 프로토콜)

  • Hyungsik Moon;Choonhwa Lee
    • Annual Conference of KIPS
    • /
    • 2008.11a
    • /
    • pp.1001-1003
    • /
    • 2008
  • 유비쿼터스 환경을 위해 개별 서비스의 집합으로 정의할 수 있는 컴퓨팅 환경의 이동을 필요로 하고 있다. 이러한 컴퓨팅 환경의 이동을 지원하기 위해 텍스트 기반으로 확장성이 좋고 이동성에도 유용한 SIP 을 이용하였다. 확장된 SIP 을 이용하여 컴퓨팅 환경의 이동을 관리하는 프로토콜을 제안하였다.

Text Mining Techniques for Adaptable Learning (적응적인 학습을 위한 텍스트 마이닝 기술)

  • Kim, Cheon-Shik;Jung, Myung-Hee;Hong, You-Sik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.3
    • /
    • pp.31-39
    • /
    • 2008
  • Until now, there are many technologies to improve studying ability using e-learning system. In most of e-learning system, learners are studying through the lecture materials and studying problems. The studying ability and intention, however, can be improved through the shared materials and discussion. In this case, learning materials are shared by the learners' discussion and shared materials through the board Internet and MSN. Such data was not classified by learners; it was not easy for the learners to search related valuable information. Therefore, it was not helping to learning. The technologies of most text mining extract summary data from the collection of document or classify into similar document from the complex document. In this paper, we implemented e-learning system for learners to improve learning abilities and especially, applied text mining technology to classify learning material for helping learners.

Effect of Text Transmission Performance on Delay Spread by Water Surface Fluctuation in Underwater Multipath Channel (수중 다중경로 채널에서 수면변동에 의한 지연확산이 텍스트 전송성능에 미치는 영향)

  • Park, Ji-Hyun;Kim, Jong-Wook;Yoon, Jong-Rak
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.48 no.1
    • /
    • pp.1-8
    • /
    • 2011
  • In this paper, a water tank experiment using Binary Frequency Shift Keying (BFSK) method for text transmission performance by water surface fluctuation is conducted. Water surface fluctuation and delay spread which affect the channel coherence bandwidth is a limiting factor in underwater acoustic communication. The amplitude fluctuation and delay spread the smooth surface and fluctuation surface, were identified. The effective delay spread of both cases are 5ms, 4ms corresponding to the coherence bandwidth of 200Hz, 250Hz, respectively. The bit error rate of BFSK modulated text transmission is about $10^{-4}$ in less than 200bps in smooth surface but less than 250bps in fluctuation surface. Therefore, this experiment shows that the water surface fluctuation is important factor determining the performance of the underwater acoustic transmission.

BERT-based Classification Model for Korean Documents (한국어 기술문서 분석을 위한 BERT 기반의 분류모델)

  • Hwang, Sangheum;Kim, Dohyun
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.1
    • /
    • pp.203-214
    • /
    • 2020
  • It is necessary to classify technical documents such as patents, R&D project reports in order to understand the trends of technology convergence and interdisciplinary joint research, technology development and so on. Text mining techniques have been mainly used to classify these technical documents. However, in the case of classifying technical documents by text mining algorithms, there is a disadvantage that the features representing technical documents must be directly extracted. In this study, we propose a BERT-based document classification model to automatically extract document features from text information of national R&D projects and to classify them. Then, we verify the applicability and performance of the proposed model for classifying documents.

An Analysis Scheme Design of Customer Spending Pattern using Text Mining (텍스트 마이닝을 이용한 소비자 소비패턴 분석 기법 설계)

  • Jeong, Eun-Hee;Lee, Byung-Kwan
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.2
    • /
    • pp.181-188
    • /
    • 2018
  • In this paper, we propose an analysis scheme of customer spending pattern using text mining. In proposed consumption pattern analysis scheme, first we analyze user's rating similarity using Pearson correlation, second we analyze user's review similarity using TF-IDF cosine similarity, third we analyze the consistency of the rating and review using Sendiwordnet. And we select the nearest neighbors using rating similarity and review similarity, and provide the recommended list that is proper with consumption pattern. The precision of recommended list are 0.79 for the Pearson correlation, 0.73 for the TF-IDF, and 0.82 for the proposed consumption pattern. That is, the proposed consumption pattern analysis scheme can more accurately analyze consumption pattern because it uses both quantitative rating and qualitative reviews of consumers.

Optimizing Input Parameters of Paralichthys olivaceus Disease Classification based on SHAP Analysis (SHAP 분석 기반의 넙치 질병 분류 입력 파라미터 최적화)

  • Kyung-Won Cho;Ran Baik
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1331-1336
    • /
    • 2023
  • In text-based fish disease classification using machine learning, there is a problem that the input parameters of the machine learning model are too many, but due to performance problems, the input parameters cannot be arbitrarily reduced. This paper proposes a method of optimizing input parameters specialized for Paralichthys olivaceus disease classification using SHAP analysis techniques to solve this problem,. The proposed method includes data preprocessing of disease information extracted from the halibut disease questionnaire by applying the SHAP analysis technique and evaluating a machine learning model using AutoML. Through this, the performance of the input parameters of AutoML is evaluated and the optimal input parameter combination is derived. In this study, the proposed method is expected to be able to maintain the existing performance while reducing the number of input parameters required, which will contribute to enhancing the efficiency and practicality of text-based Paralichthys olivaceus disease classification.

A multi-channel CNN based online review helpfulness prediction model (Multi-channel CNN 기반 온라인 리뷰 유용성 예측 모델 개발에 관한 연구)

  • Li, Xinzhe;Yun, Hyorim;Li, Qinglong;Kim, Jaekyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.171-189
    • /
    • 2022
  • Online reviews play an essential role in the consumer's purchasing decision-making process, and thus, providing helpful and reliable reviews is essential to consumers. Previous online review helpfulness prediction studies mainly predicted review helpfulness based on the consistency of text and rating information of online reviews. However, there is a limitation in that representation capacity or review text and rating interaction. We propose a CNN-RHP model that effectively learns the interaction between review text and rating information to improve the limitations of previous studies. Multi-channel CNNs were applied to extract the semantic representation of the review text. We also converted rating into independent high-dimensional embedding vectors representing the same dimension as the text vector. The consistency between the review text and the rating information is learned based on element-wise operations between the review text and the star rating vector. To evaluate the performance of the proposed CNN-RHP model in this study, we used online reviews collected from Amazom.com. Experimental results show that the CNN-RHP model indicates excellent performance compared to several benchmark models. The results of this study can provide practical implications when providing services related to review helpfulness on online e-commerce platforms.