• Title/Summary/Keyword: Text-independent

Search Result 237, Processing Time 0.024 seconds

Passage Retrieval and Calculation Method of Topic Field by Using Field-Associated Terms (분야연상어를 이용한 화제분야의 계산방법과 단락검색)

  • Lee Samuel-Sangkon
    • The KIPS Transactions:PartB
    • /
    • v.12B no.1 s.97
    • /
    • pp.57-68
    • /
    • 2005
  • It is important to segment a text, which is independent upon any text-embedded auxiliary information. This paper presents a technique for dividing the text into field-coherent passages. The presented method is based upon extracting field-associated terms from the text measuring how the topics grow, shrink and shift from sentence to sentence. We propose measures of topic continuity and of topic transition and suggest how those could be used to find the boundaries among passages. After collecting 12,500 documents, we obtain $88{\%}$ for average precision and $78{\%}$ for recall in Korean training set.

Relevant Analysis on User Choice Tendency of Intelligent Tourism Platform under the Background of Text mining

  • Liu, Zi-Yang;Liao, Kai;Guo, Zi-Han
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.9
    • /
    • pp.119-125
    • /
    • 2019
  • The purpose of this study is to find out the relevant factors of the choice tendency of tourism users to Intelligent Tourism platform through big data analysis, which will help enterprises to make accurate positioning and improvement according to user information feedback in the tourism market in the future, so as to gain the favor of users' choice and achieve long-term market competitiveness. This study takes the Intelligent Tourism platform as the independent variable and the user choice tendency as the dependent variable, and explores the related factors between the Intelligent Tourism platform and the user choice tendency. This study make use of text mining and R language text analysis, and uses SPSS and AMOS statistical analysis tools to carry out empirical analysis. According to the analysis results, the conclusions are as follows: service quality has a significant positive correlation with user choice tendency; service quality has a significant positive correlation with tourism trust; Tourism Trust has a significant positive correlation with user choice tendency; service quality has a significant positive correlation with user experience; user experience has a significant positive correlation with user choice tendency Positive correlation effect.

Text Region Extraction from Videos using the Harris Corner Detector (해리스 코너 검출기를 이용한 비디오 자막 영역 추출)

  • Kim, Won-Jun;Kim, Chang-Ick
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.7
    • /
    • pp.646-654
    • /
    • 2007
  • In recent years, the use of text inserted into TV contents has grown to provide viewers with better visual understanding. In this paper, video text is defined as superimposed text region located of the bottom of video. Video text extraction is the first step for video information retrieval and video indexing. Most of video text detection and extraction methods in the previous work are based on text color, contrast between text and background, edge, character filter, and so on. However, the video text extraction has big problems due to low resolution of video and complex background. To solve these problems, we propose a method to extract text from videos using the Harris corner detector. The proposed algorithm consists of four steps: corer map generation using the Harris corner detector, extraction of text candidates considering density of comers, text region determination using labeling, and post-processing. The proposed algorithm is language independent and can be applied to texts with various colors. Text region update between frames is also exploited to reduce the processing time. Experiments are performed on diverse videos to confirm the efficiency of the proposed method.

Robust Watermarking for Digital Images in Geometric Distortions Using FP-ICA of Secant Method (할선법의 FP-ICA를 이용한 기하학적 변형에 강건한 디지털영상 워터마킹)

  • Cho Yong-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.11B no.7 s.96
    • /
    • pp.813-820
    • /
    • 2004
  • This paper proposes a digital image watermarking which is robust to geometric distortions using an independent component analysis(ICA) of fixed-point(FP) algorithm based on secant method. The FP algorithm of secant method is applied for better performance in a separation time and rate, and ICA is applied to reject the prior knowledges for original image, key, and watermark such as locations and size, etc. The proposed method embeds the watermark into the spatial domain of original image The proposed watermarking technique has been applied to lena, key, and two watermarks(text and Gaussian noise) respectively. The simulation results show that the proposed method has higher speed and better rate for extracting the original images than the FP algorithm of Newton method. And the proposed method has a watermarking which is robust to geometric distortions such as resizing, rotation, and cropping. Especially, the watermark of images with Gaussian noise has better extraction performance than the watermark with text since Gaussian noise has lower correlation coefficient than the text to the original and key images. The watermarking of ICA doesn't require the prior knowledge for the original images.

A Semantic Text Model with Wikipedia-based Concept Space (위키피디어 기반 개념 공간을 가지는 시멘틱 텍스트 모델)

  • Kim, Han-Joon;Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.19 no.3
    • /
    • pp.107-123
    • /
    • 2014
  • Current text mining techniques suffer from the problem that the conventional text representation models cannot express the semantic or conceptual information for the textual documents written with natural languages. The conventional text models represent the textual documents as bag of words, which include vector space model, Boolean model, statistical model, and tensor space model. These models express documents only with the term literals for indexing and the frequency-based weights for their corresponding terms; that is, they ignore semantical information, sequential order information, and structural information of terms. Most of the text mining techniques have been developed assuming that the given documents are represented as 'bag-of-words' based text models. However, currently, confronting the big data era, a new paradigm of text representation model is required which can analyse huge amounts of textual documents more precisely. Our text model regards the 'concept' as an independent space equated with the 'term' and 'document' spaces used in the vector space model, and it expresses the relatedness among the three spaces. To develop the concept space, we use Wikipedia data, each of which defines a single concept. Consequently, a document collection is represented as a 3-order tensor with semantic information, and then the proposed model is called text cuboid model in our paper. Through experiments using the popular 20NewsGroup document corpus, we prove the superiority of the proposed text model in terms of document clustering and concept clustering.

Text Segmentation from Images with Various Light Conditions Based on Gaussian Mixture Model

  • Tran, Khoa Anh;Lee, Gueesang
    • International Journal of Contents
    • /
    • v.9 no.1
    • /
    • pp.1-5
    • /
    • 2013
  • Standard Gaussian Mixture Model (GMM) is a well-known method for image segmentation. However, one of its problems is that we consider the pixel as independent to each other, which can cause the segmentation results sensitive to noise. It explains why some of existing algorithms still cannot segment texts from the background clearly. Therefore, we present a new method in which we incorporate the spatial relationship between a pixel and its neighbors inside $3{\times}3$ windows to segment the text. Our approach works well with images containing texts, which has different sizes, shapes or colors in case of light changes or complex background. Experimental results demonstrate the robustness, accuracy and effectiveness of the proposed model in image segmentation compared to other methods.

Dysarthric speaker identification with different degrees of dysarthria severity using deep belief networks

  • Farhadipour, Aref;Veisi, Hadi;Asgari, Mohammad;Keyvanrad, Mohammad Ali
    • ETRI Journal
    • /
    • v.40 no.5
    • /
    • pp.643-652
    • /
    • 2018
  • Dysarthria is a degenerative disorder of the central nervous system that affects the control of articulation and pitch; therefore, it affects the uniqueness of sound produced by the speaker. Hence, dysarthric speaker recognition is a challenging task. In this paper, a feature-extraction method based on deep belief networks is presented for the task of identifying a speaker suffering from dysarthria. The effectiveness of the proposed method is demonstrated and compared with well-known Mel-frequency cepstral coefficient features. For classification purposes, the use of a multi-layer perceptron neural network is proposed with two structures. Our evaluations using the universal access speech database produced promising results and outperformed other baseline methods. In addition, speaker identification under both text-dependent and text-independent conditions are explored. The highest accuracy achieved using the proposed system is 97.3%.

Estimating Media Environments of Fashion Contents through Semantic Network Analysis from Social Network Service of Global SPA Brands (패션콘텐츠 미디어 환경 예측을 위한 해외 SPA 브랜드의 SNS 언어 네트워크 분석)

  • Jun, Yuhsun
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.43 no.3
    • /
    • pp.427-439
    • /
    • 2019
  • This study investigated the semantic network based on the focus of the fashion image and SNS text utilized by global SPA brands on the last seven years in terms of the quantity and quality of data generated by the fast-changing fashion trends and fashion content-based media environment. The research method relocated frequency, density and repetitive key words as well as visualized algorithms using the UCINET 6.347 program and the overall classification of the text related to fashion images on social networks used by global SPA brands. The conclusions of the study are as follows. A common aspect of global SPA brands is that by looking at the basis of text extraction on SNS, exposure through image of products is considered important for sales. The following is a discriminatory aspect of global SPA brands. First, ZARA consistently exposes marketing using a variety of professions and nationalities to SNS. Second, UNIQLO's correlation exposes its collaboration promotion to SNS while steadily exposing basic items. Third, in the case of H&M, some discriminatory results were found with other brands in connectivity with each cluster category that showed remarkably independent results.

Passage Retrieval based on Tracing Topic Continuity and Transition by Using Field-Associated Term (분야연상어를 이용한 화제의 계속성과 전환성을 추적하는 단락분할 방법)

  • Lee, Sang-Kon
    • The KIPS Transactions:PartB
    • /
    • v.10B no.1
    • /
    • pp.57-66
    • /
    • 2003
  • We propose a technique to extract a relevant passage from text collection based on field-associated terms since they tries to concentrate relevant text to users query. Documents are supposed to be managed as a whole without any segmentation into small pieces, but the method presented is independent upon any text-embedded auxiliary information, and is based on topic continuity and transition. For users needs-relative sentences or passages, we present a passage retrieval techniques by using occurrence frequency of a field-associated term to delimit text, that is likely to be relevant to a particular topic, considering continuity and transition within topic flowing in text. We evaluate 50 Japanese documents and verify the usefulness with 82% for average precision and 63% for recall.

Realization a Text Independent Speaker Identification System with Frame Level Likelihood Normalization (프레임레벨유사도정규화를 적용한 문맥독립화자식별시스템의 구현)

  • 김민정;석수영;김광수;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.1
    • /
    • pp.8-14
    • /
    • 2002
  • In this paper, we realized a real-time text-independent speaker recognition system using gaussian mixture model, and applied frame level likelihood normalization method which shows its effects in verification system. The system has three parts as front-end, training, recognition. In front-end part, cepstral mean normalization and silence removal method were applied to consider speaker's speaking variations. In training, gaussian mixture model was used for speaker's acoustic feature modeling, and maximum likelihood estimation was used for GMM parameter optimization. In recognition, likelihood score was calculated with speaker models and test data at frame level. As test sentences, we used text-independent sentences. ETRI 445 and KLE 452 database were used for training and test, and cepstrum coefficient and regressive coefficient were used as feature parameters. The experiment results show that the frame-level likelihood method's recognition result is higher than conventional method's, independently the number of registered speakers.

  • PDF