• Title/Summary/Keyword: 어휘정보

Search Result 1,065, Processing Time 0.03 seconds

Establishment of ITS Policy Issues Investigation Method in the Road Section applied Textmining (텍스트마이닝을 활용한 도로분야 ITS 정책이슈 탐색기법 정립)

  • Oh, Chang-Seok;Lee, Yong-taeck;Ko, Minsu
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.6
    • /
    • pp.10-23
    • /
    • 2016
  • With requiring circumspections using big data, this study attempts to develop and apply the search method for audit issues relating to the ITS policy or program. For the foregoing, the auditing process of the board of audit and inspection was converged with the theoretical frame of boundary analysis proposed by William Dunn as an analysis tool for audit issues. Moreover, we apply the text mining technique in order to computerize the analysis tool, which is similar to the boundary analysis in the concept of approaching meta-problems. For the text mining analysis, specific model we applied the antisymmetry-symmetry compound lexeme-based LDA model based on the Latent Dirichlet Allocation(LDA) methodologies proposed by David Blei. The several prime issues were founded through a case analysis as follows: lack of collection of traffic information by the urban traffic information system, which is operated by the National Police Agency, the overlapping problems between the Ministry of Land, Infrastructure and Transport and the Advanced Traffic Management System and fabrication of the mileage on digital tachograph.

Addressing Low-Resource Problems in Statistical Machine Translation of Manual Signals in Sign Language (말뭉치 자원 희소성에 따른 통계적 수지 신호 번역 문제의 해결)

  • Park, Hancheol;Kim, Jung-Ho;Park, Jong C.
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.163-170
    • /
    • 2017
  • Despite the rise of studies in spoken to sign language translation, low-resource problems of sign language corpus have been rarely addressed. As a first step towards translating from spoken to sign language, we addressed the problems arising from resource scarcity when translating spoken language to manual signals translation using statistical machine translation techniques. More specifically, we proposed three preprocessing methods: 1) paraphrase generation, which increases the size of the corpora, 2) lemmatization, which increases the frequency of each word in the corpora and the translatability of new input words in spoken language, and 3) elimination of function words that are not glossed into manual signals, which match the corresponding constituents of the bilingual sentence pairs. In our experiments, we used different types of English-American sign language parallel corpora. The experimental results showed that the system with each method and the combination of the methods improved the quality of manual signals translation, regardless of the type of the corpora.

A Document Sentiment Classification System Based on the Feature Weighting Method Improved by Measuring Sentence Sentiment Intensity (문장 감정 강도를 반영한 개선된 자질 가중치 기법 기반의 문서 감정 분류 시스템)

  • Hwang, Jae-Won;Ko, Young-Joong
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.6
    • /
    • pp.491-497
    • /
    • 2009
  • This paper proposes a new feature weighting method for document sentiment classification. The proposed method considers the difference of sentiment intensities among sentences in a document. Sentiment features consist of sentiment vocabulary words and the sentiment intensity scores of them are estimated by the chi-square statistics. Sentiment intensity of each sentence can be measured by using the obtained chi-square statistics value of each sentiment feature. The calculated intensity values of each sentence are finally applied to the TF-IDF weighting method for whole features in the document. In this paper, we evaluate the proposed method using support vector machine. Our experimental results show that the proposed method performs about 2.0% better than the baseline which doesn't consider the sentiment intensity of a sentence.

An Automatic Korean Word Spacing System for Devices with Low Computing Power (저사양 기기를 위한 한국어 자동 띄어쓰기 시스템)

  • Song, Yeong-Kil;Kim, Hark-Soo
    • The KIPS Transactions:PartB
    • /
    • v.16B no.4
    • /
    • pp.333-340
    • /
    • 2009
  • Most of the previous automatic word spacing systems are not suitable to use for mobile devices with relatively low computing powers because they require many system resources. We propose an automatic word spacing system that requires reasonable memory usage and simple numerical computations for mobile devices with low computing powers. The proposed system is a two step model that consists of a statistical system and a rule-based system. To reduce the memory usage, the statistical system first corrects word spacing errors by using a modified hidden Markov model based on character unigrams. Then, to increase the accuracy, the rule-based system re-corrects miscorrected word spaces by using lexical rules based on character bigrams or more. In the experiments, the proposed system showed relatively high accuracy of 94.14% in spite of small memory usage of about 1MB.

Effective Foreign Language Learning with Situated Cognition in the MOO based Environments (상황인지(Situated Cognition)원리를 적용한 효과적인 외국어 학습 방안 연구: MOO 학습환경을 중심으로)

  • Lee, Seung-Hee;Seo, Yun-Kyoung
    • Journal of The Korean Association of Information Education
    • /
    • v.6 no.1
    • /
    • pp.64-74
    • /
    • 2002
  • The purpose of this paper is to review the importance of situated cognition and the features of MOO(Multi-user Object Oriented)environments for effective foreign language learning. Learning foreign languages is beyond simply recalling for the vocabularies or expression usages of targeted languages. As much the same as children naturally acquire their mother languages among active and social interactions with other surrounding people, foreign languages should be told in the circumstances and contexts for authentic applications of foreign languages. The MOO, one of the virtual realities with spatial metaphors on the text basis, has been gaining high attentions from educational fields, thanks to the strong functions of social contexts and learner interactions. This paper approaches the features of MOO as foreign language learning environments, in terms of activity, context and interaction.

  • PDF

An Experimental Study on Semantic Searches for Image Data Using Structured Social Metadata (구조화된 소셜 메타데이터를 활용한 이미지 자료의 시맨틱 검색에 관한 실험적 연구)

  • Kim, Hyun-Hee;Kim, Yong-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.44 no.1
    • /
    • pp.117-135
    • /
    • 2010
  • We designed a structured folksonomy system in which queries can be expanded through tag control; equivalent, synonym or related tags are bound together, in order to improve the retrieval efficiency (recall and precision) of image data. Then, we evaluated the proposed system by comparing it to a tag-based system without tag control in terms of recall, precision, and user satisfaction. Furthermore, we also investigated which query expansion method is the most efficient in terms of retrieval performance. The experimental results showed that the recall, precision, and user satisfaction rates of the proposed system are statistically higher than the rates of the tag-based system, respectively. On the other hand, there are significant differences among the precision rates of query expansion methods but there are no significant differences among their recall rates. The proposed system can be utilized as a guide on how to effectively index and retrieve the digital content of digital library systems in the Library 2.0 era.

A Study on Fun Elements of Web 2.0 Blog Widget (Web 2.0 블로그 위젯의 재미 요소에 대한 연구)

  • Choi, Sung-Kyu;Kim, Kee-Sung;Jang, Seok-Hyun;Whang, Min-Cheol
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.785-790
    • /
    • 2009
  • Widgets are the instrument for representing user's character and embossing the value of blogs. The compound word of the Windows and Gadget the application, widgets are the functional program to displayed on the screen graphical user interface (GUI) tools as a kind of service that user want to see. On the operating system, the Web, and mobile area, widgets offer the delivery of information, convenience and efficiency. However widgets have been never gave satisfaction to user because it focused transmitting information and representing circumstance than fun. This study is for recognized fun elements that user feel interest and categorized fun elements each type of widgets. Fun elements of widget never been defined, we use fun elements on design and product area and emotional word that is representative of affectivity. And we make up an online questionnaire to blog users. The widget selected by popular degree among the domestic widgets and the Japanese widget. And the results of the questionnaire that 5-scales used based on user preferences to identify the elements that are fun.

  • PDF

A Study on the Speech Recognition for DDD Area - Name Using Vector Quantization with Time Information (시간 정보와 VQ를 이용한 DDD 지역명 인식에 관한 연구)

  • LEE S. K.;LEE K. S.;ANN T. O.;CHO H. J.;BYON Y. C.;KIM S. H.
    • The Journal of the Acoustical Society of Korea
    • /
    • v.8 no.5
    • /
    • pp.102-112
    • /
    • 1989
  • In this paper, we proposed the study on speaker-independent isolated word recognition for DDD area-name using vector quantization and chose total 146 DDD area-name to recognize words for application of dialing system. We made the codebook using 12th LPC cepstrum coefficients and used the minsum and the minimax method to find the centroid and we applied 3 splitting rule to a codebook generation. The single section and the multi section with time information were used to generate the codebooks and the over-lapped section codebook was used, too. From the experiment result, we proved that the minsum method was better than the minimax method and the evaluation of the system yielded an accuracy of about 90 percents In case of speaker-independent.

  • PDF

A Word Embedding used Word Sense and Feature Mirror Model (단어 의미와 자질 거울 모델을 이용한 단어 임베딩)

  • Lee, JuSang;Shin, JoonChoul;Ock, CheolYoung
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.226-231
    • /
    • 2017
  • Word representation, an important area in natural language processing(NLP) used machine learning, is a method that represents a word not by text but by distinguishable symbol. Existing word embedding employed a large number of corpora to ensure that words are positioned nearby within text. However corpus-based word embedding needs several corpora because of the frequency of word occurrence and increased number of words. In this paper word embedding is done using dictionary definitions and semantic relationship information(hypernyms and antonyms). Words are trained using the feature mirror model(FMM), a modified Skip-Gram(Word2Vec). Sense similar words have similar vector. Furthermore, it was possible to distinguish vectors of antonym words.

Evaluation of the Discordance between Sentence Polarities and Keyword Polarities by Using MUSE Sentiment-Annotated Corpora (MUSE 감성주석코퍼스를 활용한 문장 극성과 키워드 극성간의 불일치 현상에 대한 분석)

  • Cho, Donghee;Shin, Donghyok;Joo, Heejin;Chae, Byoungyeol;Cao, Wenkai;Nam, Jeesun
    • 한국어정보학회:학술대회논문집
    • /
    • 2016.10a
    • /
    • pp.195-200
    • /
    • 2016
  • 본 연구는 MUSE 감성 코퍼스를 활용하여 문장의 극성과 키워드의 극성이 얼마만큼 일치하고 일치하지 않은지를 분석함으로써 특히 문장의 극성과 키워드의 극성이 불일치하는 유형에 대한 연구의 필요성을 역설하고자 한다. 본 연구를 위하여 DICORA에서 구축한 MUSE 감성주석코퍼스 가운데 IT 리뷰글 도메인으로부터 긍정 1,257문장, 부정 1,935문장을, 맛집 리뷰글 도메인으로부터는 긍정 2,418문장, 부정 432문장을 추출하였다. UNITEX를 이용하여 LGG를 구축한 후 이를 위의 코퍼스에 적용하여 나타난 양상을 살펴본 결과, 긍 부정 문장에서 반대 극성의 키워드가 실현된 경우는 두 도메인에서 약 4~16%의 비율로 나타났으며, 단일 키워드가 아닌 구나 문장 차원으로 극성이 표현된 경우는 두 도메인에서 약 25~40%의 비교적 높은 비율로 나타났음을 확인하였다. 이를 통해 키워드의 극성에 의존하기 보다는 문장과 키워드의 극성이 일치하지 않는 경우들, 가령 문장 전체의 극성을 전환시키는 극성전환장치(PSD)가 실현된 유형이나 문장 내 극성 어휘가 존재하지 않지만 구 또는 문장 차원의 극성이 표현되는 유형들에 대한 유의미한 연구가 수행되어야 비로소 신뢰할만한 오피니언 자동 분류 시스템의 구현이 가능하다는 것을 알 수 있다.

  • PDF