• Title/Summary/Keyword: Word order

Search Result 1,011, Processing Time 0.024 seconds

Document Classification Model Using Web Documents for Balancing Training Corpus Size per Category

  • Park, So-Young;Chang, Juno;Kihl, Taesuk
    • Journal of information and communication convergence engineering
    • /
    • v.11 no.4
    • /
    • pp.268-273
    • /
    • 2013
  • In this paper, we propose a document classification model using Web documents as a part of the training corpus in order to resolve the imbalance of the training corpus size per category. For the purpose of retrieving the Web documents closely related to each category, the proposed document classification model calculates the matching score between word features and each category, and generates a Web search query by combining the higher-ranked word features and the category title. Then, the proposed document classification model sends each combined query to the open application programming interface of the Web search engine, and receives the snippet results retrieved from the Web search engine. Finally, the proposed document classification model adds these snippet results as Web documents to the training corpus. Experimental results show that the method that considers the balance of the training corpus size per category exhibits better performance in some categories with small training sets.

Speech Recognition Using HMM Based on Fuzzy (피지에 기초를 둔 HMM을 이용한 음성 인식)

  • 안태옥;김순협
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.12
    • /
    • pp.68-74
    • /
    • 1991
  • This paper proposes a HMM model based on fuzzy, as a method on the speech recognition of speaker-independent. In this recognition method, multi-observation sequences which give proper probabilities by fuzzy rule according to order of short distance from VQ codebook are obtained. Thereafter, the HMM model using this multi-observation sequences is generated, and in case of recognition, a word that has the most highest probability is selected as a recognized word. The vocabularies for recognition experiment are 146 DDD are names, and the feature parameter is 10S0thT LPC cepstrum coefficients. Besides the speech recognition experiments of proposed model, for comparison with it, we perform the experiments by DP, MSVQ and general HMM under same condition and data. Through the experiment results, it is proved that HMM model using fuzzy proposed in this paper is superior to DP method, MSVQ and general HMM model in recognition rate and computational time.

  • PDF

Comparison of Neural Network Techniques for Text Data Analysis

  • Kim, Munhee;Kang, Kee-Hoon
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.2
    • /
    • pp.231-238
    • /
    • 2020
  • Generally, sequential data refers to data having continuity. Text data, which is a representative type of unstructured data, is also sequential data in that it is necessary to know the meaning of the preceding word in order to know the meaning of the following word or context. So far, many techniques for analyzing sequential data such as text data have been proposed. In this paper, four methods of 1d-CNN, LSTM, BiLSTM, and C-LSTM are introduced, focusing on neural network techniques. In addition, by using this, IMDb movie review data was classified into two classes to compare the performance of the techniques in terms of accuracy and analysis time.

A Study of Wangyun's Theory of Gujinzi - Focusing on Fenbiewen and Leizengzi (왕균(王筠)의 고금자(古今字) 이론 연구 - 분별문(分別文)과 누증자(累增字)를 위주로)

  • Oh, Jae-Joong
    • Cross-Cultural Studies
    • /
    • v.39
    • /
    • pp.461-484
    • /
    • 2015
  • Wangyun is a prominent scholar in the Qing dynasty. Shuowenshili is his masterpiece study of Shuowenjiezi. Shuowenshili discusses the difference between Fenbiewen and Leizengzi. It emphasizes that meanings to a word tends to help express more exactly. Wangyun's discussion about Gujinzi refers to the phenomenon that in order to recode a word different characters used in different periods. Wangyun's use of Gujinzi is based on the purpose to explain the relationship between characters. Fenbiewen and Leizengzi are specialized term of characters evolution. Wangyun's theory of Fenbiewen, Leizengzi and Gujinzi is the most important linguistics found. He was generous contributions to the linguistics of China. In particular, understanding the meaning and development of Chinese character.

GCC based Compiler Construction for Compact DSP32

  • Cho, Myeong-Jin;Lee, Ho-Kyoon;Huong, Giang Nguyen Thi;Kim, Seon-Wook;Han, Young-Sun;Um, Jung-Young
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2011.04a
    • /
    • pp.43-45
    • /
    • 2011
  • Very Long Instruction Word (VLIW) executes multiple instructions in parallel. In order to exploit higher performance, i.e., higher parallelism, VLIW compiler groups as many instructions into one word as possible. In this paper, we show how to construct a VLIW C compiler based on GCC for CDSP32 (Compact Digital Signal Processor 32-bit) which is an embedded DSP processor to issue two instructions in one VLIW. Also, we evaluated the compiler on EEMBC benchmark; the experiment result showed that the total number of dynamic instructions of the VLIW compiler was reduced by 18% on average over without VLIW instruction scheduling.

Development of An Automatic Classification System for Game Reviews Based on Word Embedding and Vector Similarity (단어 임베딩 및 벡터 유사도 기반 게임 리뷰 자동 분류 시스템 개발)

  • Yang, Yu-Jeong;Lee, Bo-Hyun;Kim, Jin-Sil;Lee, Ki Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.2
    • /
    • pp.1-14
    • /
    • 2019
  • Because of the characteristics of game software, it is important to quickly identify and reflect users' needs into game software after its launch. However, most sites such as the Google Play Store, where users can download games and post reviews, provide only very limited and ambiguous classification categories for game reviews. Therefore, in this paper, we develop an automatic classification system for game reviews that categorizes reviews into categories that are clearer and more useful for game providers. The developed system converts words in reviews into vectors using word2vec, which is a representative word embedding model, and classifies reviews into the most relevant categories by measuring the similarity between those vectors and each category. Especially, in order to choose the best similarity measure that directly affects the classification performance of the system, we have compared the performance of three representative similarity measures, the Euclidean similarity, cosine similarity, and the extended Jaccard similarity, in a real environment. Furthermore, to allow a review to be classified into multiple categories, we use a threshold-based multi-category classification method. Through experiments on real reviews collected from Google Play Store, we have confirmed that the system achieved up to 95% accuracy.

Origin and Transformation of the Word 'Library' in the Ancient World (고대 도서관 명칭의 기원과 변용)

  • Yoon, Hee-Yoon
    • Journal of Korean Library and Information Science Society
    • /
    • v.52 no.4
    • /
    • pp.1-21
    • /
    • 2021
  • This study traced the origin and transformation of word library linked with archives in the ancient Near East, and Greece and Rome. First, the word library has two origins. One is derived from the Latin bibliothēkē from the ancient Greek βιβλιοθήκη. The first trace is Pollux's Onomasticon in the second half of the 2nd century, and if considered as a set of literature texts, it is Lipsius's De Bibliothecis Syntagma in 1602. The other was established as an library in the early 14th century after Latin libraria (or librarium) was translated into Old French librairie (or librarie). The word library was coined by Chaucer in 1374. Second, the clay tablet repository that existed in the ancient Near East is close to an archive, but the official name is unknown. However, the Ashurbanipal clay tablet archive is far from the principle of respect for original order and origins emphasized by the archivists, so it is not a royal archive, but a prototype of the royal library. And the official name of the Library of Alexandria was 'Βιβλιοθήκη της Αλεξάνδρειας', and then it was changed to 'ALEXANDRINA BYBLIOTHECE'. Third, In ancient Greece and Rome, archives and libraries were separated. Greece libraries were at the level of a small libraries attached to gymnasiums, and had few independent titles. The names of the Roman libraries often attached to the public baths were mixed with βιβλιοθήκη and Bibliotheca. Finally, the ancient library was succeeded to the cathedral bibliothek, and was transformed into 'bayt al-hikmah' in the Islamic Empire. In Japan, China, and Korea, Japanese-Chinese word library was accepted at the end of the 19th century, but there are many issues that require follow-up research.

A Theory of Intermediality and its Application in Peter Greenaway's (상호매체성의 이론과 그 적용 - 피터 그리너웨이의 <프로스페로의 서재>를 중심으로)

  • PARK, Ki-Hyun
    • Cross-Cultural Studies
    • /
    • v.19
    • /
    • pp.39-77
    • /
    • 2010
  • The cinema of Peter Greenaway has consistently engaged questions of the relationship between the arts and particularly the relations of image and writing to cinema. When different types of images are correlated and merged with each other on the borders of painting, photography, film, video and computer animation, the interrelationships of the distinct elements cause a shift in the notion of the whole image. This analysis proposes to articulate the complex relationship between the 'interartial' dimension and the 'intermedial' dimension in Peter Greenaway's film, (1991). If the interartiality is interested in the interaction between various arts, including the transition from one to another, the intermediality articulates the same type of relationship between two or more media. The interactional relationship is the same on both sides; on the contrary, the relationship between art and media does not show the same symmetry. All art is based on one or more media - the media is a condition existence of art - but no art can't be reduced to the status of media. This suggests that if the interartiality always involves the intermediality, this proposal may not be reversed. First, we analyse a self-conscious investigation into digital art and technology. Prosospero's Books can be read as a daring visual essay that self-consciously investigates the technical and philosophical functions of letters, books, images, animated paintings, digital arts, and the other magical illusions, which have been modern or will be post-modern media to represent the world. Greenaway uses both conventional film techniques and the resources of high-definition television to layer image upon image, superimposing a second or third frame within his frame. Greenaway uses the frame-within-frame as the cinematic equivalent of Shakespeare's paly-within-play : it offer him the possibility to analyse the work of art/artist/spectator relationship. Secondly, we analyse the relationship between the written word, oral word and the books. Like the written word, the oral word changes into a visual image: The linguistic richness and nuances of Shakeaspeare's characters turn into the powerful and authoritative, but monotone, voices of Gielgud-Prospero, who speaks the Shakespearean lines aloud, shaping the characters so powerfully through his worlds that they are conjured before us. Specially each book is placed over the frame of the play's action, only partially covering the image, so that it gives virtually every frame at least two space-time orientations. Thirdly, we try to show how Peter Greenaway uses pictorial references in order to illustrate the context of the Renaissance as well as pictorial techniques and language in order to question the nature of artistic representation. For exemple, The storm is visualised through reference to Botticelli's : the storm of papers swirling around the library is constructed to look like a facsimili copy of Michelangelo's Laurentiana Library in Florence. Greenaway's modern mannerism consists in imposing his own aesthetic vision and his questioning of art beyond the play's meta-theatricality: in other words, Shakespeare''s text has been adapted without being betrayed.

An Adaptive Learning Rate with Limited Error Signals for Training of Multilayer Perceptrons

  • Oh, Sang-Hoon;Lee, Soo-Young
    • ETRI Journal
    • /
    • v.22 no.3
    • /
    • pp.10-18
    • /
    • 2000
  • Although an n-th order cross-entropy (nCE) error function resolves the incorrect saturation problem of conventional error backpropagation (EBP) algorithm, performance of multilayer perceptrons (MLPs) trained using the nCE function depends heavily on the order of nCE. In this paper, we propose an adaptive learning rate to markedly reduce the sensitivity of MLP performance to the order of nCE. Additionally, we propose to limit error signal values at out-put nodes for stable learning with the adaptive learning rate. Through simulations of handwritten digit recognition and isolated-word recognition tasks, it was verified that the proposed method successfully reduced the performance dependency of MLPs on the nCE order while maintaining advantages of the nCE function.

  • PDF

The Linear Constituent Order of the Noun Phrase: An Optimality Theoretic Account

  • Chung, Chin-Wan
    • English Language & Literature Teaching
    • /
    • v.9 no.1
    • /
    • pp.23-48
    • /
    • 2003
  • This paper provides an analysis of the linear constituent order of the NP in three different types of languages based on 33 languages: the NP with the prenominal modifiers, the NP with the postnominal modifiers, and the NP with both prenominal and postnominal modifiers (the mixed NP). Languages have NPs that feature different linear order, of the NP constituents. We attribute such different linear constituent orders within the NP to the linguistic distance and the limits imposed by the constituency and adjacency. We use the various kinds of alignment constraints which properly reflect the linguistic distance between the noun and each constituent. Language universals on word order provide us some general orders of various NP constituents. If we adopt the linguistic distance, the limits imposed by the constituency and the adjacency, and the alignment constraints, we can explain the complicated differences of NP constituent orders of languages of the world.

  • PDF