• Title/Summary/Keyword: Similar Word

Search Result 415, Processing Time 0.028 seconds

A Study of Development for Korean Phonotactic Probability Calculator (한국어 음소결합확률 계산기 개발연구)

  • Lee, Chan-Jong;Lee, Hyun-Bok;Choi, Hun-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.239-244
    • /
    • 2009
  • This paper is to develop the Korean Phonotactic Probability Calculator (KPPC) that anticipates the phonotactic probability in Korean. KPPC calculates the positional segment frequecncy, position-specific biphone frequency and position-specific triphone frequency. And KPPC also calculates the Neighborhood Density that is the number of words that sound similar to a target word. The Phonotactic Calculator that was developed in University of Kansas can be analyzed by the computer-readable phonemic transcription. This can calculate positional frequency and position-specific biphone frequency that were derived from 20,000 dictionary words. But KPPC calculates positional frequency, positional biphone frequency, positional triphone frequency and neighborhood density. KPPC can calculate by korean alphabet or computer-readable phonemic transcription. This KPPC can anticipate high phonotactic probability, low phonotactic probability, high neighborhood density and low neighborhood density.

A Method for Measuring Inter-Utterance Similarity Considering Various Linguistic Features (다양한 언어적 자질을 고려한 발화간 유사도 측정 방법)

  • Lee, Yeon-Su;Shin, Joong-Hwi;Hong, Gum-Won;Song, Young-In;Lee, Do-Gil;Rim, Hae-Chang
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.1
    • /
    • pp.61-69
    • /
    • 2009
  • This paper presents an improved method measuring inter-utterance similarity in an example-based dialogue system, which searches the most similar utterance in a dialogue database to generate a response to a given user utterance. Unlike general inter-sentence similarity measures, the inter-utterance similarity measure for example-based dialogue system should consider not only word distribution but also various linguistic features, such as affirmation/negation, tense, modality, sentence type, which affects the natural conversation. However, previous approaches do not sufficiently reflect these features. This paper proposes a new utterance similarity measure by analyzing and reflecting various linguistic features to improve performance in accuracy. Also, by considering substitutability of the features, the proposed method can utilize limited number of examples. Experimental results show that the proposed method achieves 10%p improvement in accuracy compared to the previous method.

Personalized Recommendation System for IPTV using Ontology and K-medoids (IPTV환경에서 온톨로지와 k-medoids기법을 이용한 개인화 시스템)

  • Yun, Byeong-Dae;Kim, Jong-Woo;Cho, Yong-Seok;Kang, Sang-Gil
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.147-161
    • /
    • 2010
  • As broadcasting and communication are converged recently, communication is jointed to TV. TV viewing has brought about many changes. The IPTV (Internet Protocol Television) provides information service, movie contents, broadcast, etc. through internet with live programs + VOD (Video on demand) jointed. Using communication network, it becomes an issue of new business. In addition, new technical issues have been created by imaging technology for the service, networking technology without video cuts, security technologies to protect copyright, etc. Through this IPTV network, users can watch their desired programs when they want. However, IPTV has difficulties in search approach, menu approach, or finding programs. Menu approach spends a lot of time in approaching programs desired. Search approach can't be found when title, genre, name of actors, etc. are not known. In addition, inserting letters through remote control have problems. However, the bigger problem is that many times users are not usually ware of the services they use. Thus, to resolve difficulties when selecting VOD service in IPTV, a personalized service is recommended, which enhance users' satisfaction and use your time, efficiently. This paper provides appropriate programs which are fit to individuals not to save time in order to solve IPTV's shortcomings through filtering and recommendation-related system. The proposed recommendation system collects TV program information, the user's preferred program genres and detailed genre, channel, watching program, and information on viewing time based on individual records of watching IPTV. To look for these kinds of similarities, similarities can be compared by using ontology for TV programs. The reason to use these is because the distance of program can be measured by the similarity comparison. TV program ontology we are using is one extracted from TV-Anytime metadata which represents semantic nature. Also, ontology expresses the contents and features in figures. Through world net, vocabulary similarity is determined. All the words described on the programs are expanded into upper and lower classes for word similarity decision. The average of described key words was measured. The criterion of distance calculated ties similar programs through K-medoids dividing method. K-medoids dividing method is a dividing way to divide classified groups into ones with similar characteristics. This K-medoids method sets K-unit representative objects. Here, distance from representative object sets temporary distance and colonize it. Through algorithm, when the initial n-unit objects are tried to be divided into K-units. The optimal object must be found through repeated trials after selecting representative object temporarily. Through this course, similar programs must be colonized. Selecting programs through group analysis, weight should be given to the recommendation. The way to provide weight with recommendation is as the follows. When each group recommends programs, similar programs near representative objects will be recommended to users. The formula to calculate the distance is same as measure similar distance. It will be a basic figure which determines the rankings of recommended programs. Weight is used to calculate the number of watching lists. As the more programs are, the higher weight will be loaded. This is defined as cluster weight. Through this, sub-TV programs which are representative of the groups must be selected. The final TV programs ranks must be determined. However, the group-representative TV programs include errors. Therefore, weights must be added to TV program viewing preference. They must determine the finalranks.Based on this, our customers prefer proposed to recommend contents. So, based on the proposed method this paper suggested, experiment was carried out in controlled environment. Through experiment, the superiority of the proposed method is shown, compared to existing ways.

Establishment of the Korean Standard Vocal Sound into Character Conversion Rule (한국어 음가를 한글 표기로 변환하는 표준규칙 제정)

  • 이계영;임재걸
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.41 no.2
    • /
    • pp.51-64
    • /
    • 2004
  • The purpose of this paper is to establish the Standard Korean Vocal Sound into Character Conversion Rule (Standard VSCC Rule) by reversely applying the Korean Standard Pronunciation Rule that regulates the way of reading written Hangeul sentences. The Standard VSCC Rule performs a crucially important role in Korean speech recognition. The general method of speech recognition is to find the most similar pattern among the standard voice patterns to the input voice pattern. Each of the standard voice patterns is an average of several sample voice patterns. If the unit of the standard voice pattern is a word, then the number of entries of the standard voice pattern will be greater than a few millions (taking inflection and postpositional particles into account). This many entries require a huge database and an impractically too many comparisons in the process of finding the most similar pattern. Therefore, the unit of the standard voice pattern should be a syllable. In this case, we have to resolve the problem of the difference between the Korean vocal sounds and the writing characters. The process of converting a sequence of Korean vocal sounds into a sequence of characters requires our Standard VSCC Rule. Making use of our Standard VSCC Rule, we have implemented a Korean vocal sounds into Hangeul character conversion system. The Korean Standard Pronunciation Rule consists of 30 items. In order to show soundness and completeness of our Standard VSCC Rule, we have tested the conversion system with various data sets reflecting all the 30 items. The test results will be presented in this paper.

A study on the upper garment of Korean women, Jugori (여자 저고리 소고)

  • 이경자
    • Journal of the Korean Home Economics Association
    • /
    • v.8 no.1
    • /
    • pp.62-86
    • /
    • 1970
  • A study on the upper garment of Korean women, JUORI The upper garment of Korean women. JUGORI, is an inherited mode from the ancient clothing style in the various aspects based on the particulars of Korean clothes. The ancient style of clothes is originated from KWAMDUI belonging to inhabitants of Northern Territory of Korea. And it is quite different from Chinese clothes in lineage. However, this unicque mode of clothes has been much influnced by the Chinese culture and also by the climate of Korea. And it is quite different from Chinese clothes in lineage. However, this unicque mode of clothes has been much influnced by the Chinese culture and also by the climate of Korean penynsula. The changes of the pattern of JUGORI, in a word, is a sign of shortening tendency of size. This tendency of JUGORI is remarkably seen in the shortening of length and other parts are decreased in size. The JUGORI in the ancient age was fallen below the weist of woman, which is similar to Robe, and was worn with band. However, the length of the JUGORI has been gradually shortened, and therefore, GORUM took place of the band. The shortening tendency of JUGORI is seemed to be shown its sign in the initial time of its origin, because there are some evidences that the women in Sylla Dynasty, and this tendency has been much expedited during the period of Koryu Dynasty with influences of Monggorian culture (Won Lynasty of China) The oldest sample for data of JUGORI in nowaday is one the remains of Yi Dynasty, and this sample for data provides all the particulars of the modern pattern of JUGORI. The tendency of JUGORI had been continued even in Yi Dynasty, and at the end of the Dynasty, the clothes was shortened that the women felt inconvenient wearing it in the status of the shortened JUGORI which was even hardly cover the initial time of epoch of modernization induced from the Western civilization, and after 1920s and 1930s JUGORI become a larger tendency. This is a sing of revival of practical use and rationalization of JUGORI become a shortening tendency again, and the size is similar with that of early age of Yi Dynasty. Instead of these similarities, the particulars of modern JUGORI is weighing on much emphasis on curve beauty and expression of experior beauty. The reason is that, together with westernization of clothes, JUGORI became a special pattern of clothes as a traditional Korean women wears. The very thing explaining this pattern of JUGORI is the "ARIRANG DRESS". And there are some fashion using button instead of GORUM and half sleeve JUGORI for summer use which is regarded as a part of improved aspect of life in Korea. in Korea.

  • PDF

Stratigraphical Study on Tertiary System of Pohang Area Compared with Petrogeologies of Japan (포항(浦項)의 제삼기층(第三紀層)과 일본유전지질(日本油田地質)의 층위대비연구(層位對比硏究))

  • Chang, Seyong
    • Economic and Environmental Geology
    • /
    • v.9 no.1
    • /
    • pp.1-11
    • /
    • 1976
  • It is believed that geological survey, drilling and geophysical survey which was carried out on Tertiary deposits in Pohang is a valuable but through the studying of many Tertiary sediments in Japan discovered many questions on analysis of final report prepared by National geological survey. The main reason is: 1. The seismic sound velocity which have regulated in the final report prepared by geological survey for Tertiary deposits in Pohang was 1,500-2,000m/sec in spite of oil bearing sediments of same age in Japan are 2,000-3,800m/sec. These may means the requirement of reconsideration of seismic velocity for Tertiary deposits in Pohang and required to have a dipper drilling. 2. Stratigraphically, geophysically, and paleontologically, the Tertiary deposits in Pahang land area is similar with that of Nishiyama-Hunakawa formations of Akita oil field in Japan. Nishiyama-Hunakawa formation is the main oil bearing formation in Japan. 3. Those valcanic rock including andestitic rock and liparitic rock which have extensively distributed over either at land area or at sea bottom, assumed by geological survey as the base of Tertiary sediments. But in case of Japan many oil bearing deposits are in over laid by these kind of volcanic rock. Therefore a possible of same condition with Japan is presumable on Tertiary sediments in Pohang. 4. It is believed that the Tertiary sediments of land area in Pohang is the extension of offshore basin but is wandering that the final report submitted by geologic survey have not remain any word on report of ECAFE discribed so much problematics as followed: A. Although it was assumed that no great thickness exceeding 1,000 meters, or major structures would be encountered in the Tertiary offshore sequence, it was hoped that shallow hydrocarbon deposits might be found, because these sediment are lithologically similar to those of the same age in the producing area of the northwest Honshu region of Japan where hydrocarbon are extracted from depths of only 500 to 600 meters. B. Four possible hydrocarbon trap conditions are represented in the survey area: anticlinal folds, faults, pinch outs, along the igneous basement and lateral facies changes. C. Most of the prime possible reservoir area are beyond the 50 meter water depth mark, except for the structures in Yonil Bay. D. Despite the shallowness of the offshore basin, sufficient trap condition exist in the area to warrant further exploration for hydrocarbon. 5. All of the problems mentioned above have gave us a strong reasons to have us hesitating to make a final conclusion on Tertiary problems in Pohang, before to have a drill to a depth to 3,000 meters or more whatever it is the Tertiary or a Mesozoics below 1,000 meters.

  • PDF

An Analysis on the Vocabulary in the English-Translation Version of Donguibogam Using the Corpus-based Analysis (코퍼스 분석방법을 이용한 『동의보감(東醫寶鑑)』 영역본의 어휘 분석)

  • Jung, Ji-Hun;Kim, Dong-Ryul;Kim, Do-Hoon
    • The Journal of Korean Medical History
    • /
    • v.28 no.2
    • /
    • pp.37-45
    • /
    • 2015
  • Objectives : A quantitative analysis on the vocabulary in the English translation version of Donguibogam. Methods : This study quantitatively analyzed the English-translated texts of Donguibogam with the Corpus-based analysis, and compared the quantitative results analyzing the texts of original Donguibogam. Results : As the results from conducting the corpus analysis on the English-translation version of Donguibogam, it was found that the number of total words (Token) was about 1,207,376, and the all types of used words were about 20.495 and the TTR (Type/Token Rate) was 1.69. The accumulation rate reaching to the high-ranking 1000 words was 83.54%, and the accumulation rate reaching to the high-ranking 2000 words was 90.82%. As the words having the high-ranking frequency, the function words like 'the, and of, is' mainly appeared, and for the content words, the words like 'randix, qi, rhizoma and water' were appeared in multi frequencies. As the results from comparing them with the corpus analysis results of original version of Donguibogam, it was found that the TTR was higher in the English translation version than that of original version. The compositions of function words and contents words having high-ranking frequencies were similar between the English translation version and the original version of Donguibogam. The both versions were also similar in that their statements in the parts of 'Remedies' and 'Acupuncture' showed higher composition rate of contents words than the rate of function words. Conclusions : The vocabulary in the English translation version of Donguibogam showed that this book was a book keeping the complete form of sentence and an Korean medical book at the same time. Meanwhile, the English translation version of Donguibogam had some problems like the unification of vocabulary due to several translators, and the incomplete delivery of word's meanings from the Chinese character-culture area to the English-culture area, and these problems are considered as the matters to be considered in a work translating Korean old medical books in English.

A Longitudinal Case Study of Late Babble and Early Speech in Southern Mandarin

  • Chen, Xiaoxiang
    • Cross-Cultural Studies
    • /
    • v.20
    • /
    • pp.5-27
    • /
    • 2010
  • This paper studies the relation between canonical/variegated babble (CB/VB) and early speech in an infant acquiring Mandarin Chinese from 9 to 17 months. The infant was audio-and video-taped in her home almost every week. The data analyzed here come from 1,621 utterances extracted from 23 sessions ranging from 30 minutes to one hour, from age 00:09;07 to 01:05;27. The data was digitized, and segments from 23 sessions were transcribed in narrow IPA and coded for analysis. Babble was coded from age 00:09;07 to 01:00;00, and words were coded from 01:00;00 to 01:05;27, proto-words appeared at 11 months, and some babble was still present after 01:10;00. 3821 segments were counted in CB/VB utterances, plus the segments found in 899 word tokens. The data transcription was completed and checked by the author and was rechecked by two other researchers who majored in Chinese phonetics in order to ensure the reliability, we reached an agreement of 95.65%. Mandarin Chinese is phonetically very rich in consonants, especially affricates: it has aspirated and unaspirated stops in labial, alveolar, and velar places of articulation; affricates and fricatives in alveolar, retroflex, and palatal places; /f/; labial, alveolar, and velar nasals; a lateral;[h]; and labiovelar and palatal glides. In the child's pre-speech phonetic repertoire, 7 different consonants and 10 vowels were transcribed at 00:09;07. By 00:10;16, the number of phones was more than doubled (17 consonants, 25 vowels), but the rate of increase slowed after 11 months of age. The phones from babbling remained active throughout the child's early and subsequent speech. The rank order of the occurrence of the major class types for both CB and early speech was: stops, approximants, nasals, affricates, fricatives and lateral. As expected, unaspirated stops outnumbered aspirated stops, and front stops and nasals were more frequent than back sounds in both types of utterances. The fact that affricates outnumbered fricatives in the child's late babble indicates the pre-speech influence of the ambient language. The analysis of the data also showed that: 1) the phonetic characteristics of CB/VB and early meaningful speech are extremely similar. The similarities of CB/VB and speech prove that the two are deeply related; 2) The infant has demonstrated similar preferences for certain types of sounds in the two stages; 3) The infant's babbling was patterned at segmental level, and this regularity was similarly evident in the early speech of children. The three types being coronal plus front vowel; labial plus central and dorsal plus back vowel exhibited much overlap in the phonetic forms of CB/ VB and early speech. So the child's CB/ VB at this stage already shared the basic architecture, composition and representation of early speech. The evidence of similarity between CB/VB and early speech leaves no doubt that phones present in CB/VB are indeed precursors to early speech.

The Comparative Study of the Mantra of Korean Buddhism and the Jumun of Daesoonjinrihoe (한국 불교 진언과 대순진리회 주문의 비교 연구)

  • Park, In-gyu
    • Journal of the Daesoon Academy of Sciences
    • /
    • v.22
    • /
    • pp.387-432
    • /
    • 2014
  • In this paper, I want to compare the mantra of Korean Buddhism with the jumun(呪文) of Daesoonjinrihoe in rites and cultivation. Regarding the mantra of Buddhism there are some researches, but there are few studies with regard to the jumun of Daesoonjinrihoe. The mantra of Buddhism and jumun of Daesoonjinrihoe look similar in pronouncing Hangul characters, but the religious and historical context around these seems to be different. The mantra of Korean Buddhism is associated with the introduction and diffusion process of esoteric Buddhism. In the early period of Buddhism some mantras were allowed by Buddha and the mantras were certified as a educational teaching in the period of Early Buddhist schools. In Mahayana school, the dharani that was abstracted from the vast Mahayana scriptures was developed. As Mahayana Buddhism develops, esoteric Buddhism was born in India. Esoteric Buddhism was introduced into China and was imported into Korea in Silla dynasty. In Koryo dynasty various rituals of esoteric Buddhism flourished and Jineunjong(眞言宗) and Chongjijong(總持宗) school were formed. In Chosun dynasty Buddhism was suppressed by government and the esoteric school was discontinued. But in rituals and cultivation the mantra and dharani were flourished in the latter part of Chosun dynasty. In modern period several esoteric schools were formed and developed. In present context the mantra was recited by many people in Korea. Main mantras are 'Om mani padme hum', 'Dharani of Avalokitesvara(神妙章句大陀羅尼)', 'neungumju(楞嚴呪)', 'Gwangmyung mantra(光明眞言) etc. The jumun of Daesoonjinrihoe was started by Kang Jeungsan(姜甑山) who was believed to be a God by Daesoonjinrihoe believers. Jeungsan used several existed mantras in creating new heaven and earth and made new jumuns by himself and taught them to his followers. Cho Jungsan(趙鼎山) who succeeded to the doctrines has received the jumuns by Jeungsan. He selected the jumuns to recite and determined the method how to spell these. Park Hankyung(朴漢慶) who opened Daesoonjinrihoe succeeded the rituals and doctrines. Every day ritual of Daesoonjinrihoe is chanting the jumun and the cultivation and gongbu(工夫) is practiced through jumun. Important jumuns of Daesoonjinrihoe are Taeulju(太乙呪) and Kidoju(祈禱呪). In the aspects of ritual, the mantra of Buddhism and the jumun of Daesoonjinrihoe perform a similar function. The mantra of Buddhism has the context of the doctrines of Buddhism and the method of Buddhistic practicing but the jumun of Daesoonjinrihoe is related to Jeungsan's teaching and the doctrines of Daesoonjinrihoe. But it is same that the mantra and jumun are used in communicating or uniting with ultimate reality. So the mantra and jumun are important vehicles for homo religius to meet the sacred and unite with the sacred and is regarded as the sacred word by the faithful which has a lot of symbols and meanings.

Product Community Analysis Using Opinion Mining and Network Analysis: Movie Performance Prediction Case (오피니언 마이닝과 네트워크 분석을 활용한 상품 커뮤니티 분석: 영화 흥행성과 예측 사례)

  • Jin, Yu;Kim, Jungsoo;Kim, Jongwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.49-65
    • /
    • 2014
  • Word of Mouth (WOM) is a behavior used by consumers to transfer or communicate their product or service experience to other consumers. Due to the popularity of social media such as Facebook, Twitter, blogs, and online communities, electronic WOM (e-WOM) has become important to the success of products or services. As a result, most enterprises pay close attention to e-WOM for their products or services. This is especially important for movies, as these are experiential products. This paper aims to identify the network factors of an online movie community that impact box office revenue using social network analysis. In addition to traditional WOM factors (volume and valence of WOM), network centrality measures of the online community are included as influential factors in box office revenue. Based on previous research results, we develop five hypotheses on the relationships between potential influential factors (WOM volume, WOM valence, degree centrality, betweenness centrality, closeness centrality) and box office revenue. The first hypothesis is that the accumulated volume of WOM in online product communities is positively related to the total revenue of movies. The second hypothesis is that the accumulated valence of WOM in online product communities is positively related to the total revenue of movies. The third hypothesis is that the average of degree centralities of reviewers in online product communities is positively related to the total revenue of movies. The fourth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. The fifth hypothesis is that the average of betweenness centralities of reviewers in online product communities is positively related to the total revenue of movies. To verify our research model, we collect movie review data from the Internet Movie Database (IMDb), which is a representative online movie community, and movie revenue data from the Box-Office-Mojo website. The movies in this analysis include weekly top-10 movies from September 1, 2012, to September 1, 2013, with in total. We collect movie metadata such as screening periods and user ratings; and community data in IMDb including reviewer identification, review content, review times, responder identification, reply content, reply times, and reply relationships. For the same period, the revenue data from Box-Office-Mojo is collected on a weekly basis. Movie community networks are constructed based on reply relationships between reviewers. Using a social network analysis tool, NodeXL, we calculate the averages of three centralities including degree, betweenness, and closeness centrality for each movie. Correlation analysis of focal variables and the dependent variable (final revenue) shows that three centrality measures are highly correlated, prompting us to perform multiple regressions separately with each centrality measure. Consistent with previous research results, our regression analysis results show that the volume and valence of WOM are positively related to the final box office revenue of movies. Moreover, the averages of betweenness centralities from initial community networks impact the final movie revenues. However, both of the averages of degree centralities and closeness centralities do not influence final movie performance. Based on the regression results, three hypotheses, 1, 2, and 4, are accepted, and two hypotheses, 3 and 5, are rejected. This study tries to link the network structure of e-WOM on online product communities with the product's performance. Based on the analysis of a real online movie community, the results show that online community network structures can work as a predictor of movie performance. The results show that the betweenness centralities of the reviewer community are critical for the prediction of movie performance. However, degree centralities and closeness centralities do not influence movie performance. As future research topics, similar analyses are required for other product categories such as electronic goods and online content to generalize the study results.