• Title/Summary/Keyword: Frequency of Words

Search Result 885, Processing Time 0.025 seconds

Word Recognition Using VQ and Fuzzy Theory (VQ와 Fuzzy 이론을 이용한 단어인식)

  • Kim, Ja-Ryong;Choi, Kap-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.38-47
    • /
    • 1991
  • The frequency variation among speakers is one of problems in the speech recognition. This paper applies fuzzy theory to solve the variation problem of frequency features. Reference patterns are expressed by fuzzified patterns which are produced by the peak frequency and the peak energy extracted from codebooks which are generated from training words uttered by several speakers, as they should include common features of speech signals. Words are recognized by fuzzy inference which uses the certainty factor between the reference patterns and the test fuzzified patterns which are produced by the peak frequency and the peak energy extracted from the power spectrum of input speech signals. Practically, in computing the certainty factor, to reduce memory capacity and computation requirements we propose a new equation which calculates the improved certainty factor using only the difference between two fuzzy values. As a result of experiments to test this word recognition method by fuzzy interence with Korean digits, it is shown that this word recognition method using the new equation presented in this paper, can solve the variation problem of frequency features and that the memory capacity and computation requirements are reduced.

  • PDF

An acoustic and perceptual investigation of the vowel length contrast in Korean

  • Lee, Goun;Shin, Dong-Jin
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.37-44
    • /
    • 2016
  • The goal of the current study is to investigate how the sound change is reflected in production or in perception, and what the effect of lexical frequency is on the loss of sound contrasts. Specifically, the current study examined whether the vowel length contrasts are retained in Korean speakers' productions, and whether Korean listeners can distinguish vowel length minimal pairs in their perception. Two production experiments and two perception experiments investigated this. For production tests, twelve Korean native speakers in their 20s and 40s completed a read-aloud task as well as a map-task. The results showed that, regardless of their age group, all Korean speakers produced vowel length contrasts with a small but significant differences in the read-aloud test. Interestingly, the difference between long and short vowels has disappeared in the map task, indicating that the speech mode affects producing vowel length contrasts. For perception tests, thirty-three Korean listeners completed a discrimination and a forced-choice identification test. The results showed that Korean listeners still have a perceptual sensitivity to distinguish lexical meaning of the vowel length minimal pair. We also found that the identification accuracy was affected by the word frequency, showing a higher identification accuracy in high- and mid- frequency words than low frequency words. Taken together, the current study demonstrated that the speech mode (read-aloud vs. spontaneous) affects the production of the sound undergoing a language change; and word frequency affects the sound change in speech perception.

A Study on the Endpoint Detection by FIR Filtering (FIR filtering에 의한 끝점추출에 관한 연구)

  • Lee, Chang-Young
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.81-88
    • /
    • 1999
  • This paper provides a method for speech detection. After first order FIR filtering on the speech signals, we applied the conventional method of endpoint detection which utilizes the energy as the criterion in separating signals from background noise. By FIR filtering, only the Fourier components with large values of [amplitude x frequency] become significant in energy profile. By applying this procedure to the 445-words database constructed from ETRI, we confirmed that the low-amplitude noise and/or the low-frequency noise are separated clearly from the speech signals, thereby enhancing the feasibility of ideal endpoint detections.

  • PDF

The Effect of Word Frequency on Resident Education Effectiveness in Rural Field Forums (농촌현장포럼에서 단어의 빈도가 주민교육 효과성에 미치는 영향)

  • Lee, Byungjoon;Yoon, Seongsoo
    • Journal of Korean Society of Rural Planning
    • /
    • v.28 no.3
    • /
    • pp.1-11
    • /
    • 2022
  • In this study, the magnitude of the influence of word frequency on the change of perception of residents before and after resident competency strengthening education in villages where the rural field forum was conducted was analyzed. The results of analyzing the changes in residents' perceptions of the village development project according to the frequency of words are as follows. It was found that talking about surrounding factors had a greater influence than individual factors of keywords. In addition, the frequency of word use had a positive effect on the resident's perception. It was analyzed that items with high awareness of resident prior to resident competency-enhancing education had a low impact.

Electrical and Memory Switching Characteristics of Amorphous Thin-Film $As_{10}Ge_{15}Te_{75}$ Thin-Film (비정질 $As_{10}Ge_{15}Te_{75}$ 박막의 전기적 및 메모리 스위칭 특성)

  • 이병석;이현용;정흥배
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 1996.11a
    • /
    • pp.234-237
    • /
    • 1996
  • The amorphous chalogenide semiconductors are new material in semiconductor physics. Their properties, especially electronic and optical properties are main motives for device application. Amorphous As$_{10}$Ge$_{15}$ Te$_{75}$material has the stable ac conductivity at high frequency and the dc memory switching property. At higher frequency than 10MHz, ac conductivity of As$_{10}$Ge$_{15}$ Te$_{75}$ thin film is much higher than below frequency and independent of temperature and frequency. If the dc voltages are applied between edges of thin film, one can see the dc memory switching phenomenon, in other words the dc conductivity increases quite a few of magnitude after the threshold voltage is applied. Using the stable ac conductivity at high frequency and the increase of conductivity after dc memory switching, As$_{10}$Ge$_{15}$ Te$_{75}$thin film is considered as new material for microwave switch devices.vices.es.vices.

  • PDF

Frequency Inheritance in the Production of Korean Homophones

  • Han, Jeong-Im
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.7-19
    • /
    • 2007
  • The present study investigates the so-called frequency inheritance effect in word production. According to some earlier studies (e.g. Jescheniak & Levelt, 1994), retrieval of a low-frequency homophone benefits from its high-frequency homophone twin, and more specifically word-retrieval RT is determined by the frequency of the phonological form of the word (sum of homophone frequencies) rather than the frequency of the specific word. This result, however, has been challenged by later studies (e.g. Caramazza et al., 2001) and one possible resolution is that languages differ in the extent to which the inheritance effect occurs. Two experiments are reported to test whether the frequency inheritance effect depends on the target language, namely, if a language such as Korean with relatively many homophones tend not to show frequency inheritance, which is compared with the language with fewer homophones such as Dutch and German (Jescheniak & Levelt, 1994; Jescheniak et al., 2003). Experiment 1 was picture naming, and Experiment 2 used an English-to-Korean translation task. In both experiments, the homophones were actually slower than the low-frequency controls, suggesting that there was no evidence for the inheritance effect. These results imply that the issue of whether specific word or homophone frequency determines production can be properly assessed by taking into account the language-specific nature of the lexicon such as the percentage of the homophone words in that language.

  • PDF

A Suggestion for Spatiotemporal Analysis Model of Complaints on Officially Assessed Land Price by Big Data Mining (빅데이터 마이닝에 의한 공시지가 민원의 시공간적 분석모델 제시)

  • Cho, Tae In;Choi, Byoung Gil;Na, Young Woo;Moon, Young Seob;Kim, Se Hun
    • Journal of Cadastre & Land InformatiX
    • /
    • v.48 no.2
    • /
    • pp.79-98
    • /
    • 2018
  • The purpose of this study is to suggest a model analysing spatio-temporal characteristics of the civil complaints for the officially assessed land price based on big data mining. Specifically, in this study, the underlying reasons for the civil complaints were found from the spatio-temporal perspectives, rather than the institutional factors, and a model was suggested monitoring a trend of the occurrence of such complaints. The official documents of 6,481 civil complaints for the officially assessed land price in the district of Jung-gu of Incheon Metropolitan City over the period from 2006 to 2015 along with their temporal and spatial poperties were collected and used for the analysis. Frequencies of major key words were examined by using a text mining method. Correlations among mafor key words were studied through the social network analysis. By calculating term frequency(TF) and term frequency-inverse document frequency(TF-IDF), which correspond to the weighted value of key words, I identified the major key words for the occurrence of the civil complaint for the officially assessed land price. Then the spatio-temporal characteristics of the civil complaints were examined by analysing hot spot based on the statistics of Getis-Ord $Gi^*$. It was found that the characteristic of civil complaints for the officially assessed land price were changing, forming a cluster that is linked spatio-temporally. Using text mining and social network analysis method, we could find out that the occurrence reason of civil complaints for the officially assessed land price could be identified quantitatively based on natural language. TF and TF-IDF, the weighted averages of key words, can be used as main explanatory variables to analyze spatio-temporal characteristics of civil complaints for the officially assessed land price since these statistics are different over time across different regions.

Automatic Extraction of Opinion Words from Korean Product Reviews Using the k-Structure (k-Structure를 이용한 한국어 상품평 단어 자동 추출 방법)

  • Kang, Han-Hoon;Yoo, Seong-Joon;Han, Dong-Il
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.470-479
    • /
    • 2010
  • In relation to the extraction of opinion words, it may be difficult to directly apply most of the methods suggested in existing English studies to the Korean language. Additionally, the manual method suggested by studies in Korea poses a problem with the extraction of opinion words in that it takes a long time. In addition, English thesaurus-based extraction of Korean opinion words leaves a challenge to reconsider the deterioration of precision attributed to the one to one mismatching between Korean and English words. Studies based on Korean phrase analyzers may potentially fail due to the fact that they select opinion words with a low level of frequency. Therefore, this study will suggest the k-Structure (k=5 or 8) method, which may possibly improve the precision while mutually complementing existing studies in Korea, in automatically extracting opinion words from a simple sentence in a given Korean product review. A simple sentence is defined to be composed of at least 3 words, i.e., a sentence including an opinion word in ${\pm}2$ distance from the attribute name (e.g., the 'battery' of a camera) of a evaluated product (e.g., a 'camera'). In the performance experiment, the precision of those opinion words for 8 previously given attribute names were automatically extracted and estimated for 1,868 product reviews collected from major domestic shopping malls, by using k-Structure. The results showed that k=5 led to a recall of 79.0% and a precision of 87.0%; while k=8 led to a recall of 92.35% and a precision of 89.3%. Also, a test was conducted using PMI-IR (Pointwise Mutual Information - Information Retrieval) out of those methods suggested in English studies, which resulted in a recall of 55% and a precision of 57%.

Children's Play Facilities according to the Classification of Amusement Features (놀이속성 분류에 따른 적정 어린이 놀이시설물 연구)

  • Jeong, Kil-Taek;Shin, Min-Ji;Shin, Ji-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.46 no.1
    • /
    • pp.29-37
    • /
    • 2018
  • This study intends to derive play attribute words to describe the nature of play by analyzing the correlation between play facilities and play attribute words. To investigate play attributes at playing facilities and supplement areas of weakness can provide a balanced play environment. Play attributes words were compiled via a literature review and the importance of each play attributes word was surveyed by experts. The keywords explaining play derived from news articles and references are defined as play attributes words. These words were classified into six broad categories and twenty-six sub-categories. The importance of major play attribute words show: Communication (0.268%) > Imagination (0.201%) > Amusement (0.190%) > Development (0.167%) > Learning (0.108%) > Intelligence (0.067%). Experts have recognized the most important elements are communication and imagination. Each play attribute associated with an amusement facility was separately identified in the amusement facilities installed in 114 children's parks in Seoul. Of the play attribute words, the amusement facilities at Seoul's Children's Park reflected a high frequency in 'development'. Furthermore, the importance of major playing attribute words such as 'Communication' and 'Imagination' were not fully reflected in cognitive play facilities. Therefore, it was judged that there is a need to actively introduce these attributes. This study proposed future improvements by determining weaknesses of amusement facilities in children's parks and analyzing the features and functions of play so as to suggest future improvements.

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.