• 제목/요약/키워드: research vocabulary

검색결과 268건 처리시간 0.027초

Group-wise Keyword Extraction of the External Audit using Text Mining and Association Rules (텍스트마이닝과 연관규칙을 이용한 외부감사 실시내용의 그룹별 핵심어 추출)

  • Seong, Yoonseok;Lee, Donghee;Jung, Uk
    • Journal of Korean Society for Quality Management
    • /
    • 제50권1호
    • /
    • pp.77-89
    • /
    • 2022
  • Purpose: In order to improve the audit quality of a company, an in-depth analysis is required to categorize the audit report in the form of a text document containing the details of the external audit. This study introduces a systematic methodology to extract keywords for each group that determines the differences between groups such as 'audit plan' and 'interim audit' using audit reports collected in the form of text documents. Methods: The first step of the proposed methodology is to preprocess the document through text mining. In the second step, the documents are classified into groups using machine learning techniques and based on this, important vocabularies that have a dominant influence on the performance of classification are extracted. In the third step, the association rules for each group's documents are found. In the last step, the final keywords for each group representing the characteristics of each group are extracted by comparing the important vocabulary for classification with the important vocabulary representing the association rules of each group. Results: This study quantitatively calculates the importance value of the vocabulary used in the audit report based on machine learning rather than the qualitative research method such as the existing literature search, expert evaluation, and Delphi technique. From the case study of this study, it was found that the extracted keywords describe the characteristics of each group well. Conclusion: This study is meaningful in that it has laid the foundation for quantitatively conducting follow-up studies related to key vocabulary in each stage of auditing.

Construction and Application of POI Database with Spatial Relations Using SNS (SNS를 이용한 POI 공간관계 데이터베이스 구축과 활용)

  • Kim, Min Gyu;Park, Soo Hong
    • Spatial Information Research
    • /
    • 제22권4호
    • /
    • pp.21-38
    • /
    • 2014
  • Since users who search maps conduct their searching using the name they already know or is commonly called rather than formal name of a specific place, they tend to fail to find their destination. In addition, in typical web map service in terms of spatial searching of map. Location information of unintended place can be provided because when spatial searching is conducted with the vocabulary 'nearby' and 'in the vicinity', location exceeding 2 km from the current location is searched altogether as well. In this research, spatial range that human can perceive is calculated by extracting POI date with the usage of twitter data of SNS, constructing spatial relations with existing POI, which is already constructed. As a result, various place names acquired could be utilized as different names of existing POI data and it is expected that new POI data would contribute to select places for constructing POI data by utilizing to recognize places having lots of POI variation. Besides, we also expect efficient spatial searching be conducted using diverse spatial vocabulary which can be used in spatial searching and spatial range that human can perceive.

『Asia Pacific Journal of Corpus Research』 (1 권 1 호의 연구 동향과 연구 방법에 관한 고찰)

  • Jung, Chae Kwan
    • Asia Pacific Journal of Corpus Research
    • /
    • 제1권1호
    • /
    • pp.127-132
    • /
    • 2020
  • The purpose of this review is to provide local readers, more specifically, Korean student readers who are not all that familiar with the English language a general overview of research articles that have been published in Asia Pacific Journal of Corpus Research vol. 1, no. 1. A brief summary of each research article focusing on research methods and then followed by an overall review and some insights on research issues will be presented.

Korean Broadcast News Transcription Using Morpheme-based Recognition Units

  • Kwon, Oh-Wook;Alex Waibel
    • The Journal of the Acoustical Society of Korea
    • /
    • 제21권1E호
    • /
    • pp.3-11
    • /
    • 2002
  • Broadcast news transcription is one of the hardest tasks in speech recognition because broadcast speech signals have much variability in speech quality, channel and background conditions. We developed a Korean broadcast news speech recognizer. We used a morpheme-based dictionary and a language model to reduce the out-of·vocabulary (OOV) rate. We concatenated the original morpheme pairs of short length or high frequency in order to reduce insertion and deletion errors due to short morphemes. We used a lexicon with multiple pronunciations to reflect inter-morpheme pronunciation variations without severe modification of the search tree. By using the merged morpheme as recognition units, we achieved the OOV rate of 1.7% comparable to European languages with 64k vocabulary. We implemented a hidden Markov model-based recognizer with vocal tract length normalization and online speaker adaptation by maximum likelihood linear regression. Experimental results showed that the recognizer yielded 21.8% morpheme error rate for anchor speech and 31.6% for mostly noisy reporter speech.

Typicality of Vocabulary for evaluation on Instrument-Noise generated at Loud Noise Workplace (고소음 작업장에서 발생하는 기기소음 평가를 위한 어휘의 유형화)

  • Ju, Duck-Hoon;Kook, Jung-Hun;Kim, Jae-Soo
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 한국소음진동공학회 2007년도 추계학술대회논문집
    • /
    • pp.242-247
    • /
    • 2007
  • After the Industrialization of 1960s, while it has greatly contributed to the industrial development owing to acceleration of mechanization, but it is real situation that the countermeasure to Noise Damage generating at the loud noise workshop is scarcely made. Especially, the Instrument-Noise made at factory and workplace is so shocking and repeatedly reiterating terrible noise that most of the spot workers are forcedly imposing such dangers as the severe unpleasant feeling and hearing impairments. On such point of view, this Research has attempted to extract the proper Rating Vocabulary in order for valuation on Instrument Noise made at the terrible noise-workplace, therefore it is considering that those extracted Vocabularies could be utilized as the useful materials for appraisal on Instrument Noise, also for establishment of Regulation-Standard with regard to Acoustic Psychology Experimentation and Instrument Noise.

  • PDF

Design of a Korean Speech Recognition Platform (한국어 음성인식 플랫폼의 설계)

  • Kwon Oh-Wook;Kim Hoi-Rin;Yoo Changdong;Kim Bong-Wan;Lee Yong-Ju
    • MALSORI
    • /
    • 제51호
    • /
    • pp.151-165
    • /
    • 2004
  • For educational and research purposes, a Korean speech recognition platform is designed. It is based on an object-oriented architecture and can be easily modified so that researchers can readily evaluate the performance of a recognition algorithm of interest. This platform will save development time for many who are interested in speech recognition. The platform includes the following modules: Noise reduction, end-point detection, met-frequency cepstral coefficient (MFCC) and perceptually linear prediction (PLP)-based feature extraction, hidden Markov model (HMM)-based acoustic modeling, n-gram language modeling, n-best search, and Korean language processing. The decoder of the platform can handle both lexical search trees for large vocabulary speech recognition and finite-state networks for small-to-medium vocabulary speech recognition. It performs word-dependent n-best search algorithm with a bigram language model in the first forward search stage and then extracts a word lattice and restores each lattice path with a trigram language model in the second stage.

  • PDF

HMnet Evaluation for Phonetic Environment Variations of Traning Data in Speech Recognition

  • Kim, Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • 제15권4E호
    • /
    • pp.28-36
    • /
    • 1996
  • In this paper, we propose a new evaluation methodology which can more clearly show the performance of the allophone modeling algorithm generally used in large vocabulary speech recognition. The proposed evaluation method shows the running characteristics and limitations of the modeling algorithm by testing how the variation of phonetic environments of training data affects the recognition performance and the desirable number of free parameters to be estimated. Using the method, we experiment results, we conclude that, in vocabulary-independent recognition task, the phonetic diversity of training data greatly affects the robustness of model, and it is necessary to develop a proper measure which can determine the number of states compromizing the robustness and the precision of the HMnet better than the conventional modeling efficiency.

  • PDF

Effects on Literacy Development of Teacher-Child Discussion about the Contents of Storybooks (동화 내용에 대한 교사와 유아의 토의활동이 유아의 문해발달에 미치는 효과)

  • Min, Ok Jin;Lee, Youn Kyoung
    • Korean Journal of Child Studies
    • /
    • 제22권1호
    • /
    • pp.163-175
    • /
    • 2001
  • This study investigated the effect on children's literacy of teacher-child discussion about the contents of storybooks. The subjects were 10 experimental and 10 control 5-year-old kindergarten children in Cheongju. The experimental design was the Split Plot design. The research instruments used for pre- and post-tests were the Vocabulary Decoding Competence Test(Lee, 1998), Writing Competence Test(Lee & Lee, 1990), Story Comprehension Competence Test(Lee, 1998; Morrow, 1990), and the Emergent Reading Ability Judgements for Favorite Storybooks(Sulzby, 1985). Data were analyzed by ANOVA with repeated measures. Results showed that the teacher-child discussions about the contents of storybooks were effective for improving children's competence in decoding vocabulary, writing, and story comprehension but not for improving emergent reading ability.

  • PDF

Utterance Verification Using Search Confusion Rate and Its N-Best Approach

  • Kim, Kyu-Hong;Kim, Hoi-Rin;Hahn, Min-Soo
    • ETRI Journal
    • /
    • 제27권4호
    • /
    • pp.461-464
    • /
    • 2005
  • Recently, a variety of confidence measures for utterance verification has been studied to improve speech recognition performance by rejecting out-of-vocabulary inputs. Most of the conventional confidence measures for utterance verification are based primarily on hypothesis testing or an approximated posterior probability, and their performances depend on the robustness of an alternative hypothesis or the prior probability. We introduce a novel confidence measure called a search confusion rate (SCR), which does not require an alternative hypothesis or the approximation of posterior probability. Our confusion-based approach shows better performance in additive noise-corrupted speech as well as in clean speech.

  • PDF

基于汉语语料库的中韩词典词汇释义的准确性研究 - 以D3H1区的词汇为中心

  • Gwak, Jun-Hwa
    • 중국학논총
    • /
    • 제65호
    • /
    • pp.23-38
    • /
    • 2020
  • The dictionary is the most important tool for every Chinese learner to confirm the meaning and usage of words. Therefore, accuracy of headword's interpretation in the dictionary is crucial. This study aims to discuss the accuracy and the adequacy of headwords' interpretation in the Chinese-Korean dictionary through the Chinese corpus and Baidu. The scope of this study are 3000 words in the D3H1 region. According to the research results, the main problems of the vocabulary in this region can be divided into three categories: the first is the problem of lexical interpretation, the second is the problem of missing interpretation, and the third is other problems. In the D3H1 area, there are a total of 719 low-frequency vocabularies, and 54 headword's interpretations are not accurate or appropriate. This study is a detailed investigation and analysis of the problems of these 54 vocabularies.