• Title/Summary/Keyword: Corpus-based Study

Search Result 204, Processing Time 0.028 seconds

A Study of Computational Literature Analysis based Classification for a Pairwise Comparison by Contents Similarity in a section of Tokkijeon, 'Fish Tribe Conference' (컴퓨터 문헌 분석 기반의 토끼전 '어족회의' 대목 내용 유사도에 따른 이본 계통 분류 연구)

  • Kim, Dong-Keon;Jeong, Hwa-Young
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.5
    • /
    • pp.15-25
    • /
    • 2022
  • This study aims to identify the family and lineage of a part of a "Fish Tribe Conference" in the section Tokkijeon by utilizing computer literature analysis techniques. First of all, we encode the classification for a pairwise comparison's type of each paragraph to build a corpus, and based on this, we use the Hamming distance to calculate the distance matrix between each classification for a pairwise comparison's. We visualized classification for a pairwise comparison's clustering pattern by applying multidimensional scale method, and hierarchical clustering to explore the characteristics of the 'fish family' line and lineage compared to the existing cluster analysis study on entire paragraphs of "Tokkijeon". As a result, unlike the cluster analysis of the entire paragraph of "Tokkijeon", which consists of six categories, the "Fish Tribe Conference" section has five categories and some classification for a pairwise comparison's accesses. The results of this study are that the relative distance between Yibon was measured and systematic classification was performed in an objective and empirical way by calculation, and the characteristics of the line of the fish family were revealed compared to the analysis of the entire rabbit exhibition.

Relationship between the Conception Rate after Estrus Induction using $PGF_2{\alpha}$ and Other Parameters in Holstein Dairy Cows (젖소에서 $PGF_2{\alpha}$ 투여에 의한 발정 유도 후 수태율과 다른 인자와의 관계)

  • Park, Chul-Ho;Lim, Won-Ho;Suh, Guk-Hyun;Oh, Ki-Seok;Son, Chang-Ho
    • Journal of Embryo Transfer
    • /
    • v.25 no.3
    • /
    • pp.133-139
    • /
    • 2010
  • The purpose of this study was to determine the relationship between conception rate and other parameters (body condition score; BCS, progesterone concentrations and follicle size) before estrus induction with $PGF_2{\alpha}$. The conception rate in cows with (2.75, 2.75 to 3.25 and 3.25), BCS regardless of AI (artificial insemination) time was 47.5, 67.5% and 48.5% at $PGF_2{\alpha}$ injection, respectively. The conception rate regardless of BCS was 59.0% in cows inseminated based on detected estrus, and 46.2% in cows inseminated at 72 to 80 hours (timed artificial insemination, TAI) after $PGF_2{\alpha}$ injection. The conception rate regardless of AI time was 43.0% in cows with low progesterone concentrations (less than 1.0 ng/ml), and 67.5% in cows with high progesterone concentrations (more than 1.0 ng/ml) at $PGF_2{\alpha}$ injection. The conception rate regardless of progesterone concentrations was 59.9% in cows inseminated based on detected estrus, and 48.1% in cows of TAI after $PGF_2{\alpha}$ injection. The conception rate regardless of AI time was 36.0% in cows with small dominant follicles (less than 5 mm), 56.0% in cows between 5 mm to 10 mm of follicle size, and 65.5% in cows with large dominant follicles (more than 10 mm) at $PGF_2{\alpha}$ injection, respectively. The conception rate regardless of follicle size was 57.3% in cows inseminated based on detected estrus, and 47.6% in cows of TAI after $PGF_2{\alpha}$ injection. These results indicated that if the cows with BCS 2.75 to 3.25, active corpus luteum, and/or large dominant follicle (more than 10 mm) are used for estrus induction, the conception rate will be greater.

Semi-supervised domain adaptation using unlabeled data for end-to-end speech recognition (라벨이 없는 데이터를 사용한 종단간 음성인식기의 준교사 방식 도메인 적응)

  • Jeong, Hyeonjae;Goo, Jahyun;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.29-37
    • /
    • 2020
  • Recently, the neural network-based deep learning algorithm has dramatically improved performance compared to the classical Gaussian mixture model based hidden Markov model (GMM-HMM) automatic speech recognition (ASR) system. In addition, researches on end-to-end (E2E) speech recognition systems integrating language modeling and decoding processes have been actively conducted to better utilize the advantages of deep learning techniques. In general, E2E ASR systems consist of multiple layers of encoder-decoder structure with attention. Therefore, E2E ASR systems require data with a large amount of speech-text paired data in order to achieve good performance. Obtaining speech-text paired data requires a lot of human labor and time, and is a high barrier to building E2E ASR system. Therefore, there are previous studies that improve the performance of E2E ASR system using relatively small amount of speech-text paired data, but most studies have been conducted by using only speech-only data or text-only data. In this study, we proposed a semi-supervised training method that enables E2E ASR system to perform well in corpus in different domains by using both speech or text only data. The proposed method works effectively by adapting to different domains, showing good performance in the target domain and not degrading much in the source domain.

A Study on Building Knowledge Base for Intelligent Battlefield Awareness Service

  • Jo, Se-Hyeon;Kim, Hack-Jun;Jin, So-Yeon;Lee, Woo-Sin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.4
    • /
    • pp.11-17
    • /
    • 2020
  • In this paper, we propose a method to build a knowledge base based on natural language processing for intelligent battlefield awareness service. The current command and control system manages and utilizes the collected battlefield information and tactical data at a basic level such as registration, storage, and sharing, and information fusion and situation analysis by an analyst is performed. This is an analyst's temporal constraints and cognitive limitations, and generally only one interpretation is drawn, and biased thinking can be reflected. Therefore, it is essential to aware the battlefield situation of the command and control system and to establish the intellignet decision support system. To do this, it is necessary to build a knowledge base specialized in the command and control system and develop intelligent battlefield awareness services based on it. In this paper, among the entity names suggested in the exobrain corpus, which is the private data, the top 250 types of meaningful names were applied and the weapon system entity type was additionally identified to properly represent battlefield information. Based on this, we proposed a way to build a battlefield-aware knowledge base through mention extraction, cross-reference resolution, and relationship extraction.

A Study on the Computational Model of Word Sense Disambiguation, based on Corpora and Experiments on Native Speaker's Intuition (직관 실험 및 코퍼스를 바탕으로 한 의미 중의성 해소 계산 모형 연구)

  • Kim, Dong-Sung;Choe, Jae-Woong
    • Korean Journal of Cognitive Science
    • /
    • v.17 no.4
    • /
    • pp.303-321
    • /
    • 2006
  • According to Harris'(1966) distributional hypothesis, understanding the meaning of a word is thought to be dependent on its context. Under this hypothesis about human language ability, this paper proposes a computational model for native speaker's language processing mechanism concerning word sense disambiguation, based on two sets of experiments. Among the three computational models discussed in this paper, namely, the logic model, the probabilistic model, and the probabilistic inference model, the experiment shows that the logic model is first applied fer semantic disambiguation of the key word. Nexr, if the logic model fails to apply, then the probabilistic model becomes most relevant. The three models were also compared with the test results in terms of Pearson correlation coefficient value. It turns out that the logic model best explains the human decision behaviour on the ambiguous words, and the probabilistic inference model tomes next. The experiment consists of two pans; one involves 30 sentences extracted from 1 million graphic-word corpus, and the result shows the agreement rate anong native speakers is at 98% in terms of word sense disambiguation. The other pm of the experiment, which was designed to exclude the logic model effect, is composed of 50 cleft sentences.

  • PDF

No-Reference Visibility Prediction Model of Foggy Images Using Perceptual Fog-Aware Statistical Features (시지각적 통계 특성을 활용한 안개 영상의 가시성 예측 모델)

  • Choi, Lark Kwon;You, Jaehee;Bovik, Alan C.
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.4
    • /
    • pp.131-143
    • /
    • 2014
  • We propose a no-reference perceptual fog density and visibility prediction model in a single foggy scene based on natural scene statistics (NSS) and perceptual "fog aware" statistical features. Unlike previous studies, the proposed model predicts fog density without multiple foggy images, without salient objects in a scene including lane markings or traffic signs, without supplementary geographical information using an onboard camera, and without training on human-rated judgments. The proposed fog density and visibility predictor makes use of only measurable deviations from statistical regularities observed in natural foggy and fog-free images. Perceptual "fog aware" statistical features are derived from a corpus of natural foggy and fog-free images by using a spatial NSS model and observed fog characteristics including low contrast, faint color, and shifted luminance. The proposed model not only predicts perceptual fog density for the entire image but also provides local fog density for each patch size. To evaluate the performance of the proposed model against human judgments regarding fog visibility, we executed a human subjective study using a variety of 100 foggy images. Results show that the predicted fog density of the model correlates well with human judgments. The proposed model is a new fog density assessment work based on human visual perceptions. We hope that the proposed model will provide fertile ground for future research not only to enhance the visibility of foggy scenes but also to accurately evaluate the performance of defog algorithms.

A Korean Homonym Disambiguation System Using Refined Semantic Information and Thesaurus (정제된 의미정보와 시소러스를 이용한 동형이의어 분별 시스템)

  • Kim Jun-Su;Ock Cheol-Young
    • The KIPS Transactions:PartB
    • /
    • v.12B no.7 s.103
    • /
    • pp.829-840
    • /
    • 2005
  • Word Sense Disambiguation(WSD) is one of the most difficult problem in Korean information processing. We propose a WSD model with the capability to filter semantic information using the specific characteristics in dictionary dictions, and nth added information, useful to sense determination, such as statistical, distance and case information. we propose a model, which can resolve the issues resulting from the scarcity of semantic information data based on the word hierarchy system (thesaurus) developed by Ulsan University's UOU Word Intelligent Network, a dictionary-based toxicological database. Among the WSD models elaborated by this study, the one using statistical information, distance and case information along with the thesaurus (hereinafter referred to as 'SDJ-X model') performed the best. In an experiment conducted on the sense-tagged corpus consisting of 1,500,000 eojeols, provided by the Sejong project, the SDJ-X model recorded improvements over the maximum frequency word sense determination (maximum frequency determination, MFC, accuracy baseline) of $18.87\%$ ($21.73\%$ for nouns and inter-eojeot distance weights by $10.49\%$ ($8.84\%$ for nouns, $11.51\%$ for verbs). Finally, the accuracy level of the SDJ-X model was higher than that recorded by the model using only statistical information, distance and case information, without the thesaurus by a margin of $6.12\%$ ($5.29\%$ for nouns, $6.64\%$ for verbs).

2-DG Autoradiographic Imaging of Brain Activity Patterns by Electroacupuncture Stimulation in Awake Rats (전침자극(電針刺戟)에 의한 흰쥐 중추신경계(中樞神經系)내 대사활성(代謝活性) 변화(變化)의 영상화(映像化) 연구(硏究))

  • Sohn, Young-Joo;Won, Ran;Jung, Hyuk-Sang;Kim, Yong-Suk;Park, Young-Bae;Sohn, Nak-Won
    • Journal of Acupuncture Research
    • /
    • v.18 no.3
    • /
    • pp.56-68
    • /
    • 2001
  • Objective : Functional brain mapping study on acupuncture stimulation using the [14C]2-deoxyglucose([14C]2-DG) autoradiography provides quantitative data and visualized pathway in central nervous system(CNS). We aimed to investigate the neural pathway and spatial distribution of metabolic activity elicited in CNS on electroacupuncture stimulation using [14C]2-DG autoradiography. Methods : The study were divided into three groups by stimulation times. 45-mins stimulation group according to Sokoloffs method, 5-mins stimulation group according to Duncun's method, and 15-mins stimulation group. ;A venous catheter was equipped into right jugular vein. The rats (Sprague-Dawley rats, 230-260g) were kept fastened loosely on a holding platform without anesthesia. Electroacupuncture stimulation (5 ms, 2 Hz, 1~3 mA) were applied on the left Zusanli (ST36) acupoint and [14C]2-DG ($25{\mu}Ci/rat$) injection was performed through the catheter. After sacrifice, the brain and the spinal cord were made to sections for film image. The film images were digitalized as the isotope concentration based upon comparison of optical densities with that of the standards and normalized by the optical density of corpus callosum. Results : 1. 15-mins stimulation group was most effective among 3 experiments. 2. On 15-mins stimulation group, medial geniculate nucleus, intetpeduncular nucleus intermedius, ventral periolivary nucleus, caudal periolivary nucleus, medial superior olive, lateral paragigantocellular nucleus, including hypothalamic arcuate nucleus were increased by more than 25% (at least, p<0.05) by electroacupuncture stimulation. 3. Especially, the metabolism in hypothalamic arcuate nucleus was increased by 90% (p<0.05). 4. The fact that arcuate nucleus of hypothalamus might play a role of interconnection area between ascending and descending pathway of acupuncture stimulation was demonstrated visually. Conclusions : Advanced study on electroacupuncture stimulation elicited significant increase of metabolic activity in various nuclei of hypothalamus will provide the important experimental basis in research of the relationship between electroacupuncture stimulation and internal visceral functions.

  • PDF

Comparison of vowel lengths of articles and monosyllabic nouns in Korean EFL learners' noun phrase production in relation to their English proficiency (한국인 영어학습자의 명사구 발화에서 영어 능숙도에 따른 관사와 단음절 명사 모음 길이 비교)

  • Park, Woojim;Mo, Ranm;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.33-40
    • /
    • 2020
  • The purpose of this research was to find out the relation between Korean learners' English proficiency and the ratio of the length of the stressed vowel in a monosyllabic noun to that of the unstressed vowel in an article of the noun phrases (e.g., "a cup", "the bus", etcs.). Generally, the vowels in monosyllabic content words are phonetically more prominent than the ones in monosyllabic function words as the former have phrasal stress, making the vowels in content words longer in length, higher in pitch, and louder in amplitude. This study, based on the speech samples from Korean-Spoken English Corpus (K-SEC) and Rated Korean-Spoken English Corpus (Rated K-SEC), examined 879 English noun phrases, which are composed of an article and a monosyllabic noun, from sentences which are rated on 4 levels of proficiency. The lengths of the vowels in these 879 target NPs were measured and the ratio of the vowel lengths in nouns to those in articles was calculated. It turned out that the higher the proficiency level, the greater the mean ratio of the vowels in nouns to the vowels in articles, confirming the research's hypothesis. This research thus concluded that for the Korean English learners, the higher the English proficiency level, the better they could produce the stressed and unstressed vowels with more conspicuous length differences between them.

A Study on the Integration of Information Extraction Technology for Detecting Scientific Core Entities based on Large Resources (대용량 자원 기반 과학기술 핵심개체 탐지를 위한 정보추출기술 통합에 관한 연구)

  • Choi, Yun-Soo;Cheong, Chang-Hoo;Choi, Sung-Pil;You, Beom-Jong;Kim, Jae-Hoon
    • Journal of Information Management
    • /
    • v.40 no.4
    • /
    • pp.1-22
    • /
    • 2009
  • Large-scaled information extraction plays an important role in advanced information retrieval as well as question answering and summarization. Information extraction can be defined as a process of converting unstructured documents into formalized, tabular information, which consists of named-entity recognition, terminology extraction, coreference resolution and relation extraction. Since all the elementary technologies have been studied independently so far, it is not trivial to integrate all the necessary processes of information extraction due to the diversity of their input/output formation approaches and operating environments. As a result, it is difficult to handle scientific documents to extract both named-entities and technical terms at once. In this study, we define scientific as a set of 10 types of named entities and technical terminologies in a biomedical domain. in order to automatically extract these entities from scientific documents at once, we develop a framework for scientific core entity extraction which embraces all the pivotal language processors, named-entity recognizer, co-reference resolver and terminology extractor. Each module of the integrated system has been evaluated with various corpus as well as KEEC 2009. The system will be utilized for various information service areas such as information retrieval, question-answering(Q&A), document indexing, dictionary construction, and so on.