• 제목/요약/키워드: Korean corpus

검색결과 1,186건 처리시간 0.034초

초등학교 교과서의 어휘 통계 분석 연구 : 한국어 세종 코퍼스와의 비교를 중심으로 (The Study Of Lexical Statistics Analysis For Elementary School Textbook : Focusing On Comparing The SEJONG Corpus In Korean)

  • 유원희;임희석
    • 컴퓨터교육학회논문지
    • /
    • 제18권1호
    • /
    • pp.99-108
    • /
    • 2015
  • 본 논문에서는 초등학교 교과서 말뭉치를 구축하고, 초등교과서에서 나타나는 어휘들에 대하여 통계분석을 실시하였다. 또한 초등 교과서가 일반생활에서 사용하는 어휘와 얼마나 유사한지를 살펴보기 위하여 스피어만 상관관계 분석을 실시하였다. 연구결과로 초등교과서의 말뭉치 구축 모습과 실제 예시를 보였고, 상관관계 분석을 통하여 초등교과서와 일반 말뭉치와의 상관관계를 수치적으로 보였다.

한국어 대용량발화말뭉치의 단모음분석 (Monophthong Analysis on a Large-scale Speech Corpus of Read-Style Korean)

  • 윤태진;강윤정
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.139-145
    • /
    • 2014
  • The paper describes methods of conducting vowel analysis from a large-scale corpus with the aids of forced alignment and optimal formant ceiling methods. 'Read Style Corpus of Standard Korean' is used for building the forced alignment system and a subset of the corpus for the processing and extraction of features for vowel analysis based on optimal formant ceiling. The results of the vowel analysis are reliable and comparable to the results obtained using traditional analytical methods. The findings indicate that the methods adopted for the analysis can be extended and be used for more fine-grained analysis without time-consuming manual labeling without losing accuracy and reliability.

언어모델 인터뷰 영향 평가를 통한 텍스트 균형 및 사이즈간의 통계 분석 (Statistical Analysis Between Size and Balance of Text Corpus by Evaluation of the effect of Interview Sentence in Language Modeling)

  • 정의정;이영직
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2002년도 하계학술발표대회 논문집 제21권 1호
    • /
    • pp.87-90
    • /
    • 2002
  • This paper analyzes statistically the relationship between size and balance of text corpus by evaluation of the effect of interview sentences in language model for Korean broadcast news transcription system. Our Korean broadcast news transcription system's ultimate purpose is to recognize not interview speech, but the anchor's and reporter's speech in broadcast news show. But the gathered text corpus for constructing language model consists of interview sentences a portion of the whole, $15\%$ approximately. The characteristic of interview sentence is different from the anchor's and the reporter's in one thing or another. Therefore it disturbs the anchor and reporter oriented language modeling. In this paper, we evaluate the effect of interview sentences in language model for Korean broadcast news transcription system and analyze statistically the relationship between size and balance of text corpus by making an experiment as the same procedure according to varying the size of corpus.

  • PDF

杜冲의 토끼 음경해면체 평활근 이완효과 (Relaxation Effects of Eucomiae Cortex in Isolated Rabbit Corpus Cavernosum Smooth Muscle)

  • 박선영
    • 동의생리병리학회지
    • /
    • 제29권6호
    • /
    • pp.485-491
    • /
    • 2015
  • This study was aimed to investigate the relaxation effects of Eucomiae Cortex (EC) extract in isolated rabbit corpus cavernosum smooth muscle and its mechanism. To evaluate the relaxation of EC extract in rabbit corpus cavernosum, EC extract was treated in corporal strips which were precontracted with phenylephrine(PE). To study its mechanism, Nω-nitro-L-arginine (L-NNA) was pretreated after infuse of EC extract and compared with non-treated. In calcium chloride (Ca2+) -free krebs solution, EC extract and Ca2+ 1 mM were infused by turns after Ca2+ 1 mM was treated into corporal strips contracted by PE. Cell ability, nitric oxide (NO) and epithelial nitric oxide synthase (eNOS) on human umbilical vein endothelial cell (HUVEC) were measured by MTT assay, Griess reagent system and histochemical, immunohistochemical methods. EC extract showed a significant relaxation effects on the corporal strips, this effects were inhibited by pretreatment of L-NNA. EC extract inhibited the increase of contraction by Ca2+ influx in Ca2+-free krebs solution, and eNOS positive reaction in corpus cavernosum, NO production in HUVEC increased by treatment of EC extract. These result suggest that the relaxation effects of EC extract in isolated corpus cavernosum smooth muscle are involved in increase of eNOS and NO production, blocking of extracellular Ca2+ influx.

Using Corpora for Studying English Grammar

  • Kwon, Heok-Seung
    • 한국영어학회지:영어학
    • /
    • 제4권1호
    • /
    • pp.61-81
    • /
    • 2004
  • This paper will look at some grammatical phenomena which will illustrate some of the questions that can be addressed with a corpus-based approach. We will use this approach to investigate the following subjects in English grammar: number ambiguity, subject-verb concord, concord with measure expressions, and (reflexive) pronoun choice in coordinated noun phrases. We will emphasize the distinctive features of the corpus-based approach, particularly its strengths in investigating language use, as opposed to traditional descriptions or prescriptions of structure in English grammar. This paper will show that a corpus-based approach has made it possible to conduct new kinds of investigations into grammar in use and to expand the scope of earlier investigations. Native speakers rarely have accurate information about frequency of use. A large representative corpus (i.e., The British National Corpus) is one of the most reliable sources of frequency information. It is important to base an analysis of language on real data rather than intuition. Any description of grammar is more complete and accurate if it is based on a body of real data.

  • PDF

코퍼스를 통한 고등학교 영어교과서의 어휘 분석 (Usage analysis of vocabulary in Korean high school English textbooks using multiple corpora)

  • 김영미;서진희
    • 영어어문교육
    • /
    • 제12권4호
    • /
    • pp.139-157
    • /
    • 2006
  • As the Communicative Approach has become the norm in foreign language teaching, the objectives of teaching English in school have changed radically in Korea. The focus in high school English textbooks has shifted from mere mastery of structures to communicative proficiency. This paper will study five polysemous words which appear in twelve high school English textbooks used in Korea. The twelve text books are incorporated into a single corpus and analyzed to classify the usage of the selected words. Then the usage of each word was compared with that of three other corpora based sources: the BNC(British National Corpus) Sampler, ICE Singapore(International Corpus of English for Singapore) and Collins COBUILD learner's dictionary which is based on the corpus, "The Bank of English". The comparisons carried out as part of this study will demonstrate that Korean text books do not always supply the full range of meanings of polysemous words.

  • PDF

한국어 품사 부착 말뭉치의 오류 검출 및 수정 (Detecting and correcting errors in Korean POS-tagged corpora)

  • 최명길;서형원;권홍석;김재훈
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제37권2호
    • /
    • pp.227-235
    • /
    • 2013
  • 품사 부착 말뭉치의 품질은 품사 부착기를 개발하는데 있어서 매우 중요한 역할을 수행한다. 그러나 세종 말뭉치를 비롯하여 한국에서 구축된 많은 품사 부착 말뭉치들은 여전히 다양한 형태의 오류를 포함하고 있다. 이런 오류들을 살펴보면 품사 부착 오류는 물론이고 철자 오류, 문자의 삽입 및 삭제 등 매우 다양하다. 본 논문에서는 오류 패턴을 이용하여 품사 부착 오류를 검출하고 이를 효과적으로 수정하는 도구를 개발한다. 제안된 방법과 도구를 이용해서 오류를 수정할 경우 평균 9배 이상 빠르게 오류를 수정할 수 있어서 이 방법이 매우 효과적인 방법임을 확인할 수 있었다.

배추흰나비 5령유충의 뇌신경내분비계의 구조 (Architecture of Cerebral Neuroendocrine System in the Lawa of Cabbage Butterfly Pieris rapue)

  • 이봉희;윤혜련심재원
    • 한국동물학회지
    • /
    • 제36권2호
    • /
    • pp.285-292
    • /
    • 1993
  • This investigation has been carried out to clarify structural architecture of cerebral neuroendocrine systems in the fifth instar lanra of cabbage butterfly Pieris rapae. In order to examine the cerebral neurosecretorv cell systems the brain and retrocerebral neuroendocrine complex were histochemically stained with the paraldehvde fuchsin. The brain of the fifth instar laMa contains three kinds of neurosecretorv cells: medial, lateral and tritocerebral neurosecretorv cells. The axon bundles of medial and lateral neurosecretory cells form medial neurosecretory pathway(MNSP) and lateral neurosecretorv pathwav(LNSP) within the brain respectively. Especially, prior to exiting the brain, the axon bundles of medial neurosecretorH cells located in both left and right cefebral hemispheres decussate in cerebral medial region and project to contralateral retrocerebral neuroendocrine complexes. Outside the brain the axon bundles of medial and lateral neurosecretory cells form the nenri corporis cardiaca(NCC) I and II respectively. The NCC I and ll run together to the retrocerebral neuroendocrine complex, forming the large nenre bundles in both left md right sides. The anon bundles of tritocerebral neurosecretory cells which pass through the brain along the tritocerebral neurosecretory pathway (TNSP) form the Ncc III outids the train. some of the Ncc I and it terminate in the corpus cardiacum, while the others pass through the corpus cardiacum without termination. The nerve bundle which passes the corpus cardiacum forms the nenrus corforis allatum(NCA) I which runs between the corpus cardiacum and the corpus allatum. Theyt are finally innervated to the corpus allatum. The Ncc III Projects to the corpus cardiRcum. However, most of NCC III priss through the corpus cardiacum without branching and then run down for another organ.

  • PDF

말뭉치와 개념정보를 이용한 명사 중의성 해소 방법 (Noun Sense Disambiguation Based-on Corpus and Conceptual Information)

  • 이휘봉;허남원;문경희;이종혁
    • 인지과학
    • /
    • 제10권2호
    • /
    • pp.1-10
    • /
    • 1999
  • 본 노문에서는 말뭉치와 개념정보에 기반한 명사 중의성 해소 방법을 제안하다. 지곤의 연구에서는 대부분 어휘의 공기 정보을 이용하고있으나, 이러한 방법은 많은 저장공간이 필요하고, 적용률이 크지 않다는 단점이 있다. 본 논무에서는 자동으로 의미 태깅된 한국어 말뭉치에서 추출된 공기 개념정보를 이용하여 명사 중의성을 해소하는 방법을 제안한다. 제안한 방법의 평가 실험에서 기본의미를 정하는 것보다 1.6% 높은 평균 82.4%의 정확률을 보였다. 실험 문장들이 학습문장과 다른 것을 고려하면, 제안된 방법이 어휘 중의성 해소에 유용함을 보여주는 결과라고 할 수 있다.

  • PDF

소에서 비임신 및 임신 상태의 난소 형태와 혈중 progesterone 농도 변화에 의한 조기 임신진단 (A study on the early pregnancy diagnosis by changing of plasma progesterone concentration and morphology of ovary in pregnancy and non -pregnancy cows)

  • 김철호;박종식;신정섭;강정부
    • 한국동물위생학회지
    • /
    • 제31권3호
    • /
    • pp.397-414
    • /
    • 2008
  • In order to evaluate conception rate of Hanwoo in northwestern region of Gyeongsang-nam-do, we investigated conception rate and reduction of reproductive disorder rate after artificial insemination (AI) in 1,000 heads of breeding cows, This study showed that 80.9% of cows were classified as fertility after 1st and 2nd AI. For a accurate pregnancy diagnosis with practicing ovariectomy and histeotomy, we comparatively investigated each of 80 slaughtered cows, including 30 of non-pregnancy, and used enzyme-linked immunosorbent assay (ELISA) for estimation of plasma progesterone concentration and serum luteal hormone. The mean diameter of non-pregnant corpus luteum is $18.9{\pm}4.2{\times}15.6{\pm}3.6 mm$ and that of pregnant corpus luteum is $22.5{\pm}2.7{\times}18.7{\pm}2.9 mm$. This indicates that corpus luteum is more developed in the ovary of pregnant than non-pregnant cows (P<0.05). The diameter of pregnant corpus luteum according to the stage of pregnancy showed $21.3{\pm}2.4{\pm}18.4{\pm}2.6 mm$ in early stage (1-3 month), $23.4{\pm}2.8{\times}19.1{\pm}2.7 mm$ in middle stage (4-6 month) and $22.8{\pm}3.0{\times}18.8{\pm}2.4mm$, in last stage (7-9 month). This indicates that corpus luteum in middle and last stage is more significantly developed than that of early stage(P<0.05). The mean plasma progesterone concentration of cows showing size of non-pregnant corpus luteum was $4.58{\pm}0.92ng/ml$ and that of pregnant corpus luteum $8.26{\pm}0.98ng/ml$. Thus, it was more significantly increased in pregnant corpus luteum(P<0.02).. However, it was low to $0.58{\pm}0.39ng/ml$. in estrus (corpus albicans). The plasma progesterone concentration according to gestation period was high in proportion to the degree of development in corpus luteum and more significantly increased (P<0.05) and maintained in middle and last state than early state. The concentration was sharply decreased to $0.56{\pm}0.32ng/ml$ at parturition. As a consequence, we can practice the early pregnancy diagnosis by confirming non-pregnancy when the mean plasma progesterone concentration is below 1ng/ml 19 to 22 days after AI and this can be available to diagnose reproductive disorder.