• 제목/요약/키워드: corpora

검색결과 249건 처리시간 0.026초

SiTEC의 공동 이용을 위한 음성 코퍼스 구축 현황 및 계획 (Current States and Future Plans at SiTEC for Speech Corpora for Common Use)

  • 김봉완;최대림;김영일;이광현;이용주
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.175-185
    • /
    • 2003
  • To support speech information technology industry it is vital to create and distribute standardized speech corpora to be used for the development of products and technologies. In this article we introduce speech corpora created by Speech Information Technology & Industry Promotion Center(SiTEC) during its 1st and 2nd fiscal years (2001/5/1-2003/4/30) and plans for those corpora which is being created currently or will be created in near future. We introduce the corpus for car application to expand speech information technology to the field of traditional industry, the corpora for foreign languages to support exportation, the corpus for basic research for the sake of application in the industry, the corpora for common use, and others.

  • PDF

다양한 음성코퍼스의 통합관리시스템의 설계 및 구현에 관한 검토 (An Investigation for Design and Implementation of an Integrated Data Management System of Various Speech Corpora)

  • 황경훈;정창원;김영일;김봉완;이용주
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.69-72
    • /
    • 2003
  • In this paper, we investigate various factors that are relevant to design and implementation of an integrated management system for various speech corpora. The purpose of this paper is to manage an integrated management system for various kinds of speech corpora necessary for speech research and speech corpora consrtructed in different data formats. In addition, ways are considered to allow users to search with effect for speech corpora that meet various conditions which they want, and to allow them to add with ease corpora that are constructed newly. In order to achieve this goal, we design a global schema for an integrated management of new additional information without changing old speech corpora, and construct a web-based integrated management system based on the scheme that can be accessed without any temporal and spatial restrictions. And we show the steps by which these can be implemented, and describe related future study topics, examining the system.

  • PDF

다양한 음성코퍼스의 통합 관리시스템 구축 (Construction of Integration Management System of Various Speech Corpora)

  • 유경택;정창원;김도관;이용주
    • 한국컴퓨터정보학회논문지
    • /
    • 제11권1호
    • /
    • pp.259-271
    • /
    • 2006
  • 본 논문에서는 다양한 음성코퍼스의 통합 관리 시스템을 설계하고 구현하기 위한 여러 고려 사항들을 검토 하고자 한다. 본 논문의 목적은 음성 연구에 필요한 다양한 음성 데이터베이스의 종류 그리고 서로 다른 데이터 형태로 구축된 음성코퍼스를 통합적으로 관리하는데 있다. 또한, 부가적으로 사용자가 요청하는 다양한 조건에 맞는 음성 데이터들을 효과적으로 검색 가능하고 새로 구성된 음성코퍼스를 손쉽게 추가 할 수 있도록 고려하였다. 이를 위해 기존의 구축된 음성코퍼스의 수정 없이 새로운 정보를 통합 관리하기 위한 전역 스키마(global schema)를 설계하고, 이를 기반으로 시 공간의 제약 없이 액세스 할 수 있는 웹 기반의 통합 관리 시스템을 구축하였다. 끝으로 서비스에 포함된 수행 결과인 웹기반 인터페이스를 기술하고, 통합 관리 시스템을 구현하기 위해 인덱스 뷰를 사용한 효과성을 보인다.

  • PDF

Bilingual lexicon induction through a pivot language

  • Kim, Jae-Hoon;Seo, Hyeong-Won;Kwon, Hong-Seok
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제37권3호
    • /
    • pp.300-306
    • /
    • 2013
  • This paper presents a new method for constructing bilingual lexicons through a pivot language. The proposed method is adapted from the context-based approach, called the standard approach, which is well-known for building bilingual lexicons using comparable corpora. The main difference between the standard approach and the proposed method is how to represent context vectors. The former is to represent context vectors in a target language, while the latter in a pivot language. The proposed method is very simplified from the standard approach thereby. Furthermore, the proposed method is more accurate than the standard approach because it uses parallel corpora instead of comparable corpora. The experiments are conducted on a language pair, Korean and Spanish. Our experimental results have shown that the proposed method is quite attractive where a parallel corpus directly between source and target languages are unavailable, but both source-pivot and pivot-target parallel corpora are available.

흰쥐황체에서 MCP-1과 큰포식세포아형의 역할에 관한 면역조직화학적 연구 (Immunohistochemical Study on Role of the Monocyte Chemoattractant Protein-1 and Macrophage Subpopulations in the Rat Corpora Luteum)

  • 조근자;김원식;김수일
    • 한국발생생물학회지:발생과생식
    • /
    • 제13권1호
    • /
    • pp.51-57
    • /
    • 2009
  • 큰포식세포나 혈관의 내피세포 등에서 분비되는 monocyte chemoattractant protein-1(MCP-1)은 큰포식세포의 활성을 조절하고 황체의 용해시기에는 용해를 개시, 촉진시키는 작용을 하는 것으로 알려져 있다. 그러나 아직 임신 황체나 출산 후의 황체의 발달과 유지에 대한 MCP-1의 작용기전은 확실히 알려져 있지 않다. 난포발달과정에서 큰포식세포의 역할을 알아보기 위해서 흰쥐를 실험동물로 임신시기별, 출산 후 황체에서 TUNEL 염색, ED1, ED2 및 MCP-1에 대한 면역조직화학을 실시하였다. 출산 후 황체에서 큰포식세포의 수가 의미 있게 증가하였으며, 큰포식세포에 대한 ED1, ED2의 면역반응성이 증가하였고, MCP-1의 면역반응성도 크게 증가하였다. 본 연구의 결과 출산 후 황체에서는 큰포식세포가 주로 탐식작용을 하게 되지만, 임신 황체에서는 황체의 구조와 기능을 유지하는데 주로 관여할 것으로 생각된다.

  • PDF

Using Small Corpora of Critiques to Set Pedagogical Goals in First Year ESP Business English

  • Wang, Yu-Chi;Davis, Richard Hill
    • 아시아태평양코퍼스연구
    • /
    • 제2권2호
    • /
    • pp.17-29
    • /
    • 2021
  • The current study explores small corpora of critiques written by Chinese and non-Chinese university students and how strategies used by these writers compare with high-rated L1 students. Data collection includes three small corpora of student writing; 20 student critiques in 2017, 23 student critiques from 2018, and 23 critiques from the online Michigan MICUSP collection at the University of Michigan. The researchers employ Text Inspector and Lexical Complexity to identify university students' vocabulary knowledge and awareness of syntactic complexity. In addition, WMatrix4® is used to identify and support the comparison of lexical and semantic differences among the three corpora. The findings indicate that gaps between Chinese and non-Chinese writers in the same university classes exist in students' knowledge of grammatical features and interactional metadiscourse. In addition, critiques by Chinese writers are more likely to produce shorter clauses and sentences. In addition, the mean value of complex nominal and coordinate phrases is smaller for Chinese students than for non-Chinese and MICUSP writers. Finally, in terms of lexical bundles, Chinese student writers prefer clausal bundles instead of phrasal bundles, which, according to previous studies, are more often found in texts of skilled writers. The current study's findings suggest incorporating implicit and explicit instruction through the implementation of corpora in language classrooms to advance skills and strategies of all, but particularly of Chinese writers of English.

From Tombstones to Corpora: TSML for Research on Language, Culture, Identity and Gender Differences

  • Streiter, Oliver;Voltmer, Leonhard;Goudin, Yoann
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2007년도 정기학술대회
    • /
    • pp.450-458
    • /
    • 2007
  • Tombstone inscriptions represent a linguistic genre which yields insights in culture and language. Creating corpora from tombstones is thus a complementary approach for the study of languages and cultures. For the annotation of tombstone corpora, we propose TSML, the Tombstone-Markup-Language, developed during the massive annotation of Taiwanese tombstones and a number of tombstones from China, Indonesia and Europe. We discuss our conceptual framework in the annotation of tombstones and derive successively and present preliminary research data to show how the usefulness of the annotations. Finally, we will encourage researchers to participate in the specification of TSML to obtain soon an annotation language for annotations across cultures and languages.

  • PDF

초음파유도 난포란 채취를 위한 기본 기술의 개발 I. 초음파상에 나타난 한우 난소, 난포 및 황체의 크기 측정 (Development of Basic Techniques for Ultrasound-guided Follicular Aspiration I. Measurement of Size of Ovaries, Follicles and Corpora Lutea of Korean Native Cows by Ultrasonography)

  • 최민철;강태영;조성근;최상용;손우진;이효종
    • 한국수정란이식학회지
    • /
    • 제12권2호
    • /
    • pp.203-209
    • /
    • 1997
  • This study was carried out to compare the actual size(length and height) of ovaries, follicles and corpora lutea of Korean native cow with those on sonograms. We used 3 different probes(3.5 MHz abdominal probe, 6.5 MHz transvaginal probe and 5.0 MHz transrectal probe) and a calipher for measurements of ovaries, follicles and corpora lutea on sonograms and actual size. Under water immersion, 157 ovaries were scanned with 3 probes and measured in actual size and compared each other. The average height and width of ovaries of Korean native cows were 17.40$\pm$3.99 and 34.23$\pm$6.02mm, respectively. In comparison of height, length of ovaries and preovulation follicles, we found that image with a transvaginal probe was nearly the same as the actual size(p<0.01), but with an abdominal probe the image was appeared larger than the actual size. In measurement(diameter) of preovulation follicles the transvaginal probe was proven to be more accurate to the actual size than other probes and in corpus luteum measurement all probes were accurate. In the comparison of number of follicles by different size ranges, there was no statistical difference in the count of follicles over 10 mm in diameter between the transvaginal probe and naked eyes.

  • PDF

커널 Ripple-Down Rule을 이용한 태깅 말뭉치 오류 자동 수정 (Automatic Correction of Errors in Annotated Corpus Using Kernel Ripple-Down Rules)

  • 박태호;차정원
    • 정보과학회 논문지
    • /
    • 제43권6호
    • /
    • pp.636-644
    • /
    • 2016
  • 자연어처리에서 기계학습을 위한 학습 말뭉치는 매우 중요하다. 정제된 대량의 말뭉치는 자연어처리 시스템에 직접 영향을 준다. 본 논문에서는 대량의 말뭉치 오류를 자동으로 수정하는 새로운 방법을 제안한다. 오류 말뭉치와 정답 말뭉치에서 사람이 태깅한 문서의 특성을 반영한 수정 규칙을 자동으로 생성하였다. 수정 규칙은 RDR(Ripple-Down Rules)를 사용하여 표현하였다. 수정 방법의 가치를 보이기 위해 품사 부착 말뭉치와 개체명 부착 말뭉치에 대해서 실험하였으며 두 분야에서 유의미한 결과를 보였다. 이 방법은 대량의 말뭉치를 제작할 때 오류를 최소화하는 방법으로 사용이 가능하다.

Immunolocalization of Allatotropin Neuropeptide in the Developing Brain of the Silk Moth Bombyx mori

  • Park, Cheolin;Lee, Bong-Hee
    • Animal cells and systems
    • /
    • 제5권3호
    • /
    • pp.211-216
    • /
    • 2001
  • Polyclonal antiserum against Manduca sexta allatotropin has been utilized to investigate the localization of allatotropin-immunoreactivity in the brain of the si1k moth Bombyx mori. Manduca sexta allatotropin-immunoreactive (Mas-AT-IR) neurons were found in all larval brains investigated, but not in prepupal, pupal and adult brains. In the larval stages, first appearance of Mas-AT-immunoreactivity w8s shown in the brain of first instar larvae, which contains four pairs of bilateral Mas-AT-IR cell bodies. Labeled neurons increased to six pairs in the second instar larval brain, including two pairs of median neurosecretory cells in the pars intercerebralis. In the third and fourth instar larvae, five pairs of labeled cell bodies were distributed throughout each brain. In the fifth instar, there were about ten pairs of bilateral cell bodies in the day-1 brain, about seven pairs in the day-3 brains, and five pairs in the day-5 brains, respectively. Mas-AT-labeling was observed in both axons within nervi corpora cavdiaci (NCC) 1+11 and corpora allata. This suggests that the Mas-AT produced from the brain neurons is transported via some axons of the NCC 1+11 and nervi corpora allati I to the corpora allata, which appears to be a main accumulation site for the Mas-AT neuropeptide in some brain neurons produced in B. mori.

  • PDF