• Title/Summary/Keyword: corpora

Search Result 249, Processing Time 0.025 seconds

Current States and Future Plans at SiTEC for Speech Corpora for Common Use (SiTEC의 공동 이용을 위한 음성 코퍼스 구축 현황 및 계획)

  • Kim Bong-Wan;Choi Dae-Lim;Kim Young-Il;Lee Kwang-Hyun;Lee Yong-Ju
    • MALSORI
    • /
    • no.46
    • /
    • pp.175-185
    • /
    • 2003
  • To support speech information technology industry it is vital to create and distribute standardized speech corpora to be used for the development of products and technologies. In this article we introduce speech corpora created by Speech Information Technology & Industry Promotion Center(SiTEC) during its 1st and 2nd fiscal years (2001/5/1-2003/4/30) and plans for those corpora which is being created currently or will be created in near future. We introduce the corpus for car application to expand speech information technology to the field of traditional industry, the corpora for foreign languages to support exportation, the corpus for basic research for the sake of application in the industry, the corpora for common use, and others.

  • PDF

An Investigation for Design and Implementation of an Integrated Data Management System of Various Speech Corpora (다양한 음성코퍼스의 통합관리시스템의 설계 및 구현에 관한 검토)

  • Hwang Kyunghun;Jeong Changwon;Kim Youngil;Kim Bongwan;Lee Yongju
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.69-72
    • /
    • 2003
  • In this paper, we investigate various factors that are relevant to design and implementation of an integrated management system for various speech corpora. The purpose of this paper is to manage an integrated management system for various kinds of speech corpora necessary for speech research and speech corpora consrtructed in different data formats. In addition, ways are considered to allow users to search with effect for speech corpora that meet various conditions which they want, and to allow them to add with ease corpora that are constructed newly. In order to achieve this goal, we design a global schema for an integrated management of new additional information without changing old speech corpora, and construct a web-based integrated management system based on the scheme that can be accessed without any temporal and spatial restrictions. And we show the steps by which these can be implemented, and describe related future study topics, examining the system.

  • PDF

Construction of Integration Management System of Various Speech Corpora (다양한 음성코퍼스의 통합 관리시스템 구축)

  • Rhyu, Kyeong-Taek;Jeong, Chang-Won;Kim, Do-Goan;Lee, Young-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.1 s.39
    • /
    • pp.259-271
    • /
    • 2006
  • In this paper, we propose relevant to design and implementation of an integrated management system for various speech corpora. The purpose of this paper is to manage an integrated management system for various kinds of speech corpora necessary for speech research and speech corpora constructed in different data formats. In addition, ways are considered to allow users to search with effect for speech corpora that meet various conditions which they want, and to allow them to add with ease corpora that are constructed newly. In order to achieve this goal, we design a global schema for an integrated management of new additional information without changing old speech corpora, and construct a web-based integrated management system based on the scheme that can be accessed without any temporal and spatial restrictions. Finally, we describe the web based interface which are the executed results involved in the service and show the efficiency of using the index view for implementation of integrated management system.

  • PDF

Bilingual lexicon induction through a pivot language

  • Kim, Jae-Hoon;Seo, Hyeong-Won;Kwon, Hong-Seok
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.37 no.3
    • /
    • pp.300-306
    • /
    • 2013
  • This paper presents a new method for constructing bilingual lexicons through a pivot language. The proposed method is adapted from the context-based approach, called the standard approach, which is well-known for building bilingual lexicons using comparable corpora. The main difference between the standard approach and the proposed method is how to represent context vectors. The former is to represent context vectors in a target language, while the latter in a pivot language. The proposed method is very simplified from the standard approach thereby. Furthermore, the proposed method is more accurate than the standard approach because it uses parallel corpora instead of comparable corpora. The experiments are conducted on a language pair, Korean and Spanish. Our experimental results have shown that the proposed method is quite attractive where a parallel corpus directly between source and target languages are unavailable, but both source-pivot and pivot-target parallel corpora are available.

Immunohistochemical Study on Role of the Monocyte Chemoattractant Protein-1 and Macrophage Subpopulations in the Rat Corpora Luteum (흰쥐황체에서 MCP-1과 큰포식세포아형의 역할에 관한 면역조직화학적 연구)

  • Cho, Keun-Ja;Kim, Won-Sik;Kim, Soo-Il
    • Development and Reproduction
    • /
    • v.13 no.1
    • /
    • pp.51-57
    • /
    • 2009
  • Monocyte chemoattractant protein-1(MCP-1) is released from the macrophages and endothelial cells, regulated luteotropic and luteolytic actions of macrophages and induced luteolysis. However, the mechanisms of MCP-1 on the development and maintenance of pregnant corpora lutea are thoroughly unknown. In this experiment, TUNEL stain, ED1, ED2, and MCP-1 immunohistochemistry on the corpora lutea of pregnant rats were carried out to reveal the role of macrophages in the developing corpora lutea. In the postpartum corpora lutea, the number of macrophages was increased significantly, and the intensity of ED1 and ED2 immunoreactivity in macrophages were increased moderately, and MCP-1 immunoreactivity was also increased. In conclusion, macrophages in the postpartum corpora lutea may exert phagocytic action mainly, and the macrophages in the pregnant corpora lutea maintain the structure and function of lutein cells.

  • PDF

Using Small Corpora of Critiques to Set Pedagogical Goals in First Year ESP Business English

  • Wang, Yu-Chi;Davis, Richard Hill
    • Asia Pacific Journal of Corpus Research
    • /
    • v.2 no.2
    • /
    • pp.17-29
    • /
    • 2021
  • The current study explores small corpora of critiques written by Chinese and non-Chinese university students and how strategies used by these writers compare with high-rated L1 students. Data collection includes three small corpora of student writing; 20 student critiques in 2017, 23 student critiques from 2018, and 23 critiques from the online Michigan MICUSP collection at the University of Michigan. The researchers employ Text Inspector and Lexical Complexity to identify university students' vocabulary knowledge and awareness of syntactic complexity. In addition, WMatrix4® is used to identify and support the comparison of lexical and semantic differences among the three corpora. The findings indicate that gaps between Chinese and non-Chinese writers in the same university classes exist in students' knowledge of grammatical features and interactional metadiscourse. In addition, critiques by Chinese writers are more likely to produce shorter clauses and sentences. In addition, the mean value of complex nominal and coordinate phrases is smaller for Chinese students than for non-Chinese and MICUSP writers. Finally, in terms of lexical bundles, Chinese student writers prefer clausal bundles instead of phrasal bundles, which, according to previous studies, are more often found in texts of skilled writers. The current study's findings suggest incorporating implicit and explicit instruction through the implementation of corpora in language classrooms to advance skills and strategies of all, but particularly of Chinese writers of English.

From Tombstones to Corpora: TSML for Research on Language, Culture, Identity and Gender Differences

  • Streiter, Oliver;Voltmer, Leonhard;Goudin, Yoann
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.450-458
    • /
    • 2007
  • Tombstone inscriptions represent a linguistic genre which yields insights in culture and language. Creating corpora from tombstones is thus a complementary approach for the study of languages and cultures. For the annotation of tombstone corpora, we propose TSML, the Tombstone-Markup-Language, developed during the massive annotation of Taiwanese tombstones and a number of tombstones from China, Indonesia and Europe. We discuss our conceptual framework in the annotation of tombstones and derive successively and present preliminary research data to show how the usefulness of the annotations. Finally, we will encourage researchers to participate in the specification of TSML to obtain soon an annotation language for annotations across cultures and languages.

  • PDF

Development of Basic Techniques for Ultrasound-guided Follicular Aspiration I. Measurement of Size of Ovaries, Follicles and Corpora Lutea of Korean Native Cows by Ultrasonography (초음파유도 난포란 채취를 위한 기본 기술의 개발 I. 초음파상에 나타난 한우 난소, 난포 및 황체의 크기 측정)

  • 최민철;강태영;조성근;최상용;손우진;이효종
    • Journal of Embryo Transfer
    • /
    • v.12 no.2
    • /
    • pp.203-209
    • /
    • 1997
  • This study was carried out to compare the actual size(length and height) of ovaries, follicles and corpora lutea of Korean native cow with those on sonograms. We used 3 different probes(3.5 MHz abdominal probe, 6.5 MHz transvaginal probe and 5.0 MHz transrectal probe) and a calipher for measurements of ovaries, follicles and corpora lutea on sonograms and actual size. Under water immersion, 157 ovaries were scanned with 3 probes and measured in actual size and compared each other. The average height and width of ovaries of Korean native cows were 17.40$\pm$3.99 and 34.23$\pm$6.02mm, respectively. In comparison of height, length of ovaries and preovulation follicles, we found that image with a transvaginal probe was nearly the same as the actual size(p<0.01), but with an abdominal probe the image was appeared larger than the actual size. In measurement(diameter) of preovulation follicles the transvaginal probe was proven to be more accurate to the actual size than other probes and in corpus luteum measurement all probes were accurate. In the comparison of number of follicles by different size ranges, there was no statistical difference in the count of follicles over 10 mm in diameter between the transvaginal probe and naked eyes.

  • PDF

Automatic Correction of Errors in Annotated Corpus Using Kernel Ripple-Down Rules (커널 Ripple-Down Rule을 이용한 태깅 말뭉치 오류 자동 수정)

  • Park, Tae-Ho;Cha, Jeong-Won
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.636-644
    • /
    • 2016
  • Annotated Corpus is important to understand natural language using machine learning method. In this paper, we propose a new method to automate error reduction of annotated corpora. We use the Ripple-Down Rules(RDR) for reducing errors and Kernel to extend RDR for NLP. We applied our system to the Korean Wikipedia and blog corpus errors to find the annotated corpora error type. Experimental results with various views from the Korean Wikipedia and blog are reported to evaluate the effectiveness and efficiency of our proposed approach. The proposed approach can be used to reduce errors of large corpora.

Immunolocalization of Allatotropin Neuropeptide in the Developing Brain of the Silk Moth Bombyx mori

  • Park, Cheolin;Lee, Bong-Hee
    • Animal cells and systems
    • /
    • v.5 no.3
    • /
    • pp.211-216
    • /
    • 2001
  • Polyclonal antiserum against Manduca sexta allatotropin has been utilized to investigate the localization of allatotropin-immunoreactivity in the brain of the si1k moth Bombyx mori. Manduca sexta allatotropin-immunoreactive (Mas-AT-IR) neurons were found in all larval brains investigated, but not in prepupal, pupal and adult brains. In the larval stages, first appearance of Mas-AT-immunoreactivity w8s shown in the brain of first instar larvae, which contains four pairs of bilateral Mas-AT-IR cell bodies. Labeled neurons increased to six pairs in the second instar larval brain, including two pairs of median neurosecretory cells in the pars intercerebralis. In the third and fourth instar larvae, five pairs of labeled cell bodies were distributed throughout each brain. In the fifth instar, there were about ten pairs of bilateral cell bodies in the day-1 brain, about seven pairs in the day-3 brains, and five pairs in the day-5 brains, respectively. Mas-AT-labeling was observed in both axons within nervi corpora cavdiaci (NCC) 1+11 and corpora allata. This suggests that the Mas-AT produced from the brain neurons is transported via some axons of the NCC 1+11 and nervi corpora allati I to the corpora allata, which appears to be a main accumulation site for the Mas-AT neuropeptide in some brain neurons produced in B. mori.

  • PDF