• Title/Summary/Keyword: English Word

Search Result 576, Processing Time 0.031 seconds

A Feature -Based Word Spotting for Content-Based Retrieval of Machine-Printed English Document Images (내용기반의 인쇄체 영문 문서 영상 검색을 위한 특징 기반 단어 검색)

  • Jeong, Gyu-Sik;Gwon, Hui-Ung
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1204-1218
    • /
    • 1999
  • 문서영상 검색을 위한 디지털도서관의 대부분은 논문제목과/또는 논문요약으로부터 만들어진 색인에 근거한 제한적인 검색기능을 제공하고 있다. 본 논문에서는 영문 문서영상전체에 대한 검색을 위한 단어 영상 형태 특징기반의 단어검색시스템을 제안한다. 본 논문에서는 검색의 효율성과 정확도를 높이기 위해 1) 기존의 단어검색시스템에서 사용된 특징들을 조합하여 사용하며, 2) 특징의 개수 및 위치뿐만 아니라 특징들의 순서를 포함하여 매칭하는 방법을 사용하며, 3) 특징비교에 의해 검색결과를 얻은 후에 여과목적으로 문자인식을 부분적으로 적용하는 2단계의 검색방법을 사용한다. 제안된 시스템의 동작은 다음과 같다. 문서 영상이 주어지면, 문서 영상 구조가 분석되고 단어 영역들의 조합으로 분할된다. 단어 영상의 특징들이 추출되어 저장된다. 사용자의 텍스트 질의가 주어지면 이에 대응되는 단어 영상이 만들어지며 이로부터 영상특징이 추출된다. 이 참조 특징과 저장된 특징들과 비교하여 유사한 단어를 검색하게 된다. 제안된 시스템은 IBM-PC를 이용한 웹 환경에서 구축되었으며, 영문 문서영상을 이용하여 실험이 수행되었다. 실험결과는 본 논문에서 제안하는 방법들의 유효성을 보여주고 있다. Abstract Most existing digital libraries for document image retrieval provide a limited retrieval service due to their indexing from document titles and/or the content of document abstracts. This paper proposes a word spotting system for full English document image retrieval based on word image shape features. In order to improve not only the efficiency but also the precision of a retrieval system, we develop the system by 1) using a combination of the holistic features which have been used in the existing word spotting systems, 2) performing image matching by comparing the order of features in a word in addition to the number of features and their positions, and 3) adopting 2 stage retrieval strategies by obtaining retrieval results by image feature matching and applying OCR(Optical Charater Recognition) partly to the results for filtering purpose. The proposed system operates as follows: given a document image, its structure is analyzed and is segmented into a set of word regions. Then, word shape features are extracted and stored. Given a user's query with text, features are extracted after its corresponding word image is generated. This reference model is compared with the stored features to find out similar words. The proposed system is implemented with IBM-PC in a web environment and its experiments are performed with English document images. Experimental results show the effectiveness of the proposed methods.

A Study on the Rhythm of Korean EFL Learners' English Pronunciation (한국인 영어학습자의 영어리듬구현 연구)

  • Chung, Hyun-Song
    • Phonetics and Speech Sciences
    • /
    • v.1 no.2
    • /
    • pp.141-149
    • /
    • 2009
  • An emphasis on teaching suprasegmental features of English, specifically English rhythm, is essential in order to improve the 'intelligibility' of the pronunciation of Korean EFL learners among interlocutors who use English as a Lingua Franca(ELF). By redefining the ELF suggested by Jenkins (2000, 2002), this paper argues that Lingua Franca Core (LFC) must include suprasegmental features such as 'stress-based rhythm' and word stress. However, because 'isochrony' is difficult to measure in a foot, the rhythm unit must be expanded to an intonational phrase which has prominence in it and the rhythm of the unit can be measured by calculating the duration of each segment in context The rhythmic pattern of Korean learners of English and that of native speakers or other non-native English speakers can then be calculated and compared by using correlation coefficients of the segmental duration. In terms of sociolinguistic factors, improving the 'comprehensibility' and 'accentedness' of Korean EFL learners' pronunciation is also important in international communication, which calls for more emphasis on suprasegmental features.

  • PDF

Japanese Expressions that Include English Expressions

  • Murata, Masaki;Kanamaru, Toshiyuki;Nakamoto, Koichirou;Kotani, Katsunori;Isahara, Hitoshi
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2007.11a
    • /
    • pp.330-339
    • /
    • 2007
  • We extracted English expressions that appear in Japanese sentences in newspaper articles and on the Internet. The results obtained from the newspaper articles showed that the preposition "in" has been regularly used for more than ten years, and it is still regularly used now. The results obtained from the Internet articles showed there were many kinds of English expressions from various parts of speech. We extracted some interesting expressions that included English prepositions and verb phrases. These were interesting because they had different word orders to the normal order in Japanese expressions. Comparing the extracted English and katakana expressions, we found that the expressions that are commonly used in Japanese are often written in the katakana syllabary and that the expressions that are not so often used in Japanese, such as prepositions, are hardly ever written in the katakana syllabary.

  • PDF

Culture in language: comparing cultures through words in South Africa

  • Montevecchi, Michela
    • Cross-Cultural Studies
    • /
    • v.24
    • /
    • pp.120-131
    • /
    • 2011
  • South Africa is a multiracial country where different cultures and languages coexist. Culture can be conveyed through language. Language conditioning is also social conditioning, and through words we make sense of our own and others' experience. In this paper I investigate the meaning of two culturally significant words: (English) peace and (African) ubuntu. Data findings will show how L2 speakers of English, when asked to define peace, promptly operate a process of transfer of the meaning from their mother-tongue Xhosa equivalent - uxolo - to its English equivalent. Ubuntu, an African word which encompasses traditional African values, has no counterpart in English. I will also argue how, in the ongoing process of globalisation, English is playing a predominant role in promoting cultural homogenization.

A System of English Vowel Transcription Based on Acoustic Properties (영어 모음음소의 표기체계에 관한 연구)

  • 김대원
    • Proceedings of the KSLP Conference
    • /
    • 2003.11a
    • /
    • pp.170-173
    • /
    • 2003
  • There are more than five systems for transcribing English vowels. Because of this diversity, teachers of English and students are confronted with not a little problems with the English vowel symbols used in the English-Korean dictionaries, English text books, books for Phonetics and Phonology. This study was designed to suggest criterions for the phonemic transcription of English vowels on the basis of phonetic properties of the vowels and a system of English vowel transcription based on the criterions in order to minimize the problems with inter-system differences. A speaker (phonetician) of RP English uttered a series of isolated minimal pairs containing the vowels in question. The suggested vowel symbols are as follows: 1) Simple vowels : /i:/ in beat, /I/ bit, /$\varepsilon$/ bet,/${\ae}$/ bat, /a:/ father, /Dlla/ bod, /$\jmath$:/ bawd, /u/ put, /u:/ boot /$\Lambda$/ but, and /$\partial$/ about /$\Im$:ll$\Im$:r/ bird. 2) Diphthongs : /aI/ in bite, /au/ bout, /$\jmath$I/ boy, /$\Im$ullou/ boat, /er/ bait, /e$\partial$lle$\partial$r/ air, /u$\partial$llu$\partial$r/ poor, /i$\partial$lli$\partial$r/ beer. Where two symbols are shown corresponding to the vowel in a single word, the first is appropriate for most speakers of British English and the second for most speakers of American English.

  • PDF

An English-to-Korean Transliteration Model based on Grapheme and Phoneme (자소 및 음소 정보를 이용한 영어-한국어 음차표기 모델)

  • Oh Jong-Hoon;Choi Key-Sun
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.4
    • /
    • pp.312-326
    • /
    • 2005
  • There has been increasing interest in English-to-Korean transliteration recently. Previous ,works are related to a direct method like $\rightarrow$Korean graphemes> and a pivot method like $\rightarrow$English phoneme$\rightarrow$Korean graphemes>. Though most of the previous works focus on the direct method, transliteration, however, is a phonetic process rather than an orthographic one. In this point of view, we present an English-Korean transliteration model using grapheme and phoneme information. Unlike the previous works, our method uses phonetic information such as phonemes and their context. Moreover, we also use graphemes corresponding to phonemes. Our method shows about $60\%$ word accuracy.

A Diachronic Lexical Analysis of the North Korean English Textbooks (북한 영어 교과서 어휘의 통시적 분석)

  • Kim, Jiyoung;Lee, Je-Young;Kim, Jeong-ryeol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.4
    • /
    • pp.331-341
    • /
    • 2017
  • This paper aims to analyze English vocabulary of the North Korean textbooks diachronically using the constructed English textbook corpus. The North Korea English textbooks attained from Information Center on North Korea of the Ministry of Unification are divided into before and after Kim Jong-Il era for the year of 1996 in which the curriculum revision has been conducted. They are stored as text files to analyse vocabularies using WordSmith Tools 7.0. The vocabulary size of the revised textbooks increased after the curriculum reorganization, but the number of vocabulary types and vocabulary diversity decreased. After the curriculum revision, it was found that lots of vocabulary related to the establishment of the Kim Jong-Il system appeared as the keyword. It was also found that some vocabularies reflected the economic and social life of North Korea. In addition, through comparison of the 100 high-frequency word list and keywords, it can be concluded that the vocabulary of the English textbooks of North Korea is gradually changing into communicative contents from contents related with written language.

A Study on the Inputting Method of English Pronunciation for a Computer by the Combining Diacritical Mark (조합분음기호에 의한 영어 발음기호의 컴퓨터 입력방법에 관한 연구)

  • Lee Hyun-Chang
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4 s.310
    • /
    • pp.31-38
    • /
    • 2006
  • In this paper, the inputting method of english pronunciation for a computer by the combining diacritical mark is studied. English pronunciation system and the methods of its notations are investigated and conditions to input english pronunciations easily are analysed. Therefore, the inputting method which can input 3, 4-level stress as well as 2-level stress is presented. By using this method, English pronunciation can be inputted to the spreadsheets, databases and presentations as well as word-processors, and each application program's data can have compatibility. In the result of experiments, every data can have the compatibility in all of application programs and inputting speed is increased highly compare with using the individual vowel method which has high speed than using the pre-existing functions of word processors.