• Title/Summary/Keyword: Letters and Words

Search Result 109, Processing Time 0.028 seconds

Automatic Extraction of Alternative Words using Parallel Corpus (병렬말뭉치를 이용한 대체어 자동 추출 방법)

  • Baik, Jong-Bum;Lee, Soo-Won
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.12
    • /
    • pp.1254-1258
    • /
    • 2010
  • In information retrieval, different surface forms of the same object can cause poor performance of systems. In this paper, we propose the method extracting alternative words using translation words as features of each word extracted from parallel corpus, korean/english title pair of patent information. Also, we propose an association word filtering method to remove association words from an alternative word list. Evaluation results show that the proposed method outperforms other alternative word extraction methods.

Literacy-Related Communication and Information Types in Social Pretend Play (사회적 가상놀이에서 나타난 문해 관련 의사소통 및 정보 유형)

  • Cho, Eun Jin;Bae, Jae Jung
    • Korean Journal of Child Studies
    • /
    • v.20 no.4
    • /
    • pp.247-263
    • /
    • 1999
  • Literacy-related communication and information types naturally occurring in the dramatic play area were observed during free play over a 4 week period. Participants were 21 boys and 16 girls enrolled in a kindergarten class in Taegu. Types of literacy-related communication frequently used during social pretend play were Description, Suggestion, Question, and Answer. Negative types of literacy-related communication, such as Threat, Protest, and Warning were rare. Types of frequently occurring literacy information were about letters & words, and literacy functions. These findings were discussed with respect to curricular implications for the classroom.

  • PDF

A Study on Processing of Speech Recognition Korean Words (한글 단어의 음성 인식 처리에 관한 연구)

  • Nam, Kihun
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.4
    • /
    • pp.407-412
    • /
    • 2019
  • In this paper, we propose a technique for processing of speech recognition in korean words. Speech recognition is a technology that converts acoustic signals from sensors such as microphones into words or sentences. Most foreign languages have less difficulty in speech recognition. On the other hand, korean consists of vowels and bottom consonants, so it is inappropriate to use the letters obtained from the voice synthesis system. That improving the conventional structure speech recognition can the correct words recognition. In order to solve this problem, a new algorithm was added to the existing speech recognition structure to increase the speech recognition rate. Perform the preprocessing process of the word and then token the results. After combining the result processed in the Levenshtein distance algorithm and the hashing algorithm, the normalized words is output through the consonant comparison algorithm. The final result word is compared with the standardized table and output if it exists, registered in the table dose not exists. The experimental environment was developed by using a smartphone application. The proposed structure shows that the recognition rate is improved by 2% in standard language and 7% in dialect.

A Study on the Keyboard of Jawi Script (Arabic-Malay Script) (아랍식-말레이문자(Jawi Script) 키보드(Keyboard)에 관한 연구)

  • KANG, Kyoung Seok
    • SUVANNABHUMI
    • /
    • v.3 no.1
    • /
    • pp.47-66
    • /
    • 2011
  • Malay society is rooted on the Islamic concept. That Islam influenced every corner of that Malay society which had ever been an edge of the civilizations of the Indus and Ganges. Once the letters of that Hindu religion namely Sanscrit was adopted to this Malay society for the purpose of getting the Malay language, that is, Bahasa Melayu down to the practical literation but in vain. The Sanscrit was too complicated for Malay society to imitate and put it into practice in everyday life because it was totally different type of letters which has many of the similar allographs for a sound. In the end Malay society gave it up and just used the Malay language without using any letters for herself. After a few centuries Islam entered this Malay society with taking Arabic letters. It was not merely influencing Malay cultures, but to the religious life according to wide spread of that Islam. Finally Arabic letters was to the very means that Malay language was written by. It means that Arabic letters had been used for Arabic language in former times, but it became a similar form of letters for a new language which was named as Malay language. This Arabic letters for Arabic language has no problems whereas Arabic letters for Malay language has some of it. Naturally speaking, arabic letters was not designed for any other language but just for Arabic language itself. On account of this, there occurred a few problems in writing Malay consonants, just like p, ng, g, c, ny and v. These 6 letters could never be written down in Arabic letters. Those 6 ones were never known before in trying to pronounce by Arab people. Therefore, Malay society had only to modify a few new forms of letters for these 6 letters which had frequently been found in their own Malay sounds. As a result, pa was derived from fa, nga was derived from ain, ga was derived from kaf, ca was derived from jim, nya was derived from tha or ba, and va was derived from wau itself. Where must these 6 newly modified letters be put on this Arabic keyboard? This is the very core of this working paper. As a matter of course, these 6 letters were put on the place where 6 Arabic signs which were scarecely written in Malay language. Those 6 are found when they are used only in the 'shift-key-using-letters.' These newly designed 6 letters were put instead of the original places of fatha, kasra, damma, sukun, tanween and so on. The main differences between the 2 set of 6 letters are this: 6 in Arabic orginal keyboard are only signs for Arabic letters, on the other hand 6 Malay's are real letters. In others words, 6 newly modified Malay letters were substituted for unused 6 Arabic signs in Malay keyboard. This type of newly designed Malay Jawi Script keyboard is still used in Malaysia, Brunei and some other Malay countries. But this sort of keyboard also needs to go forward to find out another way of keyboard system which is in accordance with the alphabetically ordered keyboard system. It means that alif is going to be typed for A key, and zai shall be typed when Z key is pressed. This keyboard system is called 'Malay Jawi-English Rumi matching keyboard system', even though this system should probably be inconvenient for Malay Jawi experts who are good at Arabic 'alif-ba-ta'order.

  • PDF

Watermarking System That Inserts Copyright Holder′s Logo (저작권자의 로고를 워터 마킹하는 장치)

  • 남상엽;이천우;김형배;이상원;박인정
    • Proceedings of the IEEK Conference
    • /
    • 2003.07d
    • /
    • pp.1487-1490
    • /
    • 2003
  • This paper shows the watermarking system that inserts copyright holder's logo in music file. In other words, a sound file is able to have an image information like a logo or letters. The watermarking system converts a sound file into an image file using spectrogram. In the spectrogram domain, a logo is inserted using spread spectrum. The proposed technique shows that the verification of copyright is better than the method using PN-Sequence.

  • PDF

Using Roots and Patterns to Detect Arabic Verbs without Affixes Removal

  • Abdulmonem Ahmed;Aybaba Hancrliogullari;Ali Riza Tosun
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.1-6
    • /
    • 2023
  • Morphological analysis is a branch of natural language processing, is now a rapidly growing field. The fundamental tenet of morphological analysis is that it can establish the roots or stems of words and enable comparison to the original term. Arabic is a highly inflected and derivational language and it has a strong structure. Each root or stem can have a large number of affixes attached to it due to the non-concatenative nature of Arabic morphology, increasing the number of possible inflected words that can be created. Accurate verb recognition and extraction are necessary nearly all issues in well-known study topics include Web Search, Information Retrieval, Machine Translation, Question Answering and so forth. in this work we have designed and implemented an algorithm to detect and recognize Arbic Verbs from Arabic text.The suggested technique was created with "Python" and the "pyqt5" visual package, allowing for quick modification and easy addition of new patterns. We employed 17 alternative patterns to represent all verbs in terms of singular, plural, masculine, and feminine pronouns as well as past, present, and imperative verb tenses. All of the verbs that matched these patterns were used when a verb has a root, and the outcomes were reliable. The approach is able to recognize all verbs with the same structure without requiring any alterations to the code or design. The verbs that are not recognized by our method have no antecedents in the Arabic roots. According to our work, the strategy can rapidly and precisely identify verbs with roots, but it cannot be used to identify verbs that are not in the Arabic language. We advise employing a hybrid approach that combines many principles as a result.

The Processing Unit in Korean Words (한글 낱말의 처리 단위)

  • 이준석;김경린
    • Korean Journal of Cognitive Science
    • /
    • v.1 no.2
    • /
    • pp.221-239
    • /
    • 1989
  • The purpose of this study was to explore the processing unit in Korean word.Three experiments were conducted to examine this question.Preliminary experiment and Enperiment I were executed to delineate the processing unit in singles syllable word and Experiment 2,for words two or more syllables.The major finding of the preliminary experiment showed that the effect of the consonant type was not significant but that of the letter position was.Reaction time increased as the position of letter increased.The difference in reaction time between the first and the second position was not significant.However,the difference between the second and third was.In the Experiment 1, the effect of the number of letter was significant: reaction time increased as the number of letters increased.The size of the position effect both in the preliminary experiment and Experiment 1was comparable.Result of Experiment 2 was such that regardless of the presence of the final consonant(s),the reaction time incresased linearly as the number of svllables increased from two to four. The findings of the present study suggest that:(1)processing unit in single syllable Korean words is a syllable without the final consonant(s):(2) but in words of two or more syllables,the unit is likely to be a syllable with the final consonant(s).

The Description Rule of Terms and Characters in Databases (데이터베이스의 사용문자(使用文字) 및 용어(用語) 표기법(表記法))

  • Kim, Tae-Jung;Lee, Chang-Han
    • Journal of Information Management
    • /
    • v.19 no.1
    • /
    • pp.95-122
    • /
    • 1988
  • From the lack of common rule for the description of the characters and terms in bibliographic databases, it was hard to share information with other organizations and to obtain relevant bits of information through online retrieval. In this paper, the authors suggest a rule for the transcription of symbols and letters, that are found in articles. but impossible to input through CRT terminals, into the symbols and letters which is capable of input and retrieval. And, in the 'Hangul orthography' and 'Description rule of the borrowed words' which are officially announced by the Ministry of Education, more than two ways are permited for the expression of terms. In that case, to improve retrieval efficiency and to prevent 'confusion in description, they are regulated.

  • PDF

우리말 동철이음어 구별표기안 - IPA, 로마자, 한글표기를 나란히 견주어 -

  • Yu Man-Geun
    • MALSORI
    • /
    • no.31_32
    • /
    • pp.51-82
    • /
    • 1996
  • The purpose of this paper is to gather pairs of heteronyms in Modem Korean and to propose that all of them should be differentiated in both the Hanngul orthography and Romanization as well as in the IPA transcription. More than a quarter of the whole Korean vocabulary consists of words with a long vowel and the number of minimal pairs distinguished only by the chroneme reaches nearly ten thousand (ie. twenty thousand words). It is suggested here that the letter s in Hanngul and the letter 'h' in the Roman alphabet be used to represent the long vowel. Another factor which brings forth lots of heteronyms in Korean is the lacking of enough indication as to non-automatic reinforcement in the initial consonant o( a word (or a morpheme) when following another within a phrase (or a word). It is proposed here that the non-automatincally rienforced word-initial consonant should be written with the letter h (like ㅺ, ㅼ, ㅽ, ㅾ) and an apostrophe (like 물'새 or 밭'이랑, 물'약) in Hanngul, and with the letter c and an apostrophe (like c'g-, c'd-, c'b-, c'j- ) in the Roman alphabet The morpheme-initial reinforced consonant within a word is written with the letters k, 1, p and cz for ㅺ, ㅼ, ㅽ, and ㅾ respectively. The contrasted pronunciations of pairs of heteronyms beginning with ㅁ/m sound are transcribed here for exemplification in the IPA, Roman alphabet and Hanngul.

  • PDF

Heteronyms in modern Korean and their transcription in the IPA and the Roman alphabet (우리말 동철이음어(同綴異音語) IPA.로마자 표기 (사~섬))

  • Youe MahnGunn
    • MALSORI
    • /
    • no.37
    • /
    • pp.49-71
    • /
    • 1999
  • The Purpose of this paper is to gather pairs of heteronyms in modern Korean and transcribe them in the IPA and the Roman alphabet in order to propose that all of them should be differentiated in Hanngul orthography. More than a quarter of the whole Korean vocabulary consists of words with a long vowel and the number of minimal pairs distinguished only by the chroneme reaches nearly ten thousand (i.e. twenty thousand words). The letter h syllable-finally is used here to represent the long vowel in Romanization except the vowel '으‘[?:] which is transcribed by doubling the letter u (i.e. uu). Another factor bringing forth lots of heteronyms in Korean is the lack of full indication as to the non-automatic reinforcement in the initial consonant of a word (or a morpheme) when preceded by another within a phrase (or a word). These reinforced word-initial consonants are written with the letter c and an apostrophe (like c'g- , c'd- , c'b-, c's-, c'j-) in Romanization here. The reinforced morpheme-initial consonant within a word is written with the letters k t, p, ss and cz for ㄲ, ㄸ, ㅃ, ㅆ and ㅉ sounds respectively. The contrasted pronunciations of pairs of heteronyms beginning with ㅅ /s/sup h// and ㅆ /s/ sounds are transcribed here for exemplification.

  • PDF