• Title/Summary/Keyword: language-specific features

Search Result 76, Processing Time 0.026 seconds

Language- Independent Sentence Boundary Detection with Automatic Feature Selection

  • Lee, Do-Gil
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.4
    • /
    • pp.1297-1304
    • /
    • 2008
  • This paper proposes a machine learning approach for language-independent sentence boundary detection. The proposed method requires no heuristic rules and language-specific features, such as part-of-speech information, a list of abbreviations or proper names. With only the language-independent features, we perform experiments on not only an inflectional language but also an agglutinative language, having fairly different characteristics (in this paper, English and Korean, respectively). In addition, we obtain good performances in both languages. We have also experimented with the methods under a wide range of experimental conditions, especially for the selection of useful features.

  • PDF

An Extensible Programming Language for Plugin Features (플러그인 언어로 확장 가능한 프로그래밍 언어)

  • 최종명;유재우
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.5
    • /
    • pp.632-642
    • /
    • 2004
  • The modern softwares have features of modularity and extensibility, and there are several researches on extensible programming languages and compilers. In this paper, we introduce Argos programming language, which provides the extensibility with the concept of plugin languages. A plugin language is used to define a method of a class, and the plugin language processors can be added and replaced dynamically The plugin languages may be used to support multiparadigm programming or domain specific languages.

Effect of language on fundamental frequency: Comparison between Korean and English produced by L2 speakers and bilingual speakers

  • Lim, Soo Bin;Lee, Goun;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.15-22
    • /
    • 2016
  • This study aims to examine whether the fundamental frequency (F0) varies depending on languages or distinguishes between L1 (first language) and L2 (second language) speech and whether the type of materials which vary in control of consonant voicing affects the use of F0-especially, mean F0. For this purpose, we compared productions of two languages produced by Korean L2 learners of English to those of Korean-English bilingual speakers. Twelve Korean L2 speakers of English and twelve Korean-English bilingual speakers participated in this study. The subjects read aloud 22 declarative sentences-balanced and unbalanced-once in English and once in Korean. Mean F0 of Korean was higher than that of English for both speaker groups, and the difference in the value of mean F0 between the Korean and English sentences was different depending on the type of materials that the participants read. With regard to F0 range, the L2 speakers had a larger F0 range in English than in Korean; however, the effect of language on F0 range was not statistically significant for the bilingual speakers. These results indicate that language-specific properties may affect the use of F0, in particular, mean F0.

Crowdfunding Scams: The Profiles and Language of Deceivers

  • Lee, Seung-hun;Kim, Hyun-chul
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.55-62
    • /
    • 2018
  • In this paper, we propose a model to detect crowdfunding scams, which have been reportedly occurring over the last several years, based on their project information and linguistic features. To this end, we first collect and analyze crowdfunding scam projects, and then reveal which specific project-related information and linguistic features are particularly useful in distinguishing scam projects from non-scams. Our proposed model built with the selected features and Random Forest machine learning algorithm can successfully detect scam campaigns with 84.46% accuracy.

Feature Analysis for Detecting Mobile Application Review Generated by AI-Based Language Model

  • Lee, Seung-Cheol;Jang, Yonghun;Park, Chang-Hyeon;Seo, Yeong-Seok
    • Journal of Information Processing Systems
    • /
    • v.18 no.5
    • /
    • pp.650-664
    • /
    • 2022
  • Mobile applications can be easily downloaded and installed via markets. However, malware and malicious applications containing unwanted advertisements exist in these application markets. Therefore, smartphone users install applications with reference to the application review to avoid such malicious applications. An application review typically comprises contents for evaluation; however, a false review with a specific purpose can be included. Such false reviews are known as fake reviews, and they can be generated using artificial intelligence (AI)-based text-generating models. Recently, AI-based text-generating models have been developed rapidly and demonstrate high-quality generated texts. Herein, we analyze the features of fake reviews generated from Generative Pre-Training-2 (GPT-2), an AI-based text-generating model and create a model to detect those fake reviews. First, we collect a real human-written application review from Kaggle. Subsequently, we identify features of the fake review using natural language processing and statistical analysis. Next, we generate fake review detection models using five types of machine-learning models trained using identified features. In terms of the performances of the fake review detection models, we achieved average F1-scores of 0.738, 0.723, and 0.730 for the fake review, real review, and overall classifications, respectively.

영어, 독일어 그리고 한국어의 강화사 (INTENSIFIERS) -머리에 묶이지 않은 용법 (NON-HEAD-BOUND-USE)을 중심으로

  • 최규련
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2001.06a
    • /
    • pp.199-225
    • /
    • 2001
  • The main goal of this paper is to investigate and compare English, German and Korean non-head-bound-intensifiers such as English ‘x-self’, German ‘selbst’, and Korean ‘susulo, casin’. That is, this paper is mainly concerned with the semantic domain where the respective contributions of the expressions in question overlap. The phenomenon under discussion with the label “intensifiers” is regarded as universal, which provides the ground of the comparative/contrastive or semi-cross-linguistic study of this paper. Not only the semantic concept of intensification by these expressions but also the combination of grammatical features or syntactic behaviours thereof seem to have highly invariant common denominators among the wide varieties of languages, even if they come from apparently different language families. In comparing English, German and Korean intensifiers, this paper is interested in the more general features of the expressions in question rather than some language-specific idiocyncracies. Intensifiers work similarly not only in English and German, but also in Korean. Each of three languages under investigation provides some sort of a safegard against confusing instances and misleading judgements on the issues under discussion. Morphologically, however, English expressions in question agree with their rele-vant NP in number, gender and person. Whereas German and Korean counterparts do not have such specific morphological properties. Intensifiers in their non-head-bound-use are subject-oriented, just as in their head-bound use. Non-head-bound-intensifiers differ from head-bound-intensifiers mostly in their syntactic behaviours or distributional properties, whereas they share the semantic domain “intensification” regarding relevant subject-NP. They introduce an ordering and distinguish center and periphery, and ‘self-involvement (directness of involvement)’seems a additional possible characterisation of the relevant dimension of these intensifiers in common. An assertion of identity also can be reg

  • PDF

How to Teach English Intonation to Japanese Students

  • Masaki Tsuzuki
    • Proceedings of the KSPS conference
    • /
    • 1996.02a
    • /
    • pp.47-61
    • /
    • 1996
  • The phonetic study of English language in Japan is a matter of great importance, a problem of major concern and a. vital subject The special difficulties which the Japanese college students have in learning English lie in the field of prosodic features of English, such as, syllable, rhythm, stress, intonation, prominence, of.. These difficulties have made Japanese students' pronunciation relatively monotonous or mora(ness). In my presentation, the specific phonetic features of Japanese language first will be discussed and clarified. And then the effective teaching method of intonation to improve Japanese students' pronunciation will be suggested. Finally, the oral dialogue with intonation analysis and transcription in the class room will be demonstrated to highlight the presentation.

  • PDF

Acoustic Measurement of English read speech by native and nonnative speakers

  • Choi, Han-Sook
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.77-88
    • /
    • 2011
  • Foreign accent in second language production depends heavily on the transfer of features from the first language. This study examines acoustic variations in segments and suprasegments by native and nonnative speakers of English, searching for patterns of the transfer and plausible indexes of foreign accent in English. The acoustic variations are analyzed with recorded read speech by 20 native English speakers and 50 Korean learners of English, in terms of vowel formants, vowel duration, and syllabic variation induced by stress. The results show that the acoustic measurements of vowel formants and vowel and syllable durations display difference between native speakers and nonnative speakers. The difference is robust in the production of lax vowels, diphthongs, and stressed syllables, namely the English-specific features. L1 transfer on L2 specification is found both at the segmental levels and at the suprasegmental levels. The transfer levels measured as groups and individuals further show a continuum of divergence from the native-like target. Overall, the eldest group, students who are in the graduate schools, shows more native-like patterns, suggesting weaker foreign accent in English, whereas the high school students tend to involve larger deviation from the native speakers' patterns. Individual results show interdependence between segmental transfer and prosodic transfer, and correlation with self-reported proficiency levels. Additionally, experience factors in English such as length of English study and length of residence in English speaking countries are further discussed as factors to explain the acoustic variation.

  • PDF

Development of a Korean chatbot system that enables emotional communication with users in real time (사용자와 실시간으로 감성적 소통이 가능한 한국어 챗봇 시스템 개발)

  • Baek, Sungdae;Lee, Minho
    • Journal of Sensor Science and Technology
    • /
    • v.30 no.6
    • /
    • pp.429-435
    • /
    • 2021
  • In this study, the creation of emotional dialogue was investigated within the process of developing a robot's natural language understanding and emotional dialogue processing. Unlike an English-based dataset, which is the mainstay of natural language processing, the Korean-based dataset has several shortcomings. Therefore, in a situation where the Korean language base is insufficient, the Korean dataset should be dealt with in detail, and in particular, the unique characteristics of the language should be considered. Hence, the first step is to base this study on a specific Korean dataset consisting of conversations on emotional topics. Subsequently, a model was built that learns to extract the continuous dialogue features from a pre-trained language model to generate sentences while maintaining the context of the dialogue. To validate the model, a chatbot system was implemented and meaningful results were obtained by collecting the external subjects and conducting experiments. As a result, the proposed model was influenced by the dataset in which the conversation topic was consultation, to facilitate free and emotional communication with users as if they were consulting with a chatbot. The results were analyzed to identify and explain the advantages and disadvantages of the current model. Finally, as a necessary element to reach the aforementioned ultimate research goal, a discussion is presented on the areas for future studies.

Effects of age of L2 acquisition and L2 experience on the production of English vowels by Korean speakers

  • Eunhae Oh;Eunyoung Shin
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.9-16
    • /
    • 2023
  • The current study investigated the influence of age of L2 acquisition (AOA) and length of residence (LOR) in the L2 setting country on the production of voicing-conditioned vowel duration and spectral qualities in English by Korean learners. The primary aim was to explore the ways in which the language-specific phonetic features are acquired by the age of onset and L2 experience. Analyses of the archived corpus data produced by 45 native speakers of Korean showed that, regardless of AOA or LOR, absolute vowel duration was used as a salient correlate of voicing contrast in English for Korean learners. The accuracy of relative vowel duration was influenced more by onset age than by L2 experience, suggesting that being exposed to English at an early age may benefit the acquisition of temporal dimension. On the other hand, the spectral characteristics of English vowels were more consistently influenced by L2 experience, indicating that immersive experience in the L2 speaking environment are likely to improve the accurate production of vowel quality. The distinct influence of the onset age and L2 experience on the specific phonetic cues in L2 vowel production provides insight into the intricate relationship between the two factors on the manifestation of L2 phonological knowledge.