• 제목/요약/키워드: Linguistic segmentation

검색결과 19건 처리시간 0.022초

APEC국가간 언어의 투자 결정요인 분석 (The Language Determinant Analysis of Investment Among APEC Member Economies)

  • 션즈펑;김태인
    • 아태비즈니스연구
    • /
    • 제7권2호
    • /
    • pp.61-76
    • /
    • 2016
  • This study aims to establish ways of how languages are used as determination factors for investment decisions among Asian countries where used languages are diversified. According to the analysis result, language segmentation of the investing country increases investment whereas the language segmentation of the invested countries is analyzed as the decreasing factor of investment. Also, it is analyzed that the further the linguistic distance between the investing country and the invested country the more investment increases. In the aspects of approached language distance and investment time selection, along with the increased linguistic distance, the elasticity to foreign direct investment is apprehended to be more flexible than other forms of investment. Such result shows the more segmented the languages of the targeted invested country the more investment cost will increase and therefore the results in linguistic distance can be explained as diversification of the invested country and the result to the forming of bridgehead at the invested area.

  • PDF

Couple Look의 언어적 표현 (A Study on the Linguistic Manifestation of 'Couple Look')

  • 한명숙
    • 복식문화연구
    • /
    • 제13권5호
    • /
    • pp.756-762
    • /
    • 2005
  • The objective of this research is to examine psychological desires of college students who attempt to express themselves by wearing so called 'couple look' attire, which is a dressing habit that represents responses to various psychologies and the society. Moreover, the message that is trying to be conveyed to others by dressing as such and the question of whether that message is being conveyed, are subject to analysis by applying linguistic classification theory pertaining to this specific term. After a pre-examination based on a through interview conducted with 70 male and female college student, the main examination was based on question and answering methods on 450 male and female college students for data collection. The results were compared, reviewed and analyzed by applying Geoffrey Leech's meaning segmentation theory on linguistics, and was aimed at defining through research how meaning segmentation represented through languages can be applied in expressing one's self through clothing. The research results are as follows. 1. The psychological desires of wearing couple look attire are to express that they like and love each other, are dating, and to showcase their intimacy. 2. Clothing attire that are appropriate to express the couple look are T-shirts, jeans, pants, sweaters, mufflers, and accessories such as tennis shoes, hats, shoes, bags, rings, watches, ear-rings, etc. 3. Amongst people who have tried the couple look and those who have not, those who have said that they were willing to dress in couple look are mostly experienced in dressing so.

  • PDF

Ambiguity Resolution in Chinese Word Segmentation

  • Maosong, Sun;T'sou, Benjamin-K.
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 1995년도 Language, Information and Computation = Proceedings of the 10th Pacific Asia Conference, Hong Kong
    • /
    • pp.121-126
    • /
    • 1995
  • A new method for Chinese word segmentation named Conditional F'||'&'||'BMM (Forward and Backward Maximal Matching) which incorporates both bigram statistics (ie., mutual infonllation and difference of t-test between Chinese characters) and linguistic rules for ambiguity resolution is proposed in this paper The key characteristics of this model are the use of: (i) statistics which can be automatically derived from any raw corpus, (ii) a rule base for disambiguation with consistency and controlled size to be built up in a systematic way.

  • PDF

한국어 음성인식 시스템 향상을 위한 동음이철 단위의 중의성 유형 분류 (Ambiguity Types of the Homonymic & Heterographic Units for Improving Korean Voice Recognition System - a Preliminary Research)

  • 윤애선;강미영
    • 음성과학
    • /
    • 제15권4호
    • /
    • pp.67-81
    • /
    • 2008
  • The accuracy rate of P2G (Phoneme-to-Grapheme) is one of the important factors determining the quality of unlimited voice recognition (VR) systems. Few studies were, however, conducted to reduce ambiguities of a phoneme string which can be segmented into a variety of different linguistic units (i.e. morphemes, words, eo-jeols), thus be transformed into more than one grapheme string. This paper is a preliminary research for building a large knowledge base of those homonymic & heterographic units(HHUs), which will provide unlimited Korean VR systems with more accurate P2G information. This paper analyzes 2 main factors generating HHUs: (1) boundary determination of the prosodic unit; (2) its segmentation into linguistic units. In this paper, linguistic characteristics determining variable boundaries of a prosodic unit are investigated, and the ambiguity types of HHUs are classified in accordance with their morphological and syntactic structures as well as with the phonological rules governing them.

  • PDF

Identification of Chinese Personal Names in Unrestricted Texts

  • Cheung, Lawrence;Tsou, Benjamin K.;Sun, Mao-Song
    • 한국언어정보학회:학술대회논문집
    • /
    • 한국언어정보학회 2002년도 Language, Information, and Computation Proceedings of The 16th Pacific Asia Conference
    • /
    • pp.28-35
    • /
    • 2002
  • Automatic identification of Chinese personal names in unrestricted texts is a key task in Chinese word segmentation, and can affect other NLP tasks such as word segmentation and information retrieval, if it is not properly addressed. This paper (1) demonstrates the problems of Chinese personal name identification in some If applications, (2) analyzes the structure of Chinese personal names, and (3) further presents the relevant processing strategies. The geographical differences of Chinese personal names between Beijing and Hong Kong are highlighted at the end. It shows that variation in names across different Chinese communities constitutes a critical factor in designing Chinese personal name Identification algorithm.

  • PDF

한국어 단어 및 문장 분류 태스크를 위한 분절 전략의 효과성 연구 (A Comparative study on the Effectiveness of Segmentation Strategies for Korean Word and Sentence Classification tasks)

  • 김진성;김경민;손준영;박정배;임희석
    • 한국융합학회논문지
    • /
    • 제12권12호
    • /
    • pp.39-47
    • /
    • 2021
  • 효과적인 분절을 통한 양질의 입력 자질 구성은 언어모델의 문장 이해력을 향상하기 위한 필수적인 단계이다. 입력 자질의 품질 제고는 세부 태스크의 성능과 직결된다. 본 논문은 단어와 문장 분류 관점에서 한국어의 언어적 특징을 효과적으로 반영하는 분절 전략을 비교 연구한다. 분절 유형은 언어학적 단위에 따라 어절, 형태소, 음절, 자모 네 가지로 분류하며, RoBERTa 모델 구조를 활용하여 사전학습을 진행한다. 각 세부 태스크를 분류 단위에 따라 문장 분류 그룹과 단어 분류 그룹으로 구분 지어 실험함으로써, 그룹 내 경향성 및 그룹 간 차이에 대한 분석을 진행한다. 실험 결과에 따르면, 문장 분류에서는 단위의 언어학적 분절 전략을 적용한 모델이 타 분절 전략 대비 최대 NSMC: +0.62%, KorNLI: +2.38%, KorSTS: +2.41% 높은 성능을, 단어 분류에서는 음절 단위의 분절 전략이 최대 NER: +0.7%, SRL: +0.61% 높은 성능을 보임으로써, 각 분류 그룹에서의 효과성을 보여준다.

온라인 연속 필기 문자의 인식을 위한 문자간 구분 및 종류의 결정에 관한 연구 (A study on character segmentation and determination of linguistic type for recognition of on-line cursive characters)

  • 박강령;전병환;김창수;김우성;김재희
    • 전자공학회논문지C
    • /
    • 제34C권7호
    • /
    • pp.61-69
    • /
    • 1997
  • With the vigorous researches in the character recognition, the need to recognize run-on multilingual handwritten characters is increasing to provide uses with more comfortable PUI(pen user interface) environments. In general, many intermediate word candidates word candidates are generated in run-on multilingual recognition because there is no information of ending position and linguistic kind of character. To remove unnecessary word candidates which are generated in run-on multilingual recognition, we classify them into two groups and select the best candidate among the word candidates in the group where the final characater is completed using 5 attributes. In this research, we propose a method in order to select the best one candidate. It is called WRM (Weighted ranking method). The weights are adaptively trained by LMS(Least mean square) learning rule. Results show that the abilities of decision makin gusing weights are much better than those not using weights.

  • PDF

Towards Effective Entity Extraction of Scientific Documents using Discriminative Linguistic Features

  • Hwang, Sangwon;Hong, Jang-Eui;Nam, Young-Kwang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권3호
    • /
    • pp.1639-1658
    • /
    • 2019
  • Named entity recognition (NER) is an important technique for improving the performance of data mining and big data analytics. In previous studies, NER systems have been employed to identify named-entities using statistical methods based on prior information or linguistic features; however, such methods are limited in that they are unable to recognize unregistered or unlearned objects. In this paper, a method is proposed to extract objects, such as technologies, theories, or person names, by analyzing the collocation relationship between certain words that simultaneously appear around specific words in the abstracts of academic journals. The method is executed as follows. First, the data is preprocessed using data cleaning and sentence detection to separate the text into single sentences. Then, part-of-speech (POS) tagging is applied to the individual sentences. After this, the appearance and collocation information of the other POS tags is analyzed, excluding the entity candidates, such as nouns. Finally, an entity recognition model is created based on analyzing and classifying the information in the sentences.

Fuzzy Relaxation Based on the Theory of Possibility and FAM

  • Uam, Tae-Uk;Park, Yang-Woo;Ha, Yeong-Ho
    • Journal of Electrical Engineering and information Science
    • /
    • 제2권5호
    • /
    • pp.72-78
    • /
    • 1997
  • This paper presents a fuzzy relaxation algorithm, which is based on the possibility and FAM instead of he probability and compatibility coefficients used in most of existing probabilistic relaxation algorithms, Because of eliminating stages for estimating of compatibility coefficients and normalization of the probability estimates, the proposed fuzzy relaxation algorithms increases the parallelism and has a simple iteration scheme. The construction of fuzzy relaxation scheme consists of the following three tasks: (1) definition of in/output linguistic variables, their term sets, and possibility. (2) Definition of FAM rule bases for relaxation using fuzzy compound relations. (3) Construction of the iteration scheme for calculating the new possibility estimate. Applications to region segmentation an ege detectiojn algorithms show that he proposed method can be used for not only reducing the image ambiguity and segmentation errors, but also enhancing the raw edge iteratively.

  • PDF

전화망을 위한 어구 종속 화자 확인 시스템 (Text-dependent Speaker Verification System Over Telephone Lines)

  • 김유진;정재호
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.663-667
    • /
    • 1999
  • In this paper, we review the conventional speaker verification algorithm and present the text-dependent speaker verification system for application over telephone lines and its result of experiments. We apply blind-segmentation algorithm which segments speech into sub-word unit without linguistic information to the speaker verification system for training speaker model effectively with limited enrollment data. And the World-mode] that is created from PBW DB for score normalization is used. The experiments are presented in implemented system using database, which were constructed to simulate field test, and are shown 3.3% EER.

  • PDF