• 제목/요약/키워드: language-specific features

검색결과 76건 처리시간 0.023초

Utilizing Various Natural Language Processing Techniques for Biomedical Interaction Extraction

  • Park, Kyung-Mi;Cho, Han-Cheol;Rim, Hae-Chang
    • Journal of Information Processing Systems
    • /
    • 제7권3호
    • /
    • pp.459-472
    • /
    • 2011
  • The vast number of biomedical literature is an important source of biomedical interaction information discovery. However, it is complicated to obtain interaction information from them because most of them are not easily readable by machine. In this paper, we present a method for extracting biomedical interaction information assuming that the biomedical Named Entities (NEs) are already identified. The proposed method labels all possible pairs of given biomedical NEs as INTERACTION or NO-INTERACTION by using a Maximum Entropy (ME) classifier. The features used for the classifier are obtained by applying various NLP techniques such as POS tagging, base phrase recognition, parsing and predicate-argument recognition. Especially, specific verb predicates (activate, inhibit, diminish and etc.) and their biomedical NE arguments are very useful features for identifying interactive NE pairs. Based on this, we devised a twostep method: 1) an interaction verb extraction step to find biomedically salient verbs, and 2) an argument relation identification step to generate partial predicate-argument structures between extracted interaction verbs and their NE arguments. In the experiments, we analyzed how much each applied NLP technique improves the performance. The proposed method can be completely improved by more than 2% compared to the baseline method. The use of external contextual features, which are obtained from outside of NEs, is crucial for the performance improvement. We also compare the performance of the proposed method against the co-occurrence-based and the rule-based methods. The result demonstrates that the proposed method considerably improves the performance.

한국어 연결어미 '-면서'와 중국어 대응표현의 대조연구 -한·중 병렬 말뭉치를 기반으로 (A Comparative Study on Korean Connective Morpheme '-myenseo' to the Chinese expression - based on Korean-Chinese parallel corpus)

  • YI, CHAO
    • 비교문화연구
    • /
    • 제37권
    • /
    • pp.309-334
    • /
    • 2014
  • This study is based on the Korean-Chinese parallel corpus, utilizing the Korean connective morpheme '-myenseo' and contrasting with the Chinese expression. Korean learners often struggle with the use of Korean Connective Morpheme especially when there is a lexical gap between their mother language. '-myenseo' is of the most use Korean Connective Morpheme, it usually contrast to the Chinese coordinating conjunction. But according to the corpus, the contrastive Chinese expression to '-myenseo' is more than coordinating conjunction. So through this study, can help the Chinese Korean language learners learn easier while studying '-myenseo', because the variety Chinese expression are found from the parallel corpus that related to '-myenseo'. In this study, firstly discussed the semantic features and syntactic characteristics of '-myenseo'. The significant semantic features of '-myenseo' are 'simultaneous' and 'conflict'. So in this chapter the study use examples of usage to analyse the specific usage of '-myenseo'. And then this study analyse syntactic characteristics of '-myenseo' through the subject constraint, predicate constraints, temporal constraints, mood constraints, negatives constraints. then summarize them into a table. And the most important part of this study is Chapter 4. In this chapter, it contrasted the Korean connective morpheme '-myenseo' to the Chinese expression by analysing the Korean-Chinese parallel corpus. As a result of the analysis, the frequency of the Chinese expression that contrasted to '-myenseo' is summarized into

    . It can see from the table that the most common Chinese expression comparative to '-myenseo' is non-marker patterns. That means the connection of sentence in Korean can use connective morpheme what is a clarifying linguistic marker, but in Chinese it often connect the sentence by their intrinsic logical relationships. So the conclusion of this chapter is that '-myenseo' can be comparative to Chinese conjunction, expression, non-marker patterns and liberal translation patterns, which are more than Chinese conjunction that discovered before. In the last Chapter, as the conclusion part of this study, it summarized and suggest the limitations and the future research direction.

  • 영동지역어내의 충청방언 남동부 하한선 연구 (Research on the Bottom Boundary Line on the Southeast Area of the Chungcheongdo Dialect in Yeongdong)

    • 성희제
      • 인문언어
      • /
      • 제8집
      • /
      • pp.265-289
      • /
      • 2006
    • The geographical characteristics of Yeongdong(永同) the southernmost part of the Chungcheongbukdo province, has attracted attention among the academic circle as one of the dialectal contact regions since it adjoins the Gyeongsang and Jeolla dialects. Unlike the local language in Mooju (Jellado dialect) adjacent to the Southwest part, the local language in Yeongdong is quite different from that of Kimcheon (Gyeongsang dialect). More specifically, it is noteworthy that the boundary line of the Gyeongsang dialect is found in this region, which is different from the administrative division. In other words, the local language in Yeongdong is divided into the Chungcheong dialect and the Gyeongsang dialect, and furthermore each dialect region still has the characteristics of the other region's dialect. For example, the phonological structure of Yeongdong Chungcheongdo dialect has very unique characteristics of the fudged dialect, which is seemingly influenced by the Gyeongsang dialect. The present study is to define the bottom boundary line of the southeast area of the Chungcheong dialect by identifying the boundary line between the Gyeongsang dialect and the Chungcheong dialect, and to clarify its specific sound system generated by the contact of these two dialects. For this, the author collected and analyzed data of the local language around Yeongdong and adjacent areas. It was found that Cheongwha-ri, Deokjin-ri, and Sanjeo-ri at Yeongsan-myeon, and Mugeunjeom, Sangga-ri, and Jungga-ri at Yeongdong-eup, among the regions that belongs to Chungcheong dialect within the local language of Yeongdong, show the characteristics of the Gyeongsang dialect. Accordingly, the western areas of these villages become the southeast boundary line of the Chungcheong dialect. Also, the unique phonological characteristics of the Yeongdong Chungcheong dialect is affected by the Gyeongsang dialect, among which "rhythms, y deletion, nasal phoneme deletion, and w deletion" appeared. It is thought to be the unique fudged dialectal phenomenon that appeared only in this region. The research result is expected to be of some help in finding out various aspects of dialectal contacts as well as clarifying the phonological features of the local language in Yeongdong, and thereby contributing to exact divisioning of the Chungcheong dialect.

    • PDF

    화이트헤드의 언어 이해와 상징적 연관 (Language and Symbolic Reference in Whitehead′s Philosophy)

    • 문창옥
      • 인문언어
      • /
      • 제6권
      • /
      • pp.147-166
      • /
      • 2004
    • Whitehead's discussion of language is not to be found in any one book or article. It is interwoven with his discussion of many other questions. He was, however, greatly concerned with the problem of symbolism in general and the uses of language. He regards language, spoken or written, as an instrument devised by men to aid them in their adjustment to the environment in which they live Language is used for many specific purposes in the process of this adjustment. Words are employed not only to refer to data and to express emotions. They may be used also to record experiences, and thoughts about these experiences. Worts also function as instruments in the organization of experiences as they are considered in retrospect. Thus words free us from the bondage of the immediate. And Whitehead's theory of meaning is implicit in his discussion of the functions of language. According to him, the human mind is functioning symbolically when some components of its experience elicit consciousness, beliefs, emotions, and usages, respecting other components of its experiences. The former set of components are the 'symbols', and the latter set constitute the 'meaning' of the symbols. Whitehead points out that one word may have several meanings, i.e. refer to several different data. In order to understand, thus, the meaning to which a word refers, it is sometimes very important to appreciate the system of thought within which a person is operating. Further, Whitehead's discussion of language includes a number of cogent warning the deficiencies of language, and hence the need for great care in the use of words. In fact, language developed gradually. For the most part we have created words designed to deal with practical problems. Attention focuses on the prominent features in a situation, in particular the changing aspects of things. With reference to such data our words are relatively adequate. However, this issues in an unfortunate superficiality. The enduring, the subtle, the complex and the general aspects of the universe do not have adequate verbal representation. for this reason, Whitehead's position concerning the uses of language in speculative philosophy is stated with pungent directness. The uncritical trust in the adequacy of language is one of the main errors to which philosophy is liable. Since ordinary language does not do justice to the generalities, profundities and complexities of life, it is obvious that philosophy requires new words and phrases, or at least the revision of familiar words and phrases. Proceeding to develop the theme Whitehead contends that words and phrases must be stretched towards a generality foreign to their ordinary usage. In the same vein Whitehead refers to the need to realize that language which is the tool of philosophy needs to be redesigned just as in physical science available physical apparatus needs to be redesigned. But even these words and phrases, stretched or redesigned, are never completely adequate in philosophical speculations. They are, in his opinion, merely a great improvement over ordinary language or the language science, mathematics or symbolic logic.

    • PDF

    사용자 리뷰 분석을 통한 제품 요구품질 도출 방법론 (Methodology for Deriving Required Quality of Product Using Analysis of Customer Reviews)

    • 유예린;변정은;배국진;서수민;김윤하;김남규
      • Journal of Information Technology Applications and Management
      • /
      • 제30권2호
      • /
      • pp.1-18
      • /
      • 2023
    • Recently, as technology development has accelerated and product life cycles have been shortened, it is necessary to derive key product features from customers in the R&D planning and evaluation stage. More companies want differentiated competitiveness by providing consumer-tailored products based on big data and artificial intelligence technology. To achieve this, the need to correctly grasp the required quality, which is a requirement of consumers, is increasing. However, the existing methods are centered on suppliers or domain experts, so there is a gap from the actual perspective of consumers. In other words, product attributes were defined by suppliers or field experts, but this may not consider consumers' actual perspective. Accordingly, the demand for deriving the product's main attributes through reviews containing consumers' perspectives has recently increased. Therefore, we propose a review data analysis-based required quality methodology containing customer requirements. Specifically, a pre-training language model with a good understanding of Korean reviews was established, consumer intent was correctly identified, and key contents were extracted from the review through a combination of KeyBERT and topic modeling to derive the required quality for each product. RevBERT, a Korean review domain-specific pre-training language model, was established through further pre-training. By comparing the existing pre-training language model KcBERT, we confirmed that RevBERT had a deeper understanding of customer reviews. In addition, all processes other than that of selecting the required quality were linked to the automation process, resulting in the automation of deriving the required quality based on data.

    FSM 설계를 위한 하드웨어 흐름도와 하드웨어 기술 언어에 관한 연구 (A Study on a Hardware Folw-Chart and Hardware Description Language for FSM)

    • 이병호;조중휘;정정화
      • 대한전자공학회논문지
      • /
      • 제26권4호
      • /
      • pp.127-137
      • /
      • 1989
    • 본 논문에서는 논리 설계 자동화를 위한 레지스터 전송 레벨의 하드웨어 흐름도와 하드웨어 기술언어 SDL-II (symbolic description language)를 각각 제안한다. SDL-II는 일반화된 FSM(finite state machine)의 동작 및 구조적 특성을 제안하는 하드웨어 흐름도로 표현하고 이의 각 기호에 1대 1 대응하며 제어부와 데이타 전송부를 함께 기술하도록 구문을 설정한다. 또한 여러가지 설계 요구조건을 하드웨어 흐름도로 표현하고 이를 SDL-II로 기술하여 본 논문의 유효성을 보인다.

    • PDF

    Linguistic and Stylistic Markers of Influence in the Essayistic Text: A Linguophilosophic Aspect

    • Kolkutina, Viktoriia;Orekhova, Larysa;Gremaliuk, Tetiana;Borysenko, Natalia;Fedorova, Inna;Cheban, Oksana
      • International Journal of Computer Science & Network Security
      • /
      • 제22권5호
      • /
      • pp.163-167
      • /
      • 2022
    • The article explores linguo-stylistic influence markers in essayistic texts. The novelty of this investigation is provided by its perspective. Essayism is looked at as a style of thinking and writing and studied as a holistic philosophical and cultural phenomenon, as a revalent form of comprehension of reality that features non-lasting author's judgements and enhancement of the author's voice in the text. Based on the texts by V. Rosanov, G.K. Chesterton, and D. Dontsov, the remarkable English, Russian, and Ukrainian essay-writers of the first party of the 20th century, the article tracks the typical ontological-and-existentialist correlation at the content, stylistic, and semantic levels. It is observed in terms of the ideas presented in the texts of these publicists and the lexicostylistic markers of the influence on the reader that enable these ideas to implement. The explored poetic syntax, key lexemes, dialogueness, intonational melodics, specific language, free associations, aphoristic nature, verbalization of emotions and feeling in the psycholinguistic form of their expression, stress, heroic elevation, metaphors and evaluative linguistic units in the ontological-and-existentialist aspects contribute to extremely delicate and demanding nature of the essayistic style. They create a "lacework" of unpredictable properties, intellectual illumination, unexpected similarity, metaphorical freshness, sudden discoveries, unmotivated unities.

    Phrase-Chunk Level Hierarchical Attention Networks for Arabic Sentiment Analysis

    • Abdelmawgoud M. Meabed;Sherif Mahdy Abdou;Mervat Hassan Gheith
      • International Journal of Computer Science & Network Security
      • /
      • 제23권9호
      • /
      • pp.120-128
      • /
      • 2023
    • In this work, we have presented ATSA, a hierarchical attention deep learning model for Arabic sentiment analysis. ATSA was proposed by addressing several challenges and limitations that arise when applying the classical models to perform opinion mining in Arabic. Arabic-specific challenges including the morphological complexity and language sparsity were addressed by modeling semantic composition at the Arabic morphological analysis after performing tokenization. ATSA proposed to perform phrase-chunks sentiment embedding to provide a broader set of features that cover syntactic, semantic, and sentiment information. We used phrase structure parser to generate syntactic parse trees that are used as a reference for ATSA. This allowed modeling semantic and sentiment composition following the natural order in which words and phrase-chunks are combined in a sentence. The proposed model was evaluated on three Arabic corpora that correspond to different genres (newswire, online comments, and tweets) and different writing styles (MSA and dialectal Arabic). Experiments showed that each of the proposed contributions in ATSA was able to achieve significant improvement. The combination of all contributions, which makes up for the complete ATSA model, was able to improve the classification accuracy by 3% and 2% on Tweets and Hotel reviews datasets, respectively, compared to the existing models.

    응용프로그램에 특화된 명령어를 통한 고정 소수점 오디오 코덱 최적화를 위한 ADL 기반 컴파일러 사용 (Using a H/W ADL-based Compiler for Fixed-point Audio Codec Optimization thru Application Specific Instructions)

    • 안민욱;백윤흥;조정훈
      • 정보처리학회논문지A
      • /
      • 제13A권4호
      • /
      • pp.275-288
      • /
      • 2006
    • 빠른 디자인 공간 탐색 (Design space exploration)은 응용 프로그램의 동작을 구현하기 위한 임베디드 시스템을 디자인하는데 매우 중요하다. Time-to-market이 디자인의 주관심사가 되어감에 따라 ASIP(Application specific instruction-set processor)에 기반한 접근 방식이 디자인 방법론적으로 중요한 대안이 되고 있다. 이러한 접근 방식에서는 타깃 프로세서의 ISA(Instruction set architecture)를 코드 크기와 실행 속도 측면에서 응용 프로그램에 가장 적합하도록 변경한다. 본 논문의 목적은 우리의 새로운 재겨냥성 컴파일러를 소개하고, 많이 알려진 디지털 신호 처리용 응용 프로그램을 위한 ASIP 기반 디자인 공간 탐색에서 컴파일러가 어떻게 활용될 수 있는지 설명하고자 하는 것이다. 새롭게 개발된 재겨냥성 컴파일러는 이전의 재겨냥성 컴파일러의 기능을 제공할 뿐만 아니라 application 프로그램의 특징을 시각화하고 application 프로그램의 프로파일된 결과를 제공하므로 application의 성능을 증가시키기 위해 어떤 명령어들을 넣어야 하는지를 결정하는데 도움을 준다. 재겨냥성 컴파일러의 ADL(Architecture description language)를 이용하여 타깃 프로세서의 초기 RISC-style ISA을 기술하고, 컴파일러가 응용 프로그램을 위한 어셈블리 코드를 더 최적화할 수 있도록 응용 프로그램에 특화된 명령어를 ISA에 점진적으로 추가해 나간다. AC3 오디오 codec을 위한 실험 결과로부터 우리는 32%의 성능 증가와 20%의 프로그램 크기 감소를 얻을 수 있는 6개의 새로운 특화 명령어를 빠르게 찾을 수 있었다. 따라서 우리는 고성능의 재겨냥성 컴파일러는 특정 응용 프로그램을 위한 새로운 ASIP의 빠른 디자인을 하기 위한 중요한 핵심이라는 것을 확인할 수 있었다.

    Classification of Characters in Movie by Correlation Analysis of Genre and Linguistic Style

    • You, Eun-Soon;Song, Jae-Won;Park, Seung-Bo
      • 한국컴퓨터정보학회논문지
      • /
      • 제24권1호
      • /
      • pp.49-55
      • /
      • 2019
    • The character dialogue created by AI is unnatural when compared with human-made dialogue, and it can not reveal the character's personality properly in spite of remarkable development of AI. The purpose of this paper is to classify characters through the linguistic style and to investigate the relation of the specific linguistic style with the personality. We analyzed the dialogues of 92 characters selected from total 60 movies categorized four movie genres, such as romantic comedy, action, comedy and horror/thriller, using Linguistic Inquiry and Word Count (LIWC), a text analysis software. As a result, we confirmed that there is a unique language style according to genre. Especially, we could find that the emotional tone than analytical thinking are two important features to classify. They were analyzed as very important features for classification as the precision and recall is over 78% for romantic comedy and action. However, the precision and recall were 66% and 50% for comedy and horror/thriller. Their impact on classification was less than romantic comedy and action genre. The characters of romantic comedy deal with the affection between men and women using a very high value of emotional tone than analytical thinking. The characters of action genre who need rational judgment to perform mission have much greater analytical thinking than emotional tone. Additionally, in the case of comedy and horror/thriller, we analyzed that they have many kinds of characters and that characters often change their personalities in the story.