• 제목/요약/키워드: Natural languages

검색결과 128건 처리시간 0.031초

KOREAN TOPIC MODELING USING MATRIX DECOMPOSITION

  • June-Ho Lee;Hyun-Min Kim
    • East Asian mathematical journal
    • /
    • 제40권3호
    • /
    • pp.307-318
    • /
    • 2024
  • This paper explores the application of matrix factorization, specifically CUR decomposition, in the clustering of Korean language documents by topic. It addresses the unique challenges of Natural Language Processing (NLP) in dealing with the Korean language's distinctive features, such as agglutinative words and morphological ambiguity. The study compares the effectiveness of Latent Semantic Analysis (LSA) using CUR decomposition with the classical Singular Value Decomposition (SVD) method in the context of Korean text. Experiments are conducted using Korean Wikipedia documents and newspaper data, providing insight into the accuracy and efficiency of these techniques. The findings demonstrate the potential of CUR decomposition to improve the accuracy of document clustering in Korean, offering a valuable approach to text mining and information retrieval in agglutinative languages.

상상력의 연극 이미지의 무대구성작업에 관하여 김아라 연출작업에 나타난 새로운 무대언어 (Theatre of Imagination: Study on New Languages in the Theatre Experiment of Ara Kim)

  • 남상식
    • 한국연극학
    • /
    • 제48호
    • /
    • pp.261-288
    • /
    • 2012
  • This paper attempts to research on the new language in the directing of Ara Kim. She was cranky on working on the stage to experiment with her own style since the 1980s and so opened a new dawn in modern Korean theatre. She leaded the Korean experimental theatre. The background of this experiment is her idea on theatre. And here, we have to look the subject that she setted for the work in Chuksan: Ritual Past, Ritual Present. To her, the theatre has the function of ritual and fest. The theatre suggests universal tragedy given to human as natural life force and has its own agenda to drive people to healing. For it, Ara Kim explores archetypal forms and languages before the fragmentation of genres of art. Her theatre shows the results of experiments in which such languages are recreated with modernized sensibilities. We here, for example by outdoor performance in Chuksan Human Lear, try to interpret the aesthetic principles that body out her ritual theatre. And what we looked at though, is the base of the 'complex-genre-music-theatre', the methode to 'compose' the stage elements and put it all together. The directing of Ara Kim has, in terms of the composition of the stage elements, much of the indisputable artistic value. Her theatre is, so to speak, theatre of image, and it is theatre of imagination that completed by the audience's imagination. Human Lear which has its own characteristic in image fragments, convert the original Lear into a simple tale. It serves as background of the modern ritual that shows the most basic human instincts. We meet in Human Lear a ritual tale with some list of image for the human instincts. The arrangement of image, the montage of scene shows the performance as a kind of artistic space. In Human Lear the space is the natural one. It centers around the arena stage. The objects installed in the space changes it into the laboratory for 'seeing' the happening. The spectators see the performance and at the same time see themselves in the nature laboratory. They see, and equally, they are visible objects. They see the performance and us in the space in which the performance takes place. That is what Ara Kim with her modern ritual really aims. That aim is to this days still in effect. It is a major driver of her experiments to extend the boundary of the theatre. The ritualistic site-specific performance in Akor Wat, Cambodia, A Song of Mandala is the latest great product from her experiments. On the other hand, she continues on her way to experiment with pure stage elements. The 'Station' series(Station of Water, The Station of Sand, The Station of Wind) she recently showed are the non-verbal performance with all the stage elements: movement, sound, body, light, colour, objects and so on.

사회연결망 분석을 활용한 패션 트렌드 고찰 (Exploring Fashion Trends Using Network Analysis)

  • 박지수;이유리
    • 한국의류학회지
    • /
    • 제38권5호
    • /
    • pp.611-626
    • /
    • 2014
  • Reading and foreseeing fashion trends is crucial and difficult in the fashion industry due to accelerated and diversified changes in fashion trends. We use network analysis to investigate fashion trends from 2004 to 2013 in order to find the inter-relevance among fashion trends. We extracted words from fashion trend info for women's wear provided by Samsung Design Net, created a 2-mode network of seasons and trend languages, and visualized this network using NodeXl program. Fashion trends repeated a unique pattern during the period. In the first half (2004-2008), retro modern, feminine modern, and ecological modern were dominant trends in consecutive order. The years 2009-2013 witnessed distinctive fashion trends in S/S seasons and in F/W seasons. 11F/W, 12F/W and 13F/W seasons were characterized by artistic creative style. From 2010, natural style dominated S/S seasons. 10S/S and 12S/S seasons were distinguished as a calm natural style that reflected a peaceful and simple life. In 11S/S and 13S/S seasons, soft natural style emerged as a sign of increased importance of inner spirit and natural energy. A seasonal reappearance of trends was observed every two years in S/S seasons that enabled the prediction that 14S/S will see another version of natural style. A macroscopic trend for the last 10 years was represented by the keywords 'modern' and 'natural'. 'Modern' involved the past styles such as 60's, Baroque and the origin of human life. 'Natural' was connected with design elements such as material, silhouette and color. Managerial implications and future study directions are discussed based on the results.

모음 높이의 새로운 표기법에 대하여 (A new feature specification for vowel height)

  • 박천배
    • 대한음성학회지:말소리
    • /
    • 제27_28호
    • /
    • pp.27-56
    • /
    • 1994
  • Processes involving the change of vowel height are natural enough to be found in many languages. It is essential to have a better feature specification for vowel height to grasp these processes properly, Standard Phonology adopts the binary feature system, and vowel height is represented by the two features, i.e., [\pm high] and [\pm low]. This has its own merits. But it is defective because it is misleading when we count the number of features used in a rule to compare the naturalness of rules. This feature system also cannot represent more than three degrees of height, We wi31 discard the binary features for vowel height. We consider to adopt the multivalued feature [n high] for the property of height. However, this feature cannot avoid the arbitrariness resulting from the number values denoting vowel height. It is not easy to expect whether the number in question is the largest or not It also is impossible to decide whether a larger number denotes a higher vowel or a lower vowel. Furthermore this feature specification requires an ad hoc condition such as n > 3 or n \geq 2, whenever we want to refer to a natural class including more than one degree of height The altelnative might be Particle Phonology, or Dependency Phonology. These might be apt for multivalued vowel height systems, as their supporters argue. However, the feature specification of Particle Phonology will be discarded because it does not observe strictly the assumption that the number of the particle a is decisive in representing the height. One a in a representation can denote variant degrees of height such as [e], [I], [a], [a ] and [e ]. This also means that we cannot represent natural classes in terms of the number of the particle a, Dependency Phonology also has problems in specifying a degree of vowel height by the dependency relations between the elements. There is no unique element to represent vowel height since every property has to be defined in terms of the dependency relations between two or more elements, As a result it is difficult to formulate a rule for vowel height change, especially when the phenomenon involves a chain of vowel shifts. Therefore, we suggest a new feature specification for vowel height (see Chapter 3). This specification resorts to a single feature H and a few >'s which refer exclusively to the degree of the tongue height when a vowel is pronounced. It can cope with more than three degrees of height because it is fundamentally a multivalued scalar feature. This feature also obviates the ad hoc condition for a natural class while the [n high] type of multivalued feature suffers from it. Also this feature specification conforms to our expection that the notation should become simpler as the generality of the class increases, in that the fewer angled brackets are used, the more vowels are included, Incidentally, it has also to be noted that, by adopting a single feature for vowel height, it is possible to formulate a simpler version of rules involving the changes of vowel height especially when they involve vowel shifts found in many languages.

  • PDF

한국어 통합정보사전 시스템 (YDK : A Thesaurus Developing System for Korean Language)

  • 황도삼;최기선
    • 한국정보처리학회논문지
    • /
    • 제7권9호
    • /
    • pp.2885-2893
    • /
    • 2000
  • 사전은 각종 자연언어처리 시스템에 있어서 고도의 언어처리 및 성능향상을 위한 필수 요소이며, 아무리 좋은 언어처리 도구와 알고리즘이라도 계산언어학에 근거한 양질의 체계적인 전자사전이 없는 한 이의 실용화는 불가능하다. 기존의 출판된 일반 사전은 자연언어처리 및 이해를 목적으로 하여 개발된 사전이 아니다. 또한, 자연언어처리 도구 및 응용시스템을 위해 개발된 사전은 각 시스템의 목적에 따라 각기 다른 체계에 의해 구축되어 있기 때문에 이용하는데 있어서 비효율적인 점이 있다. 따라서, 고도의 언어처리 및 이해를 목적으로 한 체계적이고 과학적인 방법론을 이용하여 형태소 구문 의미정보등 각종 정보가 통합된 통합정보사전의 개발이 필요하다. 본 논문에서는 통합정보사전을 구축하기 위한 방법론을 제시하고, 이에 근거하여 개발한 통합정보사전 개발 시스템을 제시한다.

  • PDF

앨런 코쿤(Ahin Colquhoun)의 전통건축 해석학을 넘어서 -'과거를 개념으로 대체(displacement)하기'에서 '과거를 재활성화(reactivation) 하기'로- (Beyond Alan Colquhoun's Architectural Hermeneutics of Tradition - from 'conceptural displacement of the past' to 'the reactivation of the past'-)

  • 이동언
    • 건축역사연구
    • /
    • 제7권4호
    • /
    • pp.49-60
    • /
    • 1998
  • The first aim of this paper is to investigate and analyze Alan Colquhoun's architectural hermeneutics of tradition, 'conceptual displacement of the past.' The second aim is to overcome the limit of it, and to suggest new architectural hermeneutics of tradition, 'the reactivation of the past.' The architectural work is reduced by Colquhoun to typology or arbitrary language because he believes that without arbitrary language natural language is not able to work effectively. However, he ignores that two languages cannot be separable. When they are separated the key to natural language is understood to be an unverifiable similarity between a sense perception and its correspondence in the architectural object, while the key to arbitrary language becomes mere artificial agreement on the value and function of the linguistic sign. Therefore, natural language is appropriate only when it permits spontaneous combinations of sensory data within complex structures which emerge from, and support, complex human interaction and communication(the shining of the world and of the possibility of creative being in each individual thing). Only when architecture is translated into this kind of language, can it reactivate the world's past, and become poetic.

  • PDF

근육 모델 기반 3D 얼굴 표정 생성 시스템 설계 및 구현 (A Design and Implementation of 3D Facial Expressions Production System based on Muscle Model)

  • 이혜정;정석태
    • 한국정보통신학회논문지
    • /
    • 제16권5호
    • /
    • pp.932-938
    • /
    • 2012
  • 얼굴 표정은 상호간 의사소통에 있어 중요한 의미를 갖는 것으로, 인간이 사용하는 다양한 언어보다도 수많은 인간 내면의 감정을 표현할 수 있는 유일한 수단이다. 본 논문에서는 쉽고 자연스러운 얼굴 표정 생성을 위한 근육 모델 기반 3D 얼굴 표정 생성 시스템을 제안한다. 3D 얼굴 모델의 표정 생성을 위하여 Waters의 근육 모델을 기반으로 자연스러운 얼굴 표정 생성에 필요한 근육을 추가하여 사용하고, 표정 생성의 핵심적 요소인 눈썹, 눈, 코, 입, 볼 등의 특징요소들을 중심으로 얼굴 근육과 근육벡터를 이용하여 해부학적으로 서로 연결된 얼굴 근육 움직임의 그룹화를 통해 얼굴 표정 변화의 기본 단위인 AU를 단순화하고 재구성함으로써 쉽고 자연스러운 얼굴 표정을 생성할 수 있도록 하였다.

PharmacoNER Tagger: a deep learning-based tool for automatically finding chemicals and drugs in Spanish medical texts

  • Armengol-Estape, Jordi;Soares, Felipe;Marimon, Montserrat;Krallinger, Martin
    • Genomics & Informatics
    • /
    • 제17권2호
    • /
    • pp.15.1-15.7
    • /
    • 2019
  • Automatically detecting mentions of pharmaceutical drugs and chemical substances is key for the subsequent extraction of relations of chemicals with other biomedical entities such as genes, proteins, diseases, adverse reactions or symptoms. The identification of drug mentions is also a prior step for complex event types such as drug dosage recognition, duration of medical treatments or drug repurposing. Formally, this task is known as named entity recognition (NER), meaning automatically identifying mentions of predefined entities of interest in running text. In the domain of medical texts, for chemical entity recognition (CER), techniques based on hand-crafted rules and graph-based models can provide adequate performance. In the recent years, the field of natural language processing has mainly pivoted to deep learning and state-of-the-art results for most tasks involving natural language are usually obtained with artificial neural networks. Competitive resources for drug name recognition in English medical texts are already available and heavily used, while for other languages such as Spanish these tools, although clearly needed were missing. In this work, we adapt an existing neural NER system, NeuroNER, to the particular domain of Spanish clinical case texts, and extend the neural network to be able to take into account additional features apart from the plain text. NeuroNER can be considered a competitive baseline system for Spanish drug and CER promoted by the Spanish national plan for the advancement of language technologies (Plan TL).

Web-Based Question Bank System using Artificial Intelligence and Natural Language Processing

  • Ahd, Aljarf;Eman Noor, Al-Islam;Kawther, Al-shamrani;Nada, Al-Sufyini;Shatha Tariq, Bugis;Aisha, Sharif
    • International Journal of Computer Science & Network Security
    • /
    • 제22권12호
    • /
    • pp.132-138
    • /
    • 2022
  • Due to the impacts of the current pandemic COVID-19 and the continuation of studying online. There is an urgent need for an effective and efficient education platform to help with the continuity of studying online. Therefore, the question bank system (QB) is introduced. The QB system is designed as a website to create a single platform used by faculty members in universities to generate questions and store them in a bank of questions. In addition to allowing them to add two types of questions, to help the lecturer create exams and present the results of the students to them. For the implementation, two languages were combined which are PHP and Python to generate questions by using Artificial Intelligence (AI). These questions are stored in a single database, and then these questions could be viewed and included in exams smoothly and without complexity. This paper aims to help the faculty members to reduce time and efforts by using the Question Bank System by using AI and Natural Language Processing (NLP) to extract and generate questions from given text. In addition to the tools used to create this function such as NLTK and TextBlob.

Phrase-Chunk Level Hierarchical Attention Networks for Arabic Sentiment Analysis

  • Abdelmawgoud M. Meabed;Sherif Mahdy Abdou;Mervat Hassan Gheith
    • International Journal of Computer Science & Network Security
    • /
    • 제23권9호
    • /
    • pp.120-128
    • /
    • 2023
  • In this work, we have presented ATSA, a hierarchical attention deep learning model for Arabic sentiment analysis. ATSA was proposed by addressing several challenges and limitations that arise when applying the classical models to perform opinion mining in Arabic. Arabic-specific challenges including the morphological complexity and language sparsity were addressed by modeling semantic composition at the Arabic morphological analysis after performing tokenization. ATSA proposed to perform phrase-chunks sentiment embedding to provide a broader set of features that cover syntactic, semantic, and sentiment information. We used phrase structure parser to generate syntactic parse trees that are used as a reference for ATSA. This allowed modeling semantic and sentiment composition following the natural order in which words and phrase-chunks are combined in a sentence. The proposed model was evaluated on three Arabic corpora that correspond to different genres (newswire, online comments, and tweets) and different writing styles (MSA and dialectal Arabic). Experiments showed that each of the proposed contributions in ATSA was able to achieve significant improvement. The combination of all contributions, which makes up for the complete ATSA model, was able to improve the classification accuracy by 3% and 2% on Tweets and Hotel reviews datasets, respectively, compared to the existing models.