• Title/Summary/Keyword: Natural languages

Search Result 128, Processing Time 0.026 seconds

A Content Analysis of the Trends in Vision Research With Focus on Visual Search, Eye Movement, and Eye Track

  • Rhie, Ye Lim;Lim, Ji Hyoun;Yun, Myung Hwan
    • Journal of the Ergonomics Society of Korea
    • /
    • v.33 no.1
    • /
    • pp.69-76
    • /
    • 2014
  • Objective: This study aims to present literature providing researchers with insights on specific fields of research and highlighting the major issues in the research topics. A systematic review is suggested using content analysis on literatures regarding "visual search", "eye movement", and "eye track". Background: Literature review can be classified as "narrative" or "systematic" depending on its approach in structuring the content of the research. Narrative review is a traditional approach that describes the current state of a study field and discusses relevant topics. However, since literatures on specific area cover a broad range, reviewers inherently give subjective weight on specific issues. On the contrary, systematic review applies explicit structured methodology to observe the study trends quantitatively. Method: We collected meta-data of journal papers using three search keywords: visual search, eye movement, and eye track. The collected information contains an unstructured data set including many natural languages which compose titles and abstracts, while the keyword of the journal paper is the only structured one. Based on the collected terms, seven categories were evaluated by inductive categorization and quantitative analysis from the chronological trend of the research area. Results: Unstructured information contains heavier content on "stimuli" and "condition" categories as compared with structured information. Studies on visual search cover a wide range of cognitive area whereas studies on eye movement and eye track are closely related to the physiological aspect. In addition, experimental studies show an increasing trend as opposed to the theoretical studies. Conclusion: By systematic review, we could quantitatively identify the characteristic of the research keyword which presented specific research topics. We also found out that the structured information was more suitable to observe the aim of the research. Chronological analysis on the structured keyword data showed that studies on "physical eye movement" and "cognitive process" were jointly studied in increasing fashion. Application: While conventional narrative literature reviews were largely dependent on authors' instinct, quantitative approach enabled more objective and macroscopic views. Moreover, the characteristics of information type were specified by comparing unstructured and structured information. Systematic literature review also could be used to support the authors' instinct in narrative literature reviews.

DEVELOPMENT OF FOREIGN ASTRONOMY EDUCATION PROGRAMS : CAMBODIA (해외 천문학 교육 프로그램 개발: 캄보디아)

  • KIM, SANG CHUL;LYO, A-RAN;PARK, CHANGBOM;LEE, JEONG AE;LEE, KANG-HWAN;SHIN, YONG-CHEOL;SHIN, NAEUN;SHIN, ZIHEY;CHOI, YOONHO;KWON, SUN-GILL;KIM, TAEWOO;YOON, HOSEOP;PARK, SOONCHANG;SUNG, EON-CHANG;PAK, SOOJONG
    • Publications of The Korean Astronomical Society
    • /
    • v.34 no.2
    • /
    • pp.17-28
    • /
    • 2019
  • The Korean Astronomical Society (KAS) Education & Public Outreach Committee has provided education services for children and school teachers in Cambodia over the past three years from 2016 to 2018. In the first year, 2016, one KAS member visited Pusat to teach astronomy to about 50 children, and in the following two years of 2017 and 2018, three and six KAS members, respectively, executed education workshops for ~ 20 (per each year) local school teachers in Sisophon. It turned out that it is desirable to include both teaching of astronomical knowledge and making experiments and observations in the education in order for the program to be more effective. Language barrier was the main obstacle in conveying concepts and knowledge, and having a good interpreter was very important. It happens that some languages, such as the Khmer of Cambodia, do not have astronomical terminologies, so that lecturers and even the education participants together are needed to communicate and create appropriate words. Handout hard-copies of the education materials (presentation files, lecture/experiment summaries, terminologies, etc.) are extremely helpful for the participants. Actual performing of assembling and using astronomical telescopes for night sky observations has been lifetime experience for some of the participants, which might promote zeal for knowledge and education. It is hoped that these education services for developing countries like Cambodia can be regularly continued in the future, and further extended to other countries such as Laos and Myanmar.

Korean and Multilingual Language Models Study for Cross-Lingual Post-Training (XPT) (Cross-Lingual Post-Training (XPT)을 위한 한국어 및 다국어 언어모델 연구)

  • Son, Suhyune;Park, Chanjun;Lee, Jungseob;Shim, Midan;Lee, Chanhee;Park, Kinam;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.3
    • /
    • pp.77-89
    • /
    • 2022
  • It has been proven through many previous researches that the pretrained language model with a large corpus helps improve performance in various natural language processing tasks. However, there is a limit to building a large-capacity corpus for training in a language environment where resources are scarce. Using the Cross-lingual Post-Training (XPT) method, we analyze the method's efficiency in Korean, which is a low resource language. XPT selectively reuses the English pretrained language model parameters, which is a high resource and uses an adaptation layer to learn the relationship between the two languages. This confirmed that only a small amount of the target language dataset in the relationship extraction shows better performance than the target pretrained language model. In addition, we analyze the characteristics of each model on the Korean language model and the Korean multilingual model disclosed by domestic and foreign researchers and companies.

Korean Ironic Expression Detector (한국어 반어 표현 탐지기)

  • Seung Ju Bang;Yo-Han Park;Jee Eun Kim;Kong Joo Lee
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.3
    • /
    • pp.148-155
    • /
    • 2024
  • Despite the increasing importance of irony and sarcasm detection in the field of natural language processing, research on the Korean language is relatively scarce compared to other languages. This study aims to experiment with various models for irony detection in Korean text. The study conducted irony detection experiments using KoBERT, a BERT-based model, and ChatGPT. For KoBERT, two methods of additional training on sentiment data were applied (Transfer Learning and MultiTask Learning). Additionally, for ChatGPT, the Few-Shot Learning technique was applied by increasing the number of example sentences entered as prompts. The results of the experiments showed that the Transfer Learning and MultiTask Learning models, which were trained with additional sentiment data, outperformed the baseline model without additional sentiment data. On the other hand, ChatGPT exhibited significantly lower performance compared to KoBERT, and increasing the number of example sentences did not lead to a noticeable improvement in performance. In conclusion, this study suggests that a model based on KoBERT is more suitable for irony detection than ChatGPT, and it highlights the potential contribution of additional training on sentiment data to improve irony detection performance.

Applying Meta-model Formalization of Part-Whole Relationship to UML: Experiment on Classification of Aggregation and Composition (UML의 부분-전체 관계에 대한 메타모델 형식화 이론의 적용: 집합연관 및 복합연관 판별 실험)

  • Kim, Taekyung
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.99-118
    • /
    • 2015
  • Object-oriented programming languages have been widely selected for developing modern information systems. The use of concepts relating to object-oriented (OO, in short) programming has reduced efforts of reusing pre-existing codes, and the OO concepts have been proved to be a useful in interpreting system requirements. In line with this, we have witnessed that a modern conceptual modeling approach supports features of object-oriented programming. Unified Modeling Language or UML becomes one of de-facto standards for information system designers since the language provides a set of visual diagrams, comprehensive frameworks and flexible expressions. In a modeling process, UML users need to consider relationships between classes. Based on an explicit and clear representation of classes, the conceptual model from UML garners necessarily attributes and methods for guiding software engineers. Especially, identifying an association between a class of part and a class of whole is included in the standard grammar of UML. The representation of part-whole relationship is natural in a real world domain since many physical objects are perceived as part-whole relationship. In addition, even abstract concepts such as roles are easily identified by part-whole perception. It seems that a representation of part-whole in UML is reasonable and useful. However, it should be admitted that the use of UML is limited due to the lack of practical guidelines on how to identify a part-whole relationship and how to classify it into an aggregate- or a composite-association. Research efforts on developing the procedure knowledge is meaningful and timely in that misleading perception to part-whole relationship is hard to be filtered out in an initial conceptual modeling thus resulting in deterioration of system usability. The current method on identifying and classifying part-whole relationships is mainly counting on linguistic expression. This simple approach is rooted in the idea that a phrase of representing has-a constructs a par-whole perception between objects. If the relationship is strong, the association is classified as a composite association of part-whole relationship. In other cases, the relationship is an aggregate association. Admittedly, linguistic expressions contain clues for part-whole relationships; therefore, the approach is reasonable and cost-effective in general. Nevertheless, it does not cover concerns on accuracy and theoretical legitimacy. Research efforts on developing guidelines for part-whole identification and classification has not been accumulated sufficient achievements to solve this issue. The purpose of this study is to provide step-by-step guidelines for identifying and classifying part-whole relationships in the context of UML use. Based on the theoretical work on Meta-model Formalization, self-check forms that help conceptual modelers work on part-whole classes are developed. To evaluate the performance of suggested idea, an experiment approach was adopted. The findings show that UML users obtain better results with the guidelines based on Meta-model Formalization compared to a natural language classification scheme conventionally recommended by UML theorists. This study contributed to the stream of research effort about part-whole relationships by extending applicability of Meta-model Formalization. Compared to traditional approaches that target to establish criterion for evaluating a result of conceptual modeling, this study expands the scope to a process of modeling. Traditional theories on evaluation of part-whole relationship in the context of conceptual modeling aim to rule out incomplete or wrong representations. It is posed that qualification is still important; but, the lack of consideration on providing a practical alternative may reduce appropriateness of posterior inspection for modelers who want to reduce errors or misperceptions about part-whole identification and classification. The findings of this study can be further developed by introducing more comprehensive variables and real-world settings. In addition, it is highly recommended to replicate and extend the suggested idea of utilizing Meta-model formalization by creating different alternative forms of guidelines including plugins for integrated development environments.

A Study on the Landscape Interpretation of Songge Byeoleop(Korean Villa) Garden at Jogyedong, Mt. Bukhansan near Seoul for the Restoration (북한산 조계동 송계별업(松溪別業) 정원 복원을 위한 경관해석)

  • Rho, Jae-Hyun;Song, Suk-Ho;Jo, Jang-Bin;Sim, Woo-Kyung
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.36 no.4
    • /
    • pp.1-17
    • /
    • 2018
  • This study was conducted to interpret the landscape of Songge Byeoleop(Korean villa) garden at Jogyedong, Bukhansan near Seoul which was built in the mid 17C. to restore through the literature reviews and field surveys. The results were as follows; Songge Byeoleop garden was a royal villa, constructed at King Injo24(1646) of Joseon dynasty by prince Inpyeong(麟坪大君), Lee, Yo(李?, 1622~1658), the third son of King Injo who was a brother of King Hyojong. It was a royal villa, Seokyang-lu under Mt. Taracsan of Gyendeokbang, about 7km away in the straight line from main building. It was considered that the building system was a very gorgeous with timber coloring because of owner's special situation who was called the great prince. The place of Songge Byeoleop identity and key landscape of the place were consisted with Gucheon waterfall and the sound of the water with multi-layered waterfall which might be comparable to the waterfall of Yeosan in China. After the destruction of the building, the place was used for the royal tomb quarry, but there was a mark stone for forbidden quarry. The Inner part of Songge Beoleop, centered with Jogedongcheon, Chogye-dong, composted beautifully with the natural sceneries of Gucheon waterfall, Handam and Changbeok, and artificial structures, such as Bihong-bridge, Boheogak, Yeonghyudang and Gyedang. In addition, the existing Chinese characters, 'Songge Beoleop' and 'Gucheoneunpog' carved in the rocks are literary languages and place markings symbolizing with the contrast of the different forests and territories. They gave the names of scenery to the rock and gave meaning to them. Particularly, Gucheon waterfall which served as a visual terminal point, is a cascade type with multi-staged waterfall. and the lower part shows the topographical characteristics of the Horse Bowl-shaped jointed with port-holes. On the other hand, the outer part is divided into the spaces for the main entrance gate, a hanging bridge character, a bridge connecting the inside and the outside, and Yeonghyudang part for the purpose of living. Also in the Boheogak area, dual view frame structures are made to allow the view of the four sides including the width and the perimeter of the villa. In addition, at the view point in Bihong-bridge, the Gucheon water fall divides between the sacred and profane, and crosses the Bihong-bridge and climbs to the subterranean level.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.