• Title/Summary/Keyword: semantic classification

Search Result 329, Processing Time 0.025 seconds

A Comparative Study on New Words of Korean and Chinese According to Changes in Popular Culture Contents (대중문화 콘텐츠 변화에 따른 한중 신조어 비교 연구)

  • Meng, Xiang-Shan;Lee, Kwang-Ho
    • Journal of Korea Entertainment Industry Association
    • /
    • v.14 no.6
    • /
    • pp.125-137
    • /
    • 2020
  • The purpose of this study is to analyze new words in Korean and Chinese based on changes in popular culture. As China and Korea embrace increasingly close communication in recent years, their languages have influenced each other. A lot of new Korean and Chinese words have been discovered to have the same linguistic characteristics. New words are considered as new developments of a language. They are welcomed and widely used by young people in Korea and China. Therefore, in terms of the communicative function of languages, it is worthwhile to understand new words in Korean and Chinese from the perspective of academic research. This study takes Chinese words created in 2018 as the research object. Firstly, a morphological and semantic comparison of Chinese words created in 2018 and those created in 2017 is carried out to extract the characteristic indicators of Chinese words created in 2018, with emphasis on compound words, abbreviations, substitutions, patters and rhetorical expressions. Secondly, the similarities and differences of these Chinese words with Korean words created in 2018 in terms of morphology are analyzed. Finally, after conducting sample classification and comparison, the characteristics of new Chinese and Korean words and the interaction mechanism under mutual influence are concluded. According to the study, the majority of the new words are created on the basis of existing words. Thus, it is important to explore the morphology of new words as a standard language.

Poly Synonyms Study on Naturalness in Landscape Architecture (조경학 연구에서 자연성 개념의 다의적 체계 연구)

  • Lee, Seong-Jin;Kim, Do-Eun;Son, Yong-Hoon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.1
    • /
    • pp.29-41
    • /
    • 2023
  • In landscape studies, the concept of naturalness was vast in its categories from physical space to cognitive systems, making it difficult to define terms at once. Therefore, this study summarized the concept and evaluation attributes of 'naturalness' used in the literature through systematic review (SR), and identified the scope of individual attributes that constitute the meaning of naturalness. In addition, the individual attributes classified in previous studies were identified as the meaning chain, one of the cognitive linguistic research methods, and applied to papers targeting naturalness among domestic landscape studies to organize a polysemous meaning system. Meaning chain is a suitable method for grasping words whose meaning expands in a chain due to family resemblance around prototypical meaning, and the dimension is classified according to the classification of naturalness evaluation items and a multi-semantic chain system of naturalness concepts discussed in domestic academia. The results of the study are as follows. First, the attributes of naturalness extracted through foreign landscape literature were classified into four areas: nature perceived as wilderness, nature as non-artificiality, nature as visual landscape, and nature as experience, and 13 detailed attributes. Second, these detailed attributes are generally consistent with domestic landscape studies, but their specific cases were different, and a Korean context was presented in perception of time accumulation, also they suggested that there may be a mutual conflict between naturalness attributes.

A Study on Types and Characteristics of 'Cultural Landscapes' with Big Data Analysis: Focusing on the Case of Shinan-gun, Jeollanam-do (빅데이터 분석을 통한 '문화경관' 유형과 특성 연구: 전라남도 신안군 사례를 중심으로)

  • OH Jungshim
    • Korean Journal of Heritage: History & Science
    • /
    • v.56 no.1
    • /
    • pp.162-180
    • /
    • 2023
  • The World Heritage Committee decided to make "cultural landscapes" a world heritage category in the 16th Session of the UNESCO General Conference. The decision was made from a recognition of the importance of interactions between human beings and the natural environment or between cultural heritage and natural heritage. Many countries have created policies and institutions to protect their own cultural landscapes along with the changing times. Korea, however, has not obviously defined the concepts and categories of its cultural landscapes, but manages policies and institutions based on the concept of a scenic spot, which has some similar meanings. In addition, it even borrows the "list of landscape adjectives," one of the representative methods for managing landscapes, from foreign countries. With this background, this paper suggested how to define cultural landscapes according to the global development flow. It created a list of cultural landscape adjectives by gathering the adjectives that can properly express local cultural landscapes in Korea. In particular, it collected 4,556 articles from a local newspaper by focusing on the case of Shinan-gun, Jeollanam-do, and analyzed key words and adjectives included in them by using big data analysis. The results suggested by this paper, such as the "classification table of cultural landscape types," "list of cultural landscape adjectives" and "network map of nouns/adjectives" can be applied to research on other localities, and furthermore, used as basic data for finding and protecting the characteristics of local cultural landscapes in Korea.

Classification of Industrial Parks and Quarries Using U-Net from KOMPSAT-3/3A Imagery (KOMPSAT-3/3A 영상으로부터 U-Net을 이용한 산업단지와 채석장 분류)

  • Che-Won Park;Hyung-Sup Jung;Won-Jin Lee;Kwang-Jae Lee;Kwan-Young Oh;Jae-Young Chang;Moung-Jin Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_3
    • /
    • pp.1679-1692
    • /
    • 2023
  • South Korea is a country that emits a large amount of pollutants as a result of population growth and industrial development and is also severely affected by transboundary air pollution due to its geographical location. As pollutants from both domestic and foreign sources contribute to air pollution in Korea, the location of air pollutant emission sources is crucial for understanding the movement and distribution of pollutants in the atmosphere and establishing national-level air pollution management and response strategies. Based on this background, this study aims to effectively acquire spatial information on domestic and international air pollutant emission sources, which is essential for analyzing air pollution status, by utilizing high-resolution optical satellite images and deep learning-based image segmentation models. In particular, industrial parks and quarries, which have been evaluated as contributing significantly to transboundary air pollution, were selected as the main research subjects, and images of these areas from multi-purpose satellites 3 and 3A were collected, preprocessed, and converted into input and label data for model training. As a result of training the U-Net model using this data, the overall accuracy of 0.8484 and mean Intersection over Union (mIoU) of 0.6490 were achieved, and the predicted maps showed significant results in extracting object boundaries more accurately than the label data created by course annotations.

TAGS: Text Augmentation with Generation and Selection (생성-선정을 통한 텍스트 증강 프레임워크)

  • Kim Kyung Min;Dong Hwan Kim;Seongung Jo;Heung-Seon Oh;Myeong-Ha Hwang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.10
    • /
    • pp.455-460
    • /
    • 2023
  • Text augmentation is a methodology that creates new augmented texts by transforming or generating original texts for the purpose of improving the performance of NLP models. However existing text augmentation techniques have limitations such as lack of expressive diversity semantic distortion and limited number of augmented texts. Recently text augmentation using large language models and few-shot learning can overcome these limitations but there is also a risk of noise generation due to incorrect generation. In this paper, we propose a text augmentation method called TAGS that generates multiple candidate texts and selects the appropriate text as the augmented text. TAGS generates various expressions using few-shot learning while effectively selecting suitable data even with a small amount of original text by using contrastive learning and similarity comparison. We applied this method to task-oriented chatbot data and achieved more than sixty times quantitative improvement. We also analyzed the generated texts to confirm that they produced semantically and expressively diverse texts compared to the original texts. Moreover, we trained and evaluated a classification model using the augmented texts and showed that it improved the performance by more than 0.1915, confirming that it helps to improve the actual model performance.

Restoring Omitted Sentence Constituents in Encyclopedia Documents Using Structural SVM (Structural SVM을 이용한 백과사전 문서 내 생략 문장성분 복원)

  • Hwang, Min-Kook;Kim, Youngtae;Ra, Dongyul;Lim, Soojong;Kim, Hyunki
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.131-150
    • /
    • 2015
  • Omission of noun phrases for obligatory cases is a common phenomenon in sentences of Korean and Japanese, which is not observed in English. When an argument of a predicate can be filled with a noun phrase co-referential with the title, the argument is more easily omitted in Encyclopedia texts. The omitted noun phrase is called a zero anaphor or zero pronoun. Encyclopedias like Wikipedia are major source for information extraction by intelligent application systems such as information retrieval and question answering systems. However, omission of noun phrases makes the quality of information extraction poor. This paper deals with the problem of developing a system that can restore omitted noun phrases in encyclopedia documents. The problem that our system deals with is almost similar to zero anaphora resolution which is one of the important problems in natural language processing. A noun phrase existing in the text that can be used for restoration is called an antecedent. An antecedent must be co-referential with the zero anaphor. While the candidates for the antecedent are only noun phrases in the same text in case of zero anaphora resolution, the title is also a candidate in our problem. In our system, the first stage is in charge of detecting the zero anaphor. In the second stage, antecedent search is carried out by considering the candidates. If antecedent search fails, an attempt made, in the third stage, to use the title as the antecedent. The main characteristic of our system is to make use of a structural SVM for finding the antecedent. The noun phrases in the text that appear before the position of zero anaphor comprise the search space. The main technique used in the methods proposed in previous research works is to perform binary classification for all the noun phrases in the search space. The noun phrase classified to be an antecedent with highest confidence is selected as the antecedent. However, we propose in this paper that antecedent search is viewed as the problem of assigning the antecedent indicator labels to a sequence of noun phrases. In other words, sequence labeling is employed in antecedent search in the text. We are the first to suggest this idea. To perform sequence labeling, we suggest to use a structural SVM which receives a sequence of noun phrases as input and returns the sequence of labels as output. An output label takes one of two values: one indicating that the corresponding noun phrase is the antecedent and the other indicating that it is not. The structural SVM we used is based on the modified Pegasos algorithm which exploits a subgradient descent methodology used for optimization problems. To train and test our system we selected a set of Wikipedia texts and constructed the annotated corpus in which gold-standard answers are provided such as zero anaphors and their possible antecedents. Training examples are prepared using the annotated corpus and used to train the SVMs and test the system. For zero anaphor detection, sentences are parsed by a syntactic analyzer and subject or object cases omitted are identified. Thus performance of our system is dependent on that of the syntactic analyzer, which is a limitation of our system. When an antecedent is not found in the text, our system tries to use the title to restore the zero anaphor. This is based on binary classification using the regular SVM. The experiment showed that our system's performance is F1 = 68.58%. This means that state-of-the-art system can be developed with our technique. It is expected that future work that enables the system to utilize semantic information can lead to a significant performance improvement.

Multi-Vector Document Embedding Using Semantic Decomposition of Complex Documents (복합 문서의 의미적 분해를 통한 다중 벡터 문서 임베딩 방법론)

  • Park, Jongin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.19-41
    • /
    • 2019
  • According to the rapidly increasing demand for text data analysis, research and investment in text mining are being actively conducted not only in academia but also in various industries. Text mining is generally conducted in two steps. In the first step, the text of the collected document is tokenized and structured to convert the original document into a computer-readable form. In the second step, tasks such as document classification, clustering, and topic modeling are conducted according to the purpose of analysis. Until recently, text mining-related studies have been focused on the application of the second steps, such as document classification, clustering, and topic modeling. However, with the discovery that the text structuring process substantially influences the quality of the analysis results, various embedding methods have actively been studied to improve the quality of analysis results by preserving the meaning of words and documents in the process of representing text data as vectors. Unlike structured data, which can be directly applied to a variety of operations and traditional analysis techniques, Unstructured text should be preceded by a structuring task that transforms the original document into a form that the computer can understand before analysis. It is called "Embedding" that arbitrary objects are mapped to a specific dimension space while maintaining algebraic properties for structuring the text data. Recently, attempts have been made to embed not only words but also sentences, paragraphs, and entire documents in various aspects. Particularly, with the demand for analysis of document embedding increases rapidly, many algorithms have been developed to support it. Among them, doc2Vec which extends word2Vec and embeds each document into one vector is most widely used. However, the traditional document embedding method represented by doc2Vec generates a vector for each document using the whole corpus included in the document. This causes a limit that the document vector is affected by not only core words but also miscellaneous words. Additionally, the traditional document embedding schemes usually map each document into a single corresponding vector. Therefore, it is difficult to represent a complex document with multiple subjects into a single vector accurately using the traditional approach. In this paper, we propose a new multi-vector document embedding method to overcome these limitations of the traditional document embedding methods. This study targets documents that explicitly separate body content and keywords. In the case of a document without keywords, this method can be applied after extract keywords through various analysis methods. However, since this is not the core subject of the proposed method, we introduce the process of applying the proposed method to documents that predefine keywords in the text. The proposed method consists of (1) Parsing, (2) Word Embedding, (3) Keyword Vector Extraction, (4) Keyword Clustering, and (5) Multiple-Vector Generation. The specific process is as follows. all text in a document is tokenized and each token is represented as a vector having N-dimensional real value through word embedding. After that, to overcome the limitations of the traditional document embedding method that is affected by not only the core word but also the miscellaneous words, vectors corresponding to the keywords of each document are extracted and make up sets of keyword vector for each document. Next, clustering is conducted on a set of keywords for each document to identify multiple subjects included in the document. Finally, a Multi-vector is generated from vectors of keywords constituting each cluster. The experiments for 3.147 academic papers revealed that the single vector-based traditional approach cannot properly map complex documents because of interference among subjects in each vector. With the proposed multi-vector based method, we ascertained that complex documents can be vectorized more accurately by eliminating the interference among subjects.

Using the METHONTOLOGY Approach to a Graduation Screen Ontology Development: An Experiential Investigation of the METHONTOLOGY Framework

  • Park, Jin-Soo;Sung, Ki-Moon;Moon, Se-Won
    • Asia pacific journal of information systems
    • /
    • v.20 no.2
    • /
    • pp.125-155
    • /
    • 2010
  • Ontologies have been adopted in various business and scientific communities as a key component of the Semantic Web. Despite the increasing importance of ontologies, ontology developers still perceive construction tasks as a challenge. A clearly defined and well-structured methodology can reduce the time required to develop an ontology and increase the probability of success of a project. However, no reliable knowledge-engineering methodology for ontology development currently exists; every methodology has been tailored toward the development of a particular ontology. In this study, we developed a Graduation Screen Ontology (GSO). The graduation screen domain was chosen for the several reasons. First, the graduation screen process is a complicated task requiring a complex reasoning process. Second, GSO may be reused for other universities because the graduation screen process is similar for most universities. Finally, GSO can be built within a given period because the size of the selected domain is reasonable. No standard ontology development methodology exists; thus, one of the existing ontology development methodologies had to be chosen. The most important considerations for selecting the ontology development methodology of GSO included whether it can be applied to a new domain; whether it covers a broader set of development tasks; and whether it gives sufficient explanation of each development task. We evaluated various ontology development methodologies based on the evaluation framework proposed by G$\acute{o}$mez-P$\acute{e}$rez et al. We concluded that METHONTOLOGY was the most applicable to the building of GSO for this study. METHONTOLOGY was derived from the experience of developing Chemical Ontology at the Polytechnic University of Madrid by Fern$\acute{a}$ndez-L$\acute{o}$pez et al. and is regarded as the most mature ontology development methodology. METHONTOLOGY describes a very detailed approach for building an ontology under a centralized development environment at the conceptual level. This methodology consists of three broad processes, with each process containing specific sub-processes: management (scheduling, control, and quality assurance); development (specification, conceptualization, formalization, implementation, and maintenance); and support process (knowledge acquisition, evaluation, documentation, configuration management, and integration). An ontology development language and ontology development tool for GSO construction also had to be selected. We adopted OWL-DL as the ontology development language. OWL was selected because of its computational quality of consistency in checking and classification, which is crucial in developing coherent and useful ontological models for very complex domains. In addition, Protege-OWL was chosen for an ontology development tool because it is supported by METHONTOLOGY and is widely used because of its platform-independent characteristics. Based on the GSO development experience of the researchers, some issues relating to the METHONTOLOGY, OWL-DL, and Prot$\acute{e}$g$\acute{e}$-OWL were identified. We focused on presenting drawbacks of METHONTOLOGY and discussing how each weakness could be addressed. First, METHONTOLOGY insists that domain experts who do not have ontology construction experience can easily build ontologies. However, it is still difficult for these domain experts to develop a sophisticated ontology, especially if they have insufficient background knowledge related to the ontology. Second, METHONTOLOGY does not include a development stage called the "feasibility study." This pre-development stage helps developers ensure not only that a planned ontology is necessary and sufficiently valuable to begin an ontology building project, but also to determine whether the project will be successful. Third, METHONTOLOGY excludes an explanation on the use and integration of existing ontologies. If an additional stage for considering reuse is introduced, developers might share benefits of reuse. Fourth, METHONTOLOGY fails to address the importance of collaboration. This methodology needs to explain the allocation of specific tasks to different developer groups, and how to combine these tasks once specific given jobs are completed. Fifth, METHONTOLOGY fails to suggest the methods and techniques applied in the conceptualization stage sufficiently. Introducing methods of concept extraction from multiple informal sources or methods of identifying relations may enhance the quality of ontologies. Sixth, METHONTOLOGY does not provide an evaluation process to confirm whether WebODE perfectly transforms a conceptual ontology into a formal ontology. It also does not guarantee whether the outcomes of the conceptualization stage are completely reflected in the implementation stage. Seventh, METHONTOLOGY needs to add criteria for user evaluation of the actual use of the constructed ontology under user environments. Eighth, although METHONTOLOGY allows continual knowledge acquisition while working on the ontology development process, consistent updates can be difficult for developers. Ninth, METHONTOLOGY demands that developers complete various documents during the conceptualization stage; thus, it can be considered a heavy methodology. Adopting an agile methodology will result in reinforcing active communication among developers and reducing the burden of documentation completion. Finally, this study concludes with contributions and practical implications. No previous research has addressed issues related to METHONTOLOGY from empirical experiences; this study is an initial attempt. In addition, several lessons learned from the development experience are discussed. This study also affords some insights for ontology methodology researchers who want to design a more advanced ontology development methodology.

A Destructive Method in the Connection of the Algorithm and Design in the Digital media - Centered on the Rapid Prototyping Systems of Product Design - (디지털미디어 환경(環境)에서 디자인 특성(特性)에 관한 연구(硏究) - 실내제품(室內製品) 디자인을 중심으로 -)

  • Kim Seok-Hwa
    • Journal of Science of Art and Design
    • /
    • v.5
    • /
    • pp.87-129
    • /
    • 2003
  • The purpose of this thesis is to propose a new concept of design of the 21st century, on the basis of the study on the general signification of the structures and the signs of industrial product design, by examining the difference between modern and post-modern design, which is expected to lead the users to different design practice and interpretation of it. The starting point of this study is the different styles and patterns of 'Gestalt' in the post-modern design of the late 20th century from modern design - the factor of determination in industrial product design. That is to say, unlike functional and rational styles of modern product design, the late 20th century is based upon the pluralism characterized by complexity, synthetic and decorativeness. So far, most of the previous studies on design seem to have excluded visual aspects and usability, focused only on effective communication of design phenomena. These partial studies on design, blinded by phenomenal aspects, have resulted in failure to discover a principle of fundamental system. However, design varies according to the times; and the transformation of design is reflected in Design Pragnanz to constitute a new text of design. Therefore, it can be argued that Design Pragnanz serves as an essential factor under influence of the significance of text. In this thesis, therefore, I delve into analysis of the 20th century product design, in the light of Gestalt theory and Design Pragnanz, which have been functioning as the principle of the past design. For this study, I attempted to discover the fundamental elements in modern and post-modern designs, and to examine the formal structure of product design, the users' aesthetic preference and its semantics, from the integrative viewpoint. Also, with reference to history and theory of design my emphasis is more on fundamental visual phenomena than on structural analysis or process of visualization in product design, in order to examine the formal properties of modern and post-modern designs. Firstly, In Chapter 1, 'Issues and Background of the Study', I investigated the Gestalt theory and Design Pragnanz, on the premise of formal distinction between modern and post-modern designs. These theories are founded upon the discussion on visual perception of Gestalt in Germany in 1910's, in pursuit of the principle of perception centered around visual perception of human beings. In Chapter 2, I dealt with functionalism of modern design, as an advance preparation for the further study on the product design of the late 20th century. First of all, in Chapter 2-1, I examined the tendency of modern design focused on functionalism, which can be exemplified by the famous statement 'Form follows function'. Excluding all unessential elements in design - for example, decoration, this tendency has attained the position of the international style based on the spirit of Bauhause - universality and regularity - in search of geometric order, standardization and rationalization. In Chapter 2-2, I investigated the anthropological viewpoint that modern design started representing culture in a symbolic way including overall aspects of the society - politics, economics and ethics, and its criticism on functionalist design that aesthetic value is missing in exchange of excessive simplicity in style. Moreover, I examined the pluralist phenomena in post-modern design such as kitsch, eclecticism, reactionism, hi-tech and digital design, breaking away from functionalist purism of modern design. In Chapter 3, I analyzed Gestalt Pragnanz in design in a practical way, against the background of design trends. To begin with, I selected mass product design among those for the 20th century products as a target of analysis, highlighting representative styles in each category of the products. For this analysis, I adopted the theory of J. M Lehnhardt, who gradated in percentage the aesthetic and semantic levels of Pragnantz in design expression, and that of J. K. Grutter, who expressed it in a formula of M = O : C. I also employed eight units of dichotomies, according to the G. D. Birkhoff's aesthetic criteria, for the purpose of scientific classification of the degree of order and complexity in design; and I analyzed phenomenal aspects of design form represented in each unit. For Chapter 4, I executed a questionnaire about semiological phenomena of Design Pragnanz with 28 units of antonymous adjectives, based upon the research in the previous chapter. Then, I analyzed the process of signification of Design Pragnanz, founded on this research. Furthermore, the interpretation of the analysis served as an explanation to preference, through systematic analysis of Gestalt and Design Pragnanz in product design of the late 20th century. In Chapter 5, I determined the position of Design Pragnanz by integrating the analyses of Gestalt and Pragnanz in modern and post-modern designs In this process, 1 revealed the difference of each Design Pragnanz in formal respect, in order to suggest a vision of the future as a result, which will provide systemic and structural stimulation to current design.

  • PDF