• Title/Summary/Keyword: Vector representation

Search Result 288, Processing Time 0.023 seconds

Optical Multi-Normal Vector Based Iridescence BRDF Compression Method (광학적 다중 법선 벡터 기반 훈색(暈色)현상 BRDF 압축 기법)

  • Ryu, Sae-Woon;Lee, Sang-Hwa;Park, Jong-Il
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.37 no.3
    • /
    • pp.184-193
    • /
    • 2010
  • This paper proposes a biological iridescence BRDF(Bidirectional Reflectance Distribution Function) compression and rendering method. In the graphics technology, iridescence sometimes is named structure colors. The main features of these symptoms are shown transform of color and brightness by varying viewpoint. Graphics technology to render this is the BRDF technology. The BRDF methods enable realistic representation of varying view direction, but it requires a lot of computing power because of large data. In this paper, we obtain reflection map from iridescence BRDF, analyze color of reflection map and propose representation method by several colorfully concentric circle. The one concentric circle represents beam width of reflection ray by one normal vector. In this paper, we synthesize rough concentric by using several virtually optical normal vectors. And we obtain spectrum information from concentric circles passing through the center point. The proposed method enables IBR(image based rendering) technique which results is realistic illuminance and spectrum distribution by one texture from reduced BRDF data within spectrum.

Classifying Sub-Categories of Apartment Defect Repair Tasks: A Machine Learning Approach (아파트 하자 보수 시설공사 세부공종 머신러닝 분류 시스템에 관한 연구)

  • Kim, Eunhye;Ji, HongGeun;Kim, Jina;Park, Eunil;Ohm, Jay Y.
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.9
    • /
    • pp.359-366
    • /
    • 2021
  • A number of construction companies in Korea invest considerable human and financial resources to construct a system for managing apartment defect data and for categorizing repair tasks. Thus, this study proposes machine learning models to automatically classify defect complaint text-data into one of the sub categories of 'finishing work' (i.e., one of the defect repair tasks). In the proposed models, we employed two word representation methods (Bag-of-words, Term Frequency-Inverse Document Frequency (TF-IDF)) and two machine learning classifiers (Support Vector Machine, Random Forest). In particular, we conducted both binary- and multi- classification tasks to classify 9 sub categories of finishing work: home appliance installation work, paperwork, painting work, plastering work, interior masonry work, plaster finishing work, indoor furniture installation work, kitchen facility installation work, and tiling work. The machine learning classifiers using the TF-IDF representation method and Random Forest classification achieved more than 90% accuracy, precision, recall, and F1 score. We shed light on the possibility of constructing automated defect classification systems based on the proposed machine learning models.

A multi-channel CNN based online review helpfulness prediction model (Multi-channel CNN 기반 온라인 리뷰 유용성 예측 모델 개발에 관한 연구)

  • Li, Xinzhe;Yun, Hyorim;Li, Qinglong;Kim, Jaekyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.171-189
    • /
    • 2022
  • Online reviews play an essential role in the consumer's purchasing decision-making process, and thus, providing helpful and reliable reviews is essential to consumers. Previous online review helpfulness prediction studies mainly predicted review helpfulness based on the consistency of text and rating information of online reviews. However, there is a limitation in that representation capacity or review text and rating interaction. We propose a CNN-RHP model that effectively learns the interaction between review text and rating information to improve the limitations of previous studies. Multi-channel CNNs were applied to extract the semantic representation of the review text. We also converted rating into independent high-dimensional embedding vectors representing the same dimension as the text vector. The consistency between the review text and the rating information is learned based on element-wise operations between the review text and the star rating vector. To evaluate the performance of the proposed CNN-RHP model in this study, we used online reviews collected from Amazom.com. Experimental results show that the CNN-RHP model indicates excellent performance compared to several benchmark models. The results of this study can provide practical implications when providing services related to review helpfulness on online e-commerce platforms.

Character Extraction from Color Map Image Using Interactive Clustering (대화식 클러스터링 기법을 이용한 칼라 지도의 문자 영역 추출에 관한 연구)

  • Ahn, Chang;Park, Chan-Jung;Rhee, Sang-Burm
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.1
    • /
    • pp.270-279
    • /
    • 1997
  • The conversion of printed maps into computerized databases is an enormous task. Thus the automation of the conversion process is essential. Efficient computer representation of printed maps and line drawings depends on codes assigned to characters, symbols, and vector representation of the graphics. In many cases, maps are constructed in a number of layers, where each layer is printed in a distinct color, and it represents a subset of the map information. In order to properly represent the character layer from color map images, an interactive clustering and character extraction technique is proposed. Character is usually separated from graphics by extracting and classifying connected components in the image. But this procedure fails, when characters touch or overlap lines-something that occurs often in land register maps. By vectorizing line segments, the touched characters and numbers are extracted. The algorithm proposed in this paper is intended to contribute towards the solution of the color image clustering and touched character problem.

  • PDF

A Word Embedding used Word Sense and Feature Mirror Model (단어 의미와 자질 거울 모델을 이용한 단어 임베딩)

  • Lee, JuSang;Shin, JoonChoul;Ock, CheolYoung
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.226-231
    • /
    • 2017
  • Word representation, an important area in natural language processing(NLP) used machine learning, is a method that represents a word not by text but by distinguishable symbol. Existing word embedding employed a large number of corpora to ensure that words are positioned nearby within text. However corpus-based word embedding needs several corpora because of the frequency of word occurrence and increased number of words. In this paper word embedding is done using dictionary definitions and semantic relationship information(hypernyms and antonyms). Words are trained using the feature mirror model(FMM), a modified Skip-Gram(Word2Vec). Sense similar words have similar vector. Furthermore, it was possible to distinguish vectors of antonym words.

An Implementation of an ENC Representation System which meets S-52 presentation specification and S-57 transfer standards (S-52 표현사양 및 S-57 교환표준을 만족하는 전자해도 표현 시스템 구현)

  • 이희용;서상현
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.2
    • /
    • pp.469-478
    • /
    • 2000
  • On the advent of digital era, ECDIS has emerged as a new navigation aid that should result in significant benefits to safe navigation. More than simply a graphics display, ECDIS is a new concept navigation system capable of providing integrated information of geographical and texual data. As an official vector data for ECDIS, ENC consists of spatial and feature data to describe objects in form of points, lines and areas. IHO published International Standards for ENC, such as S-52(Specification for Chart Content and Display Aspects of ECDIS) and S-57(IHO Transfer Standard for Digital Hydrographic Data). This paper deals with the implementation of an ENC representation system which meets S-52 presentation specification and S-57 transfer standards by analyzing S-57 data structures and converting them to an appropriate internal data structures and representing them onto screen adopting S-52 presentation specification.

  • PDF

An Implementation of an ENC Representation System which meets S-52 presentation specification and S-57 transfer standards (S-52 표현사양 및 S-57 교환표준을 만족하는 전자해도 표현 시스템 구현)

  • 서상현;이희용
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 1999.11a
    • /
    • pp.146-150
    • /
    • 1999
  • On the advent of digital era, ECDIS has emerged as a new navigation aid that should result in significant benefits to safe navigation. More than simply a graphics display, ECDIS is a new concept navigation system capable of providing integrated information of geographical and texual data. As an official vector data for ECDIS, ENC consists of spatial and feature data to describe objects in form of points, lines and areas. IHO published International Standards for ENC, such as S-52(Specification for Chart Content and Display Aspects of ECDIS and S-57(IHO Transfer Standard for Digital Hydrographic Data). This paper deals with the implementation of an EUC representation system which meets S-52 presentation specification and S-57 transfer standards by analyzing S-57 data structures and converting then to an appropriate internal data structures and representing them onto screen adopting S-52 presentation specification.

  • PDF

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

Intrusion Detection Method Using Unsupervised Learning-Based Embedding and Autoencoder (비지도 학습 기반의 임베딩과 오토인코더를 사용한 침입 탐지 방법)

  • Junwoo Lee;Kangseok Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.8
    • /
    • pp.355-364
    • /
    • 2023
  • As advanced cyber threats continue to increase in recent years, it is difficult to detect new types of cyber attacks with existing pattern or signature-based intrusion detection method. Therefore, research on anomaly detection methods using data learning-based artificial intelligence technology is increasing. In addition, supervised learning-based anomaly detection methods are difficult to use in real environments because they require sufficient labeled data for learning. Research on an unsupervised learning-based method that learns from normal data and detects an anomaly by finding a pattern in the data itself has been actively conducted. Therefore, this study aims to extract a latent vector that preserves useful sequence information from sequence log data and develop an anomaly detection learning model using the extracted latent vector. Word2Vec was used to create a dense vector representation corresponding to the characteristics of each sequence, and an unsupervised autoencoder was developed to extract latent vectors from sequence data expressed as dense vectors. The developed autoencoder model is a recurrent neural network GRU (Gated Recurrent Unit) based denoising autoencoder suitable for sequence data, a one-dimensional convolutional neural network-based autoencoder to solve the limited short-term memory problem that GRU can have, and an autoencoder combining GRU and one-dimensional convolution was used. The data used in the experiment is time-series-based NGIDS (Next Generation IDS Dataset) data, and as a result of the experiment, an autoencoder that combines GRU and one-dimensional convolution is better than a model using a GRU-based autoencoder or a one-dimensional convolution-based autoencoder. It was efficient in terms of learning time for extracting useful latent patterns from training data, and showed stable performance with smaller fluctuations in anomaly detection performance.

Emotion Transition Model based Music Classification Scheme for Music Recommendation (음악 추천을 위한 감정 전이 모델 기반의 음악 분류 기법)

  • Han, Byeong-Jun;Hwang, Een-Jun
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.159-166
    • /
    • 2009
  • So far, many researches have been done to retrieve music information using static classification descriptors such as genre and mood. Since static classification descriptors are based on diverse content-based musical features, they are effective in retrieving similar music in terms of such features. However, human emotion or mood transition triggered by music enables more effective and sophisticated query in music retrieval. So far, few works have been done to evaluate the effect of human mood transition by music. Using formal representation of such mood transitions, we can provide personalized service more effectively in the new applications such as music recommendation. In this paper, we first propose our Emotion State Transition Model (ESTM) for describing human mood transition by music and then describe a music classification and recommendation scheme based on the ESTM. In the experiment, diverse content-based features were extracted from music clips, dimensionally reduced by NMF (Non-negative Matrix Factorization, and classified by SVM (Support Vector Machine). In the performance analysis, we achieved average accuracy 67.54% and maximum accuracy 87.78%.

  • PDF