• Title/Summary/Keyword: 연속형 속성

Search Result 35, Processing Time 0.032 seconds

Discretization of continuous-valued attributes considering data distribution (데이터 분포를 고려한 연속 값 속성의 이산화)

  • 이상훈;박정은;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.05a
    • /
    • pp.217-220
    • /
    • 2003
  • 본 논문에서는 특정 매개변수의 입력 없이 속성(attribute)에 따른 목적속성(class)값의 분포를 고려하여 연속형(conti-nuous) 값을 범주형(categorical)의 형태로 변환시키는 새로운 방법을 제안하였다. 각각의 속성에 대해 목적속성의 분포를 1차원 공간에 사상(mapping)하고, 각 목적속성의 밀도, 다른 목적속성과의 중복 정도 등의 기준에 따라 구간을 군집화 한다. 이렇게 생성된 군집들은 각각 목적속성을 예측할 수 있는 확률적 수치에 기반한 것으로, 각 속성이 제공하는 정보의 손실을 최소화하는 이산화 경계선을 갖고 있다. 제안된 데이터 이산화 방법의 향상된 성능은 C4.5 알고리즘과 UCI Machine Learning Data Repository 데이터를 사용하여 확인할 수 있다.

  • PDF

Discretization of Continuous-Valued Attributes considering Data Distribution (데이터 분포를 고려한 연속 값 속성의 이산화)

  • Lee, Sang-Hoon;Park, Jung-Eun;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.391-396
    • /
    • 2003
  • This paper proposes a new approach that converts continuous-valued attributes to categorical-valued ones considering the distribution of target attributes(classes). In this approach, It can be possible to get optimal interval boundaries by considering the distribution of data itself without any requirements of parameters. For each attributes, the distribution of target attributes is projected to one-dimensional space. And this space is clustered according to the criteria like as the density value of each target attributes and the amount of overlapped areas among each density values of target attributes. Clusters which are made in this ways are based on the probabilities that can predict a target attribute of instances. Therefore it has an interval boundaries that minimize a loss of information of original data. An improved performance of proposed discretization method can be validated using C4.5 algorithm and UCI Machine Learning Data Repository data sets.

Fuzzy discretization with spatial distribution of data and Its application to feature selection (데이터의 공간적 분포를 고려한 퍼지 이산화와 특징선택에의 응용)

  • Son, Chang-Sik;Shin, A-Mi;Lee, In-Hee;Park, Hee-Joon;Park, Hyoung-Seob;Kim, Yoon-Nyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.2
    • /
    • pp.165-172
    • /
    • 2010
  • In clinical data minig, choosing the optimal subset of features is such important, not only to reduce the computational complexity but also to improve the usefulness of the model constructed from the given data. Moreover the threshold values (i.e., cut-off points) of selected features are used in a clinical decision criteria of experts for differential diagnosis of diseases. In this paper, we propose a fuzzy discretization approach, which is evaluated by measuring the degree of separation of redundant attribute values in overlapping region, based on spatial distribution of data with continuous attributes. The weighted average of the redundant attribute values is then used to determine the threshold value for each feature and rough set theory is utilized to select a subset of relevant features from the overall features. To verify the validity of the proposed method, we compared experimental results, which applied to classification problem using 668 patients with a chief complaint of dyspnea, based on three discretization methods (i.e., equal-width, equal-frequency, and entropy-based) and proposed discretization method. From the experimental results, we confirm that the discretization methods with fuzzy partition give better results in two evaluation measures, average classification accuracy and G-mean, than those with hard partition.

A Study on Conversational AI Agent based on Continual Learning

  • Chae-Lim, Park;So-Yeop, Yoo;Ok-Ran, Jeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.1
    • /
    • pp.27-38
    • /
    • 2023
  • In this paper, we propose a conversational AI agent based on continual learning that can continuously learn and grow with new data over time. A continual learning-based conversational AI agent consists of three main components: Task manager, User attribute extraction, and Auto-growing knowledge graph. When a task manager finds new data during a conversation with a user, it creates a new task with previously learned knowledge. The user attribute extraction model extracts the user's characteristics from the new task, and the auto-growing knowledge graph continuously learns the new external knowledge. Unlike the existing conversational AI agents that learned based on a limited dataset, our proposed method enables conversations based on continuous user attribute learning and knowledge learning. A conversational AI agent with continual learning technology can respond personally as conversations with users accumulate. And it can respond to new knowledge continuously. This paper validate the possibility of our proposed method through experiments on performance changes in dialogue generation models over time.

A Personal Credit Estimate Algorithm Using Artificial Neural Network (인공신경망을 이용한 개인 신용평가 알고리즘)

  • Lim Sung-Bin;Choi Woo-Kyung;Kim Sung-Hyun;Kim Yong-Min;Jeon Hong-Tae
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2005.04a
    • /
    • pp.293-296
    • /
    • 2005
  • 최근 우리나라는 가계신용의 급신장과 신용불량의 급증 등으로 개인 신용부문이 금융기관의 건전성 유지에 부정적인 영향을 미치고 있다. 이러한 잠재적 문제를 사전에 방지하기 위해 금융기관 등에서는 개인 신용평가에 대한 수요가 커지고 있는 실정이다. 주어진 데이터로부터의 반복적인 학습 과정을 거쳐 패턴을 분류하고 또한 모델과 학습 방법에 따라 입력변수와 목적변수의 속성이 연속형이나 이산형인 경우를 모두 다룰 수 있는 신경망 모델은 개개인의 다양하고 복잡한 데이터를 입력변수로 받아서 신용등급을 나누는데 우수한 능력을 보여줄 수 있다. 본 논문에서는 신경망 모델을 이용해 개인의 신용등급을 객관적이고 일률적으로 평가해서 등급을 나누어주는 알고리즘을 제안하고자 한다.

  • PDF

Expectation and Expectation Gap towards intelligent properties of AI-based Conversational Agent (인공지능 대화형 에이전트의 지능적 속성에 대한 기대와 기대 격차)

  • Park, Hyunah;Tae, Moonyoung;Huh, Youngjin;Lee, Joonhwan
    • Journal of the HCI Society of Korea
    • /
    • v.14 no.1
    • /
    • pp.15-22
    • /
    • 2019
  • The purpose of this study is to investigate the users' expectation and expectation gap about the attributes of smart speaker as an intelligent agent, ie autonomy, sociality, responsiveness, activeness, time continuity, goal orientation. To this end, semi-structured interviews were conducted for smart speaker users and analyzed based on ground theory. Result has shown that people have huge expectation gap about the sociality and human-likeness of smart speakers, due to limitations in technology. The responsiveness of smart speakers was found to have positive expectation gap. For the memory of time-sequential information, there was an ambivalent expectation gap depending on the degree of information sensitivity and presentation method. We also found that there was a low expectation level for autonomous aspects of smart speakers. In addition, proactive aspects were preferred only when appropriate for the context. This study presents implications for designing a way to interact with smart speakers and managing expectations.

Multi-Interval Discretization of Continuous-Valued Attributes for Constructing Incremental Decision Tree (증분 의사결정 트리 구축을 위한 연속형 속성의 다구간 이산화)

  • Baek, Jun-Geol;Kim, Chang-Ouk;Kim, Sung-Shick
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.27 no.4
    • /
    • pp.394-405
    • /
    • 2001
  • Since most real-world application data involve continuous-valued attributes, properly addressing the discretization process for constructing a decision tree is an important problem. A continuous-valued attribute is typically discretized during decision tree generation by partitioning its range into two intervals recursively. In this paper, by removing the restriction to the binary discretization, we present a hybrid multi-interval discretization algorithm for discretizing the range of continuous-valued attribute into multiple intervals. On the basis of experiment using semiconductor etching machine, it has been verified that our discretization algorithm constructs a more efficient incremental decision tree compared to previously proposed discretization algorithms.

  • PDF

R명령어들의 속도 평가

  • Lee, Jin-A;Heo, Mun-Yeol
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.301-305
    • /
    • 2003
  • 최근에 R은 여러 분야에서 많이 사용되고 있다. 특히 모의실험(simulation)이나 통계학 관련 연구에 많이 사용되고 있다. 모의실험을 하는 경우에는 많은 반복으로 인해 R 프로그램의 수행 속도가 매우 중요하다. 또한 데이터마이닝 분야에서도 R을 많이 사용하고 있다. 우리는 데이터 마이닝에서 데이터의 전처리 과정 중 Fayyad & Irani 방법을 사용하여 연속형 변수를 이산화하는 실험을 하였으며, 이를 위해 R을 사용하였다. 이 프로그램은 재귀 함수를 이용하고 이런 과정에서 빈도표 작성, information계산, 빈도표의 분할, 정지 규칙 등의 여러 함수를 사용하게 되어있다. 우리가 작성한 R 로드를 사용하여 UCI DB의 Iono 자료를 (속성이 35개, 사례수가 약 1000개정도) 이산화 하였을 때 7초 이상의 상당한 시간이 소요된다. 반면에 JAVA로 만들어진 Weka에서 똑같은 Fayyad & Irani 방법을 수행했을 때 위와 같은 큰 자료를 이산화하는 속도가 매우 빨라 수행시간은 거의 무시할 만하였다. 이런 차이점을 보고 R 프로그램의 수행 속도를 늘이는 방법을 찾게 되었다. 이 본 발표에서는 R 코드 중 시간이 많이 소요되는 것들을 몇 가지 선정하고 이들을 더 효율적으로 만들 수 있는 코드를 작성하여 이들 코드의 수행속도를 비교하였다. 또한 몇 가지 명령에 대해서는SAS와도 비교하였다.

  • PDF

Discretization of Continuous Attributes based on Rough Set Theory and SOM (러브집합이론과 SOM을 이용한 연속형 속성의 이산화)

  • Seo Wan-Seok;Kim Jae-Yearn
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.28 no.1
    • /
    • pp.1-7
    • /
    • 2005
  • Data mining is widely used for turning huge amounts of data into useful information and knowledge in the information industry in recent years. When analyzing data set with continuous values in order to gain knowledge utilizing data mining, we often undergo a process called discretization, which divides the attribute's value into intervals. Such intervals from new values for the attribute allow to reduce the size of the data set. In addition, discretization based on rough set theory has the advantage of being easily applied. In this paper, we suggest a discretization algorithm based on Rough Set theory and SOM(Self-Organizing Map) as a means of extracting valuable information from large data set, which can be employed even in the case where there lacks of professional knowledge for the field.

A Study on Optical Seemless of Discrete LED panels with Focusing Effect of prism Structure (프리즘 구조의 집광효과를 이용한 이산형 LED 패널의 광학적 연속성 구현에 관한 연구)

  • Cho, Sung-Hwan;Kim, Eung-Bo;Choi, Won-Seok;Joung, Yeun-Ho
    • Journal of Satellite, Information and Communications
    • /
    • v.12 no.2
    • /
    • pp.11-14
    • /
    • 2017
  • In this paper, we introduce a method of light focusing effect using prism structure to solve optical discontinuity of conventional external signage LED panels. The prims structures were patterned on a transparent polycarbonate substrate with MEMS and femto-second laser process. We have confirmed that the patterned prism structures on the substrate made artificial LED lights on empty space between the panels by light guide effect of the structure. The artificial light's lateral positions were controlled by thickness of polycarbonate substrate. This cost effective prim patterned transparent film can be utilized on digital signage LED panels to achieve good optical communication.