• 제목/요약/키워드: Multi-Instance Data

검색결과 31건 처리시간 0.022초

Efficiently Processing Skyline Query on Multi-Instance Data

  • Chiu, Shu-I;Hsu, Kuo-Wei
    • Journal of Information Processing Systems
    • /
    • 제13권5호
    • /
    • pp.1277-1298
    • /
    • 2017
  • Related to the maximum vector problem, a skyline query is to discover dominating tuples from a set of tuples, where each defines an object (such as a hotel) in several dimensions (such as the price and the distance to the beach). A tuple, an instance of an object, dominates another tuple if it is equally good or better in all dimensions and better in at least one dimension. Traditionally, skyline queries are defined upon single-instance data or upon objects each of which is associated with an instance. However, in some cases, an object is not associated with a single instance but rather by multiple instances. For example, on a review website, many users assign scores to a product or a service, and a user's score is an instance of the object representing the product or the service. Such data is an example of multi-instance data. Unlike most (if not all) others considering the traditional setting, we consider skyline queries defined upon multi-instance data. We define the dominance calculation and propose an algorithm to reduce its computational cost. We use synthetic and real data to evaluate the proposed methods, and the results demonstrate their utility.

긍정 데이터 분포를 반영한 다중 인스턴스 지지 벡터 기계 학습 (Learning Multiple Instance Support Vector Machine through Positive Data Distribution)

  • 황중원;박성배;이상조
    • 정보과학회 논문지
    • /
    • 제42권2호
    • /
    • pp.227-234
    • /
    • 2015
  • 본 논문에서는 데이터 분포를 고려한 다중 인스턴스 지지 벡터 기계 학습 알고리즘을 제안한다. 기존의 방법은 긍정 가방 안에서 "가장 긍정"인 인스턴스만 고려하여 마진을 찾는다. 일반적으로 다중 인스턴스로 표현된 데이터에서, 긍정 가방에 포함된 인스턴스들 중 실제로 긍정을 나타내는 인스턴스들은 자질 공간 상에서 서로 유사한 곳에 위치해 있다. 제안한 방법은 기존의 다중 인스턴스 지지 벡터 기계 학습 알고리즘 중에서 긍정 인스턴스들의 교차점을 찾아 이 교차점과 거리를 계산하여 "가장 긍정"인 인스턴스를 선택한다. 긍정 인스턴스들의 교차점인 피벗 포인트를 구하는 방식은 두 가지이다. 먼저, 학습과정 중 추정된 긍정 인스턴스들의 중심점을 사용하는 방법과 학습 시작 시에 가장 긍정일 것으로 예상되는 긍정 인스턴스들의 중심점을 찾는 방법으로 나뉜다. 총 12개의 벤치마크 다중 인스턴스 데이터 셋을 통해 제안한 방법이 기존의 학습 알고리즘에 비해 더 좋은 성능을 보임을 보인다.

DATA MININING APPROACH TO PARAMETRIC COST ESTIMATE IN EARLY DESIGN STAGE AND ANALYTICAL CHARACTERIZATION ON OLAP (ON-LINE ANALYTICAL PROCESSING)

  • JaeHo Cho;HyunKyun Jung;JaeYoul Chun
    • 국제학술발표논문집
    • /
    • The 4th International Conference on Construction Engineering and Project Management Organized by the University of New South Wales
    • /
    • pp.176-181
    • /
    • 2011
  • A role of cost modeler is that of facilitating design process by the systematic application of cost factors so as to maintain sensible and economic relationships between cost, quantity, utility and appearance. These relationships help to achieve the client's requirements within an agreed budget. The purpose of this study is to develop a parametric cost estimating model for the early design stage by using the multi-dimensional system of OLAP (On-line Analytical Processing) based on the case of quantity data related to architectural design features. The parametric cost estimating models have been adopted to support decision making in the early design stage. These models typically use a similar instance or a pattern of historical case. In order to effectively use this type of data model, it is required to set data classification and prediction methods. One of the methods is to find the similar class in line with attribute selection measure in the multi-dimensional data model. Therefore, this research is to analyze the relevance attribute influenced by architectural design features with the subject of case-based quantity data used for the parametric cost estimating model. The relevance attributes can be analyzed by Analytical Characterization. It helps determine what attributes to be included in the OLAP multi-dimension.

  • PDF

Intelligent Intrusion Detection and Prevention System using Smart Multi-instance Multi-label Learning Protocol for Tactical Mobile Adhoc Networks

  • Roopa, M.;Raja, S. Selvakumar
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권6호
    • /
    • pp.2895-2921
    • /
    • 2018
  • Security has become one of the major concerns in mobile adhoc networks (MANETs). Data and voice communication amongst roaming battlefield entities (such as platoon of soldiers, inter-battlefield tanks and military aircrafts) served by MANETs throw several challenges. It requires complex securing strategy to address threats such as unauthorized network access, man in the middle attacks, denial of service etc., to provide highly reliable communication amongst the nodes. Intrusion Detection and Prevention System (IDPS) undoubtedly is a crucial ingredient to address these threats. IDPS in MANET is managed by Command Control Communication and Intelligence (C3I) system. It consists of networked computers in the tactical battle area that facilitates comprehensive situation awareness by the commanders for timely and optimum decision-making. Key issue in such IDPS mechanism is lack of Smart Learning Engine. We propose a novel behavioral based "Smart Multi-Instance Multi-Label Intrusion Detection and Prevention System (MIML-IDPS)" that follows a distributed and centralized architecture to support a Robust C3I System. This protocol is deployed in a virtually clustered non-uniform network topology with dynamic election of several virtual head nodes acting as a client Intrusion Detection agent connected to a centralized server IDPS located at Command and Control Center. Distributed virtual client nodes serve as the intelligent decision processing unit and centralized IDPS server act as a Smart MIML decision making unit. Simulation and experimental analysis shows the proposed protocol exhibits computational intelligence with counter attacks, efficient memory utilization, classification accuracy and decision convergence in securing C3I System in a Tactical Battlefield environment.

다중 해상도 레벨 세트 방식을 이용한 기하 활성 모델 (A Geometric Active Contour Model Using Multi Resolution Level Set Methods)

  • 김성곤;김두영
    • 한국정보처리학회논문지
    • /
    • 제6권10호
    • /
    • pp.2809-2815
    • /
    • 1999
  • Level set, and active contour(snakes) models are extensively used for image segmentation or shape extraction in computer vision. Snakes utilize the energy minimization concepts, and level set is based on the curve evolution in order to extract contours from image data. In general, these two models have their own drawbacks. For instance, snake acts pooly unless it is placed close to the wanted shape boundary, and it has difficult problem when image has multiple objects to be extracted. But, level set method is free of initial curve position problem, and has ability to handle topology of multiple objects. Nevertheless, level set method requires much more calculation time compared to snake model. In this paper, we use good points of two described models and also apply multi resolution algorithm in order to speed up the process without decreasing the performance of the shape extraction.

  • PDF

A Study on the Business Model of a Fan Community Platform 'Weverse'

  • Song, Minzheong
    • International journal of advanced smart convergence
    • /
    • 제10권4호
    • /
    • pp.172-182
    • /
    • 2021
  • We look at the business model development of a fan community platform 'Weverse' from two-sided platform (TSP) to multi-sided platform (MSP) and investigate its platform business model. From the Rocket Model's theoretical perspective, the results reveal that Weverse firstly focuses on inviting artists as many as possible starting from BTS, then attracts new artists' fans naturally. For success of this TSP, it forms MSP, 'Weverse Shop' to meet two sides' relevant needs timely and filtered. In third stage of connection, various partnerships are attempted in terms of open platform strategies. For instance, by combining 'VLive' and Weverse, Naver's fan platform business is transferred to Weverse. For core transaction through direct and indirect monetization, several cobranding activities are tried. Lastly, regarding optimization, newly born Weverse being launched in the first half of 2022 is supposed to create further synergies with Naver's R&D capabilities in data, AI, and other technologies like metaverse platform 'ZEPETO' which already sells clothing items of Weverse artists.

다중 이온 플라즈마 파동모델 개발 (WAVE MODEL DEVELOPMENT IN MULTI-ION PLASMAS)

  • 송성희;이동훈;표유선
    • Journal of Astronomy and Space Sciences
    • /
    • 제16권1호
    • /
    • pp.41-52
    • /
    • 1999
  • 지구 주변에는 플라즈마로 가득 차 있고 그것을 매질로 하는 다양한 플라즈마 파동이 존재한다. 우주공간 플라즈마는 여러 종류의 이온과 전자로 구성되어 있고 특히 이온들은 파동의 전파에 많은 영향을 미친다. 다중 이온 플라즈마에서 파동분산 방정식의 해를 구하는 것은 상당히 복잡하다. 따라서, 임의의 자기장, 밀도를 고려하여 우주공간에서 다중 이온 플라즈마에서 파동의 분산관계를 쉽게 알 수 있는 계산모델을 개발하였다. 이 모델로부터 IGRF(International Geomagnetic Reference Field)에서 임의 지점을 지나는 자력선과 관측된 밀도 함수로부터 각 위도별로 가능한 파동들의 성질을 조사하여 위성의 초기 관측 자료 분석에 응용하였다. 예를 들어 POLAR 위성의 관측값 중에서 자기 적도 근처에서 발생되어 자력선을 따라 전파하는 특정한 범위의 주파수 경우 파동의 편극 상태가 변한 위치, 전파경로 등을 본 모델을 이용하여 예측할 수 있었다.

  • PDF

멀티 온톨로지 기반의 키워드 연관성을 이용한 전문가 검색 시스템 (The Expert Search System using keyword association based on Multi-Ontology)

  • 정계동;황치곤;최영근
    • 한국정보통신학회논문지
    • /
    • 제16권1호
    • /
    • pp.183-190
    • /
    • 2012
  • 본 연구는 연구논문 및 저자 프로파일을 기반으로 상호 협력이 가능한 전문가 검색 시스템을 구축한다. 제안한 방법론은 다음과 같다. 첫째, 입력 키워드와 가장 연관성 높은 키워드를 검색하기 위한 가중치 부여 기법을 제안하고, 둘째, 이 기법을 통해 전문가를 효율적으로 검색하는 방안을 제안한다. 우선적으로 논문에서 키워드와 저자 프로파일을 추출하고, 이를 통하여 전문가를 검색할 수 있도록 한다. 이것은 소셜 네트워크의 여러 분야에서 활용할 수 있다. 이러한 정보는 여러 시스템에 분산되어 있다. 이렇게 분산된 데이터를 통합하기 위한 기술로 멀티 온토롤지를 이용하는 기법을 제안한다. 멀티 온톨로지는 메타 온톨로지, 인스턴스 온톨로지, 로케이션 온톨로지와 연관관계 온톨로지로 구성되고, 연관관계 온톨로지는 동적으로 키워드 연관관계 분석을 통해 구축된다. 이 멀티 온톨로지를 이용하여 전문가 망을 제공하고, 이것은 키워드의 연관관계 추적을 통한 전문가 검색이 가능하도록 한다. 이를 통하여 전문가들의 연구물을 확인할 수 있도록 제공함으로써 세부 전문분야를 확인할 수 한다.

Privacy Disclosure and Preservation in Learning with Multi-Relational Databases

  • Guo, Hongyu;Viktor, Herna L.;Paquet, Eric
    • Journal of Computing Science and Engineering
    • /
    • 제5권3호
    • /
    • pp.183-196
    • /
    • 2011
  • There has recently been a surge of interest in relational database mining that aims to discover useful patterns across multiple interlinked database relations. It is crucial for a learning algorithm to explore the multiple inter-connected relations so that important attributes are not excluded when mining such relational repositories. However, from a data privacy perspective, it becomes difficult to identify all possible relationships between attributes from the different relations, considering a complex database schema. That is, seemingly harmless attributes may be linked to confidential information, leading to data leaks when building a model. Thus, we are at risk of disclosing unwanted knowledge when publishing the results of a data mining exercise. For instance, consider a financial database classification task to determine whether a loan is considered high risk. Suppose that we are aware that the database contains another confidential attribute, such as income level, that should not be divulged. One may thus choose to eliminate, or distort, the income level from the database to prevent potential privacy leakage. However, even after distortion, a learning model against the modified database may accurately determine the income level values. It follows that the database is still unsafe and may be compromised. This paper demonstrates this potential for privacy leakage in multi-relational classification and illustrates how such potential leaks may be detected. We propose a method to generate a ranked list of subschemas that maintains the predictive performance on the class attribute, while limiting the disclosure risk, and predictive accuracy, of confidential attributes. We illustrate and demonstrate the effectiveness of our method against a financial database and an insurance database.

지능형 교육 시스템의 학습자 분류를 위한 Variational Auto-Encoder 기반 준지도학습 기법 (Variational Auto-Encoder Based Semi-supervised Learning Scheme for Learner Classification in Intelligent Tutoring System)

  • 정승원;손민재;황인준
    • 한국멀티미디어학회논문지
    • /
    • 제22권11호
    • /
    • pp.1251-1258
    • /
    • 2019
  • Intelligent tutoring system enables users to effectively learn by utilizing various artificial intelligence techniques. For instance, it can recommend a proper curriculum or learning method to individual users based on their learning history. To do this effectively, user's characteristics need to be analyzed and classified based on various aspects such as interest, learning ability, and personality. Even though data labeled by the characteristics are required for more accurate classification, it is not easy to acquire enough amount of labeled data due to the labeling cost. On the other hand, unlabeled data should not need labeling process to make a large number of unlabeled data be collected and utilized. In this paper, we propose a semi-supervised learning method based on feedback variational auto-encoder(FVAE), which uses both labeled data and unlabeled data. FVAE is a variation of variational auto-encoder(VAE), where a multi-layer perceptron is added for giving feedback. Using unlabeled data, we train FVAE and fetch the encoder of FVAE. And then, we extract features from labeled data by using the encoder and train classifiers with the extracted features. In the experiments, we proved that FVAE-based semi-supervised learning was superior to VAE-based method in terms with accuracy and F1 score.