• Title/Summary/Keyword: User Clustering

Search Result 377, Processing Time 0.039 seconds

A Clustering Algorithm for Sequence Data Using Rough Set Theory (러프 셋 이론을 이용한 시퀀스 데이터의 클러스터링 알고리즘)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.2
    • /
    • pp.113-119
    • /
    • 2008
  • The World Wide Web is a dynamic collection of pages that includes a huge number of hyperlinks and huge volumes of usage informations. The resulting growth in online information combined with the almost unstructured web data necessitates the development of powerful web data mining tools. Recently, a number of approaches have been developed for dealing with specific aspects of web usage mining for the purpose of automatically discovering user profiles. We analyze sequence data, such as web-logs, protein sequences, and retail transactions. In our approach, we propose the clustering algorithm for sequence data using rough set theory. We present a simple example and experimental results using a splice dataset and synthetic datasets.

  • PDF

Clustering Representative Annotations for Image Browsing (이미지 브라우징 처리를 위한 전형적인 의미 주석 결합 방법)

  • Zhou, Tie-Hua;Wang, Ling;Lee, Yang-Koo;Ryu, Keun-Ho
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2010.06c
    • /
    • pp.62-65
    • /
    • 2010
  • Image annotations allow users to access a large image database with textual queries. But since the surrounding text of Web images is generally noisy. an efficient image annotation and retrieval system is highly desired. which requires effective image search techniques. Data mining techniques can be adopted to de-noise and figure out salient terms or phrases from the search results. Clustering algorithms make it possible to represent visual features of images with finite symbols. Annotationbased image search engines can obtains thousands of images for a given query; but their results also consist of visually noise. In this paper. we present a new algorithm Double-Circles that allows a user to remove noise results and characterize more precise representative annotations. We demonstrate our approach on images collected from Flickr image search. Experiments conducted on real Web images show the effectiveness and efficiency of the proposed model.

  • PDF

Cultural Region-based Clustering of SNS Big Data and Users Preferences Analysis (문화권 클러스터링 기반 SNS 빅데이터 및 사용자 선호도 분석)

  • Rho, Seungmin
    • Journal of Advanced Navigation Technology
    • /
    • v.22 no.6
    • /
    • pp.670-674
    • /
    • 2018
  • Social network service (SNS) related data including comments/text, images, videos, blogs, and user experiences contain a wealth of information which can be used to build recommendation systems for various clients' and provide insightful data/results to business analysts. Multimedia data, especially visual data like image and videos are the richest source of SNS data which can reflect particular region, and cultures values/interests, form a gigantic portion of the overall data. Mining such huge amounts of data for extracting actionable intelligence require efficient and smart data analysis methods. The purpose of this paper is to focus on this particular modality for devising ways to model, index, and retrieve data as and when desired.

Hardware Accelerated Design on Bag of Words Classification Algorithm

  • Lee, Chang-yong;Lee, Ji-yong;Lee, Yong-hwan
    • Journal of Platform Technology
    • /
    • v.6 no.4
    • /
    • pp.26-33
    • /
    • 2018
  • In this paper, we propose an image retrieval algorithm for real-time processing and design it as hardware. The proposed method is based on the classification of BoWs(Bag of Words) algorithm and proposes an image search algorithm using bit stream. K-fold cross validation is used for the verification of the algorithm. Data is classified into seven classes, each class has seven images and a total of 49 images are tested. The test has two kinds of accuracy measurement and speed measurement. The accuracy of the image classification was 86.2% for the BoWs algorithm and 83.7% the proposed hardware-accelerated software implementation algorithm, and the BoWs algorithm was 2.5% higher. The image retrieval processing speed of BoWs is 7.89s and our algorithm is 1.55s. Our algorithm is 5.09 times faster than BoWs algorithm. The algorithm is largely divided into software and hardware parts. In the software structure, C-language is used. The Scale Invariant Feature Transform algorithm is used to extract feature points that are invariant to size and rotation from the image. Bit streams are generated from the extracted feature point. In the hardware architecture, the proposed image retrieval algorithm is written in Verilog HDL and designed and verified by FPGA and Design Compiler. The generated bit streams are stored, the clustering step is performed, and a searcher image databases or an input image databases are generated and matched. Using the proposed algorithm, we can improve convenience and satisfaction of the user in terms of speed if we search using database matching method which represents each object.

Anomaly Detection Method Based on The False-Positive Control (과탐지를 제어하는 이상행위 탐지 방법)

  • 조혁현;정희택;김민수;노봉남
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.13 no.4
    • /
    • pp.151-159
    • /
    • 2003
  • Internet as being generalized, intrusion detection system is needed to protect computer system from intrusions synthetically. We propose an intrusion detection method to identify and control the contradiction on self-explanation that happen at profiling process of anomaly detection methodology. Because many patterns can be created on profiling process with association method, we present effective application plan through clustering for rules. Finally, we propose similarity function to decide whether anomaly action or not for user pattern using clustered pattern database.

Implementation and Performance Evaluation of Reporting Interval-adaptive Sensor Control Scheme for Energy Efficient Data Gathering (에너지 효율적 센서 데이터 수집을 위한 리포팅 허용 지연시간 적응형 센서 제어 기법 구현 및 성능평가)

  • Shon, Tae-Shik;Choi, Hyo-Hyun
    • The KIPS Transactions:PartC
    • /
    • v.17C no.6
    • /
    • pp.459-464
    • /
    • 2010
  • Due to the application-specific nature of wireless sensor networks, the sensitivity to such a requirement as data reporting latency may vary depending on the type of applications, thus requiring application-specific algorithm and protocol design paradigms which help us to maximize energy conservation and thus the network lifetime. In this paper, we implement and evaluate a novel delay-adaptive sensor scheduling scheme for energy-saving data gathering which is based on a two phase clustering (TPC), in wireless sensor networks. The TPC is implemented on sensor Mote hardwares. With the help of TPC implemented, sensors selectively use direct links for control and forwarding time critical sensed data and relay links for data forwarding based on the user delay constraints given. Implementation study shows that TPC helps the sensors to increase a significant amount of energy while collecting sensed data from sensors in a real environment.

Relevance Feedback Method of an Extended Boolean Model using Hierarchical Clustering Techniques (계층적 클러스터링 기법을 이용한 확장 불리언 모델의 적합성 피드백 방법)

  • 최종필;김민구
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.10
    • /
    • pp.1374-1385
    • /
    • 2004
  • The relevance feedback process uses information obtained from a user about an initially retrieved set of documents to improve subsequent search formulations and retrieval performance. In the extended Boolean model, the relevance feedback Implies not only that new query terms must be identified, but also that the terms must be connected with the Boolean AND/OR operators properly Salton et al. proposed a relevance feedback method for the extended Boolean model, called the DNF (disjunctive normal form) method. However, this method has a critical problem in generating a reformulated queries. In this study, we investigate the problem of the DNF method and propose a relevance feedback method using hierarchical clustering techniques to solve the problem. We show the results of experiments which are performed on two data sets: the DOE collection in TREC 1 and the Web TREC 10 collection.

Item Filtering System Using Associative Relation Clustering Split Method (연관관계 군집 분할 방법을 이용한 아이템 필터링 시스템)

  • Cho, Dong-Ju;Park, Yang-Jae;Jung, Kyung-Yong
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.6
    • /
    • pp.1-8
    • /
    • 2007
  • In electronic commerce, it is important for users to recommend the proper item among large item sets with saving time and effort. Therefore, if the recommendation system can be recommended the suitable item, we will gain a good satisfaction to the user. In this paper, we proposed the associative relation clustering split method in the collaborative filtering in order to perform the accuracy and the scalability. We produce the lift between associative items using the ratings data. and then split the node group that consists of the item to improve an efficiency of the associative relation cluster. This method differs the association about the items of groups. If the association of groups is filled, the reminding items combine. To estimate the performance, the suggested method is compared with the K-means and EM in the MovieLens data set.

Phased Visualization of Facial Expressions Space using FCM Clustering (FCM 클러스터링을 이용한 표정공간의 단계적 가시화)

  • Kim, Sung-Ho
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.2
    • /
    • pp.18-26
    • /
    • 2008
  • This paper presents a phased visualization method of facial expression space that enables the user to control facial expression of 3D avatars by select a sequence of facial frames from the facial expression space. Our system based on this method creates the 2D facial expression space from approximately 2400 facial expression frames, which is the set of neutral expression and 11 motions. The facial expression control of 3D avatars is carried out in realtime when users navigate through facial expression space. But because facial expression space can phased expression control from radical expressions to detail expressions. So this system need phased visualization method. To phased visualization the facial expression space, this paper use fuzzy clustering. In the beginning, the system creates 11 clusters from the space of 2400 facial expressions. Every time the level of phase increases, the system doubles the number of clusters. At this time, the positions of cluster center and expression of the expression space were not equal. So, we fix the shortest expression from cluster center for cluster center. We let users use the system to control phased facial expression of 3D avatar, and evaluate the system based on the results.

Automated Method of Landmark Extraction for Protein 2DE Images based on Multi-dimensional Clustering (다차원 클러스터링 기반의 단백질 2DE 이미지에서의 자동화된 기준점 추출 방법)

  • Shim, Jung-Eun;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.12D no.5 s.101
    • /
    • pp.719-728
    • /
    • 2005
  • 2-dimensional electrophoresis(2DE) is a separation technique to identify proteins contained in a sample. However, the image is very sensitive to its experimental conditions as well as the quality of scanning. In order to adjust the possible variation of spots in a particular image, a user should manually annotate landmark spots on each gel image to analyze the spots of different images together. However, this operation is an error-prone and tedious job. This thesis develops an automated method of extracting the landmark spots of an image based on landmark profile. The landmark profile is created by clustering the previously identified landmarks of sample images of the same type. The profile contains the various properties of clusters identified for each landmark. When the landmarks of a new image need to be fount all the candidate spots of each landmark are first identified by examining the properties of its clusters. Subsequently, all the landmark spots of the new image are collectively found by the well-known optimization algorithm $A^*$. The performance of this method is illustrated by various experiments on real 2DE images of mouse's brain-tissues.