• Title/Summary/Keyword: data similarity

Search Result 2,051, Processing Time 0.05 seconds

A weighted similarity coefficient method for manufacturing cell formation (제조셀 형성을 위한 가중치 유사성계수 방법)

  • Oh, Soo-Cheol;Cho, Kyu-Kab
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.22 no.1
    • /
    • pp.141-154
    • /
    • 1996
  • This paper presents a similarity coefficient based approach to the problem of machine-part grouping for cellular manufacturing. The method uses relevant production data such as part type, production volume, routing sequence to make machine cells and part families for cell formation. A new similarity coefficient using weighted factors is introduced and an algorithm for formation of machine cells and part families is developed. A comparative study of two similarity coefficient methods, Gupta and Seifoddini's method and the proposed method, is conducted.

  • PDF

Collective Prediction exploiting Spatio Temporal correlation (CoPeST) for energy efficient wireless sensor networks

  • ARUNRAJA, Muruganantham;MALATHI, Veluchamy
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.7
    • /
    • pp.2488-2511
    • /
    • 2015
  • Data redundancy has high impact on Wireless Sensor Network's (WSN) performance and reliability. Spatial and temporal similarity is an inherent property of sensory data. By reducing this spatio-temporal data redundancy, substantial amount of nodal energy and bandwidth can be conserved. Most of the data gathering approaches use either temporal correlation or spatial correlation to minimize data redundancy. In Collective Prediction exploiting Spatio Temporal correlation (CoPeST), we exploit both the spatial and temporal correlation between sensory data. In the proposed work, the spatial redundancy of sensor data is reduced by similarity based sub clustering, where closely correlated sensor nodes are represented by a single representative node. The temporal redundancy is reduced by model based prediction approach, where only a subset of sensor data is transmitted and the rest is predicted. The proposed work reduces substantial amount of energy expensive communication, while maintaining the data within user define error threshold. Being a distributed approach, the proposed work is highly scalable. The work achieves up to 65% data reduction in a periodical data gathering system with an error tolerance of 0.6℃ on collected data.

Similarity Measurement Method of Trajectory using Indexing Information of Moving Object in Video (비디오 내 이동 객체의 색인 정보를 이용한 궤적 유사도 측정 기법)

  • Kim, Jeong In;Choi, Chang;Kim, Pan Koo
    • Smart Media Journal
    • /
    • v.1 no.3
    • /
    • pp.43-47
    • /
    • 2012
  • The recent proliferation of multimedia data necessitates the effectively and efficiently retrieving of multimedia data. These research not only focus on the retrieving methods of text matching but also on using the multimedia data features. Therefore, this paper is a similarity measurement method of trajectory using indexing information of moving object in video, for similarity measurement. This method consists of 2 steps. Firstly, Video data is processed indexing for trajectory extraction of moving objects using CCTV. Finally, we describe to compare DTW(Dynamic Time Warping) to TSR(Tansent Space Representation) algorithm.

  • PDF

Purchase Transaction Similarity Measure Considering Product Taxonomy (상품 분류 체계를 고려한 구매이력 유사도 측정 기법)

  • Yang, Yu-Jeong;Lee, Ki Yong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.9
    • /
    • pp.363-372
    • /
    • 2019
  • A sequence refers to data in which the order exists on the two items, and purchase transaction data in which the products purchased by one customer are listed is one of the representative sequence data. In general, all goods have a product taxonomy, such as category/ sub-category/ sub-sub category, and if they are similar to each other, they are classified into the same category according to their characteristics. Therefore, in this paper, we not only consider the purchase order of products to compare two purchase transaction sequences, but also calculate their similarity by giving a higher score if they are in the same category in spite of their difference. Especially, in order to choose the best similarity measure that directly affects the calculation performance of the purchase transaction sequences, we have compared the performance of three representative similarity measures, the Levenshtein distance, dynamic time warping distance, and the Needleman-Wunsch similarity. We have extended the existing methods to take into account the product taxonomy. For conventional similarity measures, the comparison of goods in two sequences is calculated by simply assigning a value of 0 or 1 according to whether or not the product is matched. However, the proposed method is subdivided to have a value between 0 and 1 using the product taxonomy tree to give a different degree of relevance between the two products, even if they are different products. Through experiments, we have confirmed that the proposed method was measured the similarity more accurately than the previous method. Furthermore, we have confirmed that dynamic time warping distance was the most suitable measure because it considered the degree of association of the product in the sequence and showed good performance for two sequences with different lengths.

A Study on the Maximizing Coverage for Recommender System

  • Lee, Hee-Choon;Lee, Seok-Jun;Park, Ji-Won;Kim, Chul-Seoung
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.11a
    • /
    • pp.119-128
    • /
    • 2006
  • The similarity weight, the pearson's correlation coefficient, which is used in the recommender system has a weak point that it cannot predict all of the prediction value. The similarity weight, the vector similarity, has a weak point of the high MAE although the prediction coverage using the vector similarity is higher than that using the pearson's correlation coefficient. The purpose of this study is to suggest how to raise the prediction coverage. Also, the MAE using the suggested method in this study was compared both with the MAE using the pearson's correlation coefficient and with the MAE using the vector similarity, so was the prediction coverage. As a result, it was found that the low of the MAE in the case of using the suggested method was higher than that using the pearson's correlation coefficient. However, it was also shown that it was lower than that using the vector similarity In terms of the prediction coverage, when the suggested method was compared with two similarity weights as I mentioned above, it was found that its prediction coverage was higher than that pearson's correlation coefficient as well as vector similarity.

  • PDF

Vegetation Classification from Time Series NOAA/AVHRR Data

  • Yasuoka, Yoshifumi;Nakagawa, Ai;Kokubu, Keiko;Pahari, Krishna;Sugita, Mikio;Tamura, Masayuki
    • Proceedings of the KSRS Conference
    • /
    • 1999.11a
    • /
    • pp.429-432
    • /
    • 1999
  • Vegetation cover classification is examined based on a time series NOAA/AVHRR data. Time series data analysis methods including Fourier transform, Auto-Regressive (AR) model and temporal signature similarity matching are developed to extract phenological features of vegetation from a time series NDVI data from NOAA/AVHRR and to classify vegetation types. In the Fourier transform method, typical three spectral components expressing the phenological features of vegetation are selected for classification, and also in the AR model method AR coefficients are selected. In the temporal signature similarity matching method a new index evaluating the similarity of temporal pattern of the NDVI is introduced for classification.

  • PDF

Establishing Method of RAM Objective Considering Combat Readiness and Field Data of Similarity Equipment (전투준비태세 및 유사장비 운용자료를 활용한 RAM 목표 값 설정방법에 관한 연구)

  • Kim, Kyung-Yong;Bae, Suk-Joo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.32 no.3
    • /
    • pp.127-134
    • /
    • 2009
  • RAM(Reliability, Availability, Maintainability) is important performance factor to keep combat readiness and optimize operational and maintenance cost of weapon systems. This paper discusses the method to establish RAM for combat readiness by using field failure data from similarity equipments. Operational availability is estimated from a binomial distribution function of user's operational conditions such as combat readiness preservation probability, operational rate, operational availability and total number of equipment. Reliability and maintainability is estimated from field failure data from similarity equipment to accomplish operational availability. The effectiveness of established RAM is verified through analysis of combat readiness preservation probability and mission reliability. A case study of weapon system illustrates the process of the proposed method.

AI Performance Based On Learning-Data Labeling Accuracy (인공지능 학습데이터 라벨링 정확도에 따른 인공지능 성능)

  • Ji-Hoon Lee;Jieun Shin
    • Journal of Industrial Convergence
    • /
    • v.22 no.1
    • /
    • pp.177-183
    • /
    • 2024
  • The study investigates the impact of data quality on the performance of artificial intelligence (AI). To this end, the impact of labeling error levels on the performance of artificial intelligence was compared and analyzed through simulation, taking into account the similarity of data features and the imbalance of class composition. As a result, data with high similarity between characteristic variables were found to be more sensitive to labeling accuracy than data with low similarity between characteristic variables. It was observed that artificial intelligence accuracy tended to decrease rapidly as class imbalance increased. This will serve as the fundamental data for evaluating the quality criteria and conducting related research on artificial intelligence learning data.

Style-Specific Language Model Adaptation using TF*IDF Similarity for Korean Conversational Speech Recognition

  • Park, Young-Hee;Chung, Min-Hwa
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2E
    • /
    • pp.51-55
    • /
    • 2004
  • In this paper, we propose a style-specific language model adaptation scheme using n-gram based tf*idf similarity for Korean spontaneous speech recognition. Korean spontaneous speech shows especially different style-specific characteristics such as filled pauses, word omission, and contraction, which are related to function words and depend on preceding or following words. To reflect these style-specific characteristics and overcome insufficient data for training language model, we estimate in-domain dependent n-gram model by relevance weighting of out-of-domain text data according to their n-. gram based tf*idf similarity, in which in-domain language model include disfluency model. Recognition results show that n-gram based tf*idf similarity weighting effectively reflects style difference.

Exploratory Methodology for Acquiring Architectural Plans Based on Spatial Graph Similarity

  • Ham, Sungil;Chang, Seongju;Suh, Dongjun;Narangerel, Amartuvshin
    • Architectural research
    • /
    • v.17 no.2
    • /
    • pp.57-64
    • /
    • 2015
  • In architectural planning, previous cases of similar spatial program provide important data for architectural design. Case-based reasoning (CBR) paradigm in the field of architectural design is closely related to the designing behavior of a planner who makes use of similar architectural designs and spatial programs in the past. In CBR, spatial graph can be constituted with most fundamental data, which can provide a method of searching spatial program by using visual graphs. This study developed a system for CBR that can analyze the similarity through graph comparison and search for buildings. This is an integrated system that is able to compare space similarity of different buildings and analyze their types, in addition to the analysis on a space within a single structure.