• Title/Summary/Keyword: data matching

Search Result 1,997, Processing Time 0.026 seconds

Semantic-based Mashup Platform for Contents Convergence

  • Yongju Lee;Hongzhou Duan;Yuxiang Sun
    • International journal of advanced smart convergence
    • /
    • v.12 no.2
    • /
    • pp.34-46
    • /
    • 2023
  • A growing number of large scale knowledge graphs raises several issues how knowledge graph data can be organized, discovered, and integrated efficiently. We present a novel semantic-based mashup platform for contents convergence which consists of acquisition, RDF storage, ontology learning, and mashup subsystems. This platform servers a basis for developing other more sophisticated applications required in the area of knowledge big data. Moreover, this paper proposes an entity matching method using graph convolutional network techniques as a preliminary work for automatic classification and discovery on knowledge big data. Using real DBP15K and SRPRS datasets, the performance of our method is compared with some existing entity matching methods. The experimental results show that the proposed method outperforms existing methods due to its ability to increase accuracy and reduce training time.

Automatic Detection of the Updating Object by Areal Feature Matching Based on Shape Similarity (형상유사도 기반의 면 객체 매칭을 통한 갱신 객체 탐지)

  • Kim, Ji-Young;Yu, Ki-Yun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.30 no.1
    • /
    • pp.59-65
    • /
    • 2012
  • In this paper, we proposed a method for automatic detection of a updating object from spatial data sets of different scale and updating cycle by using areal feature matching based on shape similarity. For this, we defined a updating object by analysing matching relationships between two different spatial data sets. Next, we firstly eliminated systematic errors in different scale by using affine transformation. Secondly, if any object is overlaid with several areal features of other data sets, we changed several areal features into a single areal feature. Finally, we detected the updating objects by applying areal feature matching based on shape similarity into the changed spatial data sets. After applying the proposed method into digital topographic map and a base map of Korean Address Information System in South Korea, we confirmed that F-measure is highly 0.958 in a statistical evaluation and that significant updating objects are detected from a visual evaluation.

An Index-Based Search Method for Performance Improvement of Set-Based Similar Sequence Matching (집합 유사 시퀀스 매칭의 성능 향상을 위한 인덱스 기반 검색 방법)

  • Lee, Juwon;Lim, Hyo-Sang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.11
    • /
    • pp.507-520
    • /
    • 2017
  • The set-based similar sequence matching method measures similarity not for an individual data item but for a set grouping multiple data items. In the method, the similarity of two sets is represented as the size of intersection between them. However, there is a critical performances issue for the method in twofold: 1) calculating intersection size is a time consuming process, and 2) the number of set pairs that should be calculated the intersection size is quite large. In this paper, we propose an index-based search method for improving performance of set-based similar sequence matching in order to solve these performance issues. Our method consists of two parts. In the first part, we convert the set similarity problem into the intersection size comparison problem, and then, provide an index structure that accelerates the intersection size calculation. Second, we propose an efficient set-based similar sequence matching method which exploits the proposed index structure. Through experiments, we show that the proposed method reduces the execution time by 30 to 50 times then the existing methods. We also show that the proposed method has scalability since the performance gap becomes larger as the number of data sequences increases.

FUZZY matching using propensity score: IBM SPSS 22 Ver. (성향 점수를 이용한 퍼지 매칭 방법: IBM SPSS 22 Ver.)

  • Kim, So Youn;Baek, Jong Il
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.91-100
    • /
    • 2016
  • Fuzzy matching is proposed to make propensities of two groups similar with their propensity scores and a way to select control variable to make propensity scores with a process that shows how to acquire propensity scores using logic regression analysis, is presented. With such scores, it was a method to obtain an experiment group and a control group that had similar propensity employing the Fuzzy Matching. In the study, it was proven that the two groups were the same but with a different distribution chart and standardization which made edge tolerance different and we realized that the number of chosen cases decreased when the edge tolerance score became smaller. So with the idea, we were able to determine that it is possible to merge groups using fuzzy matching without a precontrol and use them when data (big data) are used while to check the pros and cons of Fuzzy Matching were made possible.

Development and Performance Analysis of a New Navigation Algorithm by Combining Gravity Gradient and Terrain Data as well as EKF and Profile Matching

  • Lee, Jisun;Kwon, Jay Hyoun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.5
    • /
    • pp.367-377
    • /
    • 2019
  • As an alternative navigation system for the non-GNSS (Global Navigation Satellite System) environment, a new type of DBRN (DataBase Referenced Navigation) which applies both gravity gradient and terrain, and combines filter-based algorithm with profile matching was suggested. To improve the stability of the performance compared to the previous study, both centralized and decentralized EKF (Extended Kalman Filter) were constructed based on gravity gradient and terrain data, and one of filters was selected in a timely manner. Then, the final position of a moving vehicle was determined by combining a position from the filter with the one from a profile matching. In the simulation test, it was found that the overall performance was improved to the 19.957m by combining centralized and decentralized EKF compared to the centralized EKF that of 20.779m. Especially, the divergence of centralized EKF in two trajectories located in the plain area disappeared. In addition, the average horizontal error decreased to the 16.704m by re-determining the final position using both filter-based and profile matching solutions. Of course, not all trajectories generated improved performance but there is not a large difference in terms of their horizontal errors. Among nine trajectories, eights show smaller than 20m and only one has 21.654m error. Thus, it would be concluded that the endemic problem of performance inconsistency in the single geophysical DB or algorithm-based DBRN was resolved because the combination of geophysical data and algorithms determined the position with a consistent level of error.

Mapping the Terms of Medicinal Material and Formula Classification to International Standard Terminology

  • Kim, Jin-Hyun;Kim, Chul;Yea, Sang-Jun;Jang, Hyun-Chul;Kim, Sang-Kyun;Kim, Young-Eun;Kim, Chang-Seok;Song, Mi-Young
    • International Journal of Contents
    • /
    • v.7 no.4
    • /
    • pp.108-115
    • /
    • 2011
  • The current study aims to analyze the acceptance of International Standard Terminology (IST) related to herbs and formulas used in Korea. It also intends to examine limitations of each term source by linking texts for herbal medicine research and formula research used in schools of oriental medicine with medicinal substance-formula classification names within the IST framework. This study examined 64 medicinal classification names of IST, including synonyms, 41 formula classification names, 65 classification names of "Herbal Medicine Study," 89 medicinal classification names of "Shin's Clinical Herbal Medicine Study," and lastly 83 formula classification names of "Formula Study." Data on their chief virtue, efficacy and characteristics as medicinal substances were extracted from their definitions, and such data were used to perform Chinese character-English mapping using the IST. The outcomes of the mapping were then analyzed in terms of both lexical matching and semantic matching. In terms of classification names for medicinal substances, "Herbal Medicine Study" had 60.0% lexical matching, whereas "Shin's Clinical Herbal Medicine Study" had 48.3% lexical matching. When semantic matching was also applied, "Herbal Medicine Study" showed a value of 87.7% and "Shin's Clinical Herbal Medicine Study" 74.2%. In terms of formula classification names, lexical matching was 28.9% of 83 subjects, and when semantic matching was also considered, the value was 30.1%. When the conceptual elements of this study were applied, some IST terms that are classified with other codes were found to be conceptually consistent, and some terms were not accepted due to different depths in the classification systems of each source.

A Past Elimination Algorithm of Impossible Candidate Vectors Using Matching Scan Method in Motion Estimation of Full Search (전영역 탐색 방식의 움직임 예측에서 매칭 스캔 방법을 이용한 불가능한 후보 벡터의 고속 제거 알고리즘)

  • Kim Jone-Nam
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.8
    • /
    • pp.1080-1087
    • /
    • 2005
  • Significant computations for full search (FS) motion estimation have been a big obstacle in real-time video coding and recent MPEG-4 AVC (advanced video coding) standard requires much more computations than conventional MPEG-2 for motion estimation. To reduce an amount of computation of full search (FS) algorithm for fast motion estimation, we propose a new and fast matching algorithm without any degradation of predicted images like the conventional FS. The computational reduction without any degradation in predicted image comes from fast elimination of impossible candidate motion vectors. We obtain faster elimination of inappropriate motion vectors using efficient matching units from localization of complex area in image data and dithering order based matching scan. Our algorithm reduces about $30\%$ of computations for block matching error compared with the conventional partial distortion elimination (PDE) algorithm, and our algorithm will be useful in real-time video coding applications using MPEG-4 AVC or MPEG-2.

  • PDF

Area based image matching with MOC-NA imagery (MOC-NA 영상의 영역기준 영상정합)

  • Youn, Jun-Hee;Park, Choung-Hwan
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.28 no.4
    • /
    • pp.463-469
    • /
    • 2010
  • Since MOLA(Mars Orbiter Laser Altimeter) data, which provides altimetry data for Mars, does not cover the whole Mars area, image matching with MOC imagery should be implemented for the generation of DEM. However, automatic image matching is difficult because of insufficient features and low contrast. In this paper, we present the area based semi-automatic image matching algorithm with MOC-NA(Mars Orbiter Camera ? Narrow Angle) imagery. To accomplish this, seed points describing conjugate points are manually added for the stereo imagery, and interesting points are automatically produced by using such seed points. Produced interesting points being used as initial conjugate points, area based image matching is implemented. For the points which fail to match, the locations of initial conjugate points are recalculated by using matched six points and image matching process is re-implemented. The quality assessment by reversing the role of target and search image shows 97.5 % of points were laid within one pixel absolute difference.

Dynamic Pricing Based on Reinforcement Learning Reflecting the Relationship between Driver and Passenger Using Matching Matrix (Matching Matrix를 사용하여 운전자와 승객의 관계를 반영한 강화학습 기반 유동적인 가격 책정 체계)

  • Park, Jun Hyung;Lee, Chan Jae;Yoon, Young
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.19 no.6
    • /
    • pp.118-133
    • /
    • 2020
  • Research interest in the Mobility-as-a-Service (MaaS) concept for enhancing users' mobility experience is increasing. In particular, dynamic pricing techniques based on reinforcement learning have emerged since adjusting prices based on the demand is expected to help mobility services, such as taxi and car-sharing services, to gain more profit. This paper provides a simulation framework that considers more practical factors, such as demand density per location, preferred prices, the distance between users and drivers, and distance to the destination that critically affect the probability of matching between the users and the mobility service providers (e.g., drivers). The aforementioned new practical features are reflected on a data structure referred to as the Matching Matrix. Using an efficient algorithm of computing the probability of matching between the users and drivers and given a set of precisely identified high-demand locations using HDBSCAN, this study developed a better reward function that can gear the reinforcement learning process towards finding more realistic dynamic pricing policies.

Approximate Top-k Labeled Subgraph Matching Scheme Based on Word Embedding (워드 임베딩 기반 근사 Top-k 레이블 서브그래프 매칭 기법)

  • Choi, Do-Jin;Oh, Young-Ho;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.8
    • /
    • pp.33-43
    • /
    • 2022
  • Labeled graphs are used to represent entities, their relationships, and their structures in real data such as knowledge graphs and protein interactions. With the rapid development of IT and the explosive increase in data, there has been a need for a subgraph matching technology to provide information that the user is interested in. In this paper, we propose an approximate Top-k labeled subgraph matching scheme that considers the semantic similarity of labels and the difference in graph structure. The proposed scheme utilizes a learning model using FastText in order to consider the semantic similarity of a label. In addition, the label similarity graph(LSG) is used for approximate subgraph matching by calculating similarity values between labels in advance. Through the LSG, we can resolve the limitations of the existing schemes that subgraph expansion is possible only if the labels match exactly. It supports structural similarity for a query graph by performing searches up to 2-hop. Based on the similarity value, we provide k subgraph matching results. We conduct various performance evaluations in order to show the superiority of the proposed scheme.