• Title/Summary/Keyword: Duplicate Detection Algorithm

Search Result 15, Processing Time 0.035 seconds

Improved Facial Component Detection Using Variable Parameter and Verification (가변 변수와 검증을 이용한 개선된 얼굴 요소 검출)

  • Oh, Jeong-su
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.3
    • /
    • pp.378-383
    • /
    • 2020
  • Viola & Jones' object detection algorithm is a very good algorithm for the face component(FC) detection, but there are still problems such as duplicate detection, false detection and non-detection due to parameter setting. This paper proposes an improved FC detection algorithm that applies the variable parameter to reduce non-detection and the verification to reduce duplicate detection and false detection to the Viola & Jones' algorithm. The proposed algorithm reduces the non-detection by changing the parameter value of the Viola & Jones' algorithm until the potential valid FCs are detected, and eliminates the duplicate detection and the false detection by using the verification that evaluates size, position, and uniqueness of the detected FCs. Simulation results show that the proposed algorithm includes valid FCs in the detected objects and then detects only the valid FCs by removing invalid FCs from them.

A Study on Duplicate Detection Algorithm in Union Catalog (종합목록의 중복레코드 검증을 위한 알고리즘 연구)

  • Cho, Sun-Yeong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.37 no.4
    • /
    • pp.69-88
    • /
    • 2003
  • This study intends to develop a new duplicate detection algorithm to improve database quality. The new algorithm is developed to analyze by variables of language and bibliographic type, and it checks elements in bibliographic data not just MARC fields. The algorithm computes the degree of similarity and the weight values to avoid possible elimination of records by simple input error. The study was peformed on the 7,649 newly uploaded records during the last one year against the 210,000 sample master database. The findings show that the new algorithm has improved the duplicates recall rate by 36.2%.

Improved Face Detection Algorithm Using Face Verification (얼굴 검증을 이용한 개선된 얼굴 검출)

  • Oh, Jeong-su
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.10
    • /
    • pp.1334-1339
    • /
    • 2018
  • Viola & Jones's face detection algorithm is a typical face detection algorithm and shows excellent face detection performance. However, the Viola & Jones's algorithm in images including many faces generates undetected faces and wrong detected faces, such as false faces and duplicate detected faces, due to face diversity. This paper proposes an improved face detection algorithm using a face verification algorithm that eliminates the false detected faces generated from the Viola & Jones's algorithm. The proposed face verification algorithm verifies whether the detected face is valid by evaluating its size, its skin color in the designated area, its edges generated from eyes and mouth, and its duplicate detection. In the face verification experiment of 658 face images detected by the Viola & Jones's algorithm, the proposed face verification algorithm shows that all the face images created in the real person are verified.

Tree-Pattern-Based Clone Detection with High Precision and Recall

  • Lee, Hyo-Sub;Choi, Myung-Ryul;Doh, Kyung-Goo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.1932-1950
    • /
    • 2018
  • The paper proposes a code-clone detection method that gives the highest possible precision and recall, without giving much attention to efficiency and scalability. The goal is to automatically create a reliable reference corpus that can be used as a basis for evaluating the precision and recall of clone detection tools. The algorithm takes an abstract-syntax-tree representation of source code and thoroughly examines every possible pair of all duplicate tree patterns in the tree, while avoiding unnecessary and duplicated comparisons wherever possible. The largest possible duplicate patterns are then collected in the set of pattern clusters that are used to identify code clones. The method is implemented and evaluated for a standard set of open-source Java applications. The experimental result shows very high precision and recall. False-negative clones missed by our method are all non-contiguous clones. Finally, the concept of neighbor patterns, which can be used to improve recall by detecting non-contiguous clones and intertwined clones, is proposed.

IPv6 Autoconfiguration for Hierarchical MANETs with Efficient Leader Election Algorithm

  • Bouk, Safdar Hussain;Sasase, Iwao
    • Journal of Communications and Networks
    • /
    • v.11 no.3
    • /
    • pp.248-260
    • /
    • 2009
  • To connect a mobile ad hoc network (MANET) with an IP network and to carryout communication, ad hoc network node needs to be configured with unique IP adress. Dynamic host configuration protocol (DHCP) server autoconfigure nodes in wired networks. However, this cannot be applied to ad hoc network without introducing some changes in auto configuration mechanism, due to intrinsic properties (i.e., multi-hop, dynamic, and distributed nature) of the network. In this paper, we propose a scalable autoconfiguration scheme for MANETs with hierarchical topology consisting of leader and member nodes, by considering the global Internet connectivity with minimum overhead. In our proposed scheme, a joining node selects one of the pre-configured nodes for its duplicate address detection (DAD) operation. We reduce overhead and make our scheme scalable by eliminating the broadcast of DAD messages in the network. We also propose the group leader election algorithm, which takes into account the resources, density, and position information of a node to select a new leader. Our simulation results show that our proposed scheme is effective to reduce the overhead and is scalable. Also, it is shown that the proposed scheme provides an efficient method to heal the network after partitioning and merging by enhancing the role of bordering nodes in the group.

A Method of Hierarchical Address Autoconfiguration base on Hop-count in 6LoWPAN (6LoWPAN에서 홉-수 기반 계층적 자동주소할당 방법)

  • Kim, Dong-Kyu;Kim, Jung-Gyu
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.3
    • /
    • pp.11-21
    • /
    • 2010
  • Increase in the number of sensor nodes in sensor networks and sensor node to automatically assign addresses are needed. The method developed to address existing severe wasting, coordinators have all address information, each sensor node when addressing the shortcomings are a lot of traffic. In this paper, 6LoWPAN automatically from the sensor nodes capable of efficiently addressing Hop-Count based hierarchical address allocation algorithm is proposed. How to propose a hop-count of divided areas are separated, with no overlap and can be assigned a unique address, DAD(Duplicate Address Detection) reduced area. Perform DAD to reduce traffic, packet transmission in the IP header destination address, respectively, with a minimum 32-bit compression and packet transmission over a non-compression method to reduce the number of 11.1%.

Developing Image Processing Program for Automated Counting of Airborne Fibers (이미지 처리를 통한 공기 중 섬유의 자동계수 알고리즘 프로그램 개발)

  • Choi, Sungwon;Lee, Heekong;Lee, Jong Il;Kim, Hyunwook
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.24 no.4
    • /
    • pp.484-491
    • /
    • 2014
  • Objectives: An image processing program for asbestos fibers analyzing the gradient components and partial linearity was developed in order to accurately segment fibers. The objectives were to increase the accuracy of counting through the formulation of the size and shape of fibers and to guarantee robust fiber detection in noisy backgrounds. Methods: We utilized samples mixed with sand and sepiolite, which has a similar structure to asbestos. Sample concentrations of 0.01%, 0.05%, 0.1%, 0.5%, 1%, 2%, and 3%(w/w) were prepared. The sand used was homogenized after being sieved to less than $180{\mu}m$. Airborne samples were collected on MCE filters by utilizing a personal pump with 2 L/min flow rate for 30 minutes. We used the NIOSH 7400 method for pre-treating and counting the fibers on the filters. The results of the NIOSH 7400 method were compared with those of the image processing program. Results: The performance of the developed algorithm, when compared with the target images acquired by PCM, showed that the detection rate was on average 88.67%. The main causes of non-detection were missing fibers with a low degree of contrast and overlapping of faint and thin fibers. Also, some duplicate countings occurred for fibers with breaks in the middle due to overlapping particles. Conclusions: An image detection algorithm that could increase the accuracy of fiber counting was developed by considering the direction of the edge to extract images of fibers. It showed comparable results to PCM analysis and could be used to count fibers through real-time tracking by modeling a branch point to graph. This algorithm can be utilized to measure the concentrations of asbestos in real-time if a suitable optical design is developed.

Efficient Similarity Joins by Adaptive Prefix Filtering (맞춤 접두 필터링을 이용한 효율적인 유사도 조인)

  • Park, Jong Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.4
    • /
    • pp.267-272
    • /
    • 2013
  • As an important operation with many applications such as data cleaning and duplicate detection, the similarity join is a challenging issue, which finds all pairs of records whose similarities are above a given threshold in a dataset. We propose a new algorithm that uses the prefix filtering principle as strong constraints on generation of candidate pairs for fast similarity joins. The candidate pair is generated only when the current prefix token of a probing record shares one prefix token of an indexing record within the constrained prefix tokens by the principle. This generation method needs not to compute an upper bound of the overlap between two records, which results in reduction of execution time. Experimental results show that our algorithm significantly outperforms the previous prefix filtering-based algorithms on real datasets.

Content based Video Copy Detection Using Spatio-Temporal Ordinal Measure (시공간 순차 정보를 이용한 내용기반 복사 동영상 검출)

  • Jeong, Jae-Hyup;Kim, Tae-Wang;Yang, Hun-Jun;Jin, Ju-Kyong;Jeong, Dong-Seok
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.113-121
    • /
    • 2012
  • In this paper, we proposed fast and efficient algorithm for detecting near-duplication based on content based retrieval in large scale video database. For handling large amounts of video easily, we split the video into small segment using scene change detection. In case of video services and copyright related business models, it is need to technology that detect near-duplicates, that longer matched video than to search video containing short part or a frame of original. To detect near-duplicate video, we proposed motion distribution and frame descriptor in a video segment. The motion distribution descriptor is constructed by obtaining motion vector from macro blocks during the video decoding process. When matching between descriptors, we use the motion distribution descriptor as filtering to improving matching speed. However, motion distribution has low discriminability. To improve discrimination, we decide to identification using frame descriptor extracted from selected representative frames within a scene segmentation. The proposed algorithm shows high success rate and low false alarm rate. In addition, the matching speed of this descriptor is very fast, we confirm this algorithm can be useful to practical application.

Enhancements to the fast recovery Algorithm of TCP NewReno using rapid loss detection (빠른 손실 감지를 통한 TCP NewReno의 Fast Recovery 개선 알고리듬)

  • 김동민;김범준;김석규;이재용
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.7B
    • /
    • pp.650-659
    • /
    • 2004
  • Domestic wireless network environment is changing rapidly while adapting to meet service requirements of users and growth of market. As a result, reliable data transmission using TCP is also expected to increase. Since TCP assumes that it is used in wired networt TCP suffers significant performance degradation over wireless network where packet losses are not always result of network congestion. Especially RTO imposes a great performance degradation of TCP. In this paper, we propose DAC$^{+}$ and EFR in order to prevent performance degradation by quickly detecting and recovering loss without RTO during fast recovery. Compared with TCP NewReno, proposed scheme shows improvements in steady-state in terms of higher fast recovery Probability and reduced response time.