• Title/Summary/Keyword: Scale-Invariant Feature Transform

Search Result 162, Processing Time 0.025 seconds

Slab Region Localization for Text Extraction using SIFT Features (문자열 검출을 위한 슬라브 영역 추정)

  • Choi, Jong-Hyun;Choi, Sung-Hoo;Yun, Jong-Pil;Koo, Keun-Hwi;Kim, Sang-Woo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.5
    • /
    • pp.1025-1034
    • /
    • 2009
  • In steel making production line, steel slabs are given a unique identification number. This identification number, Slab management number(SMN), gives information about the use of the slab. Identification of SMN has been done by humans for several years, but this is expensive and not accurate and it has been a heavy burden on the workers. Consequently, to improve efficiency, automatic recognition system is desirable. Generally, a recognition system consists of text localization, text extraction, character segmentation, and character recognition. For exact SMN identification, all the stage of the recognition system must be successful. In particular, the text localization is great important stage and difficult to process. However, because of many text-like patterns in a complex background and high fuzziness between the slab and background, directly extracting text region is difficult to process. If the slab region including SMN can be detected precisely, text localization algorithm will be able to be developed on the more simple method and the processing time of the overall recognition system will be reduced. This paper describes about the slab region localization using SIFT(Scale Invariant Feature Transform) features in the image. First, SIFT algorithm is applied the captured background and slab image, then features of two images are matched by Nearest Neighbor(NN) algorithm. However, correct matching rate can be low when two images are matched. Thus, to remove incorrect match between the features of two images, geometric locations of the matched two feature points are used. Finally, search rectangle method is performed in correct matching features, and then the top boundary and side boundaries of the slab region are determined. For this processes, we can reduce search region for extraction of SMN from the slab image. Most cases, to extract text region, search region is heuristically fixed [1][2]. However, the proposed algorithm is more analytic than other algorithms, because the search region is not fixed and the slab region is searched in the whole image. Experimental results show that the proposed algorithm has a good performance.

Mobile Camera-Based Positioning Method by Applying Landmark Corner Extraction (랜드마크 코너 추출을 적용한 모바일 카메라 기반 위치결정 기법)

  • Yoo Jin Lee;Wansang Yoon;Sooahm Rhee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1309-1320
    • /
    • 2023
  • The technological development and popularization of mobile devices have developed so that users can check their location anywhere and use the Internet. However, in the case of indoors, the Internet can be used smoothly, but the global positioning system (GPS) function is difficult to use. There is an increasing need to provide real-time location information in shaded areas where GPS is not received, such as department stores, museums, conference halls, schools, and tunnels, which are indoor public places. Accordingly, research on the recent indoor positioning technology based on light detection and ranging (LiDAR) equipment is increasing to build a landmark database. Focusing on the accessibility of building a landmark database, this study attempted to develop a technique for estimating the user's location by using a single image taken of a landmark based on a mobile device and the landmark database information constructed in advance. First, a landmark database was constructed. In order to estimate the user's location only with the mobile image photographing the landmark, it is essential to detect the landmark from the mobile image, and to acquire the ground coordinates of the points with fixed characteristics from the detected landmark. In the second step, by applying the bag of words (BoW) image search technology, the landmark photographed by the mobile image among the landmark database was searched up to a similar 4th place. In the third step, one of the four candidate landmarks searched through the scale invariant feature transform (SIFT) feature point extraction technique and Homography random sample consensus(RANSAC) was selected, and at this time, filtering was performed once more based on the number of matching points through threshold setting. In the fourth step, the landmark image was projected onto the mobile image through the Homography matrix between the corresponding landmark and the mobile image to detect the area of the landmark and the corner. Finally, the user's location was estimated through the location estimation technique. As a result of analyzing the performance of the technology, the landmark search performance was measured to be about 86%. As a result of comparing the location estimation result with the user's actual ground coordinate, it was confirmed that it had a horizontal location accuracy of about 0.56 m, and it was confirmed that the user's location could be estimated with a mobile image by constructing a landmark database without separate expensive equipment.

Matching Points Filtering Applied Panorama Image Processing Using SURF and RANSAC Algorithm (SURF와 RANSAC 알고리즘을 이용한 대응점 필터링 적용 파노라마 이미지 처리)

  • Kim, Jeongho;Kim, Daewon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.4
    • /
    • pp.144-159
    • /
    • 2014
  • Techniques for making a single panoramic image using multiple pictures are widely studied in many areas such as computer vision, computer graphics, etc. The panorama image can be applied to various fields like virtual reality, robot vision areas which require wide-angled shots as an useful way to overcome the limitations such as picture-angle, resolutions, and internal informations of an image taken from a single camera. It is so much meaningful in a point that a panoramic image usually provides better immersion feeling than a plain image. Although there are many ways to build a panoramic image, most of them are using the way of extracting feature points and matching points of each images for making a single panoramic image. In addition, those methods use the RANSAC(RANdom SAmple Consensus) algorithm with matching points and the Homography matrix to transform the image. The SURF(Speeded Up Robust Features) algorithm which is used in this paper to extract featuring points uses an image's black and white informations and local spatial informations. The SURF is widely being used since it is very much robust at detecting image's size, view-point changes, and additionally, faster than the SIFT(Scale Invariant Features Transform) algorithm. The SURF has a shortcoming of making an error which results in decreasing the RANSAC algorithm's performance speed when extracting image's feature points. As a result, this may increase the CPU usage occupation rate. The error of detecting matching points may role as a critical reason for disqualifying panoramic image's accuracy and lucidity. In this paper, in order to minimize errors of extracting matching points, we used $3{\times}3$ region's RGB pixel values around the matching points' coordinates to perform intermediate filtering process for removing wrong matching points. We have also presented analysis and evaluation results relating to enhanced working speed for producing a panorama image, CPU usage rate, extracted matching points' decreasing rate and accuracy.

A Study on Training Dataset Configuration for Deep Learning Based Image Matching of Multi-sensor VHR Satellite Images (다중센서 고해상도 위성영상의 딥러닝 기반 영상매칭을 위한 학습자료 구성에 관한 연구)

  • Kang, Wonbin;Jung, Minyoung;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1505-1514
    • /
    • 2022
  • Image matching is a crucial preprocessing step for effective utilization of multi-temporal and multi-sensor very high resolution (VHR) satellite images. Deep learning (DL) method which is attracting widespread interest has proven to be an efficient approach to measure the similarity between image pairs in quick and accurate manner by extracting complex and detailed features from satellite images. However, Image matching of VHR satellite images remains challenging due to limitations of DL models in which the results are depending on the quantity and quality of training dataset, as well as the difficulty of creating training dataset with VHR satellite images. Therefore, this study examines the feasibility of DL-based method in matching pair extraction which is the most time-consuming process during image registration. This paper also aims to analyze factors that affect the accuracy based on the configuration of training dataset, when developing training dataset from existing multi-sensor VHR image database with bias for DL-based image matching. For this purpose, the generated training dataset were composed of correct matching pairs and incorrect matching pairs by assigning true and false labels to image pairs extracted using a grid-based Scale Invariant Feature Transform (SIFT) algorithm for a total of 12 multi-temporal and multi-sensor VHR images. The Siamese convolutional neural network (SCNN), proposed for matching pair extraction on constructed training dataset, proceeds with model learning and measures similarities by passing two images in parallel to the two identical convolutional neural network structures. The results from this study confirm that data acquired from VHR satellite image database can be used as DL training dataset and indicate the potential to improve efficiency of the matching process by appropriate configuration of multi-sensor images. DL-based image matching techniques using multi-sensor VHR satellite images are expected to replace existing manual-based feature extraction methods based on its stable performance, thus further develop into an integrated DL-based image registration framework.

Evaluation on Tie Point Extraction Methods of WorldView-2 Stereo Images to Analyze Height Information of Buildings (건물의 높이 정보 분석을 위한 WorldView-2 스테레오 영상의 정합점 추출방법 평가)

  • Yeji, Kim;Yongil, Kim
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.33 no.5
    • /
    • pp.407-414
    • /
    • 2015
  • Interest points are generally located at the pixels where height changes occur. So, interest points can be the significant pixels for DSM generation, and these have the important role to generate accurate and reliable matching results. Manual operation is widely used to extract the interest points and to match stereo satellite images using these for generating height information, but it causes economic and time consuming problems. Thus, a tie point extraction method using Harris-affine technique and SIFT(Scale Invariant Feature Transform) descriptors was suggested to analyze height information of buildings in this study. Interest points on buildings were extracted by Harris-affine technique, and tie points were collected efficiently by SIFT descriptors, which is invariant for scale. Searching window for each interest points was used, and direction of tie points pairs were considered for more efficient tie point extraction method. Tie point pairs estimated by proposed method was used to analyze height information of buildings. The result had RMSE values less than 2m comparing to the height information estimated by manual method.

Framework Implementation of Image-Based Indoor Localization System Using Parallel Distributed Computing (병렬 분산 처리를 이용한 영상 기반 실내 위치인식 시스템의 프레임워크 구현)

  • Kwon, Beom;Jeon, Donghyun;Kim, Jongyoo;Kim, Junghwan;Kim, Doyoung;Song, Hyewon;Lee, Sanghoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.11
    • /
    • pp.1490-1501
    • /
    • 2016
  • In this paper, we propose an image-based indoor localization system using parallel distributed computing. In order to reduce computation time for indoor localization, an scale invariant feature transform (SIFT) algorithm is performed in parallel by using Apache Spark. Toward this goal, we propose a novel image processing interface of Apache Spark. The experimental results show that the speed of the proposed system is about 3.6 times better than that of the conventional system.

Mosaic image generation of AISA Eagle hyperspectral sensor using SIFT method (SIFT 기법을 이용한 AISA Eagle 초분광센서의 모자이크영상 생성)

  • Han, You Kyung;Kim, Yong Il;Han, Dong Yeob;Choi, Jae Wan
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.2
    • /
    • pp.165-172
    • /
    • 2013
  • In this paper, high-quality mosaic image is generated by high-resolution hyperspectral strip images using scale-invariant feature transform (SIFT) algorithm, which is one of the representative image matching methods. The experiments are applied to AISA Eagle images geo-referenced by using GPS/INS information acquired when it was taken on flight. The matching points between three strips of hyperspectral images are extracted using SIFT method, and the transformation models between images are constructed from the points. Mosaic image is, then, generated using the transformation models constructed from corresponding images. Optimal band appropriate for the matching point extraction is determined by selecting representative bands of hyperspectral data and analyzing the matched results based on each band. Mosaic image generated by proposed method is visually compared with the mosaic image generated from initial geo-referenced AISA hyperspectral images. From the comparison, we could estimate geometrical accuracy of generated mosaic image and analyze the efficiency of our methodology.

Fast Object Classification Using Texture and Color Information for Video Surveillance Applications (비디오 감시 응용을 위한 텍스쳐와 컬러 정보를 이용한 고속 물체 인식)

  • Islam, Mohammad Khairul;Jahan, Farah;Min, Jae-Hong;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.15 no.1
    • /
    • pp.140-146
    • /
    • 2011
  • In this paper, we propose a fast object classification method based on texture and color information for video surveillance. We take the advantage of local patches by extracting SURF and color histogram from images. SURF gives intensity content information and color information strengthens distinctiveness by providing links to patch content. We achieve the advantages of fast computation of SURF as well as color cues of objects. We use Bag of Word models to generate global descriptors of a region of interest (ROI) or an image using the local features, and Na$\ddot{i}$ve Bayes model for classifying the global descriptor. In this paper, we also investigate discriminative descriptor named Scale Invariant Feature Transform (SIFT). Our experiment result for 4 classes of the objects shows 95.75% of classification rate.

Post Sender Recognition using SIFT (SIFT를 이용한 우편영상의 송신자 인식)

  • Kim, Young-Won;Jang, Seung-Ick;Lee, Sung-Jun
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.11
    • /
    • pp.48-57
    • /
    • 2010
  • Previous post sender recognition study was focused on recognizing the address of receiver. Relatively, there was lack of study to recognize the information of sender's address. Post sender recognition study is necessary for the service and application using sender information such as returning. This paper did the experiment and suggested how to recognize post sender using SIFT. Although SIFT shows great recognition rate, SIFT had problems with time and mis-recognition. One is increased time to match keypoints in proportion as the number of registered model. The other is mis-recognition of many similar keypoints even though they are all different models due to the nature of post sender. To solve the problem, this paper suggested SIFT adding distance function and did the experiment to compare time and function. In addition, it is suggested how to register and classify models automatically without the manual process of registering models.

Design and Implementation of Video Search System robust to Brightness and Rotation Changes Based on Ferns Algorithm (Ferns 알고리즘 기반 밝기 및 회전 변화에 강인한 영상검색 시스템 설계 및 구현)

  • Yoon, Seok-Hwan;Shim, Jae-Sung;Park, Seok-Cheon
    • Journal of Korea Multimedia Society
    • /
    • v.19 no.9
    • /
    • pp.1679-1689
    • /
    • 2016
  • Recently, due to the rapid development of multimedia technologies, as image data has been extensive and large-scaled, the problem of increasing the time needed to retrieve the desired image is gradually critical. Image retrieval system that allows users to quickly and accurately search for the desired image information has been researched for a long time. However, in the case of content-based image retrieval representative Color Histogram, Color Coherence Vectors (CCV), Scale Invariant Feature Transform (SIFT) used in sensitive to changes in brightness, rotation, there is a problem that can occur misrecognized division off the power. In this paper, in order to evaluate the video retrieval system proposed, no change in brightness, respectively 0°, 90°, 180°, 270° rotated brightness up based on the case of changing, when the brightness down the results were compared with the performance evaluation of the system is an average of about 2% to provide the difference in performance due to changes in brightness, color histogram is an average of about 12.5%, CCV is an average of about 12.25%, it appeared in the SIFT is an average of about 8.5%, Thus, the proposed system of the variation width of the smallest in average about 2%, was confirmed to be robust to changes in the brightness and rotation than the existing systems.