• Title/Summary/Keyword: visual search performance

Search Result 100, Processing Time 0.025 seconds

Dual-Encoded Features from Both Spatial and Curvelet Domains for Image Smoke Recognition

  • Yuan, Feiniu;Tang, Tiantian;Xia, Xue;Shi, Jinting;Li, Shuying
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2078-2093
    • /
    • 2019
  • Visual smoke recognition is a challenging task due to large variations in shape, texture and color of smoke. To improve performance, we propose a novel smoke recognition method by combining dual-encoded features that are extracted from both spatial and Curvelet domains. A Curvelet transform is used to filter an image to generate fifty sub-images of Curvelet coefficients. Then we extract Local Binary Pattern (LBP) maps from these coefficient maps and aggregate histograms of these LBP maps to produce a histogram map. Afterwards, we encode the histogram map again to generate Dual-encoded Local Binary Patterns (Dual-LBP). Histograms of Dual-LBPs from Curvelet domain and Completed Local Binary Patterns (CLBP) from spatial domain are concatenated to form the feature for smoke recognition. Finally, we adopt Gaussian Kernel Optimization (GKO) algorithm to search the optimal kernel parameters of Support Vector Machine (SVM) for further improvement of classification accuracy. Experimental results demonstrate that our method can extract effective and reasonable features of smoke images, and achieve good classification accuracy.

ACT-R Predictive Model of Korean Text Entry on Touchscreen

  • Lim, Soo-Yong;Jo, Seong-Sik;Myung, Ro-Hae;Kim, Sang-Hyeob;Jang, Eun-Hye;Park, Byoung-Jun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.31 no.2
    • /
    • pp.291-298
    • /
    • 2012
  • Objective: The aim of this study is to predict Korean text entry on touchscreens using ACT-R cognitive architecture. Background: Touchscreen application in devices such as satellite navigation devices, PDAs, mobile phones, etc. has been increasing, and the market size is expanding. Accordingly, there is an increasing interest to develop and evaluate the interface to enhance the user experience and increase satisfaction in the touchscreen environment. Method: In this study, Korean text entry performance in the touchscreen environment was analyzed using ACT-R. The ACT-R model considering the characteristics of the Korean language which is composed of vowels and consonants was established. Further, this study analyzed if the prediction of Korean text entry is possible through the ACT-R cognitive model. Results: In the analysis results, no significant difference on performance time between model prediction and empirical data was found. Conclusion: The proposed model can predict the accurate physical movement time as well as cognitive processing time. Application: This study is useful in conducting model-based evaluation on the text entry interface of the touchscreen and enabled quantitative and effective evaluation on the diverse types of Korean text input interfaces through the cognitive models.

Enhancement on 3 DoF Image Stitching Using Inertia Sensor Data (관성 센서 데이터를 활용한 3 DoF 이미지 스티칭 향상)

  • Kim, Minwoo;Kim, Sang-Kyun
    • Journal of Broadcast Engineering
    • /
    • v.22 no.1
    • /
    • pp.51-61
    • /
    • 2017
  • This paper proposes a method to generate panoramic images by combining conventional feature extraction algorithms (e.g., SIFT, SURF, MPEG-7 CDVS) with sensed data from an inertia sensor to enhance the stitching results. The challenge of image stitching increases when the images are taken from two different mobile phones with no posture calibration. Using inertia sensor data obtained by the mobile phone, images with different yaw angles, pitch angles, roll angles are preprocessed and adjusted before performing stitching process. Performance of stitching (e.g., feature extraction time, inlier point numbers, stitching accuracy) between conventional feature extraction algorithms is reported along with the stitching performance with/without using the inertia sensor data.

A Study on the Outlet Blockage Determination Technology of Conveyor System using Deep Learning

  • Jeong, Eui-Han;Suh, Young-Joo;Kim, Dong-Ju
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.5
    • /
    • pp.11-18
    • /
    • 2020
  • This study proposes a technique for the determination of outlet blockage using deep learning in a conveyor system. The proposed method aims to apply the best model to the actual process, where we train various CNN models for the determination of outlet blockage using images collected by CCTV in an industrial scene. We used the well-known CNN model such as VGGNet, ResNet, DenseNet and NASNet, and used 18,000 images collected by CCTV for model training and performance evaluation. As a experiment result with various models, VGGNet showed the best performance with 99.03% accuracy and 29.05ms processing time, and we confirmed that VGGNet is suitable for the determination of outlet blockage.

Convolution Neural Network Based Auto Classification Model Using Endoscopic Images of Gastric Cancer and Gastric Ulcer (내시경의 위암과 위궤양 영상을 이용한 합성곱 신경망 기반의 자동 분류 모델)

  • Park, Ye Rang;Kim, Young Jae;Chung, Jun-Won;Kim, Kwang Gi
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.2
    • /
    • pp.101-106
    • /
    • 2020
  • Although benign gastric ulcers do not develop into gastric cancer, they are similar to early gastric cancer and difficult to distinguish. This may lead to misconsider early gastric cancer as gastric ulcer while diagnosing. Since gastric cancer does not have any special symptoms until discovered, it is important to detect gastric ulcers by early gastroscopy to prevent the gastric cancer. Therefore, we developed a Convolution Neural Network (CNN) model that can be helpful for endoscopy. 3,015 images of gastroscopy of patients undergoing endoscopy at Gachon University Gil Hospital were used in this study. Using ResNet-50, three models were developed to classify normal and gastric ulcers, normal and gastric cancer, and gastric ulcer and gastric cancer. We applied the data augmentation technique to increase the number of training data and examined the effect on accuracy by varying the multiples. The accuracy of each model with the highest performance are as follows. The accuracy of normal and gastric ulcer classification model was 95.11% when the data were increased 15 times, the accuracy of normal and gastric cancer classification model was 98.28% when 15 times increased likewise, and 5 times increased data in gastric ulcer and gastric cancer classification model yielded 87.89%. We will collect additional specific shape of gastric ulcer and cancer data and will apply various image processing techniques for visual enhancement. Models that classify normal and lesion, which showed relatively high accuracy, will be re-learned through optimal parameter search.

An Efficient Approximation method of Adaptive Support-Weight Matching in Stereo Images (스테레오 영상에서의 적응적 영역 가중치 매칭의 효율적 근사화 방법)

  • Kim, Ho-Young;Lee, Seong-Won
    • Journal of Broadcast Engineering
    • /
    • v.16 no.6
    • /
    • pp.902-915
    • /
    • 2011
  • Recently in the area-based stereo matching field, Adaptive Support-Weight (ASW) method that weights matching cost adaptively according to the luminance intensity and the geometric difference shows promising matching performance. However, ASW requires more computational cost than other matching algorithms do and its real-time implementation becomes impractical. By applying Integral Histogram technique after approximating to the Bilateral filter equation, the computational time of ASW can be restricted in constant time regardless of the support window size. However, Integral Histogram technique causes loss of the matching accuracy during approximation process of the original ASW equation. In this paper, we propose a novel algorithm that maintains the ASW algorithm's matching accuracy while reducing the computational costs. In the proposed algorithm, we propose Sub-Block method that groups the pixels within the support area. We also propose the method adjusting the disparity search range depending on edge information. The proposed technique reduces the calculation time efficiently while improving the matching accuracy.

Multi-View Wyner-Ziv Video Coding Based on Spatio-temporal Adaptive Estimation (시공간 적응적인 예측에 기초한 다시점 위너-지브 비디오 부호화 기법)

  • Lee, Beom-yong;Kim, Jin-soo
    • The Journal of the Korea Contents Association
    • /
    • v.16 no.6
    • /
    • pp.9-18
    • /
    • 2016
  • This paper proposes a multi-view Wyner-Ziv Video coding scheme based on spatio-temporal adaptive estimation. The proposed algorithm is designed to search for a better estimated block with joint bi-directional motion estimation by introducing weights between temporal and spatial directions, and by classifying effectively the region of interest blocks, which is based on the edge detection and the synthesis, and by selecting the reference estimation block from the effective motion vector analysis. The proposed algorithm exploits the information of a single frame viewpoint and adjacent frame viewpoints, simultaneously and then generates adaptively side information in a variety of closure, and reflection regions to have a better performance. Through several simulations with multi-view video sequences, it is shown that the proposed algorithm performs visual quality improvement as well as bit-rate reduction, compared to the conventional methods.

A Categorization Scheme of Tag-based Folksonomy Images for Efficient Image Retrieval (효과적인 이미지 검색을 위한 태그 기반의 폭소노미 이미지 카테고리화 기법)

  • Ha, Eunji;Kim, Yongsung;Hwang, Eenjun
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.6
    • /
    • pp.290-295
    • /
    • 2016
  • Recently, folksonomy-based image-sharing sites where users cooperatively make and utilize tags of image annotation have been gaining popularity. Typically, these sites retrieve images for a user request using simple text-based matching and display retrieved images in the form of photo stream. However, these tags are personal and subjective and images are not categorized, which results in poor retrieval accuracy and low user satisfaction. In this paper, we propose a categorization scheme for folksonomy images which can improve the retrieval accuracy in the tag-based image retrieval systems. Consequently, images are classified by the semantic similarity using text-information and image-information generated on the folksonomy. To evaluate the performance of our proposed scheme, we collect folksonomy images and categorize them using text features and image features. And then, we compare its retrieval accuracy with that of existing systems.

A Study on Frame Interpolation and Nonlinear Moving Vector Estimation Using GRNN (GRNN 알고리즘을 이용한 비선형적 움직임 벡터 추정 및 프레임 보간연구)

  • Lee, Seung-Joo;Bang, Min-Suk;Yun, Kee-Bang;Kim, Ki-Doo
    • Journal of IKEEE
    • /
    • v.17 no.4
    • /
    • pp.459-468
    • /
    • 2013
  • Under nonlinear characteristics of frames, we propose the frame interpolation using GRNN to enhance the visual picture quality. By full search with block size of 128x128~1x1 to reduce blocky artifact and image overlay, we select the frame having block of minimum error and re-estimate the nonlinear moving vector using GRNN. We compare our scheme with forward(backward) motion compensation, bidirectional motion compensation when the object movement is large or the object image includes zoom-in and zoom-out or camera focus has changed. Experimental results show that the proposed method provides better performance in subjective image quality compared to conventional MCFI methods.

Adaptive Frame Rate Up-Conversion Algorithm using the Neighbouring Pixel Information and Bilateral Motion Estimation (이웃하는 블록 정보와 양방향 움직임 예측을 이용한 적응적 프레임 보간 기법)

  • Oh, Hyeong-Chul;Lee, Joo-Hyun;Min, Chang-Ki;Jeong, Je-Chang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.9C
    • /
    • pp.761-770
    • /
    • 2010
  • In this paper, we propose a new Frame Rate Up-Conversion (FRUC) scheme to increase the frame rate from a lower number into a higher one and enhance the decoded video quality at the decoder. The proposed algorithm utilizes the preliminary frames of forward and backward direction using bilateral prediction. In the process of the preliminary frames, an additional interpolation is performed for the occlusion area because if the calculated value of the block with reference frame if larger than the predetermine thresholdn the block is selected as the occlusion area. In order to interpolate the occlusion area, we perform re-search to obtain the osiomal block considerhe osiomnumber of available ne block consblock. The experimental results show that performance of the proposed algorithm has better PSNR and visual quality than the conventional methods.