• Title/Summary/Keyword: Module Extraction

Search Result 214, Processing Time 0.03 seconds

A robust Correlation Filter based tracker with rich representation and a relocation component

  • Jin, Menglei;Liu, Weibin;Xing, Weiwei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.5161-5178
    • /
    • 2019
  • Correlation Filter was recently demonstrated to have good characteristics in the field of video object tracking. The advantages of Correlation Filter based trackers are reflected in the high accuracy and robustness it provides while maintaining a high speed. However, there are still some necessary improvements that should be made. First, most trackers cannot handle multi-scale problems. To solve this problem, our algorithm combines position estimation with scale estimation. The difference from the traditional method in regard to the scale estimation is that, the proposed method can track the scale of the object more quickly and effective. Additionally, in the feature extraction module, the feature representation of traditional algorithms is relatively simple, and furthermore, the tracking performance is easily affected in complex scenarios. In this paper, we design a novel and powerful feature that can significantly improve the tracking performance. Finally, traditional trackers often suffer from model drift, which is caused by occlusion and other complex scenarios. We introduce a relocation component to detect object at other locations such as the secondary peak of the response map. It partly alleviates the model drift problem.

Interface Technique for Optimization of Free-form Structural System (구조 최적화를 위한 비정형 구조시스템의 인터페이스 기법)

  • Na, Yoo-Mi;Lee, Jae-Hong;Kang, Joo-Won
    • Journal of Korean Association for Spatial Structures
    • /
    • v.12 no.1
    • /
    • pp.43-50
    • /
    • 2012
  • Recently, due to the advanced computer technology, momental architectures have been designed and built using features that are very sophisticated. People's interest in free-form structural system has increased steadily not only nationwide, but also worldwide. However, there were many difficulties in the materialization of free-form structural system owing to the lack of technique and research. To solve this problem, this study performs the interface between the 3D modeling program and the optimization program. In the 3D modeling program, it is possible to automatic mesh generation and immediately to information extraction. It performs the shape optimization. Consequently, this research designs the example model and performs optimization in order to verify the developed interface module.

Gradation Image Processing for Text Recognition in Road Signs Using Image Division and Merging

  • Chong, Kyusoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.2
    • /
    • pp.27-33
    • /
    • 2014
  • This paper proposes a gradation image processing method for the development of a Road Sign Recognition Platform (RReP), which aims to facilitate the rapid and accurate management and surveying of approximately 160,000 road signs installed along the highways, national roadways, and local roads in the cities, districts (gun), and provinces (do) of Korea. RReP is based on GPS(Global Positioning System), IMU(Inertial Measurement Unit), INS(Inertial Navigation System), DMI(Distance Measurement Instrument), and lasers, and uses an imagery information collection/classification module to allow the automatic recognition of signs, the collection of shapes, pole locations, and sign-type data, and the creation of road sign registers, by extracting basic data related to the shape and sign content, and automated database design. Image division and merging, which were applied in this study, produce superior results compared with local binarization method in terms of speed. At the results, larger texts area were found in images, the accuracy of text recognition was improved when images had been gradated. Multi-threshold values of natural scene images are used to improve the extraction rate of texts and figures based on pattern recognition.

A Comparative Study on OCR using Super-Resolution for Small Fonts

  • Cho, Wooyeong;Kwon, Juwon;Kwon, Soonchu;Yoo, Jisang
    • International journal of advanced smart convergence
    • /
    • v.8 no.3
    • /
    • pp.95-101
    • /
    • 2019
  • Recently, there have been many issues related to text recognition using Tesseract. One of these issues is that the text recognition accuracy is significantly lower for smaller fonts. Tesseract extracts text by creating an outline with direction in the image. By searching the Tesseract database, template matching with characters with similar feature points is used to select the character with the lowest error. Because of the poor text extraction, the recognition accuracy is lowerd. In this paper, we compared text recognition accuracy after applying various super-resolution methods to smaller text images and experimented with how the recognition accuracy varies for various image size. In order to recognize small Korean text images, we have used super-resolution algorithms based on deep learning models such as SRCNN, ESRCNN, DSRCNN, and DCSCN. The dataset for training and testing consisted of Korean-based scanned images. The images was resized from 0.5 times to 0.8 times with 12pt font size. The experiment was performed on x0.5 resized images, and the experimental result showed that DCSCN super-resolution is the most efficient method to reduce precision error rate by 7.8%, and reduce the recall error rate by 8.4%. The experimental results have demonstrated that the accuracy of text recognition for smaller Korean fonts can be improved by adding super-resolution methods to the OCR preprocessing module.

Activity Object Detection Based on Improved Faster R-CNN

  • Zhang, Ning;Feng, Yiran;Lee, Eung-Joo
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.3
    • /
    • pp.416-422
    • /
    • 2021
  • Due to the large differences in human activity within classes, the large similarity between classes, and the problems of visual angle and occlusion, it is difficult to extract features manually, and the detection rate of human behavior is low. In order to better solve these problems, an improved Faster R-CNN-based detection algorithm is proposed in this paper. It achieves multi-object recognition and localization through a second-order detection network, and replaces the original feature extraction module with Dense-Net, which can fuse multi-level feature information, increase network depth and avoid disappearance of network gradients. Meanwhile, the proposal merging strategy is improved with Soft-NMS, where an attenuation function is designed to replace the conventional NMS algorithm, thereby avoiding missed detection of adjacent or overlapping objects, and enhancing the network detection accuracy under multiple objects. During the experiment, the improved Faster R-CNN method in this article has 84.7% target detection result, which is improved compared to other methods, which proves that the target recognition method has significant advantages and potential.

Bird's Eye View Semantic Segmentation based on Improved Transformer for Automatic Annotation

  • Tianjiao Liang;Weiguo Pan;Hong Bao;Xinyue Fan;Han Li
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.1996-2015
    • /
    • 2023
  • High-definition (HD) maps can provide precise road information that enables an autonomous driving system to effectively navigate a vehicle. Recent research has focused on leveraging semantic segmentation to achieve automatic annotation of HD maps. However, the existing methods suffer from low recognition accuracy in automatic driving scenarios, leading to inefficient annotation processes. In this paper, we propose a novel semantic segmentation method for automatic HD map annotation. Our approach introduces a new encoder, known as the convolutional transformer hybrid encoder, to enhance the model's feature extraction capabilities. Additionally, we propose a multi-level fusion module that enables the model to aggregate different levels of detail and semantic information. Furthermore, we present a novel decoupled boundary joint decoder to improve the model's ability to handle the boundary between categories. To evaluate our method, we conducted experiments using the Bird's Eye View point cloud images dataset and Cityscapes dataset. Comparative analysis against stateof-the-art methods demonstrates that our model achieves the highest performance. Specifically, our model achieves an mIoU of 56.26%, surpassing the results of SegFormer with an mIoU of 1.47%. This innovative promises to significantly enhance the efficiency of HD map automatic annotation.

A Framework for Facial Expression Recognition Combining Contextual Information and Attention Mechanism

  • Jianzeng Chen;Ningning Chen
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.535-549
    • /
    • 2024
  • Facial expressions (FEs) serve as fundamental components for human emotion assessment and human-computer interaction. Traditional convolutional neural networks tend to overlook valuable information during the FE feature extraction, resulting in suboptimal recognition rates. To address this problem, we propose a deep learning framework that incorporates hierarchical feature fusion, contextual data, and an attention mechanism for precise FE recognition. In our approach, we leveraged an enhanced VGGNet16 as the backbone network and introduced an improved group convolutional channel attention (GCCA) module in each block to emphasize the crucial expression features. A partial decoder was added at the end of the backbone network to facilitate the fusion of multilevel features for a comprehensive feature map. A reverse attention mechanism guides the model to refine details layer-by-layer while introducing contextual information and extracting richer expression features. To enhance feature distinguishability, we employed islanding loss in combination with softmax loss, creating a joint loss function. Using two open datasets, our experimental results demonstrated the effectiveness of our framework. Our framework achieved an average accuracy rate of 74.08% on the FER2013 dataset and 98.66% on the CK+ dataset, outperforming advanced methods in both recognition accuracy and stability.

ACMs-based Human Shape Extraction and Tracking System for Human Identification (개인 인증을 위한 활성 윤곽선 모델 기반의 사람 외형 추출 및 추적 시스템)

  • Park, Se-Hyun;Kwon, Kyung-Su;Kim, Eun-Yi;Kim, Hang-Joon
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.12 no.5
    • /
    • pp.39-46
    • /
    • 2007
  • Research on human identification in ubiquitous environment has recently attracted a lot of attention. As one of those research, gait recognition is an efficient method of human identification using physical features of a walking person at a distance. In this paper, we present a human shape extraction and tracking for gait recognition using geodesic active contour models(GACMs) combined with mean shift algorithm The active contour models (ACMs) are very effective to deal with the non-rigid object because of its elastic property. However, they have the limitation that their performance is mainly dependent on the initial curve. To overcome this problem, we combine the mean shift algorithm with the traditional GACMs. The main idea is very simple. Before evolving using level set method, the initial curve in each frame is re-localized near the human region and is resized enough to include the targe region. This mechanism allows for reducing the number of iterations and for handling the large object motion. The proposed system is composed of human region detection and human shape tracking modules. In the human region detection module, the silhouette of a walking person is extracted by background subtraction and morphologic operation. Then human shape are correctly obtained by the GACMs with mean shift algorithm. In experimental results, the proposed method show that it is extracted and tracked efficiently accurate shape for gait recognition.

  • PDF

Product Evaluation Criteria Extraction through Online Review Analysis: Using LDA and k-Nearest Neighbor Approach (온라인 리뷰 분석을 통한 상품 평가 기준 추출: LDA 및 k-최근접 이웃 접근법을 활용하여)

  • Lee, Ji Hyeon;Jung, Sang Hyung;Kim, Jun Ho;Min, Eun Joo;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.97-117
    • /
    • 2020
  • Product evaluation criteria is an indicator describing attributes or values of products, which enable users or manufacturers measure and understand the products. When companies analyze their products or compare them with competitors, appropriate criteria must be selected for objective evaluation. The criteria should show the features of products that consumers considered when they purchased, used and evaluated the products. However, current evaluation criteria do not reflect different consumers' opinion from product to product. Previous studies tried to used online reviews from e-commerce sites that reflect consumer opinions to extract the features and topics of products and use them as evaluation criteria. However, there is still a limit that they produce irrelevant criteria to products due to extracted or improper words are not refined. To overcome this limitation, this research suggests LDA-k-NN model which extracts possible criteria words from online reviews by using LDA and refines them with k-nearest neighbor. Proposed approach starts with preparation phase, which is constructed with 6 steps. At first, it collects review data from e-commerce websites. Most e-commerce websites classify their selling items by high-level, middle-level, and low-level categories. Review data for preparation phase are gathered from each middle-level category and collapsed later, which is to present single high-level category. Next, nouns, adjectives, adverbs, and verbs are extracted from reviews by getting part of speech information using morpheme analysis module. After preprocessing, words per each topic from review are shown with LDA and only nouns in topic words are chosen as potential words for criteria. Then, words are tagged based on possibility of criteria for each middle-level category. Next, every tagged word is vectorized by pre-trained word embedding model. Finally, k-nearest neighbor case-based approach is used to classify each word with tags. After setting up preparation phase, criteria extraction phase is conducted with low-level categories. This phase starts with crawling reviews in the corresponding low-level category. Same preprocessing as preparation phase is conducted using morpheme analysis module and LDA. Possible criteria words are extracted by getting nouns from the data and vectorized by pre-trained word embedding model. Finally, evaluation criteria are extracted by refining possible criteria words using k-nearest neighbor approach and reference proportion of each word in the words set. To evaluate the performance of the proposed model, an experiment was conducted with review on '11st', one of the biggest e-commerce companies in Korea. Review data were from 'Electronics/Digital' section, one of high-level categories in 11st. For performance evaluation of suggested model, three other models were used for comparing with the suggested model; actual criteria of 11st, a model that extracts nouns by morpheme analysis module and refines them according to word frequency, and a model that extracts nouns from LDA topics and refines them by word frequency. The performance evaluation was set to predict evaluation criteria of 10 low-level categories with the suggested model and 3 models above. Criteria words extracted from each model were combined into a single words set and it was used for survey questionnaires. In the survey, respondents chose every item they consider as appropriate criteria for each category. Each model got its score when chosen words were extracted from that model. The suggested model had higher scores than other models in 8 out of 10 low-level categories. By conducting paired t-tests on scores of each model, we confirmed that the suggested model shows better performance in 26 tests out of 30. In addition, the suggested model was the best model in terms of accuracy. This research proposes evaluation criteria extracting method that combines topic extraction using LDA and refinement with k-nearest neighbor approach. This method overcomes the limits of previous dictionary-based models and frequency-based refinement models. This study can contribute to improve review analysis for deriving business insights in e-commerce market.

Development of Mean Stand Height Module Using Image-Based Point Cloud and FUSION S/W (영상 기반 3차원 점군과 FUSION S/W 기반의 임분고 분석 모듈 개발)

  • KIM, Kyoung-Min
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.4
    • /
    • pp.169-185
    • /
    • 2016
  • Recently mean stand height has been added as new attribute to forest type maps, but it is often too costly and time consuming to manually measure 9,100,000 points from countrywide stereo aerial photos. In addition, tree heights are frequently measured around tombs and forest edges, which are poor representations of the interior tree stand. This work proposes an estimation of mean stand height using an image-based point cloud, which was extracted from stereo aerial photo with FUSION S/W. Then, a digital terrain model was created by filtering the DSM point cloud and subtracting the DTM from DSM, resulting in nDSM, which represents object heights (buildings, trees, etc.). The RMSE was calculated to compare differences in tree heights between those observed and extracted from the nDSM. The resulting RMSE of average total plot height was 0.96 m. Individual tree heights of the whole study site area were extracted using the USDA Forest Service's FUSION S/W. Finally, mean stand height was produced by averaging individual tree heights in a stand polygon of the forest type map. In order to automate the mean stand height extraction using photogrammetric methods, a module was developed as an ArcGIS add-in toolbox.