• 제목/요약/키워드: Vision recognition

검색결과 1,037건 처리시간 0.027초

인간의 공감각에 기반을 둔 색청변환을 이용한 영상 인식 (Image Recognition Using Colored-hear Transformation Based On Human Synesthesia)

  • 신성윤;문형윤;표성배
    • 한국컴퓨터정보학회논문지
    • /
    • 제13권2호
    • /
    • pp.135-141
    • /
    • 2008
  • 본 논문에서는 공유 비전과 특수한 청각에 의해 감지되는 인간의 공감각 특징을 구별하는 색청 인식을 제안한다. 카메라를 통한 시각적인 분석이 인간의 구조화된 사물 인식에 영향을 주는 것이 가능하다는 점이다. 그래서 시각장애인들이 실제 사물과 유사한 비전을 느낄 수 있도록 하는 방법에 대해 연구해왔다. 우선 특정 장면을 대표하는 영상 데이터에서 객체의 경계가 추출된다. 다음으로, 이미지에서 객체의 위치, 색상 평균 감성, 각 객체의 거리 정보, 그리고 객체 영역의 범위와 같은 4가지 특징을 추출하고, 이들 특징들을 청각적 요소로 사상한다. 청각적 요소는 시각장애인을 위한 시각 인식 형태로 제공된다. 제안된 색청 변환 시스템은 보다 빠르고 세부적인 인지 정보를 제공하고 동시에 감각을 위한 정보를 제공한다. 따라서 이 개념을 시각장애인의 영상 인식에 적용할 경우보다 좋은 결과를 얻을 수 있다.

  • PDF

Vision-Based Activity Recognition Monitoring Based on Human-Object Interaction at Construction Sites

  • Chae, Yeon;Lee, Hoonyong;Ahn, Changbum R.;Jung, Minhyuk;Park, Moonseo
    • 국제학술발표논문집
    • /
    • The 9th International Conference on Construction Engineering and Project Management
    • /
    • pp.877-885
    • /
    • 2022
  • Vision-based activity recognition has been widely attempted at construction sites to estimate productivity and enhance workers' health and safety. Previous studies have focused on extracting an individual worker's postural information from sequential image frames for activity recognition. However, various trades of workers perform different tasks with similar postural patterns, which degrades the performance of activity recognition based on postural information. To this end, this research exploited a concept of human-object interaction, the interaction between a worker and their surrounding objects, considering the fact that trade workers interact with a specific object (e.g., working tools or construction materials) relevant to their trades. This research developed an approach to understand the context from sequential image frames based on four features: posture, object, spatial features, and temporal feature. Both posture and object features were used to analyze the interaction between the worker and the target object, and the other two features were used to detect movements from the entire region of image frames in both temporal and spatial domains. The developed approach used convolutional neural networks (CNN) for feature extractors and activity classifiers and long short-term memory (LSTM) was also used as an activity classifier. The developed approach provided an average accuracy of 85.96% for classifying 12 target construction tasks performed by two trades of workers, which was higher than two benchmark models. This experimental result indicated that integrating a concept of the human-object interaction offers great benefits in activity recognition when various trade workers coexist in a scene.

  • PDF

Human Action Recognition Using Pyramid Histograms of Oriented Gradients and Collaborative Multi-task Learning

  • Gao, Zan;Zhang, Hua;Liu, An-An;Xue, Yan-Bing;Xu, Guang-Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권2호
    • /
    • pp.483-503
    • /
    • 2014
  • In this paper, human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning is proposed. First, we accumulate global activities and construct motion history image (MHI) for both RGB and depth channels respectively to encode the dynamics of one action in different modalities, and then different action descriptors are extracted from depth and RGB MHI to represent global textual and structural characteristics of these actions. Specially, average value in hierarchical block, GIST and pyramid histograms of oriented gradients descriptors are employed to represent human motion. To demonstrate the superiority of the proposed method, we evaluate them by KNN, SVM with linear and RBF kernels, SRC and CRC models on DHA dataset, the well-known dataset for human action recognition. Large scale experimental results show our descriptors are robust, stable and efficient, and outperform the state-of-the-art methods. In addition, we investigate the performance of our descriptors further by combining these descriptors on DHA dataset, and observe that the performances of combined descriptors are much better than just using only sole descriptor. With multimodal features, we also propose a collaborative multi-task learning method for model learning and inference based on transfer learning theory. The main contributions lie in four aspects: 1) the proposed encoding the scheme can filter the stationary part of human body and reduce noise interference; 2) different kind of features and models are assessed, and the neighbor gradients information and pyramid layers are very helpful for representing these actions; 3) The proposed model can fuse the features from different modalities regardless of the sensor types, the ranges of the value, and the dimensions of different features; 4) The latent common knowledge among different modalities can be discovered by transfer learning to boost the performance.

PCA알고리즘을 이용한 최적 pRBFNNs 기반 나이트비전 얼굴인식 시스템 설계 (Design of Optimized pRBFNNs-based Night Vision Face Recognition System Using PCA Algorithm)

  • 오성권;장병희
    • 전자공학회논문지
    • /
    • 제50권1호
    • /
    • pp.225-231
    • /
    • 2013
  • 본 연구에서는 PCA알고리즘을 이용한 최적 pRBFNNs 기반 나이트비전 얼굴인식 시스템을 설계 하고자 한다. 조명이 없는 주위 상태 하에서 조도가 낮기 때문에 CCD 카메라를 이용하여 영상을 획득하는 것이 어렵다. 본 논문에서는 낮은 조도에 의해 왜곡된 이미지의 품질을 나이트 비전 카메라와 히스토그램 평활화를 사용하여 향상시킨다. 그리고 얼굴과 비얼굴 이미지 영역 사이에서 얼굴 이미지를 검출하기 위하여 Ada-Boost 알고리즘을 사용한다. 추출된 고차원 특징 데이터를 저차원의 특징 데이터로 변환하기 위하여 데이터 차원축소 기법인 주성분 분석법(Principal Components Analysis; PCA)을 사용한다. 또한 인식 모듈로서 pRBFNNs(Polynomial- based Radial Basis Function Neural Networks) 패턴분류기를 소개한다. 제안된 다항식 기반 RBFNNs은 조건부, 결론부, 추론부 세 가지의 기능적 모듈로 구성되어 있다. 조건부는 FCM (Fuzzy C-means) 클러스터링을 사용하여 입력공간을 분할하고, 결론부는 분할된 로컬 영역을 다항식 함수로 표현한다. 그리고 차분진화 (Differential Evolution; DE) 알고리즘을 사용하여 모델의 파라미터를 최적화 한다.

수중로봇을 위한 형태를 기반으로 하는 인공표식의 인식 및 추종 알고리즘 (Shape Based Framework for Recognition and Tracking of Texture-free Objects for Submerged Robots in Structured Underwater Environment)

  • 한경민;최현택
    • 전자공학회논문지SC
    • /
    • 제48권6호
    • /
    • pp.91-98
    • /
    • 2011
  • 본 논문에서는 수중로봇에 쓰일 수 있는 카메라 영상을 기반으로 하는 인공표식물의 인식 및 추종 기법을 제안한다. 문제를 풀기 위해 제안된 방법은 인식과 추종의 두 개의 단계로 이루어져 있으며 인식단계에서는 물체의 외형에 관한 특징을 분석한 후 비선형 최적화 알고리즘을 통하여 알맞은 목표물로 분류한다. 이 후 추종 단계에서는 분류된 목표물에서 색깔 히스토그램을 추출한 후 meanshift 추종 법을 이용하여 지속적으로 추종하는 방법을 택하였다. 히스토그램 매칭 시에는 Bhattacharyya 거리를 계산하는 방법을 이용하였다. 결과적으로 제안하는 접근법은 수중로봇의 영상처리 분야에 다음과 같은 공헌을 할 것으로 기대한다. 1) 제안하는 방법은 카메라의 움직임으로 생기는 물체의 자세변화나 크기 변화에도 강인하게 대처할 수 있으며 2) 카메라 센서를 통한 방법이므로 초음파 센서 등의 기기들에 비하여 가격 경쟁력이 우수하다. 3) 또한 본 논문에서는 일반적으로 많이 쓰이는 특징 점을 기반으로 한 방법이 탁도 변화에서는 형태를 기반으로 한 방법보다 열등할 수 있음을 실험을 통하여 보였다. 4) 마지막으로 제안된 방법의 성능을 기존의 방법들과 비교하여 수치적으로 검증해 보았다.

저전력 온디바이스 비전 SW 프레임워크 기술 동향 (Trends in Low-Power On-Device Vision SW Framework Technology)

  • 이문수;배수영;김정시;석종수
    • 전자통신동향분석
    • /
    • 제36권2호
    • /
    • pp.56-64
    • /
    • 2021
  • Many computer vision algorithms are computationally expensive and require a lot of computing resources. Recently, owing to machine learning technology and high-performance embedded systems, vision processing applications, such as object detection, face recognition, and visual inspection, are widely used. However, on-devices need to use their resources to handle powerful vision works with low power consumption in heterogeneous environments. Consequently, global manufacturers are trying to lock many developers into their ecosystem, providing integrated low-power chips and dedicated vision libraries. Khronos Group-an international standard organization-has released the OpenVX standard for high-performance/low-power vision processing in heterogeneous on-device systems. This paper describes vision libraries for the embedded systems and presents the OpenVX standard along with related trends for on-device vision system.

시공간상의 궤적 분석에 의한 제스쳐 인식 (Gesture Recognition by Analyzing a Trajetory on Spatio-Temporal Space)

  • 민병우;윤호섭;소정;에지마 도시야끼
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제26권1호
    • /
    • pp.157-157
    • /
    • 1999
  • Researches on the gesture recognition have become a very interesting topic in the computer vision area, Gesture recognition from visual images has a number of potential applicationssuch as HCI (Human Computer Interaction), VR(Virtual Reality), machine vision. To overcome thetechnical barriers in visual processing, conventional approaches have employed cumbersome devicessuch as datagloves or color marked gloves. In this research, we capture gesture images without usingexternal devices and generate a gesture trajectery composed of point-tokens. The trajectory Is spottedusing phase-based velocity constraints and recognized using the discrete left-right HMM. Inputvectors to the HMM are obtained by using the LBG clustering algorithm on a polar-coordinate spacewhere point-tokens on the Cartesian space .are converted. A gesture vocabulary is composed oftwenty-two dynamic hand gestures for editing drawing elements. In our experiment, one hundred dataper gesture are collected from twenty persons, Fifty data are used for training and another fifty datafor recognition experiment. The recognition result shows about 95% recognition rate and also thepossibility that these results can be applied to several potential systems operated by gestures. Thedeveloped system is running in real time for editing basic graphic primitives in the hardwareenvironments of a Pentium-pro (200 MHz), a Matrox Meteor graphic board and a CCD camera, anda Window95 and Visual C++ software environment.

로봇의 시각시스템을 위한 칼라영상에서 퍼지추론을 이용한 얼굴인식 (Robot vision system for face recognition using fuzzy inference from color-image)

  • 이주신
    • 한국정보전자통신기술학회논문지
    • /
    • 제7권2호
    • /
    • pp.106-110
    • /
    • 2014
  • 본 논문에서는 로봇의 시각시스템에 효과적으로 적용할 수 있는 얼굴인식 방법을 제안하였다. 제안한 알고리즘은 얼굴영상의 색상추출과 특징점을 이용하여 인식한다. 색상추출은 피부색, 눈동자색, 입술색의 차를 이용하였으며, 특징정보는 눈, 코, 입에서 추출된 특징점 사이의 거리, 거리 비율, 각도, 면적의 차를 특징 파라미터로 이용하였다. 특징 파라미터를 퍼지화 데이터로 멤버십 함수를 생성한 후, 유사도를 평가하여 얼굴을 인식하였다. 입력받은 정면 칼라 영상으로 실험한 결과 96%의 인식율을 나타내었다.

Recognition of Individual Holstein Cattle by Imaging Body Patterns

  • Kim, Hyeon T.;Choi, Hong L.;Lee, Dae W.;Yoon, Yong C.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제18권8호
    • /
    • pp.1194-1198
    • /
    • 2005
  • A computer vision system was designed and validated to recognize an individual Holstein cattle by processing images of their body patterns. This system involves image capture, image pre-processing, algorithm processing, and an artificial neural network recognition algorithm. Optimum management of individuals is one of the most important factors in keeping cattle healthy and productive. In this study, an image-processing system was used to recognize individual Holstein cattle by identifying the body-pattern images captured by a charge-coupled device (CCD). A recognition system was developed and applied to acquire images of 49 cattles. The pixel values of the body images were transformed into input data comprising binary signals for the neural network. Images of the 49 cattle were analyzed to learn input layer elements, and ten cattles were used to verify the output layer elements in the neural network by using an individual recognition program. The system proved to be reliable for the individual recognition of cattles in natural light.

Chinese-clinical-record Named Entity Recognition using IDCNN-BiLSTM-Highway Network

  • Tinglong Tang;Yunqiao Guo;Qixin Li;Mate Zhou;Wei Huang;Yirong Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권7호
    • /
    • pp.1759-1772
    • /
    • 2023
  • Chinese named entity recognition (NER) is a challenging work that seeks to find, recognize and classify various types of information elements in unstructured text. Due to the Chinese text has no natural boundary like the spaces in the English text, Chinese named entity identification is much more difficult. At present, most deep learning based NER models are developed using a bidirectional long short-term memory network (BiLSTM), yet the performance still has some space to improve. To further improve their performance in Chinese NER tasks, we propose a new NER model, IDCNN-BiLSTM-Highway, which is a combination of the BiLSTM, the iterated dilated convolutional neural network (IDCNN) and the highway network. In our model, IDCNN is used to achieve multiscale context aggregation from a long sequence of words. Highway network is used to effectively connect different layers of networks, allowing information to pass through network layers smoothly without attenuation. Finally, the global optimum tag result is obtained by introducing conditional random field (CRF). The experimental results show that compared with other popular deep learning-based NER models, our model shows superior performance on two Chinese NER data sets: Resume and Yidu-S4k, The F1-scores are 94.98 and 77.59, respectively.