Search | Korea Science

Deep Learning-based Gaze Direction Vector Estimation Network Integrated with Eye Landmark Localization (딥러닝 기반의 눈 랜드마크 위치 검출이 통합된 시선 방향 벡터 추정 네트워크)

Joo, Hee Young;Ko, Min Soo;Song, Hyok
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2021.06a
- /
- pp.180-182
- /
- 2021
본 논문은 눈 랜드마크 위치 검출과 시선 방향 벡터 추정이 하나의 딥러닝 네트워크로 통합된 시선 추정 네트워크를 제안한다. 제안하는 네트워크는 Stacked Hourglass Network[1]를 백본(Backbone) 구조로 이용하며, 크게 랜드마크 검출기, 특징 맵 추출기, 시선 방향 추정기라는 세 개의 부분으로 구성되어 있다. 랜드마크 검출기에서는 눈 랜드마크 50개 포인트의 좌표를 추정하며, 특징 맵 추출기에서는 시선 방향 추정을 위한 눈 이미지의 특징 맵을 생성한다. 그리고 시선 방향 추정기에서는 각 출력 결과를 조합하고 이를 통해 최종 시선 방향 벡터를 추정한다. 제안하는 네트워크는 UnityEyes[2] 데이터셋을 통해 생성된 가상의 합성 눈 이미지와 랜드마크 좌표 데이터를 이용하여 학습하였으며, 성능 평가는 실제 사람의 눈 이미지로 구성된 MPIIGaze[3] 데이터 셋을 이용하였다. 실험을 통해 시선 추정 오차는 0.0396 MSE(Mean Square Error)의 성능을 보였으며, 네트워크의 추정 속도는 42 FPS(Frame Per Second)를 나타내었다.
PDF

A Study on Skew Measurement Technique for the Crane Spreader using a Camera (카메라를 이용한 크레인 스프레더 스큐모션 계측기술에 관한 연구)

Kawai, H.;Kim, Y.B.;Choi, Y.W.
- Journal of Power System Engineering
- /
- v.14 no.4
- /
- pp.76-81
- /
- 2010
본 논문에서는 카메라를 이용하여 크레인 스프레더의 스큐모션을 계측하는 계측기법에 대해 고찰하고 있다. 계측장치는 트롤리에 설치한 카메라와 스프레더에 설치한 두 개의 랜드마크로 구성된다. 랜드마크를 이용하여 크레인 스프레더 흔들림과 상하위치를 검출하는 기법은 저자들이 이미 제안한 기술이며 실험을 통해 그 유용성을 검증하였다. 크레인 스프레더의 스큐모션 계측기법 또한 제안된 계측기법에 기초한 것으로 두 개의 랜드마크를 검출하여 템플릿 매칭기법으로 스큐모션을 계측할 수 있다. 스큐모션은 스프레더의 회전각도를 검출하여 계측해야 하는데 계측정도와 신뢰도는 정확한 템플릿매칭의 가능여부에 의존하게 된다. 즉, 랜드마크의 회전으로 매칭이 실패할 경우에는 정확한 회전각도를 검출할 수 없는 경우가 발생할 수 있게 된다. 따라서 본 논문에서는 랜드마크 회전에 따라 템플릿을 회전시키는 방법을 도입하여 템플릿매칭의 신뢰성과 계측정도를 개선하는 방법에 대해 연구하였다. 제안된 방법을 이용할 경우 템플릿매칭이 실패하는 경우가 없음을 실험을 통해 확인하였으며, 측정범위는 ${\pm}12^{\circ}$ 이고 이것은 크레인 스프레더의 스큐모션을 파악하고 제어하는데 충분한 정도의 범위이다.
PDF KSCI

딥러닝 기반 얼굴 검출, 랜드마크 검출 및 얼굴 인식 기술 연구 동향

Hwang, Won-Jun
- Broadcasting and Media Magazine
- /
- v.22 no.4
- /
- pp.41-49
- /
- 2017
본 논문에서는 최근 각광받고 있는 Convolutional Neural Network(CNN)과 같은 딥러닝 기반의 얼굴 인식 연구 동향을 살펴 보고자 한다. 얼굴 인식은 입력 영상이 들어왔을 때 자동으로 누구인지 알아내는 알고리즘으로 크게 얼굴 검출, 얼굴 랜드마크 검출 및 얼굴 특징 추출로 나누어진다. 본 논문에서는 얼굴 검출, 랜드마크 검출 및 얼굴 특징 추출에 특화된 딥러닝 알고리즘을 하나씩 살펴보고 이들이 어떻게 발전해 왔는지를 확인하고자 한다. 특히, 딥러닝 기반 얼굴 인식 알고리즘들은 딥러닝 기반 물체 인식의 발전 방향과 유사하게 진행되어 오다가 최근에는 얼굴 인식에 특화된 딥러닝 아키텍처 형태로 발전하고 있다. 어떤 방향이 얼굴 인식에 더 도움이 될지에 대해서도 확인하고 실제로 어떤 문제를 해결하고 있는지 확인하고자 한다.
PDF KSCI

Improved Anatomical Landmark Detection Using Attention Modules and Geometric Data Augmentation in X-ray Images (어텐션 모듈과 기하학적 데이터 증강을 통한 X-ray 영상 내 해부학적 랜드마크 검출 성능 향상)

Lee, Hyo-Jeong;Ma, Se-Rie;Choi, Jang-Hwan
- Journal of the Korea Computer Graphics Society
- /
- v.28 no.3
- /
- pp.55-65
- /
- 2022
Recently, deep learning-based automated systems for identifying and detecting landmarks have been proposed. In order to train such a deep learning-based model without overfitting, a large amount of image and labeling data is required. Conventionally, an experienced reader manually identifies and labels landmarks in a patient's image. However, such measurement is not only expensive, but also has poor reproducibility, so the need for an automated labeling method has been raised. In addition, in the X-ray image, since various human tissues on the path through which the photons pass are displayed, it is difficult to identify the landmark compared to a general natural image or a 3D image modality image. In this study, we propose a geometric data augmentation technique that enables the generation of a large amount of labeling data in X-ray images. In addition, the optimal attention mechanism for landmark detection was presented through the implementation and application of various attention techniques to improve the detection performance of 16 major landmarks in the skull. Finally, among the major cranial landmarks, markers that ensure stable detection are derived, and these markers are expected to have high clinical application potential.
https://doi.org/10.15701/kcgs.2022.28.3.55 인용 PDF KSCI

Deep Learning-based Gaze Direction Vector Estimation Network Integrated with Eye Landmark Localization (딥 러닝 기반의 눈 랜드마크 위치 검출이 통합된 시선 방향 벡터 추정 네트워크)

Joo, Heeyoung;Ko, Min-Soo;Song, Hyok
- Journal of Broadcast Engineering
- /
- v.26 no.6
- /
- pp.748-757
- /
- 2021
In this paper, we propose a gaze estimation network in which eye landmark position detection and gaze direction vector estimation are integrated into one deep learning network. The proposed network uses the Stacked Hourglass Network as a backbone structure and is largely composed of three parts: a landmark detector, a feature map extractor, and a gaze direction estimator. The landmark detector estimates the coordinates of 50 eye landmarks, and the feature map extractor generates a feature map of the eye image for estimating the gaze direction. And the gaze direction estimator estimates the final gaze direction vector by combining each output result. The proposed network was trained using virtual synthetic eye images and landmark coordinate data generated through the UnityEyes dataset, and the MPIIGaze dataset consisting of real human eye images was used for performance evaluation. Through the experiment, the gaze estimation error showed a performance of 3.9, and the estimation speed of the network was 42 FPS (Frames per second).
https://doi.org/10.5909/JBE.2021.26.6.748 인용 PDF KSCI KPUBS

Mobile Camera-Based Positioning Method by Applying Landmark Corner Extraction (랜드마크 코너 추출을 적용한 모바일 카메라 기반 위치결정 기법)

Yoo Jin Lee;Wansang Yoon;Sooahm Rhee
- Korean Journal of Remote Sensing
- /
- v.39 no.6_1
- /
- pp.1309-1320
- /
- 2023
The technological development and popularization of mobile devices have developed so that users can check their location anywhere and use the Internet. However, in the case of indoors, the Internet can be used smoothly, but the global positioning system (GPS) function is difficult to use. There is an increasing need to provide real-time location information in shaded areas where GPS is not received, such as department stores, museums, conference halls, schools, and tunnels, which are indoor public places. Accordingly, research on the recent indoor positioning technology based on light detection and ranging (LiDAR) equipment is increasing to build a landmark database. Focusing on the accessibility of building a landmark database, this study attempted to develop a technique for estimating the user's location by using a single image taken of a landmark based on a mobile device and the landmark database information constructed in advance. First, a landmark database was constructed. In order to estimate the user's location only with the mobile image photographing the landmark, it is essential to detect the landmark from the mobile image, and to acquire the ground coordinates of the points with fixed characteristics from the detected landmark. In the second step, by applying the bag of words (BoW) image search technology, the landmark photographed by the mobile image among the landmark database was searched up to a similar 4th place. In the third step, one of the four candidate landmarks searched through the scale invariant feature transform (SIFT) feature point extraction technique and Homography random sample consensus(RANSAC) was selected, and at this time, filtering was performed once more based on the number of matching points through threshold setting. In the fourth step, the landmark image was projected onto the mobile image through the Homography matrix between the corresponding landmark and the mobile image to detect the area of the landmark and the corner. Finally, the user's location was estimated through the location estimation technique. As a result of analyzing the performance of the technology, the landmark search performance was measured to be about 86%. As a result of comparing the location estimation result with the user's actual ground coordinate, it was confirmed that it had a horizontal location accuracy of about 0.56 m, and it was confirmed that the user's location could be estimated with a mobile image by constructing a landmark database without separate expensive equipment.
https://doi.org/10.7780/kjrs.2023.39.6.1.11 인용 PDF HTML

Vehicle License Plate Detection in Road Images (도로주행 영상에서의 차량 번호판 검출)

Lim, Kwangyong;Byun, Hyeran;Choi, Yeongwoo
- Journal of KIISE
- /
- v.43 no.2
- /
- pp.186-195
- /
- 2016
This paper proposes a vehicle license plate detection method in real road environments using 8 bit-MCT features and a landmark-based Adaboost method. The proposed method allows identification of the potential license plate region, and generates a saliency map that presents the license plate's location probability based on the Adaboost classification score. The candidate regions whose scores are higher than the given threshold are chosen from the saliency map. Each candidate region is adjusted by the local image variance and verified by the SVM and the histograms of the 8bit-MCT features. The proposed method achieves a detection accuracy of 85% from various road images in Korea and Europe.
https://doi.org/10.5626/JOK.2016.43.2.186 인용 KSCI

Design of CNN-based Gastrointestinal Landmark Classifier for Tracking the Gastrointestinal Location (캡슐내시경의 위치추적을 위한 CNN 기반 위장관 랜드마크 분류기 설계)

Jang, Hyeon-Woong;Lim, Chang-Nam;Park, Ye-Seul;Lee, Kwang-Jae;Lee, Jung-Won
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.10a
- /
- pp.1019-1022
- /
- 2019
최근의 영상 처리 분야는 딥러닝 기법들의 성능이 입증됨에 따라 다양한 분야에서 이와 같은 기법들을 활용해 영상에 대한 분류, 분석, 검출 등을 수행하려는 시도가 활발하다. 그중에서도 의료 진단 보조 역할을 할 수 있는 의료 영상 분석 소프트웨어에 대한 기대가 증가하고 있는데, 본 연구에서는 캡슐내시경 영상에 주목하였다. 캡슐내시경은 주로 소장 촬영을 목표로 하며 식도부터 대장까지 약 8~10시간 동안 촬영된다. 이로 인해 CT, MR, X-ray와 같은 다른 의료 영상과 다르게 하나의 데이터 셋이 10~15만 장의 이미지를 갖는다. 일반적으로 캡슐내시경 영상을 판독하는 순서는 위장관 교차점(Z-Line, 유문판, 회맹판)을 기준으로 위장관 랜드마크(식도, 위, 소장, 대장)를 구분한 뒤, 각 랜드마크 별로 병변 정보를 찾아내는 방식이다. 그러나 워낙 방대한 영상 데이터를 가지기 때문에 의사 혹은 의료 전문가가 영상을 판독하는데 많은 시간과 노력이 소모되고 있다. 본 논문의 목적은 캡슐내시경 영상의 판독에서 모든 환자에 대해 공통으로 수행되고, 판독하는 데 많은 시간을 차지하는 위장관 랜드마크를 찾는 것에 있다. 이를 위해, 위장관 랜드마크를 식별할 수 있는 CNN 학습 모델을 설계하였으며, 더욱 효과적인 학습을 위해 전처리 과정으로 학습에 방해가 되는 학습 노이즈 영상들을 제거하고 위장관 랜드마크 별 특징 분석을 진행하였다. 총 8명의 환자 데이터를 가지고 학습된 모델에 대해 평가 및 검증을 진행하였는데, 무작위로 환자 데이터를 샘플링하여 학습한 모델을 평가한 결과, 평균 정확도가 95% 가 확인되었으며 개별 환자별로 교차 검증 방식을 진행한 결과 평균 정확도 67% 가 확인되었다.
https://doi.org/10.3745/PKIPS.y2019m10a.1019 인용 PDF

Automated Geometric Correction of Geostationary Weather Satellite Images (정지궤도 기상위성의 자동기하보정)

Kim, Hyun-Suk;Hur, Dong-Seok;Rhee, Soo-Ahm;Kim, Tae-Jung
- Proceedings of the KSRS Conference
- /
- 2007.03a
- /
- pp.70-75
- /
- 2007
2008년 12월에 우리나라 최초의 통신해양기상위성(Communications， Oceanography and Meteorology Satellite, COMS)이 발사될 예정이다. 통신해양기상위성의 영상데이터의 기하보정을 위하여 다음과 같은 연구를 수행하였다. 기상위성은 정지궤도상에 위치하여 전지구적인 영상을 얻는다. 영상의 전지구적인 해안선은 구름 등으로 가려져서 명확한 정보를 제공할 수 없게 된다. 구름 등으로 방해되지 않는 명확한 해안선 정보를 얻기 위하여 구름 추출을 한다. 실시간으로 기상정보를 얻는 기상위성의 특성상 정합에 전체 영상을 사용하면 수행시간이 다소 소요된다. 정합시 전체 영상에서 정합을 위한 후보점 추출을 위하여 GSHHS(Global Self-consistent Hierarchical High-resolution Shoreline)의 해안선 데이터베이스를 사용하여 211 개 의 랜드마크 칩들을 구축하였다. 이때 구축된 랜드마크 칩은 실험에 사용한 GOES-9의 위치 동경 155도를 반영하여 구축하였다. 전체 영상에서 구축된 랜드마크 칩들의 위치를 중심으로 구름추출을 수행한다. 전체 211 개의 후보점 중 구름이 제거된 나머지 후보점에 대하여 정합을 수행한다. 랜드마크 칩과 위성영상 간의 정합 중 참정합과 오정합이 존재하는데 자동으로 오정합을 검출하기 위하여 강인추정기법 (RANSAC, Random Sample Consensus)을 사용한다. 이때 자동으로 판별되어 오정합이 제거된 정합결과로 최종적인 기하보정을 수행한다. 기하보정을 위한 센서모델은 GOES-9 위성의 센서특정을 고려하여 개발되었다. 정합 및 RANSAC결과로 얻어진 기준점으로 정밀 센서모델을 수립하여 기하보정을 실시하였다. 이때 일련의 수행과정을 통신해양기상위성의 실시간 처리요구사항에 맞도록 속도를 최적화하여 진행되도록 개발하였다.
PDF

A Facial Morphing Method Using Delaunay Triangle of Facial Landmarks (얼굴 랜드마크의 들로네 삼각망을 이용한 얼굴 모핑 기법)

Park, Kyung Nam
- Journal of Digital Contents Society
- /
- v.19 no.1
- /
- pp.213-220
- /
- 2018
Face morphing, one of the most powerful image processing techniques that are often used in image processing and computer graphic fields, as it is a technique to change the image progressively and naturally from the original image to the target image. In this paper, we propose a method to generate Delaunay triangles using the facial landmark vertices generated by the Dlib face landmark detector and to implement morphing through warping and cross dissolving of Delaunay triangles between the original image and the target image. In this paper, we generate vertex points for face not manually but automatically, which is the major feature of the face such as eye, eyebrow, nose, and mouth, and is used to generate Delaunay triangles automatically which is the main characteristic of our face morphing method. Simulations show that we can add vertices manually and get more natural morphing results.
https://doi.org/10.9728/dcs.2018.19.1.213 인용 PDF KSCI

Search Result 23, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)