• 제목/요약/키워드: Image Degradation Model

검색결과 93건 처리시간 0.027초

보행자 경로 예측 기법을 이용한 위험구역 진입 여부 결정과 Knowledge Distillation을 이용한 작은 모델 학습 개선 (Determining Whether to Enter a Hazardous Area Using Pedestrian Trajectory Prediction Techniques and Improving the Training of Small Models with Knowledge Distillation)

  • 최인규;이영한;송혁
    • 한국정보통신학회논문지
    • /
    • 제25권9호
    • /
    • pp.1244-1253
    • /
    • 2021
  • 본 논문에서는 보행자 경로 예측 기법을 이용하여 보행자들이 현재 시점 이후로 위험구역으로 진입하는지 사전에 예측하는 방법과 경로 예측 네트워크의 효율적인 간소화 방법을 제안한다. 그리고 임베디드 환경에서 실시간 운용을 위해 작은 네트워크에 대하여 KD(Knowledge Distillation)을 적용하는 방법을 제안한다. 예측된 미래 경로와 위험구역 간의 상관관계를 이용하여 진입 여부를 판단하였으며 작은 네트워크를 학습할 때 효율적인 KD를 적용하여 성능저하를 최소화하였다. 실험을 통하여, 제안하는 간소화 기법을 적용한 모델이 기존 모델과 비교하여 37.49%의 속도향상 대비 미미한 정확도 저하를 이끌어 내는 것을 보여 주었다. 또한, 91.43%의 정확도를 가진 작은 네트워크를 KD를 이용하여 학습한 결과 94.76%의 향상된 정확도를 보임을 확인하였다.

데이터 증강을 통한 마스크 착용 얼굴 이미지에 강인한 얼굴 자세추정 (Robust Head Pose Estimation for Masked Face Image via Data Augmentation)

  • 한경탁;홍성은
    • 방송공학회논문지
    • /
    • 제27권6호
    • /
    • pp.944-947
    • /
    • 2022
  • 최근 코로나바이러스로 인한 마스크 착용이 급증함에 따라 마스크 착용에 대응할 수 있는 기술의 중요성이 증가하고 있다. 얼굴 자세 추정 분야는 운전자 주의, 얼굴 정면화, 시선 감지 등의 다양한 활용성에도 불구하고 마스크 착용에 따른 성능 저하 문제를 해결할 수 있는 연구가 거의 수행되지 않았다. 본 논문은 마스크 착용 유무에 따른 얼굴 자세 추정의 성능 저하에 대한 분석을 토대로, 마스크가 없는 얼굴 이미지의 크기 및 자세를 분석하여 마스크 이미지를 합성할 수 있는 데이터 증강 기법을 제안한다. 제안하는 얼굴에 특화된 증강 기법을 활용한 학습은 마스크 착용 여부와 관계없이 얼굴 자세 추정 벤치마크 데이터 세트인 BIWI에서 강인한 성능을 보이며, 특정 모델에 국한되지 않기 때문에 다양한 얼굴 자세 추정 모델에 적용될 수 있다.

3차원 얼굴인식 모델에 관한 연구: 모델 구조 비교연구 및 해석 (A Study On Three-dimensional Optimized Face Recognition Model : Comparative Studies and Analysis of Model Architectures)

  • 박찬준;오성권;김진율
    • 전기학회논문지
    • /
    • 제64권6호
    • /
    • pp.900-911
    • /
    • 2015
  • In this paper, 3D face recognition model is designed by using Polynomial based RBFNN(Radial Basis Function Neural Network) and PNN(Polynomial Neural Network). Also recognition rate is performed by this model. In existing 2D face recognition model, the degradation of recognition rate may occur in external environments such as face features using a brightness of the video. So 3D face recognition is performed by using 3D scanner for improving disadvantage of 2D face recognition. In the preprocessing part, obtained 3D face images for the variation of each pose are changed as front image by using pose compensation. The depth data of face image shape is extracted by using Multiple point signature. And whole area of face depth information is obtained by using the tip of a nose as a reference point. Parameter optimization is carried out with the aid of both ABC(Artificial Bee Colony) and PSO(Particle Swarm Optimization) for effective training and recognition. Experimental data for face recognition is built up by the face images of students and researchers in IC&CI Lab of Suwon University. By using the images of 3D face extracted in IC&CI Lab. the performance of 3D face recognition is evaluated and compared according to two types of models as well as point signature method based on two kinds of depth data information.

임베디드 엣지 플랫폼에서의 경량 비전 트랜스포머 성능 평가 (Performance Evaluation of Efficient Vision Transformers on Embedded Edge Platforms)

  • 이민하;이성재;김태현
    • 대한임베디드공학회논문지
    • /
    • 제18권3호
    • /
    • pp.89-100
    • /
    • 2023
  • Recently, on-device artificial intelligence (AI) solutions using mobile devices and embedded edge devices have emerged in various fields, such as computer vision, to address network traffic burdens, low-energy operations, and security problems. Although vision transformer deep learning models have outperformed conventional convolutional neural network (CNN) models in computer vision, they require more computations and parameters than CNN models. Thus, they are not directly applicable to embedded edge devices with limited hardware resources. Many researchers have proposed various model compression methods or lightweight architectures for vision transformers; however, there are only a few studies evaluating the effects of model compression techniques of vision transformers on performance. Regarding this problem, this paper presents a performance evaluation of vision transformers on embedded platforms. We investigated the behaviors of three vision transformers: DeiT, LeViT, and MobileViT. Each model performance was evaluated by accuracy and inference time on edge devices using the ImageNet dataset. We assessed the effects of the quantization method applied to the models on latency enhancement and accuracy degradation by profiling the proportion of response time occupied by major operations. In addition, we evaluated the performance of each model on GPU and EdgeTPU-based edge devices. In our experimental results, LeViT showed the best performance in CPU-based edge devices, and DeiT-small showed the highest performance improvement in GPU-based edge devices. In addition, only MobileViT models showed performance improvement on EdgeTPU. Summarizing the analysis results through profiling, the degree of performance improvement of each vision transformer model was highly dependent on the proportion of parts that could be optimized in the target edge device. In summary, to apply vision transformers to on-device AI solutions, either proper operation composition and optimizations specific to target edge devices must be considered.

위치 정보 인코딩 기반 ISP 신경망 성능 개선 (Enhancing A Neural-Network-based ISP Model through Positional Encoding)

  • 김대연;김우혁;조성현
    • 한국컴퓨터그래픽스학회논문지
    • /
    • 제30권3호
    • /
    • pp.81-86
    • /
    • 2024
  • 영상 신호 프로세서(Image Signal Processor, ISP)는 카메라 센서로부터 획득된 RAW 영상을 사람의 눈에 보기 좋은 sRGB 영상으로 변환한다. RAW 영상은 sRGB 영상에 비해 영상 처리에 도움이 되는 정보를 가지고 있지만 상대적으로 큰 용량으로 인해 주로 sRGB 영상만 저장되고 사용된다. 또한, 실제 카메라의 ISP 과정이 공개되어 있지 않아 그 역과정을 모사하는 것은 매우 어렵다. 이에 sRGB와 RAW 영상의 상호 변환을 위한 카메라 ISP 모델링 연구가 활발히 진행되고 있으며, 최근 기존의 단순한 ISP 신경망 구조를 고도화하고 실제 카메라 ISP의 동작과 유사하게 카메라 파라미터(노출 시간, 감도, 조리개 크기, 초점 거리)를 직접 반영하는 ParamISP[1] 모델이 제안되었다. 하지만 ParamISP[1]를 포함한 기존의 연구는 카메라 ISP를 모델링함에 있어 렌즈로 인해 발생하는 렌즈 쉐이딩(Lens Shading), 광학 수차(Optical Aberration), 렌즈 왜곡(Lens Distortion) 등을 고려하지 않아 복원 성능에 한계가 있다. 본 연구는 ISP 신경망이 렌즈로 인해 발생하는 열화를 보다 잘 다룰 수 있도록 위치 정보 인코딩(Positional Encoding)을 도입한다. 제안하는 위치 정보 인코딩 기법은 영상을 분할하여 패치(Patch) 단위로 학습하는 카메라 ISP 신경망에 적합하며 기존 모델에 비해 영상의 공간적 맥락을 반영할 수 있어 더욱 정교한 영상 복원을 가능하게 한다.

Copy-move Forgery Detection Robust to Various Transformation and Degradation Attacks

  • Deng, Jiehang;Yang, Jixiang;Weng, Shaowei;Gu, Guosheng;Li, Zheng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권9호
    • /
    • pp.4467-4486
    • /
    • 2018
  • Trying to deal with the problem of low robustness of Copy-Move Forgery Detection (CMFD) under various transformation and degradation attacks, a novel CMFD method is proposed in this paper. The main advantages of proposed work include: (1) Discrete Analytical Fourier-Mellin Transform (DAFMT) and Locality Sensitive Hashing (LSH) are combined to extract the block features and detect the potential copy-move pairs; (2) The Euclidian distance is incorporated in the pixel variance to filter out the false potential copy-move pairs in the post-verification step. In addition to extracting the effective features of an image block, the DAMFT has the properties of rotation and scale invariance. Unlike the traditional lexicographic sorting method, LSH is robust to the degradations of Gaussian noise and JEPG compression. Because most of the false copy-move pairs locate closely to each other in the spatial domain or are in the homogeneous regions, the Euclidian distance and pixel variance are employed in the post-verification step. After evaluating the proposed method by the precision-recall-$F_1$ model quantitatively based on the Image Manipulation Dataset (IMD) and Copy-Move Hard Dataset (CMHD), our method outperforms Emam et al.'s and Li et al.'s works in the recall and $F_1$ aspects.

Crack Band Model 기반 손상변수를 이용한 탄소섬유강화 복합재료 적층판의 점진적 파손 거동 예측 및 검증 (Prediction and Evaluation of Progressive Failure Behavior of CFRP using Crack Band Model Based Damage Variable)

  • 윤동현;김상덕;김재훈;도영대
    • Composites Research
    • /
    • 제32권5호
    • /
    • pp.258-264
    • /
    • 2019
  • 본 논문에서는 Hashin 파손 기준식과 crack band 모델이 접목된 손상변수를 이용하여 점진적파손해석 방법이 개발되었다. 파손기준식을 이용하여 파손의 개시 유무가 판단된다. 파손이 개시된 경우에는 각 파손모드(섬유 인장/압축, 기지 인장/압축)에서 손상변수가 선형 열화 거동에 따라 계산되고, 손상강성행렬을 계산하는데 사용된다. 손상강성행렬은 손상된 재료에 반영되고, 계산된 손상강성행렬을 이용하여 재료의 완전한 파괴를 의미하는 손상변수가 1인 시점이 되기까지 점진적 파손해석이 계속해서 반복적으로 수행된다. 일련의 과정들은 상용해석프로그램인 ABAQUS에 사용자 정의 부프로그램을 이용하여 수행되었다. 제안된 점진적파손해석 도구의 검증을 위하여, 원공을 가진 복합재료 적층판의 시험 결과와 비교를 수행하였으며, 시험 중 디지털 이미지 상관법을 이용하여 획득한 변형률 거동과 해석을 통해 획득한 변형률 거동을 비교하였다. 제안된 해석결과는 시험 결과와 비교하여 유효한 일치를 보였다.

Estimation of Above-Ground Biomass of a Tropical Forest in Northern Borneo Using High-resolution Satellite Image

  • Phua, Mui-How;Ling, Zia-Yiing;Wong, Wilson;Korom, Alexius;Ahmad, Berhaman;Besar, Normah A.;Tsuyuki, Satoshi;Ioki, Keiko;Hoshimoto, Keigo;Hirata, Yasumasa;Saito, Hideki;Takao, Gen
    • Journal of Forest and Environmental Science
    • /
    • 제30권2호
    • /
    • pp.233-242
    • /
    • 2014
  • Estimating above-ground biomass is important in establishing an applicable methodology of Measurement, Reporting and Verification (MRV) System for Reducing Emissions from Deforestation and Forest Degradation-Plus (REDD+). We developed an estimation model of diameter at breast height (DBH) from IKONOS-2 image that led to above-ground biomass estimation (AGB). The IKONOS image was preprocessed with dark object subtraction and topographic effect correction prior to watershed segmentation for tree crown delineation. Compared to the field observation, the overall segmentation accuracy was 64%. Crown detection percent had a strong negative correlation to tree density. In addition, satellite-based crown area had the highest correlation with the field measured DBH. We then developed the DBH allometric model that explained 74% of the data variance. In average, the estimated DBH was very similar to the measured DBH as well as for AGB. Overall, this method can potentially be applied to estimate AGB over a relatively large and remote tropical forest in Northern Borneo.

Image Quality Assessment by Combining Masking Texture and Perceptual Color Difference Model

  • Tang, Zhisen;Zheng, Yuanlin;Wang, Wei;Liao, Kaiyang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권7호
    • /
    • pp.2938-2956
    • /
    • 2020
  • Objective image quality assessment (IQA) models have been developed by effective features to imitate the characteristics of human visual system (HVS). Actually, HVS is extremely sensitive to color degradation and complex texture changes. In this paper, we firstly reveal that many existing full reference image quality assessment (FR-IQA) methods can hardly measure the image quality with contrast and masking texture changes. To solve this problem, considering texture masking effect, we proposed a novel FR-IQA method, called Texture and Color Quality Index (TCQI). The proposed method considers both in the masking effect texture and color visual perceptual threshold, which adopts three kinds of features to reflect masking texture, color difference and structural information. Furthermore, random forest (RF) is used to address the drawbacks of existing pooling technologies. Compared with other traditional learning-based tools (support vector regression and neural network), RF can achieve the better prediction performance. Experiments conducted on five large-scale databases demonstrate that our approach is highly consistent with subjective perception, outperforms twelve the state-of-the-art IQA models in terms of prediction accuracy and keeps a moderate computational complexity. The cross database validation also validates our approach achieves the ability to maintain high robustness.

Information Processing in Primate Retinal Ganglion

  • Je, Sung-Kwan;Cho, Jae-Hyun;Kim, Gwang-Baek
    • Journal of information and communication convergence engineering
    • /
    • 제2권2호
    • /
    • pp.132-137
    • /
    • 2004
  • Most of the current computer vision theories are based on hypotheses that are difficult to apply to the real world, and they simply imitate a coarse form of the human visual system. As a result, they have not been showing satisfying results. In the human visual system, there is a mechanism that processes information due to memory degradation with time and limited storage space. Starting from research on the human visual system, this study analyzes a mechanism that processes input information when information is transferred from the retina to ganglion cells. In this study, a model for the characteristics of ganglion cells in the retina is proposed after considering the structure of the retina and the efficiency of storage space. The MNIST database of handwritten letters is used as data for this research, and ART2 and SOM as recognizers. The results of this study show that the proposed recognition model is not much different from the general recognition model in terms of recognition rate, but the efficiency of storage space can be improved by constructing a mechanism that processes input information.