• Title/Summary/Keyword: metric learning

Search Result 128, Processing Time 0.03 seconds

A multi-label Classification of Attributes on Face Images

  • Le, Giang H.;Lee, Yeejin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.105-108
    • /
    • 2021
  • Generative adversarial networks (GANs) have reached a great result at creating the synthesis image, especially in the face generation task. Unlike other deep learning tasks, the input of GANs is usually the random vector sampled by a probability distribution, which leads to unstable training and unpredictable output. One way to solve those problems is to employ the label condition in both the generator and discriminator. CelebA and FFHQ are the two most famous datasets for face image generation. While CelebA contains attribute annotations for more than 200,000 images, FFHQ does not have attribute annotations. Thus, in this work, we introduce a method to learn the attributes from CelebA then predict both soft and hard labels for FFHQ. The evaluated result from our model achieves 0.7611 points of the metric is the area under the receiver operating characteristic curve.

  • PDF

A Survey on Methodology of Meta-Learning (메타 러닝과 방법론 연구 동향)

  • Hoon Ji;Yeon-Joon Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.665-666
    • /
    • 2023
  • 딥러닝은 인간이 탐지하기 어려운 데이터의 특징 및 패턴을 인지하고, 이들을 학습하여 데이터를 분류 및 예측할 수 있는 기술이다. 그러나 딥러닝 모델을 잘 학습시키기 위해서는 고품질의 대용량 데이터와 이들을 처리할 수 있는 방대한 컴퓨터 자원이 요구되는 것이 일반적이다. 따라서 소량의 데이터만이 존재하는 분야나 컴퓨터 자원이 한정되어 있는 상황에서는 딥러닝을 적용하기 어렵다. 본 논문에서는, 소량의 데이터로도 모델을 자신들의 태스크에 맞게 최적화시킬 수 있는 메타러닝에 대해 소개하고, 메타 러닝 기법들의 방향에 따른 Metric-Based, Model-Based 및 Optimization 기반 모델들에 대해 소개하고, 앞으로 나아가야 할 연구 방향에 대해 제시한다.

Performance comparison evaluation of speech enhancement using various loss functions (다양한 손실 함수를 이용한 음성 향상 성능 비교 평가)

  • Hwang, Seo-Rim;Byun, Joon;Park, Young-Cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.2
    • /
    • pp.176-182
    • /
    • 2021
  • This paper evaluates and compares the performance of the Deep Nerual Network (DNN)-based speech enhancement models according to various loss functions. We used a complex network that can consider the phase information of speech as a baseline model. As the loss function, we consider two types of basic loss functions; the Mean Squared Error (MSE) and the Scale-Invariant Source-to-Noise Ratio (SI-SNR), and two types of perceptual-based loss functions, including the Perceptual Metric for Speech Quality Evaluation (PMSQE) and the Log Mel Spectra (LMS). The performance comparison was performed through objective evaluation and listening tests with outputs obtained using various combinations of the loss functions. Test results show that when a perceptual-based loss function was combined with MSE or SI-SNR, the overall performance is improved, and the perceptual-based loss functions, even exhibiting lower objective scores showed better performance in the listening test.

Fast and All-Purpose Area-Based Imagery Registration Using ConvNets (ConvNet을 활용한 영역기반 신속/범용 영상정합 기술)

  • Baek, Seung-Cheol
    • Journal of KIISE
    • /
    • v.43 no.9
    • /
    • pp.1034-1042
    • /
    • 2016
  • Together with machine-learning frameworks, area-based imagery registration techniques can be easily applied to diverse types of image pairs without predefined features and feature descriptors. However, feature detectors are often used to quickly identify candidate image patch pairs, limiting the applicability of these registration techniques. In this paper, we propose a ConvNet (Convolutional Network) "Dart" that provides not only the matching metric between patches, but also information about their distance, which are helpful in reducing the search space of the corresponding patch pairs. In addition, we propose a ConvNet "Fad" to identify the patches that are difficult for Dart to improve the accuracy of registration. These two networks were successfully implemented using Deep Learning with the help of a number of training instances generated from a few registered image pairs, and were successfully applied to solve a simple image registration problem, suggesting that this line of research is promising.

Development of Reinforcement Learning-based Obstacle Avoidance toward Autonomous Mobile Robots for an Industrial Environment (산업용 자율 주행 로봇에서의 격자 지도를 사용한 강화학습 기반 회피 경로 생성기 개발)

  • Yang, Jeong-Yean
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.3
    • /
    • pp.72-79
    • /
    • 2019
  • Autonomous locomotion has two essential functionalities: mapping builds and updates maps by uncertain position information and measured sensor inputs, and localization is to find the positional information with the inaccurate map and the sensor information. In addition, obstacle detection, avoidance, and path designs are necessarily required for autonomous locomotion by combining the probabilistic methods based on uncertain locations. The sensory inputs, which are measured by a metric-based scanner, have difficulties of distinguishing moving obstacles like humans from static objects like walls in given environments. This paper proposes the low resolution grid map combined with reinforcement learning, which is compared with the conventional recognition method for detecting static and moving objects to generate obstacle avoiding path. Finally, the proposed method is verified with experimental results.

Health Monitoring of Livestock using Neck Sensor based on Machine Learning (목걸이형 센서를 이용한 머신러닝 기반 가축상태 모니터링)

  • Lee, Woongsup;Park, Seongmin;Ban, Tae-Won;Kim, Seong Hwan;Ryu, Jongyeol;Sung, Kil-Young
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.11
    • /
    • pp.1421-1427
    • /
    • 2018
  • Due to the rapid development of Internet-of-Things technology, different types of smart sensors are now devised and deployed widely. These smart sensors are now used in animal husbandry which was traditionally managed by the experience of farmers, such that wearable sensors for livestock, and the smart farm which is equipped with multiple sensors are utilized to increase the efficiency of livestock management. Herein, we consider a scheme in which the body temperature and the level of activity are measured by smart sensor which is attached to the neck of dairy cattle and the health condition is monitored based on collected data. Especially, we find that the estrous of dairy cattle which is one of most important metric in milk production, can be predicted with high precision using various machine learning techniques. By utilizing the proposed prediction scheme, estrous of cattle can be detected immediately and this can improve the efficiency of cattle management.

Deep Learning Approach for Automatic Discontinuity Mapping on 3D Model of Tunnel Face (터널 막장 3차원 지형모델 상에서의 불연속면 자동 매핑을 위한 딥러닝 기법 적용 방안)

  • Chuyen Pham;Hyu-Soung Shin
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.508-518
    • /
    • 2023
  • This paper presents a new approach for the automatic mapping of discontinuities in a tunnel face based on its 3D digital model reconstructed by LiDAR scan or photogrammetry techniques. The main idea revolves around the identification of discontinuity areas in the 3D digital model of a tunnel face by segmenting its 2D projected images using a deep-learning semantic segmentation model called U-Net. The proposed deep learning model integrates various features including the projected RGB image, depth map image, and local surface properties-based images i.e., normal vector and curvature images to effectively segment areas of discontinuity in the images. Subsequently, the segmentation results are projected back onto the 3D model using depth maps and projection matrices to obtain an accurate representation of the location and extent of discontinuities within the 3D space. The performance of the segmentation model is evaluated by comparing the segmented results with their corresponding ground truths, which demonstrates the high accuracy of segmentation results with the intersection-over-union metric of approximately 0.8. Despite still being limited in training data, this method exhibits promising potential to address the limitations of conventional approaches, which only rely on normal vectors and unsupervised machine learning algorithms for grouping points in the 3D model into distinct sets of discontinuities.

A Study on Adaptive Knowledge Automatic Acquisition Model from Case-Based Reasoning System (사례 기반 추론 시스템에서 적응 지식 자동 획득 모델에 관한 연구)

  • 이상범;김영천;이재훈;이성주
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2002.05a
    • /
    • pp.81-86
    • /
    • 2002
  • In current CBR(Case-Based Reasoning) systems, the case adaptation is usually performed by rule-based method that use rules hand-coded by the system developer. So, CBR system designer faces knowledge acquisition bottleneck similar to those found in traditional expert system design. In this thesis, 1 present a model for learning method of case adaptation knowledge using case base. The feature difference of each pair of cases are noted and become the antecedent part of an adaptation rule, the differences between the solutions in the compared cases become the consequent part of the rule. However, the number of rules that can possibly be discovered using a learning algorithm is enormous. The first method for finding cases to compare uses a syntactic measure of the distance between cases. The threshold fur identification of candidates for comparison is fixed th the maximum number of differences between the target and retrived case from all retrievals. The second method is to use similarity metric since the threshold method may not be an accurate measure. I suggest the elimination method of duplicate rules. In the elimination process, a confidence value is assigned to each rule based on its frequency. The learned adaptation rules is applied in riven target Problem. The basic. process involves search for all rules that handle at least one difference followed by a combination process in which complete solutions are built.

  • PDF

A Case Study on the Accessibility of Online Learning Content in Korea (국내 원격 교육 콘텐츠의 접근성 분석 사례)

  • 신승식
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2003.05a
    • /
    • pp.92-101
    • /
    • 2003
  • The accessibility evaluation of ten web-based loaming content in Korea was performed with the following procedure : (1) A primitive metric of the compliance of those contents to the WCAG (Web Content Accessibility Guidelines) 1.0 was obtained using Bobby, a widely used accessibility checker. (2) SGML validation test was carried out. (3) The contents were rendered with various browsers including a text-mode browser. (4) They were manually checked as to whether they satisfy the accessibility criteria proposed by W3C. Most of the tested contents scored low marks in all the test categories partly because they were apparently developed with little attention paid to web standard conformance, browser compatibility, and device-independence. They also put heavy emphasis on audio-visual effects catering only to the best-equipped users and offering no alternate access route for those in restricted environment. As more information and learning materials are delivered through the Internet, these low accessible contents would lead to a deeper information divide. The accessibility needs to be regarded as an important factor in evaluating the quality of loaming content.

  • PDF

A Study on Classification of Waveforms Using Manifold Embedding Based on Commute Time (컴뮤트 타임 기반의 다양체 임베딩을 이용한 파형 신호 인식에 관한 연구)

  • Hahn, Hee-Il
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.2
    • /
    • pp.148-155
    • /
    • 2014
  • In this paper a commute time embedding is implemented by organizing patches according to the graph-based metric, and its properties are investigated via changing the number of nodes on the graph.. It is shown that manifold embedding methods generate the intrinsic geometric structures when waveforms such as speech or music instrumental sound signals are embedded on the low dimensional Euclidean space. Basically manifold embedding algorithms only project the training samples on the graph into an embedding subspace but can not generalize the learning results to test samples. They are very effective for data clustering but are not appropriate for classification or recognition. In this paper a commute time guided transform is adopted to enhance the generalization ability and its performance is analyzed by applying it to the classification of 6 kinds of music instrumental sounds.