• Title/Summary/Keyword: Triplet Loss

Search Result 18, Processing Time 0.039 seconds

Triplet Class-Wise Difficulty-Based Loss for Long Tail Classification

  • Yaw Darkwah Jnr.;Dae-Ki Kang
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.15 no.3
    • /
    • pp.66-72
    • /
    • 2023
  • Little attention appears to have been paid to the relevance of learning a good representation function in solving long tail tasks. Therefore, we propose a new loss function to ensure a good representation is learnt while learning to classify. We call this loss function Triplet Class-Wise Difficulty-Based (TriCDB-CE) Loss. It is a combination of the Triplet Loss and Class-wise Difficulty-Based Cross-Entropy (CDB-CE) Loss. We prove its effectiveness empirically by performing experiments on three benchmark datasets. We find improvement in accuracy after comparing with some baseline methods. For instance, in the CIFAR-10-LT, 7 percentage points (pp) increase relative to the CDB-CE Loss was recorded. There is more room for improvement on Places-LT.

A deep learning model based on triplet losses for a similar child drawing selection algorithm (Triplet Loss 기반 딥러닝 모델을 통한 유사 아동 그림 선별 알고리즘)

  • Moon, Jiyu;Kim, Min-Jong;Lee, Seong-Oak;Yu, Yonggyun
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.1
    • /
    • pp.1-9
    • /
    • 2022
  • The goal of this paper is to create a deep learning model based on triplet loss for generating similar child drawing selection algorithms. To assess the similarity of children's drawings, the distance between feature vectors belonging to the same class should be close, and the distance between feature vectors belonging to different classes should be greater. Therefore, a similar child drawing selection algorithm was developed in this study by building a deep learning model combining Triplet Loss and residual network(ResNet), which has an advantage in measuring image similarity regardless of the number of classes. Finally, using this model's similar child drawing selection algorithm, the similarity between the target child drawing and the other drawings can be measured and drawings with a high similarity can be chosen.

Triplet loss based domain adversarial training for robust wake-up word detection in noisy environments (잡음 환경에 강인한 기동어 검출을 위한 삼중항 손실 기반 도메인 적대적 훈련)

  • Lim, Hyungjun;Jung, Myunghun;Kim, Hoirin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.468-475
    • /
    • 2020
  • A good acoustic word embedding that can well express the characteristics of word plays an important role in wake-up word detection (WWD). However, the representation ability of acoustic word embedding may be weakened due to various types of environmental noise occurred in the place where WWD works, causing performance degradation. In this paper, we proposed triplet loss based Domain Adversarial Training (tDAT) mitigating environmental factors that can affect acoustic word embedding. Through experiments in noisy environments, we verified that the proposed method effectively improves the conventional DAT approach, and checked its scalability by combining with other method proposed for robust WWD.

A Study on Adversarial Attack Using Triplet loss (Triplet Loss를 이용한 Adversarial Attack 연구)

  • Oh, Taek-Wan;Moon, Bong-Kyo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.05a
    • /
    • pp.404-407
    • /
    • 2019
  • 최근 많은 영역에 딥러닝이 활용되고 있다. 특히 CNN과 같은 아키텍처는 얼굴인식과 같은 이미지 분류 분야에서 활용된다. 이러한 딥러닝 기술을 완전한 기술로서 활용할 수 있는지에 대한 연구가 이뤄져왔다. 관련 연구로 PGD(Projected Gradient Descent) 공격이 존재한다. 해당 공격을 이용하여 원본 이미지에 노이즈를 더해주게 되면, 수정된 이미지는 전혀 다른 클래스로 분류되게 된다. 본 연구에서 기존의 FGSM(Fast gradient sign method) 공격기법에 Triplet loss를 활용한 Adversarial 공격 모델을 제안 및 구현하였다. 제안된 공격 모델은 간단한 시나리오를 기반으로 검증하였고 해당 결과를 분석하였다.

Data Augmentation Strategy based on Token Cut-off for Using Triplet Loss in Unsupervised Contrastive Learning (비지도 대조 학습에서 삼중항 손실 함수 도입을 위한 토큰 컷오프 기반 데이터 증강 기법)

  • Myeongsoo Han;Yoo Hyun Jeong;Dong-Kyu Chae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.618-620
    • /
    • 2023
  • 최근 자연어처리 분야에서 의미론적 유사성을 반영하기 위한 대조 학습 (contrastive learning) 관련 연구가 활발히 이뤄지고 있다. 이러한 대조 학습의 핵심은 의미론적으로 가까워져야 하는 쌍과 멀어져야 하는 쌍을 잘 구축하는 것이지만, 기존의 손실 함수는 문장의 상대적인 유사성을 풍부하게 반영하는데 한계가 있다. 이를 해결하기 위해, 이전 연구에서는 삼중 항 손실 함수 (triplet loss)를 도입하였으며, 본 논문에서는 이러한 삼중 항을 구성하기 위해 대조 학습에서의 효과적인 토큰 컷오프(cutoff) 데이터 증강 기법을 제안한다. BERT, RoBERTa 등 널리 활용되는 언어 모델을 이용한 실험을 통해 제안하는 방법의 우수한 성능을 보인다.

Deep Neural Networks Learning based on Multiple Loss Functions for Both Person and Vehicles Re-Identification (사람과 자동차 재인식이 가능한 다중 손실함수 기반 심층 신경망 학습)

  • Kim, Kyeong Tae;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.8
    • /
    • pp.891-902
    • /
    • 2020
  • The Re-Identification(Re-ID) is one of the most popular researches in the field of computer vision due to a variety of applications. To achieve a high-level re-identification performance, recently other methods have developed the deep learning based networks that are specialized for only person or vehicle. However, most of the current methods are difficult to be used in real-world applications that require re-identification of both person and vehicle at the same time. To overcome this limitation, this paper proposes a deep neural network learning method that combines triplet and softmax loss to improve performance and re-identify people and vehicles simultaneously. It's possible to learn the detailed difference between the identities(IDs) by combining the softmax loss with the triplet loss. In addition, weights are devised to avoid bias in one-side loss when combining. We used Market-1501 and DukeMTMC-reID datasets, which are frequently used to evaluate person re-identification experiments. Moreover, the vehicle re-identification experiment was evaluated by using VeRi-776 and VehicleID datasets. Since the proposed method does not designed for a neural network specialized for a specific object, it can re-identify simultaneously both person and vehicle. To demonstrate this, an experiment was performed by using a person and vehicle re-identification dataset together.

Comparison of Transabdominal and Transvaginal Selective Fetal Reduction in Multifetal Pregnancy (다태임신에서의 선택적 유산술시 복식 천자와 질식 천자의 비교 연구)

  • Kim, S.H.;Moon, S.Y.;Lee, J.Y.
    • Clinical and Experimental Reproductive Medicine
    • /
    • v.23 no.1
    • /
    • pp.11-24
    • /
    • 1996
  • The number of multifetal pregnancies has increased dramatically as a result of the widespread clinical use of ovulation induction and assisted reproductive technology(ART) in infertile patients. In multifetal pregnancies, the adverse outcome is directly proportional to the number of fetuses within the uterus, primarily because of an increased predisposition to premature delivery. It is extremely difficult to counsel patients about the expected outcome of pregnancies involving three or more fetuses. To increase the chances of delivering infants mature enough to survive without being irreversibly damaged by the sequelae of marked prematurity, selective fetal reduction(SFR) to the smaller number of fetuses should be considered in multifetal pregnancies. From January, 1991 to December, 1992, transabdominal SFR in multifetal pregnancies was performed in 22 patients including 13 triplet, 7 quadruplet, 1 quintuplet and 1 heptuplet pregnancies. Transabdominal SFR using intracardiac KCI injection and aspiration of amniotic fluid was carried out in 8-13 weeks of gestation. After procedure, 20 patients were remained as twin pregnancies, and 2 patients as triplet pregnancies. There have been 11 sets of twin delivery including 2 stillbirths, 2 sets of triplet delivery including 1 stillbirth, and 1 singleton delivery. Six cases were delivered after 37 weeks of gestation, 4 cases in 33 - 37 weeks, and 1 case in 30 weeks. Unfortunately, 3 stillbirths occurred in 20-24 weeks of gestation, and 4 cases were aborted. As 7 losses of pregnanancy including 1 case of septic abortion occurred, the delayed fetal loss rate was 38.9%(7/18) in transabdominal SFR. All babies born after 30 weeks of gestation were healthy, and no fetal anomaly directly related to the procedure was encountered. From July, 1993 to February, 1995, transvaginal SFR was performed in 20 patients including 15 triplet, 4 quadruplet and 1 quintuplet pregnancies. Transvaginal SFR using the same method as transabdominal SFR was carried out in 8-11 weeks of gestation. After procedure, 19 patients were remained as twin pregnancies, and 1 patient as singleton pregnancy. There have been 13 sets of twin delivery including 2 stillbirths, and 1 singleton delivery. Six cases were delivered after 37 weeks of gestation, 5 cases in 36-37 weeks, and 1 case in 30 weeks. Unfortunately, 2 still-births occurred in 20 weeks and 21 weeks of gestation, respectively, and 2 cases were aborted. As 4 losses of pregnancy including 1 case of septic abortion occurred, the delayed fetal loss rate was 25.0%(4/16) in transvaginal SFR. No fetal anomaly directly related to the procedure was encountered. It is suggested that transvaginal SFR could be performed more easily and earlier with the lower fetal loss rate as compared with transabdominal SFR. In conclusion, SFR is a rather safe and ethically justified procedure that may improve the outcome of multifetal pregnancies.

  • PDF

Attention Deep Neural Networks Learning based on Multiple Loss functions for Video Face Recognition (비디오 얼굴인식을 위한 다중 손실 함수 기반 어텐션 심층신경망 학습 제안)

  • Kim, Kyeong Tae;You, Wonsang;Choi, Jae Young
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.10
    • /
    • pp.1380-1390
    • /
    • 2021
  • The video face recognition (FR) is one of the most popular researches in the field of computer vision due to a variety of applications. In particular, research using the attention mechanism is being actively conducted. In video face recognition, attention represents where to focus on by using the input value of the whole or a specific region, or which frame to focus on when there are many frames. In this paper, we propose a novel attention based deep learning method. Main novelties of our method are (1) the use of combining two loss functions, namely weighted Softmax loss function and a Triplet loss function and (2) the feasibility of end-to-end learning which includes the feature embedding network and attention weight computation. The feature embedding network has a positive effect on the attention weight computation by using combined loss function and end-to-end learning. To demonstrate the effectiveness of our proposed method, extensive and comparative experiments have been carried out to evaluate our method on IJB-A dataset with their standard evaluation protocols. Our proposed method represented better or comparable recognition rate compared to other state-of-the-art video FR methods.

Adaptive Attention Annotation Model: Optimizing the Prediction Path through Dependency Fusion

  • Wang, Fangxin;Liu, Jie;Zhang, Shuwu;Zhang, Guixuan;Zheng, Yang;Li, Xiaoqian;Liang, Wei;Li, Yuejun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.9
    • /
    • pp.4665-4683
    • /
    • 2019
  • Previous methods build image annotation model by leveraging three basic dependencies: relations between image and label (image/label), between images (image/image) and between labels (label/label). Even though plenty of researches show that multiple dependencies can work jointly to improve annotation performance, different dependencies actually do not "work jointly" in their diagram, whose performance is largely depending on the result predicted by image/label section. To address this problem, we propose the adaptive attention annotation model (AAAM) to associate these dependencies with the prediction path, which is composed of a series of labels (tags) in the order they are detected. In particular, we optimize the prediction path by detecting the relevant labels from the easy-to-detect to the hard-to-detect, which are found using Binary Cross-Entropy (BCE) and Triplet Margin (TM) losses, respectively. Besides, in order to capture the inforamtion of each label, instead of explicitly extracting regional featutres, we propose the self-attention machanism to implicitly enhance the relevant region and restrain those irrelevant. To validate the effective of the model, we conduct experiments on three well-known public datasets, COCO 2014, IAPR TC-12 and NUSWIDE, and achieve better performance than the state-of-the-art methods.

A study on combination of loss functions for effective mask-based speech enhancement in noisy environments (잡음 환경에 효과적인 마스크 기반 음성 향상을 위한 손실함수 조합에 관한 연구)

  • Jung, Jaehee;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.234-240
    • /
    • 2021
  • In this paper, the mask-based speech enhancement is improved for effective speech recognition in noise environments. In the mask-based speech enhancement, enhanced spectrum is obtained by multiplying the noisy speech spectrum by the mask. The VoiceFilter (VF) model is used as the mask estimation, and the Spectrogram Inpainting (SI) technique is used to remove residual noise of enhanced spectrum. In this paper, we propose a combined loss to further improve speech enhancement. In order to effectively remove the residual noise in the speech, the positive part of the Triplet loss is used with the component loss. For the experiment TIMIT database is re-constructed using NOISEX92 noise and background music samples with various Signal to Noise Ratio (SNR) conditions. Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI) are used as the metrics of performance evaluation. When the VF was trained with the mean squared error and the SI model was trained with the combined loss, SDR, PESQ, and STOI were improved by 0.5, 0.06, and 0.002 respectively compared to the system trained only with the mean squared error.