• Title/Summary/Keyword: multi discriminator

Search Result 16, Processing Time 0.019 seconds

A study on speech disentanglement framework based on adversarial learning for speaker recognition (화자 인식을 위한 적대학습 기반 음성 분리 프레임워크에 대한 연구)

  • Kwon, Yoohwan;Chung, Soo-Whan;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.447-453
    • /
    • 2020
  • In this paper, we propose a system to extract effective speaker representations from a speech signal using a deep learning method. Based on the fact that speech signal contains identity unrelated information such as text content, emotion, background noise, and so on, we perform a training such that the extracted features only represent speaker-related information but do not represent speaker-unrelated information. Specifically, we propose an auto-encoder based disentanglement method that outputs both speaker-related and speaker-unrelated embeddings using effective loss functions. To further improve the reconstruction performance in the decoding process, we also introduce a discriminator popularly used in Generative Adversarial Network (GAN) structure. Since improving the decoding capability is helpful for preserving speaker information and disentanglement, it results in the improvement of speaker verification performance. Experimental results demonstrate the effectiveness of our proposed method by improving Equal Error Rate (EER) on benchmark dataset, Voxceleb1.

A multi-label Classification of Attributes on Face Images

  • Le, Giang H.;Lee, Yeejin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.105-108
    • /
    • 2021
  • Generative adversarial networks (GANs) have reached a great result at creating the synthesis image, especially in the face generation task. Unlike other deep learning tasks, the input of GANs is usually the random vector sampled by a probability distribution, which leads to unstable training and unpredictable output. One way to solve those problems is to employ the label condition in both the generator and discriminator. CelebA and FFHQ are the two most famous datasets for face image generation. While CelebA contains attribute annotations for more than 200,000 images, FFHQ does not have attribute annotations. Thus, in this work, we introduce a method to learn the attributes from CelebA then predict both soft and hard labels for FFHQ. The evaluated result from our model achieves 0.7611 points of the metric is the area under the receiver operating characteristic curve.

  • PDF

A method based on Multi-Convolution layers Joint and Generative Adversarial Networks for Vehicle Detection

  • Han, Guang;Su, Jinpeng;Zhang, Chengwei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.1795-1811
    • /
    • 2019
  • In order to achieve rapid and accurate detection of vehicle objects in complex traffic conditions, we propose a novel vehicle detection method. Firstly, more contextual and small-object vehicle information can be obtained by our Joint Feature Network (JFN). Secondly, our Evolved Region Proposal Network (EPRN) generates initial anchor boxes by adding an improved version of the region proposal network in this network, and at the same time filters out a large number of false vehicle boxes by soft-Non Maximum Suppression (NMS). Then, our Mask Network (MaskN) generates an example that includes the vehicle occlusion, the generator and discriminator can learn from each other in order to further improve the vehicle object detection capability. Finally, these candidate vehicle detection boxes are optimized to obtain the final vehicle detection boxes by the Fine-Tuning Network(FTN). Through the evaluation experiment on the DETRAC benchmark dataset, we find that in terms of mAP, our method exceeds Faster-RCNN by 11.15%, YOLO by 11.88%, and EB by 1.64%. Besides, our algorithm also has achieved top2 comaring with MS-CNN, YOLO-v3, RefineNet, RetinaNet, Faster-rcnn, DSSD and YOLO-v2 of vehicle category in KITTI dataset.

Segmentation of Mammography Breast Images using Automatic Segmen Adversarial Network with Unet Neural Networks

  • Suriya Priyadharsini.M;J.G.R Sathiaseelan
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.12
    • /
    • pp.151-160
    • /
    • 2023
  • Breast cancer is the most dangerous and deadly form of cancer. Initial detection of breast cancer can significantly improve treatment effectiveness. The second most common cancer among Indian women in rural areas. Early detection of symptoms and signs is the most important technique to effectively treat breast cancer, as it enhances the odds of receiving an earlier, more specialist care. As a result, it has the possible to significantly improve survival odds by delaying or entirely eliminating cancer. Mammography is a high-resolution radiography technique that is an important factor in avoiding and diagnosing cancer at an early stage. Automatic segmentation of the breast part using Mammography pictures can help reduce the area available for cancer search while also saving time and effort compared to manual segmentation. Autoencoder-like convolutional and deconvolutional neural networks (CN-DCNN) were utilised in previous studies to automatically segment the breast area in Mammography pictures. We present Automatic SegmenAN, a unique end-to-end adversarial neural network for the job of medical image segmentation, in this paper. Because image segmentation necessitates extensive, pixel-level labelling, a standard GAN's discriminator's single scalar real/fake output may be inefficient in providing steady and appropriate gradient feedback to the networks. Instead of utilising a fully convolutional neural network as the segmentor, we suggested a new adversarial critic network with a multi-scale L1 loss function to force the critic and segmentor to learn both global and local attributes that collect long- and short-range spatial relations among pixels. We demonstrate that an Automatic SegmenAN perspective is more up to date and reliable for segmentation tasks than the state-of-the-art U-net segmentation technique.

A Study on Unsupervised Learning Method of RAM-based Neural Net (RAM 기반 신경망의 비지도 학습에 관한 연구)

  • Park, Sang-Moo;Kim, Seong-Jin;Lee, Dong-Hyung;Lee, Soo-Dong;Ock, Cheol-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.31-38
    • /
    • 2011
  • A RAM-based Neural Net is a weightless neural network based on binary neural network. 3-D neural network using this paper is binary neural network with multiful information bits and store counts of training. Recognition method by MRD technique is based on the supervised learning. Therefore neural network by itself can not distinguish between the categories and well-separated categories of training data can achieve only through the performance. In this paper, unsupervised learning algorithm is proposed which is trained existing 3-D neural network without distinction of data, to distinguish between categories depending on the only input training patterns. The training data for proposed unsupervised learning provided by the NIST handwritten digits of MNIST which is consist of 0 to 9 multi-pattern, a randomly materials are used as training patterns. Through experiments, neural network is to determine the number of discriminator which each have an idea of the handwritten digits that can be interpreted.

Improving target recognition of active sonar multi-layer processor through deep learning of a small amounts of imbalanced data (소수 불균형 데이터의 심층학습을 통한 능동소나 다층처리기의 표적 인식성 개선)

  • Young-Woo Ryu;Jeong-Goo Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.2
    • /
    • pp.225-233
    • /
    • 2024
  • Active sonar transmits sound waves to detect covertly maneuvering underwater objects and detects the signals reflected back from the target. However, in addition to the target's echo, the active sonar's received signal is mixed with seafloor, sea surface reverberation, biological noise, and other noise, making target recognition difficult. Conventional techniques for detecting signals above a threshold not only cause false detections or miss targets depending on the set threshold, but also have the problem of having to set an appropriate threshold for various underwater environments. To overcome this, research has been conducted on automatic calculation of threshold values through techniques such as Constant False Alarm Rate (CFAR) and application of advanced tracking filters and association techniques, but there are limitations in environments where a significant number of detections occur. As deep learning technology has recently developed, efforts have been made to apply it in the field of underwater target detection, but it is very difficult to acquire active sonar data for discriminator learning, so not only is the data rare, but there are only a very small number of targets and a relatively large number of non-targets. There are difficulties due to the imbalance of data. In this paper, the image of the energy distribution of the detection signal is used, and a classifier is learned in a way that takes into account the imbalance of the data to distinguish between targets and non-targets and added to the existing technique. Through the proposed technique, target misclassification was minimized and non-targets were eliminated, making target recognition easier for active sonar operators. And the effectiveness of the proposed technique was verified through sea experiment data obtained in the East Sea.