• Title/Summary/Keyword: similarity learning

Search Result 499, Processing Time 0.026 seconds

A Korean Multi-speaker Text-to-Speech System Using d-vector (d-vector를 이용한 한국어 다화자 TTS 시스템)

  • Kim, Kwang Hyeon;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.469-475
    • /
    • 2022
  • To train the model of the deep learning-based single-speaker TTS system, a speech DB of tens of hours and a lot of training time are required. This is an inefficient method in terms of time and cost to train multi-speaker or personalized TTS models. The voice cloning method uses a speaker encoder model to make the TTS model of a new speaker. Through the trained speaker encoder model, a speaker embedding vector representing the timbre of the new speaker is created from the small speech data of the new speaker that is not used for training. In this paper, we propose a multi-speaker TTS system to which voice cloning is applied. The proposed TTS system consists of a speaker encoder, synthesizer and vocoder. The speaker encoder applies the d-vector technique used in the speaker recognition field. The timbre of the new speaker is expressed by adding the d-vector derived from the trained speaker encoder as an input to the synthesizer. It can be seen that the performance of the proposed TTS system is excellent from the experimental results derived by the MOS and timbre similarity listening tests.

Comparative Analysis on Smart Features of IoT Home Living Products among Korea, China and Japan (한·중·일 IoT홈 가전생활재의 지능형 기능성 비교연구)

  • Zhang, Chun Chun;Lee, Yeun Sook;Hwang, Ji Hye;Park, Jae Hyun
    • Design Convergence Study
    • /
    • v.15 no.2
    • /
    • pp.237-250
    • /
    • 2016
  • Along with rapid development, progress of the network technology and digital information technology, human are stepping into the intelligent society of internet. Thereby the quality of living environment and working environment are keep improving. Under the big background of internet era, the timeliness and convenience of smart home system has been improved greatly. While lots of smart products have gradually penetrated into people's daily life. The household appliances are among most popular ones. This paper is intended to compare smart features of household living products among most representative brands in China, Japan and South Korea. The smart features include self-learning, self-adapting, self-coordinating, self-diagnosing, self-inferring, self-organizing, and self adjusting. As result, most smart features of these products showed great similarity. While some features were dominated according to countries such as remote control feature in Korea, energy saving feature in Japan, and one button operation feature in China.

Multi-task Deep Neural Network Model for T1CE Image Synthesis and Tumor Region Segmentation in Glioblastoma Patients (교모세포종 환자의 T1CE 영상 생성 및 암 영역분할을 위한 멀티 태스크 심층신경망 모델)

  • Kim, Eunjin;Park, Hyunjin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.474-476
    • /
    • 2021
  • Glioblastoma is the most common brain malignancies arising from glial cells. Early diagnosis and treatment plan establishment are important, and cancer is diagnosed mainly through T1CE imaging through injection of a contrast agent. However, the risk of injection of gadolinium-based contrast agents is increasing recently. Region segmentation that marks cancer regions in medical images plays a key role in CAD systems, and deep neural network models for synthesizing new images are also being studied. In this study, we propose a model that simultaneously learns the generation of T1CE images and segmentation of cancer regions. The performance of the proposed model is evaluated using similarity measurements including mean square error and peak signal-to-noise ratio, and shows average result values of 21 and 39 dB.

  • PDF

Visual Model of Pattern Design Based on Deep Convolutional Neural Network

  • Jingjing Ye;Jun Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.2
    • /
    • pp.311-326
    • /
    • 2024
  • The rapid development of neural network technology promotes the neural network model driven by big data to overcome the texture effect of complex objects. Due to the limitations in complex scenes, it is necessary to establish custom template matching and apply it to the research of many fields of computational vision technology. The dependence on high-quality small label sample database data is not very strong, and the machine learning system of deep feature connection to complete the task of texture effect inference and speculation is relatively poor. The style transfer algorithm based on neural network collects and preserves the data of patterns, extracts and modernizes their features. Through the algorithm model, it is easier to present the texture color of patterns and display them digitally. In this paper, according to the texture effect reasoning of custom template matching, the 3D visualization of the target is transformed into a 3D model. The high similarity between the scene to be inferred and the user-defined template is calculated by the user-defined template of the multi-dimensional external feature label. The convolutional neural network is adopted to optimize the external area of the object to improve the sampling quality and computational performance of the sample pyramid structure. The results indicate that the proposed algorithm can accurately capture the significant target, achieve more ablation noise, and improve the visualization results. The proposed deep convolutional neural network optimization algorithm has good rapidity, data accuracy and robustness. The proposed algorithm can adapt to the calculation of more task scenes, display the redundant vision-related information of image conversion, enhance the powerful computing power, and further improve the computational efficiency and accuracy of convolutional networks, which has a high research significance for the study of image information conversion.

Research of PPI prediction model based on POST-TAVR ECG (POST-TAVR ECG 기반의 PPI 예측 모델 연구)

  • InSeo Song;SeMo Yang;KangYoon Lee
    • Journal of Internet Computing and Services
    • /
    • v.25 no.2
    • /
    • pp.29-38
    • /
    • 2024
  • After Transcatheter Aortic Valve Replacement (TAVR), comprehensive management of complications, including the need for Permanent Pacemaker Implantation (PPI), is crucial, increasing the demand for accurate prediction models. Departing from traditional image-based methods, this study developed an optimal PPI prediction model based on ECG data using the XGBoost algorithm. Focusing on ECG signals like DeltaPR and DeltaQRS as key indicators, the model effectively identifies the correlation between conduction disorders and PPI needs, achieving superior performance with an AUC of 0.91. Validated using data from two hospitals, it demonstrated a high similarity rate of 95.28% in predicting PPI from ECG characteristics. This confirms the model's effective applicability across diverse hospital data, establishing a significant advancement in the development of reliable and practical PPI prediction models with reduced dependence on human intervention and costly medical imaging.

Improving Accuracy of Chapter-level Lecture Video Recommendation System using Keyword Cluster-based Graph Neural Networks

  • Purevsuren Chimeddorj;Doohyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.7
    • /
    • pp.89-98
    • /
    • 2024
  • In this paper, we propose a system for recommending lecture videos at the chapter level, addressing the balance between accuracy and processing speed in chapter-level video recommendations. Specifically, it has been observed that enhancing recommendation accuracy reduces processing speed, while increasing processing speed decreases accuracy. To mitigate this trade-off, a hybrid approach is proposed, utilizing techniques such as TF-IDF, k-means++ clustering, and Graph Neural Networks (GNN). The approach involves pre-constructing clusters based on chapter similarity to reduce computational load during recommendations, thereby improving processing speed, and applying GNN to the graph of clusters as nodes to enhance recommendation accuracy. Experimental results indicate that the use of GNN resulted in an approximate 19.7% increase in recommendation accuracy, as measured by the Mean Reciprocal Rank (MRR) metric, and an approximate 27.7% increase in precision defined by similarities. These findings are expected to contribute to the development of a learning system that recommends more suitable video chapters in response to learners' queries.

Removal of Seabed Multiples in Seismic Reflection Data using Machine Learning (머신러닝을 이용한 탄성파 반사법 자료의 해저면 겹반사 제거)

  • Nam, Ho-Soo;Lim, Bo-Sung;Kweon, Il-Ryong;Kim, Ji-Soo
    • Geophysics and Geophysical Exploration
    • /
    • v.23 no.3
    • /
    • pp.168-177
    • /
    • 2020
  • Seabed multiple reflections (seabed multiples) are the main cause of misinterpretations of primary reflections in both shot gathers and stack sections. Accordingly, seabed multiples need to be suppressed throughout data processing. Conventional model-driven methods, such as prediction-error deconvolution, Radon filtering, and data-driven methods, such as the surface-related multiple elimination technique, have been used to attenuate multiple reflections. However, the vast majority of processing workflows require time-consuming steps when testing and selecting the processing parameters in addition to computational power and skilled data-processing techniques. To attenuate seabed multiples in seismic reflection data, input gathers with seabed multiples and label gathers without seabed multiples were generated via numerical modeling using the Marmousi2 velocity structure. The training data consisted of normal-moveout-corrected common midpoint gathers fed into a U-Net neural network. The well-trained model was found to effectively attenuate the seabed multiples according to the image similarity between the prediction result and the target data, and demonstrated good applicability to field data.

Raindrop Removal and Background Information Recovery in Coastal Wave Video Imagery using Generative Adversarial Networks (적대적생성신경망을 이용한 연안 파랑 비디오 영상에서의 빗방울 제거 및 배경 정보 복원)

  • Huh, Dong;Kim, Jaeil;Kim, Jinah
    • Journal of the Korea Computer Graphics Society
    • /
    • v.25 no.5
    • /
    • pp.1-9
    • /
    • 2019
  • In this paper, we propose a video enhancement method using generative adversarial networks to remove raindrops and restore the background information on the removed region in the coastal wave video imagery distorted by raindrops during rainfall. Two experimental models are implemented: Pix2Pix network widely used for image-to-image translation and Attentive GAN, which is currently performing well for raindrop removal on a single images. The models are trained with a public dataset of paired natural images with and without raindrops and the trained models are evaluated their performance of raindrop removal and background information recovery of rainwater distortion of coastal wave video imagery. In order to improve the performance, we have acquired paired video dataset with and without raindrops at the real coast and conducted transfer learning to the pre-trained models with those new dataset. The performance of fine-tuned models is improved by comparing the results from pre-trained models. The performance is evaluated using the peak signal-to-noise ratio and structural similarity index and the fine-tuned Pix2Pix network by transfer learning shows the best performance to reconstruct distorted coastal wave video imagery by raindrops.

Improving Hypertext Classification Systems through WordNet-based Feature Abstraction (워드넷 기반 특징 추상화를 통한 웹문서 자동분류시스템의 성능향상)

  • Roh, Jun-Ho;Kim, Han-Joon;Chang, Jae-Young
    • The Journal of Society for e-Business Studies
    • /
    • v.18 no.2
    • /
    • pp.95-110
    • /
    • 2013
  • This paper presents a novel feature engineering technique that can improve the conventional machine learning-based text classification systems. The proposed method extends the initial set of features by using hyperlink relationships in order to effectively categorize hypertext web documents. Web documents are connected to each other through hyperlinks, and in many cases hyperlinks exist among highly related documents. Such hyperlink relationships can be used to enhance the quality of features which consist of classification models. The basic idea of the proposed method is to generate a sort of ed concept feature which consists of a few raw feature words; for this, the method computes the semantic similarity between a target document and its neighbor documents by utilizing hierarchical relationships in the WordNet ontology. In developing classification models, the ed concept features are equated with other raw features, and they can play a great role in developing more accurate classification models. Through the extensive experiments with the Web-KB test collection, we prove that the proposed methods outperform the conventional ones.

A Study on Person Re-Identification System using Enhanced RNN (확장된 RNN을 활용한 사람재인식 시스템에 관한 연구)

  • Choi, Seok-Gyu;Xu, Wenjie
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.17 no.2
    • /
    • pp.15-23
    • /
    • 2017
  • The person Re-identification is the most challenging part of computer vision due to the significant changes in human pose and background clutter with occlusions. The picture from non-overlapping cameras enhance the difficulty to distinguish some person from the other. To reach a better performance match, most methods use feature selection and distance metrics separately to get discriminative representations and proper distance to describe the similarity between person and kind of ignoring some significant features. This situation has encouraged us to consider a novel method to deal with this problem. In this paper, we proposed an enhanced recurrent neural network with three-tier hierarchical network for person re-identification. Specifically, the proposed recurrent neural network (RNN) model contain an iterative expectation maximum (EM) algorithm and three-tier Hierarchical network to jointly learn both the discriminative features and metrics distance. The iterative EM algorithm can fully use of the feature extraction ability of convolutional neural network (CNN) which is in series before the RNN. By unsupervised learning, the EM framework can change the labels of the patches and train larger datasets. Through the three-tier hierarchical network, the convolutional neural network, recurrent network and pooling layer can jointly be a feature extractor to better train the network. The experimental result shows that comparing with other researchers' approaches in this field, this method also can get a competitive accuracy. The influence of different component of this method will be analyzed and evaluated in the future research.