• Title/Summary/Keyword: voice image

Search Result 293, Processing Time 0.023 seconds

Speech Recognition Using Linear Discriminant Analysis and Common Vector Extraction (선형 판별분석과 공통벡터 추출방법을 이용한 음성인식)

  • 남명우;노승용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4
    • /
    • pp.35-41
    • /
    • 2001
  • This paper describes Linear Discriminant Analysis and common vector extraction for speech recognition. Voice signal contains psychological and physiological properties of the speaker as well as dialect differences, acoustical environment effects, and phase differences. For these reasons, the same word spelled out by different speakers can be very different heard. This property of speech signal make it very difficult to extract common properties in the same speech class (word or phoneme). Linear algebra method like BT (Karhunen-Loeve Transformation) is generally used for common properties extraction In the speech signals, but common vector extraction which is suggested by M. Bilginer et at. is used in this paper. The method of M. Bilginer et al. extracts the optimized common vector from the speech signals used for training. And it has 100% recognition accuracy in the trained data which is used for common vector extraction. In spite of these characteristics, the method has some drawback-we cannot use numbers of speech signal for training and the discriminant information among common vectors is not defined. This paper suggests advanced method which can reduce error rate by maximizing the discriminant information among common vectors. And novel method to normalize the size of common vector also added. The result shows improved performance of algorithm and better recognition accuracy of 2% than conventional method.

  • PDF

The Evolution of Cyber Singer Viewed from the Coevolution of Man and Machine (인간과 기계의 공진화적 관점에서 바라본 사이버가수의 진화과정)

  • Kim, Dae-Woo
    • Cartoon and Animation Studies
    • /
    • s.39
    • /
    • pp.261-295
    • /
    • 2015
  • Cyber singer appeared in the late 1990s has disappeared briefly appeared. although a few attempts in the 2000s, it did not show significant successes. cyber singer was born thanks to the technical development of the IT industry and the emergence of an idol training system in the music industry. It was developed by Vocaloid 'Seeyou' starting from 'Adam'. cyber singer that differenatiated typical digital characters in a cartoon or game may be subject to idolize to the music as a medium. They also feature forming a plurality of fandom. therefore, such attempts and repeated failures, this could be considered a fashion, but it flew content creation and ongoing attempts to take advantage of the new media, such as Vocaloid can see that there are expectations for a true Cyber-born singer. Early-Cyber singer is made only resemble human appearance, but 'Sciart' and 'Seeyou' has been evolving to becoming more like the human capabilities. in this paper, stylized cyber singer had disappeared in the past in the process of developing the technology to evolve into own artificial life does not end in failure cases, gradually led to a change in public perceptions of the image look looking machine was an attempt in that sense. With the direction of the evolution of the mechanical function to obtain a human, fun and human exchanges and mutual feelings. And it is equipped with an artificial life form that evolved with it only in appearance and function. in order to support this logic, I refer to the study of the coevolution of man and machine at every Bruce Mazlish. And, I have analyzed the evolution of cyber singer Bruce research from the perspective of the development process since the late 1990s, the planning of the eight singers who have appeared and design of the cyber character and important voices to be evaluated as a singer (vocal). The machine has been evolving coevolution with humans. cyber singer ambivalent development targets are recognized, but strive to become the new artificial creatures of horror idea of human desire and death continues. therefore, the new Cyber-organisms are likely to be the same style as 'Seeyou'. because, cartoon forms and whirring voice may not be in the form of a signifier is the real human desires, but this is because the contemporary public's desire to be desired and the technical development of this type can be created at the point where the cross-signifier.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.