• Title/Summary/Keyword: 변환기반 학습

Search Result 418, Processing Time 0.032 seconds

Automatic Generation of Pronunciation Variants for Korean Continuous Speech Recognition (한국어 연속음성 인식을 위한 발음열 자동 생성)

  • 이경님;전재훈;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2
    • /
    • pp.35-43
    • /
    • 2001
  • Many speech recognition systems have used pronunciation lexicon with possible multiple phonetic transcriptions for each word. The pronunciation lexicon is of often manually created. This process requires a lot of time and efforts, and furthermore, it is very difficult to maintain consistency of lexicon. To handle these problems, we present a model based on morphophon-ological analysis for automatically generating Korean pronunciation variants. By analyzing phonological variations frequently found in spoken Korean, we have derived about 700 phonemic contexts that would trigger the multilevel application of the corresponding phonological process, which consists of phonemic and allophonic rules. In generating pronunciation variants, morphological analysis is preceded to handle variations of phonological words. According to the morphological category, a set of tables reflecting phonemic context is looked up to generate pronunciation variants. Our experiments show that the proposed model produces mostly correct pronunciation variants of phonological words. Then we estimated how useful the pronunciation lexicon and training phonetic transcription using this proposed systems.

  • PDF

A Sliding Window-based Multivariate Stream Data Classification (슬라이딩 윈도우 기반 다변량 스트림 데이타 분류 기법)

  • Seo, Sung-Bo;Kang, Jae-Woo;Nam, Kwang-Woo;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.33 no.2
    • /
    • pp.163-174
    • /
    • 2006
  • In distributed wireless sensor network, it is difficult to transmit and analyze the entire stream data depending on limited networks, power and processor. Therefore it is suitable to use alternative stream data processing after classifying the continuous stream data. We propose a classification framework for continuous multivariate stream data. The proposed approach works in two steps. In the preprocessing step, it takes input as a sliding window of multivariate stream data and discretizes the data in the window into a string of symbols that characterize the signal changes. In the classification step, it uses a standard text classification algorithm to classify the discretized data in the window. We evaluated both supervised and unsupervised classification algorithms. For supervised, we tested Bayesian classifier and SVM, and for unsupervised, we tested Jaccard, TFIDF Jaro and Jaro Winkler. In our experiments, SVM and TFIDF outperformed other classification methods. In particular, we observed that classification accuracy is improved when the correlation of attributes is also considered along with the n-gram tokens of symbols.

5D Light Field Synthesis from a Monocular Video (단안 비디오로부터의 5차원 라이트필드 비디오 합성)

  • Bae, Kyuho;Ivan, Andre;Park, In Kyu
    • Journal of Broadcast Engineering
    • /
    • v.24 no.5
    • /
    • pp.755-764
    • /
    • 2019
  • Currently commercially available light field cameras are difficult to acquire 5D light field video since it can only acquire the still images or high price of the device. In order to solve these problems, we propose a deep learning based method for synthesizing the light field video from monocular video. To solve the problem of obtaining the light field video training data, we use UnrealCV to acquire synthetic light field data by realistic rendering of 3D graphic scene and use it for training. The proposed deep running framework synthesizes the light field video with each sub-aperture image (SAI) of $9{\times}9$ from the input monocular video. The proposed network consists of a network for predicting the appearance flow from the input image converted to the luminance image, and a network for predicting the optical flow between the adjacent light field video frames obtained from the appearance flow.

Efficient Osteoporosis Prediction Using A Pair of Ensemble Models

  • Choi, Se-Heon;Hwang, Dong-Hwan;Kim, Do-Hyeon;Bak, So-Hyeon;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.12
    • /
    • pp.45-52
    • /
    • 2021
  • In this paper, we propose a prediction model for osteopenia and osteoporosis based on a convolutional neural network(CNN) using computed tomography(CT) images. In a single CT image, CNN had a limitation in utilizing important local features for diagnosis. So we propose a compound model which has two identical structures. As an input, two different texture images are used, which are converted from a single normalized CT image. The two networks train different information by using dissimilarity loss function. As a result, our model trains various features in a single CT image which includes important local features, then we ensemble them to improve the accuracy of predicting osteopenia and osteoporosis. In experiment results, our method shows an accuracy of 77.11% and the feature visualize of this model is confirmed by using Grad-CAM.

Related Documents Classification System by Similarity between Documents (문서 유사도를 통한 관련 문서 분류 시스템 연구)

  • Jeong, Jisoo;Jee, Minkyu;Go, Myunghyun;Kim, Hakdong;Lim, Heonyeong;Lee, Yurim;Kim, Wonil
    • Journal of Broadcast Engineering
    • /
    • v.24 no.1
    • /
    • pp.77-86
    • /
    • 2019
  • This paper proposes using machine-learning technology to analyze and classify historical collected documents based on them. Data is collected based on keywords associated with a specific domain and the non-conceptuals such as special characters are removed. Then, tag each word of the document collected using a Korean-language morpheme analyzer with its nouns, verbs, and sentences. Embedded documents using Doc2Vec model that converts documents into vectors. Measure the similarity between documents through the embedded model and learn the document classifier using the machine running algorithm. The highest performance support vector machine measured 0.83 of F1-score as a result of comparing the classification model learned.

A Study of Automatic Recognition on Target and Flame Based Gradient Vector Field Using Infrared Image (적외선 영상을 이용한 Gradient Vector Field 기반의 표적 및 화염 자동인식 연구)

  • Kim, Chun-Ho;Lee, Ju-Young
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.49 no.1
    • /
    • pp.63-73
    • /
    • 2021
  • This paper presents a algorithm for automatic target recognition robust to the influence of the flame in order to track the target by EOTS(Electro-Optical Targeting System) equipped on UAV(Unmanned Aerial Vehicle) when there is aerial target or marine target with flame at the same time. The proposed method converts infrared images of targets and flames into a gradient vector field, and applies each gradient magnitude to a polynomial curve fitting technique to extract polynomial coefficients, and learns them in a shallow neural network model to automatically recognize targets and flames. The performance of the proposed technique was confirmed by utilizing the various infrared image database of the target and flame. Using this algorithm, it can be applied to areas where collision avoidance, forest fire detection, automatic detection and recognition of targets in the air and sea during automatic flight of unmanned aircraft.

An Anomalous Sequence Detection Method Based on An Extended LSTM Autoencoder (확장된 LSTM 오토인코더 기반 이상 시퀀스 탐지 기법)

  • Lee, Jooyeon;Lee, Ki Yong
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.1
    • /
    • pp.127-140
    • /
    • 2021
  • Recently, sequence data containing time information, such as sensor measurement data and purchase history, has been generated in various applications. So far, many methods for finding sequences that are significantly different from other sequences among given sequences have been proposed. However, most of them have a limitation that they consider only the order of elements in the sequences. Therefore, in this paper, we propose a new anomalous sequence detection method that considers both the order of elements and the time interval between elements. The proposed method uses an extended LSTM autoencoder model, which has an additional layer that converts a sequence into a form that can help effectively learn both the order of elements and the time interval between elements. The proposed method learns the features of the given sequences with the extended LSTM autoencoder model, and then detects sequences that the model does not reconstruct well as anomalous sequences. Using experiments on synthetic data that contains both normal and anomalous sequences, we show that the proposed method achieves an accuracy close to 100% compared to the method that uses only the traditional LSTM autoencoder.

Pre-processing Method of Raw Data Based on Ontology for Machine Learning (머신러닝을 위한 온톨로지 기반의 Raw Data 전처리 기법)

  • Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.600-608
    • /
    • 2020
  • Machine learning constructs an objective function from learning data, and predicts the result of the data generated by checking the objective function through test data. In machine learning, input data is subjected to a normalisation process through a preprocessing. In the case of numerical data, normalization is standardized by using the average and standard deviation of the input data. In the case of nominal data, which is non-numerical data, it is converted into a one-hot code form. However, this preprocessing alone cannot solve the problem. For this reason, we propose a method that uses ontology to normalize input data in this paper. The test data for this uses the received signal strength indicator (RSSI) value of the Wi-Fi device collected from the mobile device. These data are solved through ontology because they includes noise and heterogeneous problems.

A New Image Processing Scheme For Face Swapping Using CycleGAN (순환 적대적 생성 신경망을 이용한 안면 교체를 위한 새로운 이미지 처리 기법)

  • Ban, Tae-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.9
    • /
    • pp.1305-1311
    • /
    • 2022
  • With the recent rapid development of mobile terminals and personal computers and the advent of neural network technology, real-time face swapping using images has become possible. In particular, the cycle generative adversarial network made it possible to replace faces using uncorrelated image data. In this paper, we propose an input data processing scheme that can improve the quality of face swapping with less training data and time. The proposed scheme can improve the image quality while preserving facial structure and expression information by combining facial landmarks extracted through a pre-trained neural network with major information that affects the structure and expression of the face. Using the blind/referenceless image spatial quality evaluator (BRISQUE) score, which is one of the AI-based non-reference quality metrics, we quantitatively analyze the performance of the proposed scheme and compare it to the conventional schemes. According to the numerical results, the proposed scheme obtained BRISQUE scores improved by about 4.6% to 14.6%, compared to the conventional schemes.

Study on the Improvement of Lung CT Image Quality using 2D Deep Learning Network according to Various Noise Types (폐 CT 영상에서 다양한 노이즈 타입에 따른 딥러닝 네트워크를 이용한 영상의 질 향상에 관한 연구)

  • Min-Gwan Lee;Chanrok Park
    • Journal of the Korean Society of Radiology
    • /
    • v.18 no.2
    • /
    • pp.93-99
    • /
    • 2024
  • The digital medical imaging, especially, computed tomography (CT), should necessarily be considered in terms of noise distribution caused by converting to X-ray photon to digital imaging signal. Recently, the denoising technique based on deep learning architecture is increasingly used in the medical imaging field. Here, we evaluated noise reduction effect according to various noise types based on the U-net deep learning model in the lung CT images. The input data for deep learning was generated by applying Gaussian noise, Poisson noise, salt and pepper noise and speckle noise from the ground truth (GT) image. In particular, two types of Gaussian noise input data were applied with standard deviation values of 30 and 50. There are applied hyper-parameters, which were Adam as optimizer function, 100 as epochs, and 0.0001 as learning rate, respectively. To analyze the quantitative values, the mean square error (MSE), the peak signal to noise ratio (PSNR) and coefficient of variation (COV) were calculated. According to the results, it was confirmed that the U-net model was effective for noise reduction all of the set conditions in this study. Especially, it showed the best performance in Gaussian noise.