• Title/Summary/Keyword: 인코더

Search Result 370, Processing Time 0.026 seconds

Latent Shifting and Compensation for Learned Video Compression (신경망 기반 비디오 압축을 위한 레이턴트 정보의 방향 이동 및 보상)

  • Kim, Yeongwoong;Kim, Donghyun;Jeong, Se Yoon;Choi, Jin Soo;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.31-43
    • /
    • 2022
  • Traditional video compression has developed so far based on hybrid compression methods through motion prediction, residual coding, and quantization. With the rapid development of technology through artificial neural networks in recent years, research on image compression and video compression based on artificial neural networks is also progressing rapidly, showing competitiveness compared to the performance of traditional video compression codecs. In this paper, a new method capable of improving the performance of such an artificial neural network-based video compression model is presented. Basically, we take the rate-distortion optimization method using the auto-encoder and entropy model adopted by the existing learned video compression model and shifts some components of the latent information that are difficult for entropy model to estimate when transmitting compressed latent representation to the decoder side from the encoder side, and finally compensates the distortion of lost information. In this way, the existing neural network based video compression framework, MFVC (Motion Free Video Compression) is improved and the BDBR (Bjøntegaard Delta-Rate) calculated based on H.264 is nearly twice the amount of bits (-27%) of MFVC (-14%). The proposed method has the advantage of being widely applicable to neural network based image or video compression technologies, not only to MFVC, but also to models using latent information and entropy model.

Toward understanding learning patterns in an open online learning platform using process mining (프로세스 마이닝을 활용한 온라인 교육 오픈 플랫폼 내 학습 패턴 분석 방법 개발)

  • Taeyoung Kim;Hyomin Kim;Minsu Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.285-301
    • /
    • 2023
  • Due to the increasing demand and importance of non-face-to-face education, open online learning platforms are getting interests both domestically and internationally. These platforms exhibit different characteristics from online courses by universities and other educational institutions. In particular, students engaged in these platforms can receive more learner autonomy, and the development of tools to assist learning is required. From the past, researchers have attempted to utilize process mining to understand realistic study behaviors and derive learning patterns. However, it has a deficiency to employ it to the open online learning platforms. Moreover, existing research has primarily focused on the process model perspective, including process model discovery, but lacks a method for the process pattern and instance perspectives. In this study, we propose a method to identify learning patterns within an open online learning platform using process mining techniques. To achieve this, we suggest three different viewpoints, e.g., model-level, variant-level, and instance-level, to comprehend the learning patterns, and various techniques are employed, such as process discovery, conformance checking, autoencoder-based clustering, and predictive approaches. To validate this method, we collected a learning log of machine learning-related courses on a domestic open education platform. The results unveiled a spaghetti-like process model that can be differentiated into a standard learning pattern and three abnormal patterns. Furthermore, as a result of deriving a pattern classification model, our model achieved a high accuracy of 0.86 when predicting the pattern of instances based on the initial 30% of the entire flow. This study contributes to systematically analyze learners' patterns using process mining.

Drape Simulation Estimation for Non-Linear Stiffness Model (비선형 강성 모델을 위한 드레이프 시뮬레이션 결과 추정)

  • Eungjune Shim;Eunjung Ju;Myung Geol Choi
    • Journal of the Korea Computer Graphics Society
    • /
    • v.29 no.3
    • /
    • pp.117-125
    • /
    • 2023
  • In the development of clothing design through virtual simulation, it is essential to minimize the differences between the virtual and the real world as much as possible. The most critical task to enhance the similarity between virtual and real garments is to find simulation parameters that can closely emulate the physical properties of the actual fabric in use. The simulation parameter optimization process requires manual tuning by experts, demanding high expertise and a significant amount of time. Especially, considerable time is consumed in repeatedly running simulations to check the results of applying the tuned simulation parameters. Recently, to tackle this issue, artificial neural network learning models have been proposed that swiftly estimate the results of drape test simulations, which are predominantly used for parameter tuning. In these earlier studies, relatively simple linear stiffness models were used, and instead of estimating the entirety of the drape mesh, they estimated only a portion of the mesh and interpolated the rest. However, there is still a scarcity of research on non-linear stiffness models, which are commonly used in actual garment design. In this paper, we propose a learning model for estimating the results of drape simulations for non-linear stiffness models. Our learning model estimates the full high-resolution mesh model of drape. To validate the performance of the proposed method, experiments were conducted using three different drape test methods, demonstrating high accuracy in estimation.

A study on the aspect-based sentiment analysis of multilingual customer reviews (다국어 사용자 후기에 대한 속성기반 감성분석 연구)

  • Sungyoung Ji;Siyoon Lee;Daewoo Choi;Kee-Hoon Kang
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.6
    • /
    • pp.515-528
    • /
    • 2023
  • With the growth of the e-commerce market, consumers increasingly rely on user reviews to make purchasing decisions. Consequently, researchers are actively conducting studies to effectively analyze these reviews. Among the various methods of sentiment analysis, the aspect-based sentiment analysis approach, which examines user reviews from multiple angles rather than solely relying on simple positive or negative sentiments, is gaining widespread attention. Among the various methodologies for aspect-based sentiment analysis, there is an analysis method using a transformer-based model, which is the latest natural language processing technology. In this paper, we conduct an aspect-based sentiment analysis on multilingual user reviews using two real datasets from the latest natural language processing technology model. Specifically, we use restaurant data from the SemEval 2016 public dataset and multilingual user review data from the cosmetic domain. We compare the performance of transformer-based models for aspect-based sentiment analysis and apply various methodologies to improve their performance. Models using multilingual data are expected to be highly useful in that they can analyze multiple languages in one model without building separate models for each language.

Comparative Analysis of Self-supervised Deephashing Models for Efficient Image Retrieval System (효율적인 이미지 검색 시스템을 위한 자기 감독 딥해싱 모델의 비교 분석)

  • Kim Soo In;Jeon Young Jin;Lee Sang Bum;Kim Won Gyum
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.12
    • /
    • pp.519-524
    • /
    • 2023
  • In hashing-based image retrieval, the hash code of a manipulated image is different from the original image, making it difficult to search for the same image. This paper proposes and evaluates a self-supervised deephashing model that generates perceptual hash codes from feature information such as texture, shape, and color of images. The comparison models are autoencoder-based variational inference models, but the encoder is designed with a fully connected layer, convolutional neural network, and transformer modules. The proposed model is a variational inference model that includes a SimAM module of extracting geometric patterns and positional relationships within images. The SimAM module can learn latent vectors highlighting objects or local regions through an energy function using the activation values of neurons and surrounding neurons. The proposed method is a representation learning model that can generate low-dimensional latent vectors from high-dimensional input images, and the latent vectors are binarized into distinguishable hash code. From the experimental results on public datasets such as CIFAR-10, ImageNet, and NUS-WIDE, the proposed model is superior to the comparative model and analyzed to have equivalent performance to the supervised learning-based deephashing model. The proposed model can be used in application systems that require low-dimensional representation of images, such as image search or copyright image determination.

Comparative analysis of wavelet transform and machine learning approaches for noise reduction in water level data (웨이블릿 변환과 기계 학습 접근법을 이용한 수위 데이터의 노이즈 제거 비교 분석)

  • Hwang, Yukwan;Lim, Kyoung Jae;Kim, Jonggun;Shin, Minhwan;Park, Youn Shik;Shin, Yongchul;Ji, Bongjun
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.3
    • /
    • pp.209-223
    • /
    • 2024
  • In the context of the fourth industrial revolution, data-driven decision-making has increasingly become pivotal. However, the integrity of data analysis is compromised if data quality is not adequately ensured, potentially leading to biased interpretations. This is particularly critical for water level data, essential for water resource management, which often encounters quality issues such as missing values, spikes, and noise. This study addresses the challenge of noise-induced data quality deterioration, which complicates trend analysis and may produce anomalous outliers. To mitigate this issue, we propose a noise removal strategy employing Wavelet Transform, a technique renowned for its efficacy in signal processing and noise elimination. The advantage of Wavelet Transform lies in its operational efficiency - it reduces both time and costs as it obviates the need for acquiring the true values of collected data. This study conducted a comparative performance evaluation between our Wavelet Transform-based approach and the Denoising Autoencoder, a prominent machine learning method for noise reduction.. The findings demonstrate that the Coiflets wavelet function outperforms the Denoising Autoencoder across various metrics, including Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Mean Squared Error (MSE). The superiority of the Coiflets function suggests that selecting an appropriate wavelet function tailored to the specific application environment can effectively address data quality issues caused by noise. This study underscores the potential of Wavelet Transform as a robust tool for enhancing the quality of water level data, thereby contributing to the reliability of water resource management decisions.

The Applicability of Conditional Generative Model Generating Groundwater Level Fluctuation Corresponding to Precipitation Pattern (조건부 생성모델을 이용한 강수 패턴에 따른 지하수위 생성 및 이의 활용에 관한 연구)

  • Jeong, Jiho;Jeong, Jina;Lee, Byung Sun;Song, Sung-Ho
    • Economic and Environmental Geology
    • /
    • v.54 no.1
    • /
    • pp.77-89
    • /
    • 2021
  • In this study, a method has been proposed to improve the performance of hydraulic property estimation model developed by Jeong et al. (2020). In their study, low-dimensional features of the annual groundwater level (GWL) fluctuation patterns extracted based on a Denoising autoencoder (DAE) was used to develop a regression model for predicting hydraulic properties of an aquifer. However, low-dimensional features of the DAE are highly dependent on the precipitation pattern even if the GWL is monitored at the same location, causing uncertainty in hydraulic property estimation of the regression model. To solve the above problem, a process for generating the GWL fluctuation pattern for conditioning the precipitation is proposed based on a conditional variational autoencoder (CVAE). The CVAE trains a statistical relationship between GWL fluctuation and precipitation pattern. The actual GWL and precipitation data monitored on a total of 71 monitoring stations over 10 years in South Korea was applied to validate the effect of using CVAE. As a result, the trained CVAE model reasonably generated GWL fluctuation pattern with the conditioning of various precipitation patterns for all the monitoring locations. Based on the trained CVAE model, the low-dimensional features of the GWL fluctuation pattern without interference of different precipitation patterns were extracted for all monitoring stations, and they were compared to the features extracted based on the DAE. Consequently, it can be confirmed that the statistical consistency of the features extracted using CVAE is improved compared to DAE. Thus, we conclude that the proposed method may be useful in extracting a more accurate feature of GWL fluctuation pattern affected solely by hydraulic characteristics of the aquifer, which would be followed by the improved performance of the previously developed regression model.

Nonlinear Vector Alignment Methodology for Mapping Domain-Specific Terminology into General Space (전문어의 범용 공간 매핑을 위한 비선형 벡터 정렬 방법론)

  • Kim, Junwoo;Yoon, Byungho;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.127-146
    • /
    • 2022
  • Recently, as word embedding has shown excellent performance in various tasks of deep learning-based natural language processing, researches on the advancement and application of word, sentence, and document embedding are being actively conducted. Among them, cross-language transfer, which enables semantic exchange between different languages, is growing simultaneously with the development of embedding models. Academia's interests in vector alignment are growing with the expectation that it can be applied to various embedding-based analysis. In particular, vector alignment is expected to be applied to mapping between specialized domains and generalized domains. In other words, it is expected that it will be possible to map the vocabulary of specialized fields such as R&D, medicine, and law into the space of the pre-trained language model learned with huge volume of general-purpose documents, or provide a clue for mapping vocabulary between mutually different specialized fields. However, since linear-based vector alignment which has been mainly studied in academia basically assumes statistical linearity, it tends to simplify the vector space. This essentially assumes that different types of vector spaces are geometrically similar, which yields a limitation that it causes inevitable distortion in the alignment process. To overcome this limitation, we propose a deep learning-based vector alignment methodology that effectively learns the nonlinearity of data. The proposed methodology consists of sequential learning of a skip-connected autoencoder and a regression model to align the specialized word embedding expressed in each space to the general embedding space. Finally, through the inference of the two trained models, the specialized vocabulary can be aligned in the general space. To verify the performance of the proposed methodology, an experiment was performed on a total of 77,578 documents in the field of 'health care' among national R&D tasks performed from 2011 to 2020. As a result, it was confirmed that the proposed methodology showed superior performance in terms of cosine similarity compared to the existing linear vector alignment.

Real data-based active sonar signal synthesis method (실데이터 기반 능동 소나 신호 합성 방법론)

  • Yunsu Kim;Juho Kim;Jongwon Seok;Jungpyo Hong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.1
    • /
    • pp.9-18
    • /
    • 2024
  • The importance of active sonar systems is emerging due to the quietness of underwater targets and the increase in ambient noise due to the increase in maritime traffic. However, the low signal-to-noise ratio of the echo signal due to multipath propagation of the signal, various clutter, ambient noise and reverberation makes it difficult to identify underwater targets using active sonar. Attempts have been made to apply data-based methods such as machine learning or deep learning to improve the performance of underwater target recognition systems, but it is difficult to collect enough data for training due to the nature of sonar datasets. Methods based on mathematical modeling have been mainly used to compensate for insufficient active sonar data. However, methodologies based on mathematical modeling have limitations in accurately simulating complex underwater phenomena. Therefore, in this paper, we propose a sonar signal synthesis method based on a deep neural network. In order to apply the neural network model to the field of sonar signal synthesis, the proposed method appropriately corrects the attention-based encoder and decoder to the sonar signal, which is the main module of the Tacotron model mainly used in the field of speech synthesis. It is possible to synthesize a signal more similar to the actual signal by training the proposed model using the dataset collected by arranging a simulated target in an actual marine environment. In order to verify the performance of the proposed method, Perceptual evaluation of audio quality test was conducted and within score difference -2.3 was shown compared to actual signal in a total of four different environments. These results prove that the active sonar signal generated by the proposed method approximates the actual signal.

Prediction of multipurpose dam inflow utilizing catchment attributes with LSTM and transformer models (유역정보 기반 Transformer및 LSTM을 활용한 다목적댐 일 단위 유입량 예측)

  • Kim, Hyung Ju;Song, Young Hoon;Chung, Eun Sung
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.7
    • /
    • pp.437-449
    • /
    • 2024
  • Rainfall-runoff prediction studies using deep learning while considering catchment attributes have been gaining attention. In this study, we selected two models: the Transformer model, which is suitable for large-scale data training through the self-attention mechanism, and the LSTM-based multi-state-vector sequence-to-sequence (LSTM-MSV-S2S) model with an encoder-decoder structure. These models were constructed to incorporate catchment attributes and predict the inflow of 10 multi-purpose dam watersheds in South Korea. The experimental design consisted of three training methods: Single-basin Training (ST), Pretraining (PT), and Pretraining-Finetuning (PT-FT). The input data for the models included 10 selected watershed attributes along with meteorological data. The inflow prediction performance was compared based on the training methods. The results showed that the Transformer model outperformed the LSTM-MSV-S2S model when using the PT and PT-FT methods, with the PT-FT method yielding the highest performance. The LSTM-MSV-S2S model showed better performance than the Transformer when using the ST method; however, it showed lower performance when using the PT and PT-FT methods. Additionally, the embedding layer activation vectors and raw catchment attributes were used to cluster watersheds and analyze whether the models learned the similarities between them. The Transformer model demonstrated improved performance among watersheds with similar activation vectors, proving that utilizing information from other pre-trained watersheds enhances the prediction performance. This study compared the suitable models and training methods for each multi-purpose dam and highlighted the necessity of constructing deep learning models using PT and PT-FT methods for domestic watersheds. Furthermore, the results confirmed that the Transformer model outperforms the LSTM-MSV-S2S model when applying PT and PT-FT methods.