• Title/Summary/Keyword: Encoder Model

Search Result 361, Processing Time 0.03 seconds

WiFi CSI Data Preprocessing and Augmentation Techniques in Indoor People Counting using Deep Learning (딥러닝을 활용한 실내 사람 수 추정을 위한 WiFi CSI 데이터 전처리와 증강 기법)

  • Kim, Yeon-Ju;Kim, Seungku
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1890-1897
    • /
    • 2021
  • People counting is an important technology to provide application services such as smart home, smart building, smart car, etc. Due to the social distancing of COVID-19, the people counting technology attracted public attention. People counting system can be implemented in various ways such as camera, sensor, wireless, etc. according to service requirements. People counting system using WiFi AP uses WiFi CSI data that reflects multipath information. This technology is an effective solution implementing indoor with low cost. The conventional WiFi CSI-based people counting technologies have low accuracy that obstructs the high quality service. This paper proposes a deep learning people counting system based on WiFi CSI data. Data preprocessing using auto-encoder, data augmentation that transform WiFi CSI data, and a proposed deep learning model improve the accuracy of people counting. In the experimental result, the proposed approach shows 89.29% accuracy in 6 subjects.

Personalized Chit-chat Based on Language Models (언어 모델 기반 페르소나 대화 모델)

  • Jang, Yoonna;Oh, Dongsuk;Lim, Jungwoo;Lim, Heuiseok
    • Annual Conference on Human and Language Technology
    • /
    • 2020.10a
    • /
    • pp.491-494
    • /
    • 2020
  • 최근 언어 모델(Language model)의 기술이 발전함에 따라, 자연어처리 분야의 많은 연구들이 좋은 성능을 내고 있다. 정해진 주제 없이 인간과 잡담을 나눌 수 있는 오픈 도메인 대화 시스템(Open-domain dialogue system) 분야에서 역시 이전보다 더 자연스러운 발화를 생성할 수 있게 되었다. 언어 모델의 발전은 응답 선택(Response selection) 분야에서도 모델이 맥락에 알맞은 답변을 선택하도록 하는 데 기여를 했다. 하지만, 대화 모델이 답변을 생성할 때 일관성 없는 답변을 만들거나, 구체적이지 않고 일반적인 답변만을 하는 문제가 대두되었다. 이를 해결하기 위하여 화자의 개인화된 정보에 기반한 대화인 페르소나(Persona) 대화 데이터 및 태스크가 연구되고 있다. 페르소나 대화 태스크에서는 화자마다 주어진 페르소나가 있고, 대화를 할 때 주어진 페르소나와 일관성이 있는 답변을 선택하거나 생성해야 한다. 이에 우리는 대용량의 코퍼스(Corpus)에 사전 학습(Pre-trained) 된 언어 모델을 활용하여 더 적절한 답변을 선택하는 페르소나 대화 시스템에 대하여 논의한다. 언어 모델 중 자기 회귀(Auto-regressive) 방식으로 모델링을 하는 GPT-2, DialoGPT와 오토인코더(Auto-encoder)를 이용한 BERT, 두 모델이 결합되어 있는 구조인 BART가 실험에 활용되었다. 이와 같이 본 논문에서는 여러 종류의 언어 모델을 페르소나 대화 태스크에 대해 비교 실험을 진행했고, 그 결과 Hits@1 점수에서 BERT가 가장 우수한 성능을 보이는 것을 확인할 수 있었다.

  • PDF

Side-Channel Archive Framework Using Deep Learning-Based Leakage Compression (딥러닝을 이용한 부채널 데이터 압축 프레임 워크)

  • Sangyun Jung;Sunghyun Jin;Heeseok Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.3
    • /
    • pp.379-392
    • /
    • 2024
  • With the rapid increase in data, saving storage space and improving the efficiency of data transmission have become critical issues, making the research on the efficiency of data compression technologies increasingly important. Lossless algorithms can precisely restore original data but have limited compression ratios, whereas lossy algorithms provide higher compression rates at the expense of some data loss. There has been active research in data compression using deep learning-based algorithms, especially the autoencoder model. This study proposes a new side-channel analysis data compressor utilizing autoencoders. This compressor achieves higher compression rates than Deflate while maintaining the characteristics of side-channel data. The encoder, using locally connected layers, effectively preserves the temporal characteristics of side-channel data, and the decoder maintains fast decompression times with a multi-layer perceptron. Through correlation power analysis, the proposed compressor has been proven to compress data without losing the characteristics of side-channel data.

A method for metadata extraction from a collection of records using Named Entity Recognition in Natural Language Processing (자연어 처리의 개체명 인식을 통한 기록집합체의 메타데이터 추출 방안)

  • Chiho Song
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.24 no.2
    • /
    • pp.65-88
    • /
    • 2024
  • This pilot study explores a method of extracting metadata values and descriptions from records using named entity recognition (NER), a technique in natural language processing (NLP), a subfield of artificial intelligence. The study focuses on handwritten records from the Guro Industrial Complex, produced during the 1960s and 1970s, comprising approximately 1,200 pages and 80,000 words. After the preprocessing process of the records, which included digitization, the study employed a publicly available language API based on Google's Bidirectional Encoder Representations from Transformers (BERT) language model to recognize entity names within the text. As a result, 173 names of people and 314 of organizations and institutions were extracted from the Guro Industrial Complex's past records. These extracted entities are expected to serve as direct search terms for accessing the contents of the records. Furthermore, the study identified challenges that arose when applying the theoretical methodology of NLP to real-world records consisting of semistructured text. It also presents potential solutions and implications to consider when addressing these issues.

Updated Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging for Medical Professionals

  • Kiduk Kim;Kyungjin Cho;Ryoungwoo Jang;Sunggu Kyung;Soyoung Lee;Sungwon Ham;Edward Choi;Gil-Sun Hong;Namkug Kim
    • Korean Journal of Radiology
    • /
    • v.25 no.3
    • /
    • pp.224-242
    • /
    • 2024
  • The emergence of Chat Generative Pre-trained Transformer (ChatGPT), a chatbot developed by OpenAI, has garnered interest in the application of generative artificial intelligence (AI) models in the medical field. This review summarizes different generative AI models and their potential applications in the field of medicine and explores the evolving landscape of Generative Adversarial Networks and diffusion models since the introduction of generative AI models. These models have made valuable contributions to the field of radiology. Furthermore, this review also explores the significance of synthetic data in addressing privacy concerns and augmenting data diversity and quality within the medical domain, in addition to emphasizing the role of inversion in the investigation of generative models and outlining an approach to replicate this process. We provide an overview of Large Language Models, such as GPTs and bidirectional encoder representations (BERTs), that focus on prominent representatives and discuss recent initiatives involving language-vision models in radiology, including innovative large language and vision assistant for biomedicine (LLaVa-Med), to illustrate their practical application. This comprehensive review offers insights into the wide-ranging applications of generative AI models in clinical research and emphasizes their transformative potential.

Density map estimation based on deep-learning for pest control drone optimization (드론 방제의 최적화를 위한 딥러닝 기반의 밀도맵 추정)

  • Baek-gyeom Seong;Xiongzhe Han;Seung-hwa Yu;Chun-gu Lee;Yeongho Kang;Hyun Ho Woo;Hunsuk Lee;Dae-Hyun Lee
    • Journal of Drive and Control
    • /
    • v.21 no.2
    • /
    • pp.53-64
    • /
    • 2024
  • Global population growth has resulted in an increased demand for food production. Simultaneously, aging rural communities have led to a decrease in the workforce, thereby increasing the demand for automation in agriculture. Drones are particularly useful for unmanned pest control fields. However, the current method of uniform spraying leads to environmental damage due to overuse of pesticides and drift by wind. To address this issue, it is necessary to enhance spraying performance through precise performance evaluation. Therefore, as a foundational study aimed at optimizing drone-based pest control technologies, this research evaluated water-sensitive paper (WSP) via density map estimation using convolutional neural networks (CNN) with a encoder-decoder structure. To achieve more accurate estimation, this study implemented multi-task learning, incorporating an additional classifier for image segmentation alongside the density map estimation classifier. The proposed model in this study resulted in a R-squared (R2) of 0.976 for coverage area in the evaluation data set, demonstrating satisfactory performance in evaluating WSP at various density levels. Further research is needed to improve the accuracy of spray result estimations and develop a real-time assessment technology in the field.

Fast Coding Unit Decision Algorithm Based on Region of Interest by Motion Vector in HEVC (움직임 벡터에 의한 관심영역 기반의 HEVC 고속 부호화 유닛 결정 방법)

  • Hwang, In Seo;Sunwoo, Myung Hoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.11
    • /
    • pp.41-47
    • /
    • 2016
  • High efficiency video coding (HEVC) employs a coding tree unit (CTU) to improve the coding efficiency. A CTU consists of coding units (CU), prediction units (PU), and transform units (TU). All possible block partitions should be performed on each depth level to obtain the best combination of CUs, PUs, and TUs. To reduce the complexity of block partitioning process, this paper proposes the PU mode skip algorithm with region of interest (RoI) selection using motion vector. In addition, this paper presents the CU depth level skip algorithm using the co-located block information in the previously encoded frames. First, the RoI selection algorithm distinguishes between dynamic CTUs and static CTUs and then, asymmetric motion partitioning (AMP) blocks are skipped in the static CTUs. Second, the depth level skip algorithm predicts the most probable target depth level from average depth in one CTU. The experimental results show that the proposed fast CU decision algorithm can reduce the total encoding time up to 44.8% compared to the HEVC test model (HM) 14.0 reference software encoder. Moreover, the proposed algorithm shows only 2.5% Bjontegaard delta bit rate (BDBR) loss.

Deep Learning Algorithm and Prediction Model Associated with Data Transmission of User-Participating Wearable Devices (사용자 참여형 웨어러블 디바이스 데이터 전송 연계 및 딥러닝 대사증후군 예측 모델)

  • Lee, Hyunsik;Lee, Woongjae;Jeong, Taikyeong
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.25 no.6
    • /
    • pp.33-45
    • /
    • 2020
  • This paper aims to look at the perspective that the latest cutting-edge technologies are predicting individual diseases in the actual medical environment in a situation where various types of wearable devices are rapidly increasing and used in the healthcare domain. Through the process of collecting, processing, and transmitting data by merging clinical data, genetic data, and life log data through a user-participating wearable device, it presents the process of connecting the learning model and the feedback model in the environment of the Deep Neural Network. In the case of the actual field that has undergone clinical trial procedures of medical IT occurring in such a high-tech medical field, the effect of a specific gene caused by metabolic syndrome on the disease is measured, and clinical information and life log data are merged to process different heterogeneous data. That is, it proves the objective suitability and certainty of the deep neural network of heterogeneous data, and through this, the performance evaluation according to the noise in the actual deep learning environment is performed. In the case of the automatic encoder, we proved that the accuracy and predicted value varying per 1,000 EPOCH are linearly changed several times with the increasing value of the variable.

Real-time Segmentation of Black Ice Region in Infrared Road Images

  • Li, Yu-Jie;Kang, Sun-Kyoung;Jung, Sung-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.2
    • /
    • pp.33-42
    • /
    • 2022
  • In this paper, we proposed a deep learning model based on multi-scale dilated convolution feature fusion for the segmentation of black ice region in road image to send black ice warning to drivers in real time. In the proposed multi-scale dilated convolution feature fusion network, different dilated ratio convolutions are connected in parallel in the encoder blocks, and different dilated ratios are used in different resolution feature maps, and multi-layer feature information are fused together. The multi-scale dilated convolution feature fusion improves the performance by diversifying and expending the receptive field of the network and by preserving detailed space information and enhancing the effectiveness of diated convolutions. The performance of the proposed network model was gradually improved with the increase of the number of dilated convolution branch. The mIoU value of the proposed method is 96.46%, which was higher than the existing networks such as U-Net, FCN, PSPNet, ENet, LinkNet. The parameter was 1,858K, which was 6 times smaller than the existing LinkNet model. From the experimental results of Jetson Nano, the FPS of the proposed method was 3.63, which can realize segmentation of black ice field in real time.

Prediction of Music Generation on Time Series Using Bi-LSTM Model (Bi-LSTM 모델을 이용한 음악 생성 시계열 예측)

  • Kwangjin, Kim;Chilwoo, Lee
    • Smart Media Journal
    • /
    • v.11 no.10
    • /
    • pp.65-75
    • /
    • 2022
  • Deep learning is used as a creative tool that could overcome the limitations of existing analysis models and generate various types of results such as text, image, and music. In this paper, we propose a method necessary to preprocess audio data using the Niko's MIDI Pack sound source file as a data set and to generate music using Bi-LSTM. Based on the generated root note, the hidden layers are composed of multi-layers to create a new note suitable for the musical composition, and an attention mechanism is applied to the output gate of the decoder to apply the weight of the factors that affect the data input from the encoder. Setting variables such as loss function and optimization method are applied as parameters for improving the LSTM model. The proposed model is a multi-channel Bi-LSTM with attention that applies notes pitch generated from separating treble clef and bass clef, length of notes, rests, length of rests, and chords to improve the efficiency and prediction of MIDI deep learning process. The results of the learning generate a sound that matches the development of music scale distinct from noise, and we are aiming to contribute to generating a harmonistic stable music.