• Title/Summary/Keyword: 시퀀스 데이터

Search Result 410, Processing Time 0.029 seconds

Knowledge Distillation based-on Internal/External Correlation Learning

  • Hun-Beom Bak;Seung-Hwan Bae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.31-39
    • /
    • 2023
  • In this paper, we propose an Internal/External Knowledge Distillation (IEKD), which utilizes both external correlations between feature maps of heterogeneous models and internal correlations between feature maps of the same model for transferring knowledge from a teacher model to a student model. To achieve this, we transform feature maps into a sequence format and extract new feature maps suitable for knowledge distillation by considering internal and external correlations through a transformer. We can learn both internal and external correlations by distilling the extracted feature maps and improve the accuracy of the student model by utilizing the extracted feature maps with feature matching. To demonstrate the effectiveness of our proposed knowledge distillation method, we achieved 76.23% Top-1 image classification accuracy on the CIFAR-100 dataset with the "ResNet-32×4/VGG-8" teacher and student combination and outperformed the state-of-the-art KD methods.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

A Benchmark of Micro Parallel Computing Technology for Real-time Control in Smart Farm (MPICH vs OpenMP) (제목을스마트 시설환경 실시간 제어를 위한 마이크로 병렬 컴퓨팅 기술 분석)

  • Min, Jae-Ki;Lee, DongHoon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2017.04a
    • /
    • pp.161-161
    • /
    • 2017
  • 스마트 시설환경의 제어 요소는 난방기, 창 개폐, 수분/양액 밸브 개폐, 환풍기, 제습기 등 직접적으로 시설환경의 조절에 관여하는 인자와 정보 교환을 위한 통신, 사용자 인터페이스 등 간접적으로 제어에 관련된 요소들이 복합적으로 존재한다. PID 제어와 같이 하는 수학적 논리를 바탕으로 한 제어와 전문 관리자의 지식을 기반으로 한 비선형 학습 모델에 의한 제어 등이 공존할 수 있다. 이러한 다양한 요소들을 복합적으로 연동시키기 위해선 기존의 시퀀스 기반 제어 방식에는 한계가 있을 수 있다. 관행의 방식과 같이 시계열 상에서 획득한 충분한 데이터를 이용하여 제어의 양과 시점을 결정하는 방식은 예외 상황에 충분히 대처하기 어려운 단점이 있을 수 있다. 이러한 예외 상황은 자연적인 조건의 변화에 따라 불가피하게 발생하는 경우와 시스템의 오류에 기인하는 경우로 나뉠 수 있다. 본 연구에서는 실시간으로 변하는 시설환경 내의 다양한 환경요소를 실시간으로 분석하고 상응하는 제어를 수행하여 수학적이며 예측 가능한 논리에 의해 준비된 제어시스템을 보완할 방법을 연구하였다. 과거의 고성능 컴퓨팅(HPC; High Performance Computing)은 다수의 컴퓨터를 고속 네트워크로 연동하여 집적적으로 연산능력을 향상시킨 기술로 비용과 규모의 측면에서 많은 투자를 필요로 하는 첨단 고급 기술이었다. 핸드폰과 모바일 장비의 발달로 인해 소형 마이크로프로세서가 발달하여 근래 2 Ghz의 클럭 속도에 이르는 어플리케이션 프로세서(AP: Application Processor)가 등장하기도 하였다. 상대적으로 낮은 성능에도 불구하고 저전력 소모와 플랫폼의 소형화를 장점으로 한 AP를 시설환경의 실시간 제어에 응용하기 위한 방안을 연구하였다. CPU의 클럭, 메모리의 양, 코어의 수량을 다음과 같이 달리한 3가지 시스템을 비교하여 AP를 이용한 마이크로 클러스터링 기술의 성능을 비교하였다.1) 1.5 Ghz, 8 Processors, 32 Cores, 1GByte/Processor, 32Bit Linux(ARMv71). 2) 2.0 Ghz, 4 Processors, 32 Cores, 2GByte/Processor, 32Bit Linux(ARMv71). 3) 1.5 Ghz, 8 Processors, 32 Cores, 2GByte/Processor, 64Bit Linux(Arch64). 병렬 컴퓨팅을 위한 개발 라이브러리로 MPICH(www.mpich.org)와 Open-MP(www.openmp.org)를 이용하였다. 2,500,000,000에 이르는 정수 중 소수를 구하는 연산에 소요된 시간은 1)17초, 2)13초, 3)3초 이었으며, $12800{\times}12800$ 크기의 행렬에 대한 2차원 FFT 연산 소요시간은 각각 1)10초, 2)8초, 3)2초 이었다. 3번 경우는 클럭속도가 3Gh에 이르는 상용 데스크탑의 연산 속도보다 빠르다고 평가할 수 있다. 라이브러리의 따른 결과는 근사적으로 동일하였다. 선행 연구에서 획득한 3차원 계측 데이터를 1초 단위로 3차원 선형 보간법을 수행한 경우 코어의 수를 4개 이하로 한 경우 근소한 차이로 동일한 결과를 보였으나, 코어의 수를 8개 이상으로 한 경우 앞선 결과와 유사한 경향을 보였다. 현장 보급 가능성, 구축비용 및 전력 소모 등을 종합적으로 고려한 AP 활용 마이크로 클러스터링 기술을 지속적으로 연구할 것이다.

  • PDF

An Improvement of Still Image Quality Based on Error Resilient Entropy Coding for Random Error over Wireless Communications (무선 통신상 임의 에러에 대한 에러내성 엔트로피 부호화에 기반한 정지영상의 화질 개선)

  • Kim Jeong-Sig;Lee Keun-Young
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.3 s.309
    • /
    • pp.9-16
    • /
    • 2006
  • Many image and video compression algorithms work by splitting the image into blocks and producing variable-length code bits for each block data. If variable-length code data are transmitted consecutively over error-prone channel without any error protection technique, the receiving decoder cannot decode the stream properly. So the standard image and video compression algorithms insert some redundant information into the stream to provide some protection against channel errors. One of redundancies is resynchronization marker, which enables the decoder to restart the decoding process from a known state in the event of transmission errors, but its usage should be restricted not to consume bandwidth too much. The Error Resilient Entropy Code(EREC) is well blown method which can regain synchronization without any redundant information. It can work with the overall prefix codes, which many image compression methods use. This paper proposes EREREC method to improve FEREC(Fast Error-Resilient Entropy Coding). It first calculates initial searching position according to bit lengths of consecutive blocks. Second, initial offset is decided using statistical distribution of long and short blocks, and initial offset can be adjusted to insure all offset sequence values can be used. The proposed EREREC algorithm can speed up the construction of FEREC slots, and can improve the compressed image quality in the event of transmission errors. The simulation result shows that the quality of transmitted image is enhanced about $0.3{\sim}3.5dB$ compared with the existing FEREC when random channel error happens.

Efficient Pipeline Architecture of CABAC in H.264/AVC (H.264/AVC의 효율적인 파이프라인 구조를 적용한 CABAC 하드웨어 설계)

  • Choi, Jin-Ha;Oh, Myung-Seok;Kim, Jae-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.7
    • /
    • pp.61-68
    • /
    • 2008
  • In this paper, we propose an efficient hardware architecture and algorithm to increase an encoding process rate and implement a hardware for CABAC (Context Adaptive Binary Arithmetic Coding) which is used with one of the entropy coding ways for the latest video compression technique, H.264/AVC (Advanced Video Coding). CABAC typically provides a better high compression performance maximum 15% compared with CAVLC. However, the complexity of operation of CABAC is significantly higher than the CAVLC. Because of complicated data dependency during the encoding process, the complexity of operation is higher. Therefore, various architectures were proposed to reduce an amount of operation. However, they have still latency on account of complicated data dependency. The proposed architecture has two techniques to implement efficient pipeline architecture. The one is quick calculation of 7, 8th bits used to calculate a probability is the first step in Binary arithmetic coding. The other is one step reduced pipeline arcbitecture when the type of the encoded symbols is MPS. By adopting these two techniques, the required processing time was reduced about 27-29% compared with previous architectures. It is designed in a hardware description language and total logic gate count is 19K using 0.18um standard cell library.

Design of a Deep Neural Network Model for Image Caption Generation (이미지 캡션 생성을 위한 심층 신경망 모델의 설계)

  • Kim, Dongha;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • In this paper, we propose an effective neural network model for image caption generation and model transfer. This model is a kind of multi-modal recurrent neural network models. It consists of five distinct layers: a convolution neural network layer for extracting visual information from images, an embedding layer for converting each word into a low dimensional feature, a recurrent neural network layer for learning caption sentence structure, and a multi-modal layer for combining visual and language information. In this model, the recurrent neural network layer is constructed by LSTM units, which are well known to be effective for learning and transferring sequence patterns. Moreover, this model has a unique structure in which the output of the convolution neural network layer is linked not only to the input of the initial state of the recurrent neural network layer but also to the input of the multimodal layer, in order to make use of visual information extracted from the image at each recurrent step for generating the corresponding textual caption. Through various comparative experiments using open data sets such as Flickr8k, Flickr30k, and MSCOCO, we demonstrated the proposed multimodal recurrent neural network model has high performance in terms of caption accuracy and model transfer effect.

Method of Detecting and Isolating an Attacker Node that Falsified AODV Routing Information in Ad-hoc Sensor Network (애드혹 센서 네트워크에서 AODV 라우팅 정보변조 공격노드 탐지 및 추출기법)

  • Lee, Jae-Hyun;Kim, Jin-Hee;Kwon, Kyung-Hee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.12
    • /
    • pp.2293-2300
    • /
    • 2008
  • In ad-hoc sensor network, AODV routing information is disclosed to other nodes because AODV protocol doesn't have any security mechanisms. The problem of AODV is that an attacker can falsify the routing information in RREQ packet. If an attacker broadcasts the falsified packet, other nodes will update routing table based on the falsified one so that the path passing through the attacker itself can be considered as a shortest path. In this paper, we design the routing-information-spoofing attack such as falsifying source sequence number and hop count fields in RREQ packet. And we suggest an efficient scheme for detecting the attackers and isolating those nodes from the network without extra security modules. The proposed scheme doesn't employ cryptographic algorithm and authentication to reduce network overhead. We used NS-2 simulation to evaluate the network performance. And we analyzed the simulation results on three cases such as an existing normal AODV, AODV under the attack and proposed AODV. Simulation results using NS2 show that the AODV using proposed scheme can protect the routing-information-spoofing attack and the total n umber of received packets for destination node is almost same as the existing norm at AODV.

A Study on Motion Estimation Encoder Supporting Variable Block Size for H.264/AVC (H.264/AVC용 가변 블록 크기를 지원하는 움직임 추정 부호기의 연구)

  • Kim, Won-Sam;Sohn, Seung-Il
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.10
    • /
    • pp.1845-1852
    • /
    • 2008
  • The key elements of inter prediction are motion estimation(ME) and motion compensation(MC). Motion estimation is to find the optimum motion vectors, not only by using a distance criteria like the SAD, but also by taking into account the resulting number of 비트s in the 비트 stream. Motion compensation is compensate for movement of blocks of current frame. Inter-prediction Encoding is always the main bottleneck in high-quality streaming applications. Therefore, in real-time streaming applications, dedicated hardware for executing Inter-prediction is required. In this paper, we studied a motion estimator(ME) for H.264/AVC. The designed motion estimator is based on 2-D systolic array and it connects processing elements for fast SAD(Sum of Absolute Difference) calculation in parallel. By providing different path for the upper and lower lesion of each reference data and adjusting the input sequence, consecutive calculation for motion estimation is executed without pipeline stall. With data reuse technique, it reduces memory access, and there is no extra delay for finding optimal partitions and motion vectors. The motion estimator supports variable-block size and takes 328 cycles for macro-block calculation. The proposed architecture is local memory-free different from paper [6] using local memory. This motion estimation encoder can be applicable to real-time video processing.

Design and Performance Evaluation of Digital Twin Prototype Based on Biomass Plant (바이오매스 플랜트기반 디지털트윈 프로토타입 설계 및 성능 평가)

  • Chae-Young Lim;Chae-Eun Yeo;Seong-Yool Ahn;Myung-Ok Lee;Ho-Jin Sung
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.935-940
    • /
    • 2023
  • Digital-twin technology is emerging as an innovative solution for all industries, including manufacturing and production lines. Therefore, this paper optimizes all the energy used in a biomass plant based on unused resources. We will then implement a digital-twin prototype for biomass plants and evaluate its performance in order to improve the efficiency of plant operations. The proposed digital-twin prototype applies a standard communication platform between the framework and the gateway and is implemented to enable real-time collaboration. and, define the message sequence between the client server and the gateway. Therefore, an interface is implemented to enable communication with the host server. In order to verify the performance of the proposed prototype, we set up a virtual environment to collect data from the server and perform a data collection evaluation. As a result, it was confirmed that the proposed framework can contribute to energy optimization and improvement of operational efficiency when applied to biomass plants.

${T_2}weighted$- Half courier Echo Planar Imaging

  • 김치영;김휴정;안창범
    • Investigative Magnetic Resonance Imaging
    • /
    • v.5 no.1
    • /
    • pp.57-65
    • /
    • 2001
  • Purpose : $T_2$-weighted half courier Echo Planar Imaging (T2HEPI) method is proposed to reduce measurement time of existing EPI by a factor of 2. In addition, high $T_2$ contrast is obtained for clinical applications. High resolution single-shot EPI images with $T_2$ contrast are obtained with $128{\times}128$ matrix size by the proposed method. Materials and methods : In order to reduce measurement time in EPI, half courier space is measured, and rest of half courier data is obtained by conjugate symmetric filling. Thus high resolution single shot EPI image with $128{\times}128$ matrix size is obtained with 64 echoes. By the arrangement of phase encoding gradients, high $T_2$ weighted images are obtained. The acquired data in k-space are shifted if there exists residual gradient field due to eddy current along phase encoding gradient, which results in a serious problem in the reconstructed image. The residual field is estimated by the correlation coefficient between the echo signal for dc and the corresponding reference data acquired during the pre-scan. Once the residual gradient field is properly estimated, it can be removed by the adjustment of initial phase encoding gradient field between $70^{\circ}$ and $180^{\circ}$ rf pulses. Results : The suggested T2EPl is implemented in a 1.0 Tela whole body MRI system. Experiments are done with the effective echo times of 72ms and 96ms with single shot acquisitions. High resolution($128{\times}128$) volunteer head images with high $T_2$ contrast are obtained in a single scan by the proposed method. Conclusion : Using the half courier technique, higher resolution EPI images are obtained with matrix size of $128{\times}128$ in a single scan. Furthermore $T_2$ contrast is controlled by the effective echo time. Since the suggested method can be implemented by software alone (pulse sequence and corresponding tuning and reconstruction algorithms) without addition of special hardware, it can be widely used in existing MRI systems.

  • PDF