• Title/Summary/Keyword: Memory Architecture

Search Result 935, Processing Time 0.028 seconds

Efficient GPU Framework for Adaptive and Continuous Signed Distance Field Construction, and Its Applications

  • Kim, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.63-69
    • /
    • 2022
  • In this paper, we propose a new GPU-based framework for quickly calculating adaptive and continuous SDF(Signed distance fields), and examine cases related to rendering/collision processing using them. The quadtree constructed from the triangle mesh is transferred to the GPU memory, and the Euclidean distance to the triangle is processed in parallel for each thread by using it to find the shortest continuous distance without discontinuity in the adaptive grid space. In this process, it is shown through experiments that the cut-off view of the adaptive distance field, the distance value inquiry at a specific location, real-time raytracing, and collision handling can be performed quickly and efficiently. Using the proposed method, the adaptive sign distance field can be calculated quickly in about 1 second even on a high polygon mesh, so it is a method that can be fully utilized not only for rigid bodies but also for deformable bodies. It shows the stability of the algorithm through various experimental results whether it can accurately sample and represent distance values in various models.

Application of Artificial Neural Network to Flamelet Library for Gaseous Hydrogen/Liquid Oxygen Combustion at Supercritical Pressure (초임계 압력조건에서 기체수소-액체산소 연소해석의 층류화염편 라이브러리에 대한 인공신경망 학습 적용)

  • Jeon, Tae Jun;Park, Tae Seon
    • Journal of the Korean Society of Propulsion Engineers
    • /
    • v.25 no.6
    • /
    • pp.1-11
    • /
    • 2021
  • To develop an efficient procedure related to the flamelet library, the machine learning process based on artificial neural network(ANN) is applied for the gaseous hydrogen/liquid oxygen combustor under a supercritical pressure condition. For hidden layers, 25 combinations based on Rectified Linear Unit(ReLU) and hyperbolic tangent are adopted to find an optimum architecture in terms of the computational efficiency and the training performance. For activation functions, the hyperbolic tangent is proper to get the high learning performance for accurate properties. A transformation learning data is proposed to improve the training performance. When the optimal node is arranged for the 4 hidden layers, it is found to be the most efficient in terms of training performance and computational cost. Compared to the interpolation procedure, the ANN procedure reduces computational time and system memory by 37% and 99.98%, respectively.

Microcode based Controller for Compact CNN Accelerators Aimed at Mobile Devices (모바일 디바이스를 위한 소형 CNN 가속기의 마이크로코드 기반 컨트롤러)

  • Na, Yong-Seok;Son, Hyun-Wook;Kim, Hyung-Won
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.355-366
    • /
    • 2022
  • This paper proposes a microcode-based neural network accelerator controller for artificial intelligence accelerators that can be reconstructed using a programmable architecture and provide the advantages of low-power and ultra-small chip size. In order for the target accelerator to support various neural network models, the neural network model can be converted into microcode through microcode compiler and mounted on accelerator to control the operators of the accelerator such as datapath and memory access. While the proposed controller and accelerator can run various CNN models, in this paper, we tested them using the YOLOv2-Tiny CNN model. Using a system clock of 200 MHz, the Controller and accelerator achieved an inference time of 137.9 ms/image for VOC 2012 dataset to detect object, 99.5ms/image for mask detection dataset to detect wearing mask. When implementing an accelerator equipped with the proposed controller as a silicon chip, the gate count is 618,388, which corresponds to 65.5% reduction in chip area compared with an accelerator employing a CPU-based controller (RISC-V).

Twin models for high-resolution visual inspections

  • Seyedomid Sajedi;Kareem A. Eltouny;Xiao Liang
    • Smart Structures and Systems
    • /
    • v.31 no.4
    • /
    • pp.351-363
    • /
    • 2023
  • Visual structural inspections are an inseparable part of post-earthquake damage assessments. With unmanned aerial vehicles (UAVs) establishing a new frontier in visual inspections, there are major computational challenges in processing the collected massive amounts of high-resolution visual data. We propose twin deep learning models that can provide accurate high-resolution structural components and damage segmentation masks efficiently. The traditional approach to cope with high memory computational demands is to either uniformly downsample the raw images at the price of losing fine local details or cropping smaller parts of the images leading to a loss of global contextual information. Therefore, our twin models comprising Trainable Resizing for high-resolution Segmentation Network (TRS-Net) and DmgFormer approaches the global and local semantics from different perspectives. TRS-Net is a compound, high-resolution segmentation architecture equipped with learnable downsampler and upsampler modules to minimize information loss for optimal performance and efficiency. DmgFormer utilizes a transformer backbone and a convolutional decoder head with skip connections on a grid of crops aiming for high precision learning without downsizing. An augmented inference technique is used to boost performance further and reduce the possible loss of context due to grid cropping. Comprehensive experiments have been performed on the 3D physics-based graphics models (PBGMs) synthetic environments in the QuakeCity dataset. The proposed framework is evaluated using several metrics on three segmentation tasks: component type, component damage state, and global damage (crack, rebar, spalling). The models were developed as part of the 2nd International Competition for Structural Health Monitoring.

Analysis of the Impact of Host Resource Exhaustion Attacks in a Container Environment (컨테이너 환경에서의 호스트 자원 고갈 공격 영향 분석)

  • Jun-hee Lee;Jae-hyun Nam;Jin-woo Kim
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.1
    • /
    • pp.87-97
    • /
    • 2023
  • Containers are an emerging virtualization technology that can build an isolated environment more lightweight and faster than existing virtual machines. For that reason, many organizations have recently adopted them for their services. Yet, the container architecture has also exposed many security problems since all containers share the same OS kernel. In this work, we focus on the fact that an attacker can abuse host resources to make them unavailable to benign containers-also known as host resource exhaustion attacks. Then, we analyze the impact of host resource exhaustion attacks through real attack scenarios exhausting critical host resources, such as CPU, memory, disk space, process ID, and sockets in Docker, the most popular container platform. We propose five attack scenarios performed in several different host environments and container images. The result shows that three of them put other containers in denial of service.

Network Anomaly Traffic Detection Using WGAN-CNN-BiLSTM in Big Data Cloud-Edge Collaborative Computing Environment

  • Yue Wang
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.375-390
    • /
    • 2024
  • Edge computing architecture has effectively alleviated the computing pressure on cloud platforms, reduced network bandwidth consumption, and improved the quality of service for user experience; however, it has also introduced new security issues. Existing anomaly detection methods in big data scenarios with cloud-edge computing collaboration face several challenges, such as sample imbalance, difficulty in dealing with complex network traffic attacks, and difficulty in effectively training large-scale data or overly complex deep-learning network models. A lightweight deep-learning model was proposed to address these challenges. First, normalization on the user side was used to preprocess the traffic data. On the edge side, a trained Wasserstein generative adversarial network (WGAN) was used to supplement the data samples, which effectively alleviates the imbalance issue of a few types of samples while occupying a small amount of edge-computing resources. Finally, a trained lightweight deep learning network model is deployed on the edge side, and the preprocessed and expanded local data are used to fine-tune the trained model. This ensures that the data of each edge node are more consistent with the local characteristics, effectively improving the system's detection ability. In the designed lightweight deep learning network model, two sets of convolutional pooling layers of convolutional neural networks (CNN) were used to extract spatial features. The bidirectional long short-term memory network (BiLSTM) was used to collect time sequence features, and the weight of traffic features was adjusted through the attention mechanism, improving the model's ability to identify abnormal traffic features. The proposed model was experimentally demonstrated using the NSL-KDD, UNSW-NB15, and CIC-ISD2018 datasets. The accuracies of the proposed model on the three datasets were as high as 0.974, 0.925, and 0.953, respectively, showing superior accuracy to other comparative models. The proposed lightweight deep learning network model has good application prospects for anomaly traffic detection in cloud-edge collaborative computing architectures.

Speech Emotion Recognition in People at High Risk of Dementia

  • Dongseon Kim;Bongwon Yi;Yugwon Won
    • Dementia and Neurocognitive Disorders
    • /
    • v.23 no.3
    • /
    • pp.146-160
    • /
    • 2024
  • Background and Purpose: The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia. Methods: Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition. Results: Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%. Conclusions: The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

A Study on the Imitation and Transformation of Gugok-Wonlim Culture through Management of the Myungam Jeong Sik's Muyi-Gugok in Sancheong (명암(明庵) 정식(鄭拭)의 산청 무이구곡(武夷九曲) 원림경영을 통해 본 구곡문화의 모방과 변용)

  • Rho, Jae-Hyun
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.33 no.3
    • /
    • pp.84-94
    • /
    • 2015
  • This study is to examine how the admiration for Chutzu(朱子)'s achievement while he was preaching his policies after renouncing the world in Wuyi-Gugok(武夷九曲), Mt. Wuyi and Joseon-style transformation culture were unfolded and developed through Gugok management cases such as position, name, poetic diction, Jungsa(精舍) architecture and engraving of Muyi-Gugok set in Mt. Gugok, Snacheong. The results were as follows: Myungam(明庵) Jeong Sik(鄭拭, 1683~1746)'s Muyi-Gugok, which consists of Suhongkyo(垂虹橋, gok 1) - Oknyeobong(玉女峰) - Nhongwhaldam(弄月潭) - Nacwhadam(落花潭) - Daeeunbyeong(大隱屛) - Gwangpyungryea(光風瀨) - Jaewhaldae(霽月臺) - Gorooam(鼓樓巖) - Wharyongpok(臥龍瀑, gok 9) is the representative case where Chutzu's Wuyi-Gugok was exactly copied and fulfilled to the Joseon Dynasty. In a large frame, Gugok Wonlim culture, Myungam's Muyi-Gugok management has a will of succession of Dotong(道統) through admiration for Chutzu in a rigid way. Another name of Mt. Gugok is Mt. Muyi and Gugok's name is Muyi-Gugok and the residence existed between gok 4 and gok 5. In addition, the name of Jeongsa for Gugok management is also 'Muyi Jeongsa(武夷精舍)' and Gugok name and contents of Poetry are also similar and all of these are clear evidence that Myungam tries to copy Chutzu's Wuyi-Gugok to Mt. Gugok. Also, Gugok set before Myungam were located in Mt. Gugok and among them, verified four Gugok names are corresponded to those of Chutzu's Wuyi-Gugok and it tells that conforming behavior as one of admiration ways for Chutzu already arrived at Mt. Gugok before Myungam and this was an oppotunty to widen Mt. Gugok Muyi-Gugok's tradition and horizon. Also, considering that Myungam's gok 6, Gwangpyungryea and gok 7, Jewoldae are names from 'Gwangpungjewol(光風霽月)' Based on Chutzu's poem and they are closely related to Joseon's classical scholar spirit, they are associated with Joseon-style transformation of Chutzu's Muyi-Gugok. Meanwhile, gok 5 'Daeeunbyeong' was transformed to 'Nangaam(爛柯巖)' in gok 5 - "Deoksan-Gugok(德山九曲) of Jooko(竹塢) Ha Beom-Woon(河範運, 1792~1858) and those characters's engravings are handed down. In "Pome of Deoksan Gugok" transformed from Myungam's Muyi-Gugok, respect and admiration for Chutzu is weaken while Ha Beom-Woon admires Nammyeong(南冥) Cho shik(曺植, 1501~1572), a symbolic character of himself's school and from this, a movement to promote partisan unity is identified. After Myungam died, Muyi-Gugok in Mt. Gugok was transformed from a space to succeed Chutzu's Dotonga to one to commemorate the memory of ancient sages, but, it is a typicality case that widen the spectrum of Joseon's Gugok-Wonlim culture through Muyi-Gugok's imitation and transformation.

A Dynamic Prefetch Filtering Schemes to Enhance Usefulness Of Cache Memory (캐시 메모리의 유용성을 높이는 동적 선인출 필터링 기법)

  • Chon Young-Suk;Lee Byung-Kwon;Lee Chun-Hee;Kim Suk-Il;Jeon Joong-Nam
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.123-136
    • /
    • 2006
  • The prefetching technique is an effective way to reduce the latency caused memory access. However, excessively aggressive prefetch not only leads to cache pollution so as to cancel out the benefits of prefetch but also increase bus traffic leading to overall performance degradation. In this thesis, a prefetch filtering scheme is proposed which dynamically decides whether to commence prefetching by referring a filtering table to reduce the cache pollution due to unnecessary prefetches In this thesis, First, prefetch hashing table 1bitSC filtering scheme(PHT1bSC) has been shown to analyze the lock problem of the conventional scheme, this scheme such as conventional scheme used to be N:1 mapping, but it has the two state to 1bit value of each entries. A complete block address table filtering scheme(CBAT) has been introduced to be used as a reference for the comparative study. A prefetch block address lookup table scheme(PBALT) has been proposed as the main idea of this paper which exhibits the most exact filtering performance. This scheme has a length of the table the same as the PHT1bSC scheme, the contents of each entry have the fields the same as CBAT scheme recently, never referenced data block address has been 1:1 mapping a entry of the filter table. On commonly used prefetch schemes and general benchmarks and multimedia programs simulates change cache parameters. The PBALT scheme compared with no filtering has shown enhanced the greatest 22%, the cache miss ratio has been decreased by 7.9% by virtue of enhanced filtering accuracy compared with conventional PHT2bSC. The MADT of the proposed PBALT scheme has been decreased by 6.1% compared with conventional schemes to reduce the total execution time.