• 제목/요약/키워드: Memory machine

Search Result 491, Processing Time 0.023 seconds

A Study on 16 bit EISC Microprocessor (16 비트 EISC 마이크로 프로세서에 관한 연구)

  • 조경연
    • Journal of Korea Multimedia Society
    • /
    • v.3 no.2
    • /
    • pp.192-200
    • /
    • 2000
  • 8 bit and 16 bit microprocessors are widely used in the small sited control machine. The embedded microprocessors which is integrated on a single chip with the memory and I/O circuit must have simple hardware circuit and high code density. This paper proposes a 16 bit high code density EISC(Extendable Instruction Set Computer) microprocessor. SE1608 has 8 general purpose registers and 16 bit fixed length instruction set which has the short length offset and small immediate operand. By using an extend register and extend flag, the offset and immediate operand in instruction could be extended. SE1608 is implemented with 12,000 gate FPGA and all of its functions have been tested and verified at 8MHz. And the cross assembler, the cross C/C++compiler and the instruction simulator of the SE1608 have been designed and verified. This paper also proves that the code density$.$ of SE1608 shows 140% and 115% higher code density than 16 bit microprocessor H-8300 and MN10200 respectively, which is much higher than traditional microprocessors. As a consequence, the SE1608 is suitable for the embedded microprocessor since it requires less program memory to any other ones, and simple hardware circuit.

  • PDF

Performance Improvement of Parallel Processing System through Runtime Adaptation (실행시간 적응에 의한 병렬처리시스템의 성능개선)

  • Park, Dae-Yeon;Han, Jae-Seon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.26 no.7
    • /
    • pp.752-765
    • /
    • 1999
  • 대부분 병렬처리 시스템에서 성능 파라미터는 복잡하고 프로그램의 수행 시 예견할 수 없게 변하기 때문에 컴파일러가 프로그램 수행에 대한 최적의 성능 파라미터들을 컴파일 시에 결정하기가 힘들다. 본 논문은 병렬 처리 시스템의 프로그램 수행 시, 변화하는 시스템 성능 상태에 따라 전체 성능이 최적화로 적응하는 적응 수행 방식을 제안한다. 본 논문에서는 이 적응 수행 방식 중에 적응 프로그램 수행을 위한 이론적인 방법론 및 구현 방법에 대해 제안하고 적응 제어 수행을 위해 프로그램의 데이타 공유 단위에 대한 적응방식(적응 입도 방식)을 사용한다. 적응 프로그램 수행 방식은 프로그램 수행 시 하드웨어와 컴파일러의 도움으로 프로그램 자신이 최적의 성능을 얻을 수 있도록 적응하는 방식이다. 적응 제어 수행을 위해 수행 시에 병렬 분산 공유 메모리 시스템에서 프로세서 간 공유될 수 있은 데이타의 공유 상태에 따라 공유 데이타의 크기를 변화시키는 적응 입도 방식을 적용했다. 적응 입도 방식은 기존의 공유 메모리 시스템의 공유 데이타 단위의 통신 방식에 대단위 데이타의 전송 방식을 사용자의 입장에 투명하게 통합한 방식이다. 시뮬레이션 결과에 의하면 적응 입도 방식에 의해서 하드웨어 분산 공유 메모리 시스템보다 43%까지 성능이 개선되었다. Abstract On parallel machines, in which performance parameters change dynamically in complex and unpredictable ways, it is difficult for compilers to predict the optimal values of the parameters at compile time. Furthermore, these optimal values may change as the program executes. This paper addresses this problem by proposing adaptive execution that makes the program or control execution adapt in response to changes in machine conditions. Adaptive program execution makes it possible for programs to adapt themselves through the collaboration of the hardware and the compiler. For adaptive control execution, we applied the adaptive scheme to the granularity of sharing adaptive granularity. Adaptive granularity is a communication scheme that effectively and transparently integrates bulk transfer into the shared memory paradigm, with a varying granularity depending on the sharing behavior. Simulation results show that adaptive granularity improves performance up to 43% over the hardware implementation of distributed shared memory systems.

Analysis of Tensor Processing Unit and Simulation Using Python (텐서 처리부의 분석 및 파이썬을 이용한 모의실행)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.3
    • /
    • pp.165-171
    • /
    • 2019
  • The study of the computer architecture has shown that major improvements in price-to-energy performance stems from domain-specific hardware development. This paper analyzes the tensor processing unit (TPU) ASIC which can accelerate the reasoning of the artificial neural network (NN). The core device of the TPU is a MAC matrix multiplier capable of high-speed operation and software-managed on-chip memory. The execution model of the TPU can meet the reaction time requirements of the artificial neural network better than the existing CPU and the GPU execution models, with the small area and the low power consumption even though it has many MAC and large memory. Utilizing the TPU for the tensor flow benchmark framework, it can achieve higher performance and better power efficiency than the CPU or CPU. In this paper, we analyze TPU, simulate the Python modeled OpenTPU, and synthesize the matrix multiplication unit, which is the key hardware.

Automatic False-Alarm Labeling for Sensor Data

  • Adi, Taufik Nur;Bae, Hyerim;Wahid, Nur Ahmad
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.2
    • /
    • pp.139-147
    • /
    • 2019
  • A false alarm, which is an incorrect report of an emergency, could trigger an unnecessary action. The predictive maintenance framework developed in our previous work has a feature whereby a machine alarm is triggered based on sensor data evaluation. The sensor data evaluator performs three essential evaluation steps. First, it evaluates each sensor data value based on its threshold (lower and upper bound) and labels the data value as "alarm" when the threshold is exceeded. Second, it calculates the duration of the occurrence of the alarm. Finally, in the third step, a domain expert is required to assess the results from the previous two steps and to determine, thereby, whether the alarm is true or false. There are drawbacks of the current evaluation method. It suffers from a high false-alarm ratio, and moreover, given the vast amount of sensor data to be assessed by the domain expert, the process of evaluation is prolonged and inefficient. In this paper, we propose a method for automatic false-alarm labeling that mimics how the domain expert determines false alarms. The domain expert determines false alarms by evaluating two critical factors, specifically the duration of alarm occurrence and identification of anomalies before or while the alarm occurs. In our proposed method, Hierarchical Temporal Memory (HTM) is utilized to detect anomalies. It is an unsupervised approach that is suitable to our main data characteristic, which is the lack of an example of the normal form of sensor data. The result shows that the technique is effective for automatic labeling of false alarms in sensor data.

Prediction of Sea Water Condition Changes using LSTM Algorithm for the Fish Farm (LSTM 알고리즘을 이용한 양식장 해수 상태 변화 예측)

  • Rijayanti, Rita;Hwang, Mintae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.374-380
    • /
    • 2022
  • This paper shows the results of a study that predicts changes in seawater conditions in sea farms using machine learning-based long short term memory (LSTM) algorithms. Hardware was implemented using dissolved oxygen, salinity, nitrogen ion concentration, and water temperature measurement sensors to collect seawater condition information from sea farms, and transferred to a cloud-based Firebase database using LoRa communication. Using the developed hardware, seawater condition information around fish farms in Tongyeong and Geoje was collected, and LSTM algorithms were applied to learning results using these actual datasets to obtain predictive results showing 87% accuracy. Flask and REST APIs were used to provide users with predictive results for each of the four parameters, including dissolved oxygen. These predictive results are expected to help fishermen reduce significant damage caused by fish group death by providing changes in sea conditions in advance.

Forecasting volatility index by temporal convolutional neural network (Causal temporal convolutional neural network를 이용한 변동성 지수 예측)

  • Ji Won Shin;Dong Wan Shin
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.129-139
    • /
    • 2023
  • Forecasting volatility is essential to avoiding the risk caused by the uncertainties of an financial asset. Complicated financial volatility features such as ambiguity between non-stationarity and stationarity, asymmetry, long-memory, sudden fairly large values like outliers bring great challenges to volatility forecasts. In order to address such complicated features implicity, we consider machine leaning models such as LSTM (1997) and GRU (2014), which are known to be suitable for existing time series forecasting. However, there are the problems of vanishing gradients, of enormous amount of computation, and of a huge memory. To solve these problems, a causal temporal convolutional network (TCN) model, an advanced form of 1D CNN, is also applied. It is confirmed that the overall forecasting power of TCN model is higher than that of the RNN models in forecasting VIX, VXD, and VXN, the daily volatility indices of S&P 500, DJIA, Nasdaq, respectively.

Development of Artificial Intelligence-Based Remote-Sense Reflectance Prediction Model Using Long-Term GOCI Data (장기 GOCI 자료를 활용한 인공지능 기반 원격 반사도 예측 모델 개발)

  • Donguk Lee;Joo Hyung Ryu;Hyeong-Tae Jou;Geunho Kwak
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_2
    • /
    • pp.1577-1589
    • /
    • 2023
  • Recently, the necessity of predicting changes for monitoring ocean is widely recognized. In this study, we performed a time series prediction of remote-sensing reflectance (Rrs), which can indicate changes in the ocean, using Geostationary Ocean Color Imager (GOCI) data. Using GOCI-I data, we trained a multi-scale Convolutional Long-Short-Term-Memory (ConvLSTM) which is proposed in this study. Validation was conducted using GOCI-II data acquired at different periods from GOCI-I. We compared model performance with the existing ConvLSTM models. The results showed that the proposed model, which considers both spatial and temporal features, outperformed other models in predicting temporal trends of Rrs. We checked the temporal trends of Rrs learned by the model through long-term prediction results. Consequently, we anticipate that it would be available in periodic change detection.

Near-Five-Vector SVPWM Algorithm for Five-Phase Six-Leg Inverters under Unbalanced Load Conditions

  • Zheng, Ping;Wang, Pengfei;Sui, Yi;Tong, Chengde;Wu, Fan;Li, Tiecai
    • Journal of Power Electronics
    • /
    • v.14 no.1
    • /
    • pp.61-73
    • /
    • 2014
  • Multiphase machines are characterized by high power density, enhanced fault-tolerant capacity, and low torque pulsation. For a voltage source inverter supplied multiphase machine, the probability of load imbalances becomes greater and unwanted low-order stator voltage harmonics occur. This paper deals with the PWM control of multiphase inverters under unbalanced load conditions and it proposes a novel near-five-vector SVPWM algorithm based on the five-phase six-leg inverter. The proposed algorithm can output symmetrical phase voltages under unbalanced load conditions, which is not possible for the conventional SVPWM algorithms based on the five-phase five-leg inverters. The cause of extra harmonics in the phase voltages is analyzed, and an xy coordinate system orthogonal to the ${\alpha}{\beta}z$ coordinate system is introduced to eliminate low-order harmonics in the output phase voltages. Moreover, the digital implementation of the near-five-vector SVPWM algorithm is discussed, and the optimal approach with reduced complexity and low execution time is elaborated. A comparison of the proposed algorithm and other existing PWM algorithms is provided, and the pros and cons of the proposed algorithm are concluded. Simulation and experimental results are also given. It is shown that the proposed algorithm works well under unbalanced load conditions. However, its maximum modulation index is reduced by 5.15% in the linear modulation region, and its algorithm complexity and memory requirement increase. The basic principle in this paper can be easily extended to other inverters with different phase numbers.

Implementation of handwritten digit recognition CNN structure using GPGPU and Combined Layer (GPGPU와 Combined Layer를 이용한 필기체 숫자인식 CNN구조 구현)

  • Lee, Sangil;Nam, Kihun;Jung, Jun Mo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.3 no.4
    • /
    • pp.165-169
    • /
    • 2017
  • CNN(Convolutional Nerual Network) is one of the algorithms that show superior performance in image recognition and classification among machine learning algorithms. CNN is simple, but it has a large amount of computation and it takes a lot of time. Consequently, in this paper we performed an parallel processing unit for the convolution layer, pooling layer and the fully connected layer, which consumes a lot of handling time in the process of CNN, through the SIMT(Single Instruction Multiple Thread)'s structure of GPGPU(General-Purpose computing on Graphics Processing Units).And we also expect to improve performance by reducing the number of memory accesses and directly using the output of convolution layer not storing it in pooling layer. In this paper, we use MNIST dataset to verify this experiment and confirm that the proposed CNN structure is 12.38% better than existing structure.

Expanding Code Caches for Embedded Java Systems using Client Ahead-Of-Time Compilation (내장형 자바 시스템을 위한 클라이언트 선행 컴파일 기법을 이용한 코드 캐시 확장)

  • Hong, Sung-Hyun;Kim, Jin-Chul;Shin, Jin-Woo;Kwon, Jin-Woo;Lee, Joo-Hwan;Moon, Soo-Mook
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.8
    • /
    • pp.868-872
    • /
    • 2010
  • Many embedded Java systems are equipped with limited memory, which can constrain the code cache size provided for Java just-in-time compilation, affecting the Java performance. This paper proposes expanding the limited code cache when it is full, by saving the machine code for some methods in the code cache into the file system of the permanent storage and reloading it to the code cache when they are re-invoked later. This is applying the client ahead-of-time compilation during the execution time for the purpose of enlarging the code cache. Our experimental results indicate that the proposed execution method can improve the performance by as much as 1.6 times compared to the conventional method, when the code cache size is reduced by half.