• Title/Summary/Keyword: Parallel array structure

Search Result 89, Processing Time 0.027 seconds

Design on Neural Operation Unit with Modular Structure (모듈형 구조를 갖는 범용 뉴럴 연산회로 설계)

  • Kim Jong-Won;Cho Hyun-Chan;Seo Jae-Yong;Cho Tae-Hoon;Lee Sung-Jun
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2006.05a
    • /
    • pp.125-129
    • /
    • 2006
  • By advent of NNC(Neural Network Chip), it is possible that process in parallel and discern the importance of signal with learning oneself by experience in external signal. So, the design of general purpose operation unit using VHDL(VHSIC Hardware Description Language) on the existing FPGA(Field Programmable Gate Array) can replaced EN(Expert Network) and learning algorithm. Also, neural network operation unit is possible various operation using learning of NN(Neural Network). This paper present general purpose operation unit using hierarchical structure of EN. EN of presented structure learn from logical gate which constitute a operation unit, it relocated several layer. The overall structure is hierarchical using a module, it has generality more than FPGA operation unit.

  • PDF

A Study for the fabrication of Au dot-arrays using porous alumina film (다공성 알루미나 박막을 이용한 Au dot-arrays의 제작에 관한 연구)

  • Jung, Kyung-Han;Park, Sang-Hyun;Shin, Hoon-Kyu;Kwon, Young-Soo
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2003.07b
    • /
    • pp.922-925
    • /
    • 2003
  • The interest of self-organization materials that have uniform and regular structure in nano scale has been grown due to their utilization in various fields of nanotechnology. An attractive candidate among these materials is anodic aluminum oxide film, which are formed by anodization of aluminum in an appropriate acid solution. The anodic aluminum oxide film has a highly ordered porous structure with very uniform and nearly parallel pores that can be organized in an almost precise close-packed hexagonal structure. In this study, we attempt to make Au dot arrays, which were fabricated using anodic aluminum oxide film as an evaporation mask. The Au dot arrays have a uniform sized dots and spacing to its neighbors and the average diameter of Au dots is about 60 nm corresponding to them of the mask.

  • PDF

CNN Accelerator Architecture using 3D-stacked RRAM Array (3차원 적층 구조 저항변화 메모리 어레이를 활용한 CNN 가속기 아키텍처)

  • Won Joo Lee;Yoon Kim;Minsuk Koo
    • Journal of IKEEE
    • /
    • v.28 no.2
    • /
    • pp.234-238
    • /
    • 2024
  • This paper presents a study on the integration of 3D-stacked dual-tip RRAM with a CNN accelerator architecture, leveraging its low drive current characteristics and scalability in a 3D stacked configuration. The dual-tip structure is utilized in a parallel connection format in a synaptic array to implement multi-level capabilities. It is configured within a Network-on-chip style accelerator along with various hardware blocks such as DAC, ADC, buffers, registers, and shift & add circuits, and simulations were performed for the CNN accelerator. The quantization of synaptic weights and activation functions was assumed to be 16-bit. Simulation results of CNN operations through a parallel pipeline for this accelerator architecture achieved an operational efficiency of approximately 370 GOPs/W, with accuracy degradation due to quantization kept within 3%.

Splitting of Surface Plasmon Resonance Peaks Under TE- and TM-polarized Illumination

  • Yoon, Su-Jin;Hwang, Jeongwoo;Lee, Myeong-Ju;Kang, Sang-Woo;Kim, Jong-Su;Ku, Zahyun;Urbas, Augustine;Lee, Sang Jun
    • Proceedings of the Korean Vacuum Society Conference
    • /
    • 2014.02a
    • /
    • pp.296-296
    • /
    • 2014
  • We investigate experimentally and theoretically the splitting of surface plasmon (SP) resonance peaks under TE- and TM-polarized illumination. The SP structure at infrared wavelength is fabricated with a 2-dimensional square periodic array of circular holes penetrating through Au (gold) film. In brief, the processing steps to fabricate the SP structure are as follows. (i) A standard optical lithography was performed to produce to a periodic array of photoresist (PR) circular cylinders. (ii) After the PR pattern, e-beam evaporation was used to deposit a 50-nm thick layer of Au. (iii) A lift-off processing with acetone to remove the PR layer, leading to final structure (pitch, $p=2.2{\mu}m$; aperture size, $d=1.1{\mu}m$) as shown in Fig. 1(a). The transmission is measured using a Nicolet Fourier-transform infrared spectroscopy (FTIR) at the incident angle from $0^{\circ}$ to $36^{\circ}$ with a step of $4^{\circ}$ both in TE and TM polarization. Measured first and second order SP resonances at interface between Au and GaAs exhibit the splitting into two branches under TM-polarized light as shown in Fig. 1(b). However, as the incidence angle under TE polarization is increased, the $1^{st}$ order SP resonance peak blue-shifts slightly while the splitting of $2^{nd}$ order SP resonance peak tends to be larger (not shown here). For the purpose of understanding our experimental results qualitatively, SP resonance peak wavelengths can be calculated from momentum matching condition (black circle depicted in Fig. 2(b)), $k_{sp}=k_{\parallel}{\pm}iG_x{\pm}jG_y$, where $k_{sp}$ is the SP wavevector, $k_{\parallel}$ is the in-plane component of incident light wavevector, i and j are SP coupling order, and G is the grating momentum wavevector. Moreover, for better understanding we performed 3D full field electromagnetic simulations of SP structure using a finite integration technique (CST Microwave Studio). Fig. 1(b) shows an excellent agreement between the experimental, calculated and CST-simulated splitting of SP resonance peaks with various incidence angles under TM-polarized illumination (TE results are not shown here). The simulated z-component electric field (Ez) distribution at incident angle, $4^{\circ}$ and $16^{\circ}$ under TM polarization and at the corresponding SP resonance wavelength is shown in Fig. 1(c). The analysis and comparison of theoretical results with experiment indicates a good agreement of the splitting behavior of the surface plasmon resonance modes at oblique incidence both in TE and TM polarization.

  • PDF

Design of High-Speed Parallel Multiplier on Finite Fields GF(3m) (유한체 GF(3m)상의 고속 병렬 곱셈기의 설계)

  • Seong, Hyeon-Kyeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.2
    • /
    • pp.1-10
    • /
    • 2015
  • In this paper, we propose a new multiplication algorithm for primitive polynomial with all 1 of coefficient in case that m is odd and even on finite fields $GF(3^m)$, and design the multiplier with parallel input-output module structure using the presented multiplication algorithm. The proposed multiplier is designed $(m+1)^2$ same basic cells. Since the basic cells have no a latch circuit, the multiplicative circuit is very simple and is short the delay time $T_A+T_X$ per cell unit. The proposed multiplier is easy to extend the circuit with large m having regularity and modularity by cell array, and is suitable to the implementation of VLSI circuit.

Parallelization of CUSUM Test in a CUDA Environment (CUDA 환경에서 CUSUM 검증의 병렬화)

  • Son, Changhwan;Park, Wooyeol;Kim, HyeongGyun;Han, KyungSook;Pyo, Changwoo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.7
    • /
    • pp.476-481
    • /
    • 2015
  • We have parallelized the cumulative sum (CUSUM) test of NIST's statistical random number test suite in a CUDA environment. Storing random walks in an array instead of in scalar variables eliminates data dependence. The change in data structure makes it possible to apply parallel scans, scatters, and reductions at each stage of the test. In addition, serial data exchanges between CPU and GPU are removed by migrating CPU's tasks to GPU. Finally we have optimized global memory accesses. The overall speedup is 23 times over the sequential version. Our results contribute to improving security of random numbers for cryptographic keys as well as reducing the time for evaluation of randomness.

Design and evaluation of binocular type six-component load cell by using experimental technique (실험계획법을 이용한 쌍안경식 6축 로드셀의 설계 및 상호간섭 오차 평가)

  • Kang, Dae-Im;Kim, Gab-Sun;Jeong, Su-Yeon;Joo, Jin-Won
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.21 no.11
    • /
    • pp.1921-1930
    • /
    • 1997
  • This paper presents the effective technique to design a six-axis load cell by using experimental design with an orthogonal array. A binocular structure is used as a basic sensing element for a load cell instead of the parallel plate structure. The finite element method is adopted to obtain strain distributions of the sensing element, and by doing the analysis of variances, its results are utilized in determining the factor which is more influential to the output strain. Calibration test results show that the developed six-axis loa cell with the maximum capacities of 196 N in forces and 19.6 N. m in moments is evaluated to be useful with the coupling error less than 2.5%.

Spanwise Growth of Vortex Structure in Wall Turbulence

  • Adrian, Ronald J.;Balachandar, S.;Liu, Z.C.
    • Journal of Mechanical Science and Technology
    • /
    • v.15 no.12
    • /
    • pp.1741-1749
    • /
    • 2001
  • Recent studies of the structure of wall turbulence have lead to the development of a conceptual model that validates and integrates many elements of previous models into a relatively simple picture based on self-assembling packets of hairpin vortex eddies. By continual spawning new hairpins the packets grow longer in the streamwise direction, and by mutual induction between adjacent hairpins the hairpins are strained so that they grow taller and wider as they age. The result is a characteristic growth angle in the streamwise-wall normal plane. The spanwise growth of individual packets implies that packets must either merge or pass through each other when they come into contact. Direct numerical simulations of the growth and interaction of spanwise adjacent hairpins shows that they merge by the vortex connection mechanism originally proposed by Wark and Nagib (199). In this mechanism the quasi-streamwise legs of two hairpins annihilate each other, by virtue of having opposite vorticity, leaving a new hairpin of approximately double the width of the individuals. PIV measurements in planes parallel to the wall support this picture. DNS of multiple hairpins shows how the spanwise scale doubles when the hairpins form an array.

  • PDF

Comparison of Parallel and Fan-Beam Monochromatic X-Ray CT Using Synchrotron Radiation

  • Toyofuku, Fukai;Tokumori, Kenji;Kanda, Shigenobu;Ohki, Masafumi;Higashida, Yoshiharu;Hyodo, Kazuyuki;Ando, Masami;Uyama, Chikao
    • Proceedings of the Korean Society of Medical Physics Conference
    • /
    • 2002.09a
    • /
    • pp.407-410
    • /
    • 2002
  • Monochromatic x-ray CT has several advantages over conventional CT, which utilizes bremsstrahlung white x-rays from an x-ray tube. There are several methods to produce such monochromatic x-rays. The most popular one is crystal diffraction monochromatization, which has been commonly used because of the fact that the energy spread is very narrow and the energy can be changed continuously. The alternative method is the use of fluorescent x-ray, which has several advantages such as large beam size and fast energy change. We have developed a parallel-beam and a fan-beam monochromatic x-ray CT, and compared some characteristics such as accuracy of CT numbers between those systems. The fan beam monochromatic x-rays were generated by irradiating target materials by incident white x-rays from a bending magnet beam line NE5 in 6.5 GeV Accumulation Ring at Tukuba. The parallel beam monochromatic x-rays were generated by using a silicon double crystal monochromator at the bending magnet beam line BL-20BM in Spring-8. A Cadmium telluride (CdTe) 256 channel array detector with 512mm sensitive width capable of operating at room temperature was used in the photon counting mode. A cylindrical phantom containing eight concentrations of gadolinium was used for the fan beam monochromatic x-ray CT system, while a phantom containing acetone, ethanol, acrylic and water was used for the parallel monochromatic x-ray CT system. The linear attenuation coefficients obtained from CT numbers of those monochromatic x-ray CT images were compared with theoretical values. They showed a good agreement within 3%. It was found that the quantitative measurement can be possible by using the fan beam monochromatic x-ray CT system as well as a parallel beam monochromatic X-ray CT system.

  • PDF

Design of an Efficient VLSI Architecture of SADCT Based on Systolic Array (시스톨릭 어레이에 기반한 SADCT의 효율적 VLSl 구조설계)

  • Gang, Tae-Jun;Jeong, Ui-Yun;Gwon, Sun-Gyu;Ha, Yeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.38 no.3
    • /
    • pp.282-291
    • /
    • 2001
  • In this paper, an efficient VLSI architecture of Shape Adaptive Discrete Cosine Transform(SADCT) based on systolic array is proposed. Since transform size in SADCT is varied according to the shape of object in each block, it are dropped that both usability of processing elements(PE´s) and throughput rate in time-recursive SADCT structure. To overcome these disadvantages, it is proposed that the architecture based on a systolic way structure which doesn´t need memory. In the proposed architecture, throughput rate is improved by consecutive processing of one-dimensional SADCT without memory and PE´s in the first column are connected to that in the last one for improvement of usability of PE. And input data are put into each column of PE in parallel according to the maximum data number in each rearranged block. The proposed architecture is described by VHDL. Also, its function is evaluated by MentorTM. Even though the hardware complexity is somewhat increased, the throughput rate is improved about twofold.

  • PDF