통합 검색 | Korea Science

Performance Evaluation and Prediction on a Clustered SMP System for Aerospace CED Applications with Hybrid Paradigm

Matsuo Yuichi;Sueyasu Naoki;Inari Tomohide
- 한국전산유체공학회:학술대회논문집
- /
- 한국전산유체공학회 2006년도 PARALLEL CFD 2006
- /
- pp.275-278
- /
- 2006
Japan Aerospace Exploration Agency has introduced a new terascale clusterd SMP system as a main compute engine of Numerical Simulator III for aerospace science and engineering research purposes. The system is using Fujitsu PRIMEPOWER HPC2500; it has computing capability of 9.3Tflop/s peak performance and 3.6TB of user memory, with about 1,800 scalar processors for computation. In this paper, we first present the performance evaluation results for aerospace CFD applications with hybrid programming paradigm used at JAXA. Next we propose a performance prediction formula for hybrid codes based on a simple extension of AMhhal's law, and discuss about the predicted and measured performances for some typical hybrid CFD codes.
PDF

Memory Organization for a Fuzzy Controller.

Jee, K.D.S.;Poluzzi, R.;Russo, B.
- 한국지능시스템학회:학술대회논문집
- /
- 한국퍼지및지능시스템학회 1993년도 Fifth International Fuzzy Systems Association World Congress 93
- /
- pp.1041-1043
- /
- 1993
Fuzzy logic based Control Theory has gained much interest in the industrial world, thanks to its ability to formalize and solve in a very natural way many problems that are very difficult to quantify at an analytical level. This paper shows a solution for treating membership function inside hardware circuits. The proposed hardware structure optimizes the memoried size by using particular form of the vectorial representation. The process of memorizing fuzzy sets, i.e. their membership function, has always been one of the more problematic issues for the hardware implementation, due to the quite large memory space that is needed. To simplify such an implementation, it is commonly [1,2,8,9,10,11] used to limit the membership functions either to those having triangular or trapezoidal shape, or pre-definite shape. These kinds of functions are able to cover a large spectrum of applications with a limited usage of memory, since they can be memorized by specifying very few parameters ( ight, base, critical points, etc.). This however results in a loss of computational power due to computation on the medium points. A solution to this problem is obtained by discretizing the universe of discourse U, i.e. by fixing a finite number of points and memorizing the value of the membership functions on such points [3,10,14,15]. Such a solution provides a satisfying computational speed, a very high precision of definitions and gives the users the opportunity to choose membership functions of any shape. However, a significant memory waste can as well be registered. It is indeed possible that for each of the given fuzzy sets many elements of the universe of discourse have a membership value equal to zero. It has also been noticed that almost in all cases common points among fuzzy sets, i.e. points with non null membership values are very few. More specifically, in many applications, for each element u of U, there exists at most three fuzzy sets for which the membership value is ot null [3,5,6,7,12,13]. Our proposal is based on such hypotheses. Moreover, we use a technique that even though it does not restrict the shapes of membership functions, it reduces strongly the computational time for the membership values and optimizes the function memorization. In figure 1 it is represented a term set whose characteristics are common for fuzzy controllers and to which we will refer in the following. The above term set has a universe of discourse with 128 elements (so to have a good resolution), 8 fuzzy sets that describe the term set, 32 levels of discretization for the membership values. Clearly, the number of bits necessary for the given specifications are 5 for 32 truth levels, 3 for 8 membership functions and 7 for 128 levels of resolution. The memory depth is given by the dimension of the universe of the discourse (128 in our case) and it will be represented by the memory rows. The length of a world of memory is defined by: Length = nem (dm(m)＋dm(fm) Where: fm is the maximum number of non null values in every element of the universe of the discourse, dm(m) is the dimension of the values of the membership function m, dm(fm) is the dimension of the word to represent the index of the highest membership function. In our case then Length=24. The memory dimension is therefore 128*24 bits. If we had chosen to memorize all values of the membership functions we would have needed to memorize on each memory row the membership value of each element. Fuzzy sets word dimension is 8*5 bits. Therefore, the dimension of the memory would have been 128*40 bits. Coherently with our hypothesis, in fig. 1 each element of universe of the discourse has a non null membership value on at most three fuzzy sets. Focusing on the elements 32,64,96 of the universe of discourse, they will be memorized as follows: The computation of the rule weights is done by comparing those bits that represent the index of the membership function, with the word of the program memor . The output bus of the Program Memory (μCOD), is given as input a comparator (Combinatory Net). If the index is equal to the bus value then one of the non null weight derives from the rule and it is produced as output, otherwise the output is zero (fig. 2). It is clear, that the memory dimension of the antecedent is in this way reduced since only non null values are memorized. Moreover, the time performance of the system is equivalent to the performance of a system using vectorial memorization of all weights. The dimensioning of the word is influenced by some parameters of the input variable. The most important parameter is the maximum number membership functions (nfm) having a non null value in each element of the universe of discourse. From our study in the field of fuzzy system, we see that typically nfm 3 and there are at most 16 membership function. At any rate, such a value can be increased up to the physical dimensional limit of the antecedent memory. A less important role n the optimization process of the word dimension is played by the number of membership functions defined for each linguistic term. The table below shows the request word dimension as a function of such parameters and compares our proposed method with the method of vectorial memorization[10]. Summing up, the characteristics of our method are: Users are not restricted to membership functions with specific shapes. The number of the fuzzy sets and the resolution of the vertical axis have a very small influence in increasing memory space. Weight computations are done by combinatorial network and therefore the time performance of the system is equivalent to the one of the vectorial method. The number of non null membership values on any element of the universe of discourse is limited. Such a constraint is usually non very restrictive since many controllers obtain a good precision with only three non null weights. The method here briefly described has been adopted by our group in the design of an optimized version of the coprocessor described in [10].
PDF

무선 센서 네트워크망에서의 효율적인 키 관리 프로토콜 분석 (Analyses of Key Management Protocol for Wireless Sensor Networks in Wireless Sensor Networks)

김정태
- 한국정보통신학회:학술대회논문집
- /
- 한국해양정보통신학회 2005년도 추계종합학술대회
- /
- pp.799-802
- /
- 2005
In this paper, we analyses of Key Management Protocol for Wireless Sensor Networks in Wireless Sensor Networks. Wireless sensor networks have a wide spectrum of civil military application that call for security, target surveillance in hostile environments. Typical sensors possess limited computation, energy, and memory resources; therefore the use of vastly resource consuming security mechanism is not possible. In this paper, we propose a cryptography key management protocol, which is based on identity based symmetric keying.
PDF

A Study on Effect of Code Distribution and Data Replication for Multicore Computing Architectures

Cho, Doosan
- International Journal of Advanced Culture Technology
- /
- 제9권4호
- /
- pp.282-287
- /
- 2021
A multicore system must be able to take full advantage of the program's instruction and data parallelism. This study introduces the data replication technique as a support technique to maximize the program's instruction and data parallelism. Instruction level parallelism can be limited by data dependency. In this case, if data is replicated to each processor core and used, instruction level parallelism can be used to the maximum. The technique proposed in this study can maximize the performance improvement effect when applied to scientific applications such as matrix multiplication operation.
https://doi.org/10.17703/IJACT.2021.9.4.282 인용 PDF KSCI

안드로이드 디바이스에서의 3 차원 모델 렌더링 속도 향상 (Speeding up the 3D Model Rendering on Android Device)

응총지에;강대기
- 한국정보처리학회:학술대회논문집
- /
- 한국정보처리학회 2011년도 춘계학술발표대회
- /
- pp.72-74
- /
- 2011
Rendering complex 3D model on smart mobile device with limited processing power and memory is challenging. Without optimization, the complex 3D model cannot be rendered smoothly. Special techniques are required to take into account to speed up the processing. In this paper, we will discuss about some approaches to alleviate the problem.
https://doi.org/10.3745/PKIPS.y2011m04a.72 인용 PDF

Cold 블록 영역과 hot 블록 영역의 주기적 교환을 통한 wear-leveling 향상 기법 (A wear-leveling improving method by periodic exchanging of cold block areas and hot block areas)

장시웅
- 한국정보통신학회:학술대회논문집
- /
- 한국해양정보통신학회 2008년도 춘계종합학술대회 A
- /
- pp.175-178
- /
- 2008
플래시 메모리에서 읽기 작업은 속도도 빠르고 제약이 없으나 데이터 변경 시에는 덮어쓰기(overwrite)가 되지 않아 해당 데이터를 새로운 영역에 쓰고 이전에 존재하던 데이터는 무효 시켜야한다. 무효화시킨 데이터는 가비지컬렉션을 통해 지움 연산을 수행해야 한다. 지역 접근성을 가지는 데이터에 대해 가비지컬렉션을 통해 클리어 시킬 대상 목록을 선정할 때 cost-benefit 방법을 사용하면 성능은 좋으나 wear-leveling이 나빠지는 문제점이 있다. 본 연구에서는 wear-leveling을 개선하기 위해 플래시 메모리를 hot 데이터 그룹들과 cold 데이터 그룹들의 다수의 그룹으로 분할한 후 데이터를 배치하고 주기적으로 hot 데이터 영역과 cold 데이터 영역을 교체함으로써 wear-leveling과 성능을 개선하였다.
PDF

NAND 플래시 메모리 파일 시스템에 빠른 연산을 위한 설계 (Design of Fast Operation Method In NAND Flash Memory File System)

진종원;이태훈;정기동
- 한국정보과학회논문지:컴퓨팅의 실제 및 레터
- /
- 제14권1호
- /
- pp.91-95
- /
- 2008
플래시 메모리는 비휘발성, 저전력, 빠른 입출력, 충격에 강함 등과 같은 많은 장점을 가지고 있으며 모바일 기기에서의 저장 매체로 사용이 증가되고 있다. 하지만 제자리 덮어쓰기가 불가능하고 지움 연산의 단위가 크다는 제약 및 블록의 지움 횟수 제한이 있다. 이러한 제약을 극복하기 위해 YAFFS와 같은 로그 구조 기반의 플래시 파일 시스템들이 개발되었다. 그러나 쓰기 연산을 위한 공간 요청이 발생할 때나 지움 대상 블록을 선정할 때 순차적으로 블록 정보를 검색하여 할당 및 지움 연산을 수행한다. 이러한 순차적인 블록 접근 방식은 플래시 메모리의 사용량이 증가함에 따라 접근 시간이 증가될 수 있다. 그리고 블록 지움 연산을 수행하는 시기를 결정하여 불필요한 지움 연산 대상 블록을 찾는 시간을 최소화하고 충분한 플래시 메모리의 빈 공간을 유지하여야 한다. 본 논문에서는 이러한 문제점을 해결하기 위해 로그 구조 기반의 NAND 플래시 메모리 파일시스템의 빠른 연산을 위한 기법들을 제안한다. 제안된 기법은 YAFFS 상에서 구현되었으며, 제안한 기법들을 실험을 통해 비교 분석하였다. 제안된 기법은 기존의 성능과 비교해 빠른 연산 성능향상을 보였다.
PDF KSCI

Experimental investigation of Scalability of DDR DRAM packages

Crisp, R.
- 마이크로전자및패키징학회지
- /
- 제17권4호
- /
- pp.73-76
- /
- 2010
A two-facet approach was used to investigate the parametric performance of functional high-speed DDR3 (Double Data Rate) DRAM (Dynamic Random Access Memory) die placed in different types of BGA (Ball Grid Array) packages: wire-bonded BGA (FBGA, Fine Ball Grid Array), flip-chip (FCBGA) and lead-bonded $microBGA^{(R)}$. In the first section, packaged live DDR3 die were tested using automatic test equipment using high-resolution shmoo plots. It was found that the best timing and voltage margin was obtained using the lead-bonded microBGA, followed by the wire-bonded FBGA with the FCBGA exhibiting the worst performance of the three types tested. In particular the flip-chip packaged devices exhibited reduced operating voltage margin. In the second part of this work a test system was designed and constructed to mimic the electrical environment of the data bus in a PC's CPU-Memory subsystem that used a single DIMM (Dual In Line Memory Module) socket in point-to-point and point-to-two-point configurations. The emulation system was used to examine signal integrity for system-level operation at speeds in excess of 6 Gb/pin/sec in order to assess the frequency extensibility of the signal-carrying path of the microBGA considered for future high-speed DRAM packaging. The analyzed signal path was driven from either end of the data bus by a GaAs laser driver capable of operation beyond 10 GHz. Eye diagrams were measured using a high speed sampling oscilloscope with a pulse generator providing a pseudo-random bit sequence stimulus for the laser drivers. The memory controller was emulated using a circuit implemented on a BGA interposer employing the laser driver while the active DRAM was modeled using the same type of laser driver mounted to the DIMM module. A custom silicon loading die was designed and fabricated and placed into the microBGA packages that were attached to an instrumented DIMM module. It was found that 6.6 Gb/sec/pin operation appears feasible in both point to point and point to two point configurations when the input capacitance is limited to 2pF.
PDF KSCI

Boosting up the photoconductivity and relaxation time using a double layered indium-zinc-oxide/indium-gallium-zinc-oxide active layer for optical memory devices

Lee, Minkyung;Jaisutti, Rawat;Kim, Yong-Hoon
- 한국진공학회:학술대회논문집
- /
- 한국진공학회 2016년도 제50회 동계 정기학술대회 초록집
- /
- pp.278-278
- /
- 2016
Solution-processed metal-oxide semiconductors have been considered as the next generation semiconducting materials for transparent and flexible electronics due to their high electrical performance. Moreover, since the oxide semiconductors show high sensitivity to light illumination and possess persistent photoconductivity (PPC), these properties can be utilized in realizing optical memory devices, which can transport information much faster than the electrons. In previous works, metal-oxide semiconductors are utilized as a memory device by using the light (i.e. illumination does the "writing", no-gate bias recovery the "reading" operations) [1]. The key issues for realizing the optical memory devices is to have high photoconductivity and a long life time of free electrons in the oxide semiconductors. However, mono-layered indium-zinc-oxide (IZO) and mono-layered indium-gallium-zinc-oxide (IGZO) have limited photoconductivity and relaxation time of 570 nA, 122 sec, 190 nA and 53 sec, respectively. Here, we boosted up the photoconductivity and relaxation time using a double-layered IZO/IGZO active layer structure. Solution-processed IZO (top) and IGZO (bottom) layers are prepared on a Si/SiO2 wafer and we utilized the conventional thermal annealing method. To investigate the photoconductivity and relaxation time, we exposed 9 mW/cm2 intensity light for 30 sec and the decaying behaviors were evaluated. It was found that the double-layered IZO/IGZO showed high photoconductivity and relaxation time of 28 uA and 1048 sec.
PDF

임베디드 시스템을 위한 신뢰성 있는 NAND 플래시 파일 시스템의 설계 (RFFS : Design of a Reliable NAND Flash File System for Embedded system)

이태훈;박송화;김태훈;이상기;이주경;정기동
- 정보처리학회논문지A
- /
- 제12A권7호
- /
- pp.571-582
- /
- 2005
NAND 플래시 메모리는 저전력 소비, 비휘발성, 읽기 속도의 항상 등의 장점이 있다. 그러나 제자리 덮어쓰기(in-place-update)가 불가능하고 지우는 횟수에 제한이 있으며 페이지 단위로 연산이 수행되는 단점이 있다. 이러한 NAND 플래시 메모리를 위한 전용 파일 시스템으로 YAFFS가 개발되었지만 여러 가지 문제점이 존재한다. 본 논문에서는 빠른 복구를 위한 기법, 효율적인 데이터 갱신 기법 그리고 균등한 메모리 사용을 위한 플레인 지움 정책을 사용하는 파일 시스템을 제안한다 전원 오류 발생시, 로그 정보를 사용하여 빠른 복구를 지원한다. 그리고 플래시 메모리의 효율적인 사용을 위해 데이터 쓰기 양을 최소화하고 이를 위해 새로운 메타 데이터 구조를 제안한다. 또한 플레인 지움 정책은 플래시의 균등 사용과 임베디드 시스템의 제한된 자원을 고려하여 연산을 최소화한다. 제안된 기법들의 성능을 실험을 통해 증명하고 그 결과를 분석한다.
https://doi.org/10.3745/KIPSTA.2005.12A.7.571 인용 PDF KSCI

검색결과 543건 처리시간 0.03초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)