• Title/Summary/Keyword: memory efficiency

Search Result 709, Processing Time 0.022 seconds

Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing (병렬 연산을 이용한 방출 단층 영상의 재구성 속도향상 기초연구)

  • Park, Min-Jae;Lee, Jae-Sung;Kim, Soo-Mee;Kang, Ji-Yeon;Lee, Dong-Soo;Park, Kwang-Suk
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.443-450
    • /
    • 2009
  • Purpose: Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. Materials and Methods: The preliminary tests for the possibility on virtual manchines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Results: Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Conclusion: Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify.

A Word Dictionary Structure for the Postprocessing of Hangul Recognition (한글인식 후처리용 단어사전의 기억구조)

  • ;Yoshinao Aoki
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.9
    • /
    • pp.1702-1709
    • /
    • 1994
  • In the postprocessing of Hangul recognition system, the storage structure of contextual information is an important matter for the recognition rate and speed of the entire system. Trie in general is used to represent the context as word dictionary, but the memory space efficiency of the structure is low. Therefore we propose a new structure for word dictionary that has better space efficiency and the equivalent merits of trie. Because Hangul is a compound language, the language can be represented by phonemes or by characters. In the representation by phonemes(P-mode) the retrieval is fast, but the space efficiency is low. In the representation by characters(C-mode) the space efficiency is high, but the retrieval is slow. In this paper the two representation methods are combined to form a hybrid representation(H-mode). At first an optimal level for the combination is selected by two characteristic curves of node utilization and dispersion. Then the input words are represented with trie structure by P-mode from the first to the optimal level, and the rest are represented with sequentially linked list structure by C-mode. The experimental results for the six kinds of word set show that the proposed structure is more efficient. This result is based on the fact that the retrieval for H-mode is as fast as P-mode and the space efficiency is as good as C-mode.

  • PDF

A Review of the Neurocognitive Mechanisms for Mathematical Thinking Ability (수학적 사고력에 관한 인지신경학적 연구 개관)

  • Kim, Yon Mi
    • Korean Journal of Cognitive Science
    • /
    • v.27 no.2
    • /
    • pp.159-219
    • /
    • 2016
  • Mathematical ability is important for academic achievement and technological renovations in the STEM disciplines. This study concentrated on the relationship between neural basis of mathematical cognition and its mechanisms. These cognitive functions include domain specific abilities such as numerical skills and visuospatial abilities, as well as domain general abilities which include language, long term memory, and working memory capacity. Individuals can perform higher cognitive functions such as abstract thinking and reasoning based on these basic cognitive functions. The next topic covered in this study is about individual differences in mathematical abilities. Neural efficiency theory was incorporated in this study to view mathematical talent. According to the theory, a person with mathematical talent uses his or her brain more efficiently than the effortful endeavour of the average human being. Mathematically gifted students show different brain activities when compared to average students. Interhemispheric and intrahemispheric connectivities are enhanced in those students, particularly in the right brain along fronto-parietal longitudinal fasciculus. The third topic deals with growth and development in mathematical capacity. As individuals mature, practice mathematical skills, and gain knowledge, such changes are reflected in cortical activation, which include changes in the activation level, redistribution, and reorganization in the supporting cortex. Among these, reorganization can be related to neural plasticity. Neural plasticity was observed in professional mathematicians and children with mathematical learning disabilities. Last topic is about mathematical creativity viewed from Neural Darwinism. When the brain is faced with a novel problem, it needs to collect all of the necessary concepts(knowledge) from long term memory, make multitudes of connections, and test which ones have the highest probability in helping solve the unusual problem. Having followed the above brain modifying steps, once the brain finally finds the correct response to the novel problem, the final response comes as a form of inspiration. For a novice, the first step of acquisition of knowledge structure is the most important. However, as expertise increases, the latter two stages of making connections and selection become more important.

Multimedia Extension Instructions and Optimal Many-core Processor Architecture Exploration for Portable Ultrasonic Image Processing (휴대용 초음파 영상처리를 위한 멀티미디어 확장 명령어 및 최적의 매니코어 프로세서 구조 탐색)

  • Kang, Sung-Mo;Kim, Jong-Myon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.8
    • /
    • pp.1-10
    • /
    • 2012
  • This paper proposes design space exploration methodology of many-core processors including multimedia specific instructions to support high-performance and low power ultrasound imaging for portable devices. To explore the impact of multimedia instructions, we compare programs using multimedia instructions and baseline programs with a same many-core processor in terms of execution time, energy efficiency, and area efficiency. Experimental results using a $256{\times}256$ ultrasound image indicate that programs using multimedia instructions achieve 3.16 times of execution time, 8.13 times of energy efficiency, and 3.16 times of area efficiency over the baseline programs, respectively. Likewise, programs using multimedia instructions outperform the baseline programs using a $240{\times}320$ image (2.16 times of execution time, 4.04 times of energy efficiency, 2.16 times of area efficiency) as well as using a $240{\times}400$ image (2.25 times of execution time, 4.34 times of energy efficiency, 2.25 times of area efficiency). In addition, we explore optimal PE architecture of many-core processors including multimedia instructions by varying the number of PEs and memory size.

A Spatial Index for PDA using Minimum Bounding Rectangle Compression and Hashing Techniques (최소경계사각형 압축 및 해슁 기법을 이용한 PDA용 공간색인)

  • 김진덕
    • Spatial Information Research
    • /
    • v.10 no.1
    • /
    • pp.61-76
    • /
    • 2002
  • Mobile map services using PDA are prevailing because of the rapid developments of techniques of the internet and handhold devices recently. While the volume of spatial data is tremendous and the spatial operations are time-intensive, the PDA has small size memory and a low performance processor. Therefore, the spatial index for PDA should be small size and efficiently filter out the candidate objects of spatial operation as well. This paper proposes a spatial index far PDA called MHF(Multilevel Hashing File). The MHF has simple structure for storage efficiency and uses a hashing technique, which is direct search method, for search efficiency. This paper also designs a compression technique for MBR. which occupies almost 80% of index data in the two dimensional case. We call it HMBR. Although the HMBR technique reduces the MB\ulcorner size to almost a third, it shows good filtering efficiency because of no information loss by quantization in case of small objects that occupy a major portion. Our experimental tests show that the proposed MHF index using HMBR technique is appropriate for PDA in terms of the size of index, the Number of MBR comparisons, the filtering efficiency and the execution time of spatial operations.

  • PDF

The Hardware Design of Effective Deblocking Filter for HEVC Encoder (HEVC 부호기를 위한 효율적인 디블록킹 하드웨어 설계)

  • Park, Jae-Ha;Park, Seung-yong;Ryoo, Kwang-ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2014.10a
    • /
    • pp.755-758
    • /
    • 2014
  • In this paper, we propose effective Deblocking Filter hardware architecture for High Efficiency Video Coding encoder. we propose Deblocking Filter hardware architecture with less processing time, filter ordering for low area design, effective memory architecture and four-pipeline for a high performance HEVC(High Efficiency Video Coding) encoder. Proposed filter ordering can be used to reduce delay according to preprocessing. It can be used for realtime single-port SRAM read and write. it can be used in parallel processing by using two filters. Using 10 memory is effective for solving the hazard caused by a single-port SRAM. Also the proposed filter can be used in low-voltage design by using clock gating architecture in 4-pipeline. The proposed Deblocking Filter encoder architecture is designed by Verilog HDL, and implemented by 100k logic gates in TSMC $0.18{\mu}m$ process. At 150MHz, the proposed Deblocking Filter encoder can support 4K Ultra HD video encoding at 30fps, and can be operated at a maximum speed of 200MHz.

  • PDF

An Efficient Spatial Index Structure for Main Memory (메인 메모리를 위한 효율적인 공간 인덱스 구조)

  • Lee, Ki-Young;Lim, Myung-Jae;Kang, Jeong-Jin;Kim, Joung-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.9 no.2
    • /
    • pp.13-20
    • /
    • 2009
  • Recently there is growing interest in LBS requiring real-time services and the spatial main memory DBMS for efficient Telematics services. In order to optimize existing disk-based spatial indexes of the spatial main memory DBMS in the main memory, spatial index structures have been proposed, which minimize failures in cache access by reducing the entry size. However, because the reduction of entry size requires compression based on the MBR of the parent node or the removal of redundant MBR, the cost of MBR reconstruction increases in index update and the efficiency of search is lowered in index search. Thus, to reduce the cost of MBR reconstruction, this paper proposed the RSMB (relative-sized MBR)compression technique, which applies the base point of compression differently in case of broad distribution and narrow distribution. In case of broad distribution, compression is made based on the left-bottom point of the extended MBR of the parent node, and in case of narrow distribution, the whole MBR is divided into cells of the same size and compression is made based on the left-bottom point of each cell. In addition, MBR was compressed using a relative coordinate and size to reduce the cost of search in index search. Lastly, we evaluated the performance of the proposed RSMBR compression technique using real data, and proved its superiority.

  • PDF

Comparison of Message Passing Interface and Hybrid Programming Models to Solve Pressure Equation in Distributed Memory System (분산 메모리 시스템에서 압력방정식의 해법을 위한 MPI와 Hybrid 병렬 기법의 비교)

  • Jeon, Byoung Jin;Choi, Hyoung Gwon
    • Transactions of the Korean Society of Mechanical Engineers B
    • /
    • v.39 no.2
    • /
    • pp.191-197
    • /
    • 2015
  • The message passing interface (MPI) and hybrid programming models for the parallel computation of a pressure equation were compared in a distributed memory system. Both models were based on domain decomposition, and two numbers of the sub-domain were selected by considering the efficiency of the hybrid model. The parallel performances for various problem sizes were measured using up to 96 threads. It was found that in addition to the cache-memory size, the overhead of the MPI communication/OpenMP directives affected the parallel performance. For small problems, the parallel performance was low because the percentage of the overhead of the MPI communication/OpenMP directives increased as the number of threads increased, and MPI was better than the hybrid model because it had a smaller communication overhead. For large problems, the parallel performance was high because, in addition to the cache effect, the percentage of the communication overhead was relatively low compared to that for small problems, and the hybrid model was better than MPI because the communication overhead of MPI was more dominant than that of the OpenMP directives in the hybrid model.

An Energy-Delay Efficient System with Adaptive Victim Caches (선택적 희생 캐쉬를 이용한 저전력 고성능 시스템 설계 방안)

  • Kim Cheol Hong;Shim Sunghoon;Jhon Chu Shik;Jhang Seong Tae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.663-674
    • /
    • 2005
  • We propose a system aimed at achieving high energy-delay efficiency by using adaptive victim caches. Particularly, we investigate methods to improve the hit rates in the first level of memory hierarchy, which reduces the number of accesses to mort power consuming memory structures such as L2 cache. Victim cache is a memory element for reducing conflict misses in a direct-mapped L1 cache. We present two techniques to fill the victim cache with the blocks that have higher probability to be re-reqeusted by processor. Hit-based victim cache ks tilled with the blocks which were referenced frequently by processor. Replacement-based victim cache is filled with the blocks which were evicted from the sets where block replacements had happened frequently According to our simulations, replacement-based victim cache scheme outperforms the conventional victim cache scheme about $2\%$ on average and refutes the power consumption by up to $8\%$.

Memory Efficient Query Processing over Dynamic XML Fragment Stream (동적 XML 조각 스트림에 대한 메모리 효율적 질의 처리)

  • Lee, Sang-Wook;Kim, Jin;Kang, Hyun-Chul
    • The KIPS Transactions:PartD
    • /
    • v.15D no.1
    • /
    • pp.1-14
    • /
    • 2008
  • This paper is on query processing in the mobile devices where memory capacity is limited. In case that a query against a large volume of XML data is processed in such a mobile device, techniques of fragmenting the XML data into chunks and of streaming and processing them are required. Such techniques make it possible to process queries without materializing the XML data in its entirety. The previous schemes such as XFrag[4], XFPro[5], XFLab[6] are not scalable with respect to the increase of the size of the XML data because they lack proper memory management capability. After some information on XML fragments necessary for query processing is stored, it is not deleted even after it becomes of no use. As such, when the XML fragments are dynamically generated and infinitely streamed, there could be no guarantee of normal completion of query processing. In this paper, we address scalability of query processing over dynamic XML fragment stream, proposing techniques of deleting information on XML fragments accumulated during query processing in order to extend the previous schemes. The performance experiments through implementation showed that our extended schemes considerably outperformed the previous ones in memory efficiency and scalability with respect to the size of the XML data.