• Title/Summary/Keyword: Parallel computation

Search Result 594, Processing Time 0.021 seconds

Physical Property Factors Controlling the Electrical Resistivity of Subsurface (지반의 전기비저항을 좌우하는 물성요인)

  • Park Sam-Gyu
    • Geophysics and Geophysical Exploration
    • /
    • v.7 no.2
    • /
    • pp.130-135
    • /
    • 2004
  • This paper describes the physical properties of the factors controlling the electrical resistivity of the subsurface. Resistivities of various types of soil and rock samples saturated with sodium chloride solutions having nine different concentrations were measured, and the measured resistivities of these samples were compared with calculated resistivities obtained using the conventional empirical formulas. From the results obtained, we observed that the resistivity of the soil and rock samples increases with increasing in pore-fluids resistivity regardless of the media type. However, between 20 and 200 ohm-m, which is the normal range of resistivity of groundwater, the resistivity of the pore-fluids have little or no effect on the resistivities of the samples used. Below 10 ohm-m, the resistivities of the samples are mainly controlled by the pore-fluids, whereas, in the normal range of resistivity of groundwater, the sample resistivities are controlled by their intrinsic matrix resistivity more than by the pore-fluids resistivity. Also, the measured resistivity of rock and soil samples having more than $20\%$ clay contents showed a good agreement with the calculated resistivity using the parallel resistance model whereas, the calculated resistivities of glass beads correlate with that obtained using Archie's formula. When the pore-fluid resistivity is high, the computation of the resistivity values of the samples using the Archie's formula could not be carried out. Through this study, we were able to confirm that the tests are only applicable to the parallel resistance model considering the intrinsic matrix resistivity within the normal resistivity range of groundwater in the subsurface.

Fast Multi-View Synthesis Using Duplex Foward Mapping and Parallel Processing (순차적 이중 전방 사상의 병렬 처리를 통한 다중 시점 고속 영상 합성)

  • Choi, Ji-Youn;Ryu, Sae-Woon;Shin, Hong-Chang;Park, Jong-Il
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.11B
    • /
    • pp.1303-1310
    • /
    • 2009
  • Glassless 3D display requires multiple images taken from different viewpoints to show a scene. The simplest way to get multi-view image is using multiple camera that as number of views are requires. To do that, synchronize between cameras or compute and transmit lots of data comes critical problem. Thus, generating such a large number of viewpoint images effectively is emerging as a key technique in 3D video technology. Image-based view synthesis is an algorithm for generating various virtual viewpoint images using a limited number of views and depth maps. In this paper, because the virtual view image can be express as a transformed image from real view with some depth condition, we propose an algorithm to compute multi-view synthesis from two reference view images and their own depth-map by stepwise duplex forward mapping. And also, because the geometrical relationship between real view and virtual view is repetitively, we apply our algorithm into OpenGL Shading Language which is a programmable Graphic Process Unit that allow parallel processing to improve computation time. We demonstrate the effectiveness of our algorithm for fast view synthesis through a variety of experiments with real data.

A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems (방출단층촬영 시스템을 위한 GPU 기반 반복적 기댓값 최대화 재구성 알고리즘 연구)

  • Ha, Woo-Seok;Kim, Soo-Mee;Park, Min-Jae;Lee, Dong-Soo;Lee, Jae-Sung
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.459-467
    • /
    • 2009
  • Purpose: The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Materials and Methods: Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. Results: The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 see, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 see, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. Conclusion: The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries.

Development of Regularized Expectation Maximization Algorithms for Fan-Beam SPECT Data (부채살 SPECT 데이터를 위한 정칙화된 기댓값 최대화 재구성기법 개발)

  • Kim, Soo-Mee;Lee, Jae-Sung;Lee, Soo-Jin;Kim, Kyeong-Min;Lee, Dong-Soo
    • The Korean Journal of Nuclear Medicine
    • /
    • v.39 no.6
    • /
    • pp.464-472
    • /
    • 2005
  • Purpose: SPECT using a fan-beam collimator improves spatial resolution and sensitivity. For the reconstruction from fan-beam projections, it is necessary to implement direct fan-beam reconstruction methods without transforming the data into the parallel geometry. In this study, various fan-beam reconstruction algorithms were implemented and their performances were compared. Materials and Methods: The projector for fan-beam SPECT was implemented using a ray-tracing method. The direct reconstruction algorithms implemented for fan-beam projection data were FBP (filtered backprojection), EM (expectation maximization), OS-EM (ordered subsets EM) and MAP-EM OSL (maximum a posteriori EM using the one-step late method) with membrane and thin-plate models as priors. For comparison, the fan-beam protection data were also rebinned into the parallel data using various interpolation methods, such as the nearest neighbor, bilinear and bicubic interpolations, and reconstructed using the conventional EM algorithm for parallel data. Noiseless and noisy projection data from the digital Hoffman brain and Shepp/Logan phantoms were reconstructed using the above algorithms. The reconstructed images were compared in terms of a percent error metric. Results: for the fan-beam data with Poisson noise, the MAP-EM OSL algorithm with the thin-plate prior showed the best result in both percent error and stability. Bilinear interpolation was the most effective method for rebinning from the fan-beam to parallel geometry when the accuracy and computation load were considered. Direct fan-beam EM reconstructions were more accurate than the standard EM reconstructions obtained from rebinned parallel data. Conclusion: Direct fan-beam reconstruction algorithms were implemented, which provided significantly improved reconstructions.

Implementation of High-Throughput SHA-1 Hash Algorithm using Multiple Unfolding Technique (다중 언폴딩 기법을 이용한 SHA-1 해쉬 알고리즘 고속 구현)

  • Lee, Eun-Hee;Lee, Je-Hoon;Jang, Young-Jo;Cho, Kyoung-Rok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.4
    • /
    • pp.41-49
    • /
    • 2010
  • This paper proposes a new high speed SHA-1 architecture using multiple unfolding and pre-computation techniques. We unfolds iterative hash operations to 2 continuos hash stage and reschedules computation timing. Then, the part of critical path is computed at the previous hash operation round and the rest is performed in the present round. These techniques reduce 3 additions to 2 additions on the critical path. It makes the maximum clock frequency of 118 MHz which provides throughput rate of 5.9 Gbps. The proposed architecture shows 26% higher throughput with a 32% smaller hardware size compared to other counterparts. This paper also introduces a analytical model of multiple SHA-1 architecture at the system level that maps a large input data on SHA-1 block in parallel. The model gives us the required number of SHA-1 blocks for a large multimedia data processing that it helps to make decision hardware configuration. The hs fospeed SHA-1 is useful to generate a condensed message and may strengthen the security of mobile communication and internet service.

40Gb/s Foward Error Correction Architecture for Optical Communication System (광통신 시스템을 위한 40Gb/s Forward Error Correction 구조 설계)

  • Lee, Seung-Beom;Lee, Han-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.2
    • /
    • pp.101-111
    • /
    • 2008
  • This paper introduces a high-speed Reed-Solomon(RS) decoder, which reduces the hardware complexity, and presents an RS decoder based FEC architecture which is used for 40Gb/s optical communication systems. We introduce new pipelined degree computationless modified Euclidean(pDCME) algorithm architecture, which has high throughput and low hardware complexity. The proposed 16 channel RS FEC architecture has two 8 channel RS FEC architectures, which has 8 syndrome computation block and shared single KES block. It can reduce the hardware complexity about 30% compared to the conventional 16 channel 3-parallel FEC architecture, which is 4 syndrome computation block and shared single KES block. The proposed RS FEC architecture has been designed and implemented with the $0.18-{\mu}m$ CMOS technology in a supply voltage of 1.8 V. The result show that total number of gate is 250K and it has a data processing rate of 5.1Gb/s at a clock frequency of 400MHz. The proposed area-efficient architecture can be readily applied to the next generation FEC devices for high-speed optical communications as well as wireless communications.

Effective Graph-Based Heuristics for Contingent Planning (조건부 계획수립을 위한 효과적인 그래프 기반의 휴리스틱)

  • Kim, Hyun-Sik;Kim, In-Cheol;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.18B no.1
    • /
    • pp.29-38
    • /
    • 2011
  • In order to derive domain-independent heuristics from the specification of a planning problem, it is required to relax the given problem and then solve the relaxed one. In this paper, we present a new planning graph, Merged Planning Graph(MPG), and GD heuristics for solving contingent planning problems with both uncertainty about the initial state and non-deterministic action effects. The merged planning graph is an extended one to be applied to the contingent planning problems from the relaxed planning graph, which is a common means to get effective heuristics for solving the classical planning problems. In order to get heuristics for solving the contingent planning problems with sensing actions and non-deterministic actions, the new graph utilizes additionally the effect-merge relaxations of these actions as well as the traditional delete relaxations. Proceeding parallel to the forward expansion of the merged planning graph, the computation of GD heuristic excludes the unnecessary redundant cost from estimating the minimal reachability cost to achieve the overall set of goals by analyzing interdependencies among goals or subgoals. Therefore, GD heuristics have the advantage that they usually require less computation time than the overlap heuristics, but are more informative than the max and the additive heuristics. In this paper, we explain the experimental analysis to show the accuracy and the search efficiency of the GD heuristics.

Spatial Computation on Spark Using GPGPU (GPGPU를 활용한 스파크 기반 공간 연산)

  • Son, Chanseung;Kim, Daehee;Park, Neungsoo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.8
    • /
    • pp.181-188
    • /
    • 2016
  • Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop.

Acceleration of Viewport Extraction for Multi-Object Tracking Results in 360-degree Video (360도 영상에서 다중 객체 추적 결과에 대한 뷰포트 추출 가속화)

  • Heesu Park;Seok Ho Baek;Seokwon Lee;Myeong-jin Lee
    • Journal of Advanced Navigation Technology
    • /
    • v.27 no.3
    • /
    • pp.306-313
    • /
    • 2023
  • Realistic and graphics-based virtual reality content is based on 360-degree videos, and viewport extraction through the viewer's intention or automatic recommendation function is essential. This paper designs a viewport extraction system based on multiple object tracking in 360-degree videos and proposes a parallel computing structure necessary for multiple viewport extraction. The viewport extraction process in 360-degree videos is parallelized by composing pixel-wise threads, through 3D spherical surface coordinate transformation from ERP coordinates and 2D coordinate transformation of 3D spherical surface coordinates within the viewport. The proposed structure evaluated the computation time for up to 30 viewport extraction processes in aerial 360-degree video sequences and confirmed up to 5240 times acceleration compared to the CPU-based computation time proportional to the number of viewports. When using high-speed I/O or memory buffers that can reduce ERP frame I/O time, viewport extraction time can be further accelerated by 7.82 times. The proposed parallelized viewport extraction structure can be applied to simultaneous multi-access services for 360-degree videos or virtual reality contents and video summarization services for individual users.

The 64-Bit Scrambler Design of the OFDM Modulation for Vehicles Communications Technology (차량 통신 기술을 위한 OFDM 모듈레이션의 64-비트 스크램블러 설계)

  • Lee, Dae-Sik
    • Journal of Internet Computing and Services
    • /
    • v.14 no.1
    • /
    • pp.15-22
    • /
    • 2013
  • WAVE(Wireless Access for Vehicular Environment) is new concepts and Vehicles communications technology using for ITS(Intelligent Transportation Systems) service by IEEE standard 802.11p. Also it increases the efficiency and safety of the traffic on the road. However, the efficiency of Scrambler bit computational algorithms of OFDM modulation in WAVE systems will fall as it is not able to process in parallel in terms of hardware and software. This paper proposes an algorithm to configure 64-bits matrix table in scambler bit computation as well as an algorithm to compute 64-bits matrix table and input data in parallel. The proposed algorithm on this thesis is executed using 64-bits matrix table. In the result, the processing speed for 1 and 1000 times is improved about 40.08% ~ 40.27% and processing rate per sec is performed more than 468.35 compared to bit operation scramble. And processing speed for 1 and 1000 times is improved about 7.53% ~ 7.84% and processing rate per sec is performed more than 91.44 compared to 32-bits operation scramble. Therefore, if the 64 bit-CPU is used for 64-bits executable scramble algorithm, it is improved more than 40% compare to 32-bits scrambler.