통합 검색 | Korea Science

CPU-GPU² Trigeneous Computing for Iterative Reconstruction in Computed Tomography

Oh, Chanyoung;Yi, Youngmin
- IEIE Transactions on Smart Processing and Computing
- /
- 제5권4호
- /
- pp.294-301
- /
- 2016
In this paper, we present methods to efficiently parallelize iterative 3D image reconstruction by exploiting trigeneous devices (three different types of device) at the same time: a CPU, an integrated GPU, and a discrete GPU. We first present a technique that exploits single instruction multiple data (SIMD) architectures in GPUs. Then, we propose a performance estimation model, based on which we can easily find the optimal data partitioning on trigeneous devices. We found that the performance significantly varies by up to 6.23 times, depending on how SIMD units in GPUs are accessed. Then, by using trigeneous devices and the proposed estimation models, we achieve optimal partitioning and throughput, which corresponds to a 9.4% further improvement, compared to discrete GPU-only execution.
https://doi.org/10.5573/IEIESPC.2016.5.4.294 인용 PDF KSCI

Performance of Distributed Database System built on Multicore Systems

Kim, Kangseok
- 인터넷정보학회논문지
- /
- 제18권6호
- /
- pp.47-53
- /
- 2017
Recently, huge datasets have been generating rapidly in a variety of fields. Then, there is an urgent need for technologies that will allow efficient and effective processing of huge datasets. Therefore the problems of partitioning a huge dataset effectively and alleviating the processing overhead of the partitioned data efficiently have been a critical factor for scalability and performance in distributed database system. In our work we utilized multicore servers to provide scalable service to our distributed system. The partitioning of database over multicore servers have emerged from a need for new architectural design of distributed database system from scalability and performance concerns in today's data deluge. The system allows uniform access through a web service interface to concurrently distributed databases over multicore servers, using SQMD (Single Query Multiple Database) mechanism based on publish/subscribe paradigm. We will present performance results with the distributed database system built on multicore server, which is time intensive with traditional architectures. We will also discuss future works.
https://doi.org/10.7472/jksii.2017.18.6.47 인용 PDF KSCI

A Low-Complexity 128-Point Mixed-Radix FFT Processor for MB-OFDM UWB Systems

Cho, Sang-In;Kang, Kyu-Min
- ETRI Journal
- /
- 제32권1호
- /
- pp.1-10
- /
- 2010
In this paper, we present a fast Fourier transform (FFT) processor with four parallel data paths for multiband orthogonal frequency-division multiplexing ultra-wideband systems. The proposed 128-point FFT processor employs both a modified radix-$2^4$ algorithm and a radix-$2^3$ algorithm to significantly reduce the numbers of complex constant multipliers and complex booth multipliers. It also employs substructure-sharing multiplication units instead of constant multipliers to efficiently conduct multiplication operations with only addition and shift operations. The proposed FFT processor is implemented and tested using 0.18 ${\mu}m$ CMOS technology with a supply voltage of 1.8 V. The hardware- efficient 128-point FFT processor with four data streams can support a data processing rate of up to 1 Gsample/s while consuming 112 mW. The implementation results show that the proposed 128-point mixed-radix FFT architecture significantly reduces the hardware cost and power consumption in comparison to existing 128-point FFT architectures.
https://doi.org/10.4218/etrij.10.0109.0232 인용 PDF KSCI

Adaptive and optimized agent placement scheme for parallel agent-based simulation

Jin, Ki-Sung;Lee, Sang-Min;Kim, Young-Chul
- ETRI Journal
- /
- 제44권2호
- /
- pp.313-326
- /
- 2022
This study presents a noble scheme for distributed and parallel simulations with optimized agent placement for simulation instances. The traditional parallel simulation has some limitations in that it does not provide sufficient performance even though using multiple resources. The main reason for this discrepancy is that supporting parallelism inevitably requires additional costs in addition to the base simulation cost. We present a comprehensive study of parallel simulation architectures, execution flows, and characteristics. Then, we identify critical challenges for optimizing large simulations for parallel instances. Based on our cost-benefit analysis, we propose a novel approach to overcome the performance constraints of agent-based parallel simulations. We also propose a solution for eliminating the synchronizing cost among local instances. Our method ensures balanced performance through optimal deployment of agents to local instances and an adaptive agent placement scheme according to the simulation load. Additionally, our empirical evaluation reveals that the proposed model achieves better performance than conventional methods under several conditions.
https://doi.org/10.4218/etrij.2020-0399 인용 PDF KSCI

Forecasting realized volatility using data normalization and recurrent neural network

Yoonjoo Lee;Dong Wan Shin;Ji Eun Choi
- Communications for Statistical Applications and Methods
- /
- 제31권1호
- /
- pp.105-127
- /
- 2024
We propose recurrent neural network (RNN) methods for forecasting realized volatility (RV). The data are RVs of ten major stock price indices, four from the US, and six from the EU. Forecasts are made for relative ratio of adjacent RVs instead of the RV itself in order to avoid the out-of-scale issue. Forecasts of RV ratios distribution are first constructed from which those of RVs are computed which are shown to be better than forecasts constructed directly from RV. The apparent asymmetry of RV ratio is addressed by the Piecewise Min-max (PM) normalization. The serial dependence of the ratio data renders us to consider two architectures, long short-term memory (LSTM) and gated recurrent unit (GRU). The hyperparameters of LSTM and GRU are tuned by the nested cross validation. The RNN forecast with the PM normalization and ratio transformation is shown to outperform other forecasts by other RNN models and by benchmarking models of the AR model, the support vector machine (SVM), the deep neural network (DNN), and the convolutional neural network (CNN).
https://doi.org/10.29220/CSAM.2024.31.1.105 인용 PDF

Toward Generic, Immersive, and Collaborative Solutions to the Data Interoperability Problem which Target End-Users

Sanchez-Ruiz, Arturo;Umapathy, Karthikeyan;Hayes, Pat
- Journal of Computing Science and Engineering
- /
- 제3권2호
- /
- pp.127-141
- /
- 2009
In this paper, we describe our vision of a "Just-in-time" initiative to solve the Data Interoperability Problem (a.k.a. INTEROP.) We provide an architectural overview of our initiative which draws upon existing technologies to develop an immersive and collaborative approach which aims at empowering data stakeholders (e.g., data producers and data consumers) with integrated tools to interact and collaborate with each other while directly manipulating visual representations of their data in an immersive environment (e.g., implemented via Second Life.) The semantics of these visual representations and the operations associated with the data are supported by ontologies defined using the Common Logic Framework (CL). Data operations gestured by the stakeholders, through their avatars, are translated to a variety of generated resources such as multi-language source code, visualizations, web pages, and web services. The generality of the approach is supported by a plug-in architecture which allows expert users to customize tasks such as data admission, data manipulation in the immersive world, and automatic generation of resources. This approach is designed with a mindset aimed at enabling stakeholders from diverse domains to exchange data and generate new knowledge.
https://doi.org/10.5626/JCSE.2009.3.2.127 인용 PDF

EA를 위한 데이터 아키텍처 구축 모델 (A Model of implementation Data Architecture for Enterprise Architecture)

김석수;이화식
- 한국컴퓨터정보학회논문지
- /
- 제16권9호
- /
- pp.175-183
- /
- 2011
데이터는 IT의 핵심요소이다. 다른 아키텍처는 선진 기술과 기법을 참조하고 도입하여 적용이 가능하지만 데이터 아키텍처는 고유한 것이어서 우리 스스로 구축을하여야 한다. 데이터는 기술의변화와 진화에 민감하지 않은 영역으로 처음 구축 시 잘 만들면 건물의 철골 구조물과 같이 건실한 정보 시스템을 구축하는데 좋은 방향을 제시할 수 있다. 잘 구축된 데이터 아키텍처는 엔터프라이즈 아키텍처 구축을 용이하게 하고, 구축 후 관리 및 운영을 효과적으로 할 수 있게 한다. 본 논문은 엔터프라이즈 아키텍처를 위한 데이터 아키텍처 구축 모델을 제시한다.
https://doi.org/10.9708/jksci.2011.16.9.175 인용 PDF KSCI

High Speed Graphics SDRAM을 위한 저 전력, 저 노이즈 Data Bus Inversion (A Low Power and Low Noise Data Bus Inversion for High Speed Graphics SDRAM)

곽승욱;곽계달
- 대한전자공학회논문지SD
- /
- 제46권7호
- /
- pp.1-6
- /
- 2009
본 논문은 DRAM에서 DBI (Data Bus Inversion)를 이용한 새로운 방식의 High Speed 아키텍쳐를 설명하고자한다. DBI는 SSO와 LSI와 같은 잘 알려진 문제를 감소시키기 위한 방식중의 하나이다. 본 논문에서는 Analog Majority Voter(AMV), DBI Flag에 의한 GIO 제어회로, 새로운 SSO Algorithm과 같은 많은 아키텍쳐들이 Data Bus의 천이(Toggle) 개수를 줄이기 위해서 제안되었다. DBI Flag에 의해 GIO데이터 반전 여부를 결정되기 때문에 파워 소모가 감소될 수 있고, 데이터 Eye diagram도 40ps이상 증가될 수 있게 되었다. 제안된 DBI Scheme을 이용하였을 때 High speed 동작에서 거의 안정한 SI특성을 얻을 수 있게 됐다. 90nm CMOS Technology를 이용하여 제조되었다.
PDF KSCI

Techniques for Yield Prediction from Corn Aerial Images - A Neural Network Approach -

Zhang, Q.;Panigrahi, S.;Panda, S.S.;Borhan, Md.S.
- Agricultural and Biosystems Engineering
- /
- 제3권1호
- /
- pp.18-28
- /
- 2002
Neural network based models were developed and evaluated for predicting corn yield from aerial images based on 1998 and 1994 image data. The model used images in multi-spectral bands such as R, G, B, and IR (Red, Green, Blue and Infrared). The inputs to the neural network consisted of mean and standard deviation of multispectral bands of the aerial images. Performances of several neural network architectures using back-propagation with momentum were compared. The maximum yield prediction accuracy obtained was 97.81%. The BPNN model prediction accuracy could be enhanced by using more number of observations to the model, other data transformation techniques, or by performing optical calibration of the aerial image.
PDF

Efficient VLSI architecture for one-dimensional discrete wavelet transform using a sealable data reorder unit

Park, Taegeun
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2002년도 ITC-CSCC -1
- /
- pp.353-356
- /
- 2002
In this paper, we design an efficient, scalable one-dimensional discrete wavelet transform (1DDWT) filter using data reorder unit (DRU). At each level, the required hardware is optimized by sharing multipliers and adders because the input rate is reduced by a factor of two at each level due to decimation. The proposed architecture shows 100% hardware utilization by balancing hardware with input rate. Furthermore, sharing the coefficients of the highpass and the lowpass filters using the mirror filter property reduces the number of multipliers and adders by half. We designed a scalable DRU that efficiently reorders and feeds inputs to highpass and lowpass filters. The proposed DRU-based architecture is so regular and scalable that it can be easily extended to an arbitrary 1D DWT structure with M taps and J levels. Compared to other architectures, the proposed DWT filter shows efficiency in performance with relatively less hardware.
PDF

검색결과 357건 처리시간 0.021초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)