• 제목/요약/키워드: Memory reduction

검색결과 469건 처리시간 0.027초

Parallel Implementations of Digital Focus Indices Based on Minimax Search Using Multi-Core Processors

  • HyungTae, Kim;Duk-Yeon, Lee;Dongwoon, Choi;Jaehyeon, Kang;Dong-Wook, Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권2호
    • /
    • pp.542-558
    • /
    • 2023
  • A digital focus index (DFI) is a value used to determine image focus in scientific apparatus and smart devices. Automatic focus (AF) is an iterative and time-consuming procedure; however, its processing time can be reduced using a general processing unit (GPU) and a multi-core processor (MCP). In this study, parallel architectures of a minimax search algorithm (MSA) are applied to two DFIs: range algorithm (RA) and image contrast (CT). The DFIs are based on a histogram; however, the parallel computation of the histogram is conventionally inefficient because of the bank conflict in shared memory. The parallel architectures of RA and CT are constructed using parallel reduction for MSA, which is performed through parallel relative rating of the image pixel pairs and halved the rating in every step. The array size is then decreased to one, and the minimax is determined at the final reduction. Kernels for the architectures are constructed using open source software to make it relatively platform independent. The kernels are tested in a hexa-core PC and an embedded device using Lenna images of various sizes based on the resolutions of industrial cameras. The performance of the kernels for the DFIs was investigated in terms of processing speed and computational acceleration; the maximum acceleration was 32.6× in the best case and the MCP exhibited a higher performance.

새로운 안트라사이클린계 항암제 DA-125의 생식독성연구: 랫트 주산기 및 수유기시험 (Reproductive Toxicity of DA-125, A New Anthracycline Anticancer Agent: Peri- and Postnatal Study in Rats)

  • 정문구;이순복;한상섭;노정구
    • Biomolecules & Therapeutics
    • /
    • 제3권1호
    • /
    • pp.38-46
    • /
    • 1995
  • DA-125, a new anthracycline antitumor antibiotic, was administered at dose levels of 0, 0.04, 0.2 and 1.0 mg/kg/day intravenously to pregnant and subsequently delivered Sprague-Dawley rats from day 17 of gestation to day 21 of lactation. Effects of test agent on general toxicity of dams and growth, behaviour and mating performance of F1 offspring were examined. At 1 mg/kg, one out of the twentytwo dams showed difficult delivery, characterized by a stillbirth. Reduction in body weight, loss in food intake, and decrease in spleen weight were also observed in dams. In addition, the lower rates of successful performances in memory test (28.6%) and necrosis of tail end (9.5%) were seen in F1 offspring. At 0.04 and 0.2 mg/kg, no toxic effect on dams and F1 offspring was observed. There were no malformed Fl and F2 fetuses in all groups. The results indicate that the no effect dose levels(NOELs) of DA-125 are 0.2 mg/kg/day for dams and Fl offspring, and over 1 mg/kg/day for F2 fetuses.

  • PDF

발전설비의 터빈 축정렬 (II) : 자동화를 위한 전산 프로그램 개발 (Turbine Alignment (II): Computer Program Development)

  • 황철호;김정태;전오성;이현;이병준
    • 소음진동
    • /
    • 제4권1호
    • /
    • pp.33-42
    • /
    • 1994
  • When a vibration is generated due to the misalignment, the reduction of the vibration level is not attainable unless a correct shaft alignment is conducted. In a turbine system, an alignment procedure requires quite a lot amount of expense and time. To reduce this effort, an algorithm of the turbine alignment is developed to be used in the computer program. The program consists of five parts : input, calculation, display of the results, file management, and printer output. In the input part, users must provide the data on the turbine number, the reference value of the alignment, and the number of the feet of the generator. In calculation, the moving distance of the bearing and the necessary amount of the shims are calculated. In the display and the output parts, the calculated results are displayed and calculated. In the display and the output parts, the calculated results are displayed and printed. Then, by using the file management, results and procedures conducted are saved in the floppy diskette or in the hard disk. The developed program can be run in IBM PC compatible with more than 640 KB of main memory with the operating system of MS-DOS v 3.3 or higher. It is developed for novice users with no experience or specialty in this field. The program is not only useful in the power plant application, but also helpful for recording of the alignment procedures.

  • PDF

순차 회로를 위한 효율적인 혼합 고장 진단 알고리듬 (An Efficient Hybrid Diagnosis Algorithm for Sequential Circuits)

  • 김지혜;이주환;강성호
    • 대한전자공학회논문지SD
    • /
    • 제41권5호
    • /
    • pp.51-60
    • /
    • 2004
  • 반도체 기술의 발달로 회로의 집적도와 복잡도가 증가함에 따라 칩의 생산 과정에서 고장이 발생하는 빈도가 높아지게 되었다. 칩의 수율을 향상시키고, 생산 단가를 절감시키기 위해서 고장의 원인을 찾아내고 분석하는 과정은 매우 중요하다. 그러나 고장의 원인을 분석하는 과정 중 고장의 위치를 찾아내는 데는 많은 시간이 소요된다. 게이트 수준에서의 고장 위치 진단은 물리적 수준에서의 고장 범위를 한정해 줌으로써 고장 위치를 찾는 데 소요되는 시간을 줄 일 수 있다는 데 의미를 갖는다. 본 논문에서는 새로운 방식의 고장 딕션너리 방식과 추가적인 고장 시뮬레이션 방식을 혼합하여, 메모리의 소비를 최소화하면서도 시뮬레이션 수행 시간을 단축시킴으로써 효과적으로 고장 진단을 수행할 수 있는 고장 진단 알고리듬을 제안한다.

Clock-gating 방법을 사용한 저전력 시스톨릭 어레이 비터비 복호기 구현 (Low-Power Systolic Array Viterbi Decoder Implementation With A Clock-gating Method)

  • 류제혁;조준동
    • 정보처리학회논문지A
    • /
    • 제12A권1호
    • /
    • pp.1-6
    • /
    • 2005
  • 본 논문에서는 trace-back systolic array Viterbi algorithm의 저전력 생존 메모리 구현에 관한 새로운 알고리즘을 소개한다. 이 알고리즘의 핵심 아이디어는 trace back 연산의 수를 줄이기 위하여 이미 생성된 trace-back routes를 재사용하는 것이다. 그리고 trace-back unit의 불필요한 switching activity가 발생하는 영역을 gate-clock을 사용하여 전력소모를 줄이는 것이다. Synopsys Power Estimation 툴인 Design Power를 이용하여 전력소모를 측정하였고, 그 결과 [1]의 논문에서 소개된 trace-back unit 비하여 평균 $40{\%}$ 전력감소가 있었고, $23{\%}$의 면적증가를 보였다.

다자간 화상회의 시스템에서의 동시 전송방법에 의한 데이터 입출력 시간 단축 방안 (Data Input/Output Time Reduction Scheme with the Simultaneous Transmission Method for Multi-participants Video Conference System)

  • 김현기
    • 한국멀티미디어학회논문지
    • /
    • 제3권3호
    • /
    • pp.234-240
    • /
    • 2000
  • 본 논문에서는 멀티미디어 데이터 스트림이 기존의 시스템 버스를 이용하여 네트워크 접속장치로부터 주기억 장치 및 멀티미디어 처리장치에 동일한 데이터가 동시에 전송될 수 있는 방법을 제 안한다. 제안한 방법은 시스템 버스 내부의 데이터 흐름을 개선하고, 멀티미디어 데이터의 입출력 시간을 단축시킬 수 있다. 또한, 본 논문에서 제안한 방법을 다자간 화상회의 시스템에 적용하여 참석자 수에 따른 시스템 버스의 사용횟수, 버스사이클 및 데이터의 전송시간을 기존의 방법과 비교하였다. 성능비교 결과, 제안한 방법이 기존의 방법보다 참석자의 수에 관계없이 시스템 버스의 사용횟수는 50%, 전송시간은 75%씩 감소되리라 예상된다.

  • PDF

플라즈마 디스플레이의 소음 저감 연구 (Study on Noise Reduction of Plasma Display Panel)

  • 박대경;권해섭;장동섭
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2002년도 추계학술대회논문집
    • /
    • pp.693-698
    • /
    • 2002
  • For the evaluation of the plasma display panel (PDP)'s noise, vibration and sound characteristics of fanless PDP are measured and investigated. PDP is a type of two-electrode vacuum tube which operates on the same principle as a household fluorescent light. An inert gas such as argon or neon is injected between two glass plates on which transparent electrodes have been formed, and the glass is illuminated by generating discharge. For this discharge, both high voltage and currents are needed and cause an acoustic noise. We investigated the noise characteristics connected with both a electromagnetic elements from SMPS to panel through X, Y and logic board, and a mechanical elements form panel to case through transfer path which related with vibration and heat. To reduce the noise of PDP, a discharge pulse memory design related with both higher brightness and lower power consumption is important and mechanical characteristics connected with dissipation process of both heat and vibration generated by panel discharge must be investigated.

  • PDF

Parallel processing in structural reliability

  • Pellissetti, M.F.
    • Structural Engineering and Mechanics
    • /
    • 제32권1호
    • /
    • pp.95-126
    • /
    • 2009
  • The present contribution addresses the parallelization of advanced simulation methods for structural reliability analysis, which have recently been developed for large-scale structures with a high number of uncertain parameters. In particular, the Line Sampling method and the Subset Simulation method are considered. The proposed parallel algorithms exploit the parallelism associated with the possibility to simultaneously perform independent FE analyses. For the Line Sampling method a parallelization scheme is proposed both for the actual sampling process, and for the statistical gradient estimation method used to identify the so-called important direction of the Line Sampling scheme. Two parallelization strategies are investigated for the Subset Simulation method: the first one consists in the embarrassingly parallel advancement of distinct Markov chains; in this case the speedup is bounded by the number of chains advanced simultaneously. The second parallel Subset Simulation algorithm utilizes the concept of speculative computing. Speedup measurements in context with the FE model of a multistory building (24,000 DOFs) show the reduction of the wall-clock time to a very viable amount (<10 minutes for Line Sampling and ${\approx}$ 1 hour for Subset Simulation). The measurements, conducted on clusters of multi-core nodes, also indicate a strong sensitivity of the parallel performance to the load level of the nodes, in terms of the number of simultaneously used cores. This performance degradation is related to memory bottlenecks during the modal analysis required during each FE analysis.

3차원 그래픽 가속기의 지연 감소를 위한 개선된 래스터라이져 및 캐쉬 메모리 구조 제안 및 실험 (The Advanced Rasterizer and Cache Memory Architecture for Latency Reduction Of 3D GPU)

  • 박진홍;김일산;박우찬;한탁돈
    • 한국정보과학회:학술대회논문집
    • /
    • 한국정보과학회 2005년도 한국컴퓨터종합학술대회 논문집 Vol.32 No.1 (A)
    • /
    • pp.727-729
    • /
    • 2005
  • 현재 3차원 그래픽 가속기에서 성능 향상에 대한 문제점으로 대두되고 있는 것은 실제 화면에 그려지는 정보가 저장되는 프레임버퍼에 대한 접근 지연이다. 따라서 본 논문은 기존 픽셀 캐쉬가 포함된 래스터라이져 구조에서 캐쉬 읽기 접근 실패 시 발생하는 패널티와 이에 따른 프레임버퍼에 대한 지연이 발생하는 문제점을 개선하고자, 기존 래스터라이져를 래스터라이져와 합성기로 구분하고 그 사이에 캐쉬 읽기 접근 실패 시 프레임 버퍼에서 정보를 읽어오지 않는 깊이 캐쉬와 색상 캐쉬가 쌍을 이룬 픽셀 캐쉬 메모리 시스템으로 구성된 개선된 3차원 그래픽 가속기 구조을 제안하고 실험을 수행하였다. 실험 결과 제안하는 3차원 그래픽 가속기 구조가 기존 구조에 비해 캐쉬 접근 실패율이 약 $23\%$ 감소하였으며, 평균 메모리 접근 사이클이 $10\%-13\%$ 감소하였으며 이는 상당수의 프레임버퍼에 대한 접근 지연을 감소시킨 것이다. 합성기와 메모리 간의 대역폭은 약 $10\%$ 증가하지만 파이프라인의 작업에는 영향을 미치지는 않는다.

  • PDF

Nonlinear dynamic analysis of RC frames using cyclic moment-curvature relation

  • Kwak, Hyo-Gyoung;Kim, Sun-Pil;Kim, Ji-Eun
    • Structural Engineering and Mechanics
    • /
    • 제17권3_4호
    • /
    • pp.357-378
    • /
    • 2004
  • Nonlinear dynamic analysis of a reinforced concrete (RC) frame under earthquake loading is performed in this paper on the basis of a hysteretic moment-curvature relation. Unlike previous analytical moment-curvature relations which take into account the flexural deformation only with the perfect-bond assumption, by introducing an equivalent flexural stiffness, the proposed relation considers the rigid-body-motion due to anchorage slip at the fixed end, which accounts for more than 50% of the total deformation. The advantage of the proposed relation, compared with both the layered section approach and the multi-component model, may be the ease of its application to a complex structure composed of many elements and on the reduction in calculation time and memory space. Describing the structural response more exactly becomes possible through the use of curved unloading and reloading branches inferred from the stress-strain relation of steel and consideration of the pinching effect caused by axial force. Finally, the applicability of the proposed model to the nonlinear dynamic analysis of RC structures is established through correlation studies between analytical and experimental results.