Search | Korea Science

Performance Comparison of Synchronization Methods for CC-NUMA Systems (CC-NUMA 시스템에서의 동기화 기법에 대한 성능 비교)

Moon, Eui-Sun;Jhang, Seong-Tae;Jhon, Chu-Shik
- Journal of KIISE:Computer Systems and Theory
- /
- v.27 no.4
- /
- pp.394-400
- /
- 2000
The main goal of synchronization is to guarantee exclusive access to shared data and critical sections, and then it makes parallel programs work correctly and reliably. Exclusive access restricts parallelism of parallel programs, therefor efficient synchronization is essential to achieve high performance in shared-memory parallel programs. Many techniques are devised for efficient synchronization, which utilize features of systems and applications. This paper shows the simulation results that existing synchronization methods have inefficiency under CC-NUMA(Cache Coherent Non-Uniform Memory Access) system, and then compares the performance of Freeze&Melt synchronization that can remove the inefficiency. The simulation results present that Test-and-Test&Set synchronization has inefficiency caused by broadcast operation and the pre-defined order of Queue-On-Lock-Bit (QOLB) synchronization to execute a critical section causes inefficiency. Freeze&Melt synchronization, which removes these inefficiencies, has performance gain by decreasing the waiting time to execute a critical section and the execution time of a critical section, and by reducing the traffic between clusters.
PDF

Design and Implementation of a Massively Parallel Multithreaded Architecture: DAVRID

Sangho Ha;Kim, Junghwan;Park, Eunha;Yoonhee Hah;Sangyong Han;Daejoon Hwang;Kim, Heunghwan;Seungho Cho
- Journal of Electrical Engineering and information Science
- /
- v.1 no.2
- /
- pp.15-26
- /
- 1996
MPAs(Massively Parallel Architectures) should address two fundamental issues for scalability: synchronization and communication latency. Dataflow architecture faces problems of excessive synchronization overhead and inefficient execution of sequential programs while they offer the ability to exploit massive parallelism inherent in programs. In contrast, MPAs based on von Neumann computational model may suffer from inefficient synchronization mechanism and communication latency. DAVRID (DAtaflow/Von Neumann RISC hybrID) is a massively parallel multithreaded architecture which takes advantages of von Neumann and dataflow models. It has good single thread performance as well as tolerates synchronization and communication latency. In this paper, we describe the DAVRID architecture in detail and evaluate its performance through simulation runs over several benchmarks.
PDF

A New Synchronization Scheme for Parallel Processing of Loop with Constant and Variable Dependence Distance (불변 및 가변 종속거리를 갖는 루프의 병렬처리를 위한 새로운 동기화 기법)

이광형;황종선;박두순
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.32B no.5
- /
- pp.693-701
- /
- 1995
In most application programs, loops usually comprise most of the computation in a program and are the most important source of parallelism. When loops are executed on multiprocessors, the cross iteration data dependences need to be enforced by synchronization between processors. Existing synchronization schemes have been studied mainly on the loop with constant dependence distance. When these schemes are applied to the loop with variable dependence distance, there exists lots of overhead by the use of unnecessary synchronization variables and execution of unuseful synchronization instructions. Even though there exist various variable synchronization schemes, they have a lot of run-time overhead to compute synchronization information. In this paper, we present a new synchronization scheme, Synch-Free/Synch-Hold for managing synchronization efficiently on the loop with constant and variable dependence distance.
PDF

An Improving Method of Restructuring Parallel Programs for Data Race Detection

Ha, Keum-Sook;Lee, Sung woo;Yoo, Kee-Young
- Proceedings of the IEEK Conference
- /
- 2000.07b
- /
- pp.715-718
- /
- 2000
Although shared memory parallel programs are designed to be deterministic both in their final results and intermediate states, the races that occur when different processes access a common memory location in an order not guaranteed by synchronization could result in unintended non-deterministic executions of the program. So, Detecting races, particularly first data races, is important for debugging explicit shared memory parallel programs. It is possible that all data races reported by other on-the-fly algorithms would disappear once the first races were removed. To detect races parallel programs with nested loops and inter-thread coordination, it must guarantee the order of synchronization operations in an execution instance. In this paper, we propose an improved restructuring method that guarantee ordering execution instance and preserve the semantics of original program. This method requires O(np) time and (s + up) space, where n is the number of total operations, s is the number of synchronization operations and p is the number of parallelism in the execution. Also, this method makes on-the-fly detection of parallel program with nested loops and inter-thread coordination more easily in space and time complexity.
PDF

A New Synchronization Scheme for Parallel Processing on Perfectly Nested Do Loops (완전 중첩 루프에서 병렬처리를 위한 새로운 동기화 기법)

이광형;황종선;박두순;김병수
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.31B no.10
- /
- pp.1-10
- /
- 1994
In most application programs, loops usually contain most of the computation in a program and are the most improtant source of parallelism. When loops are executed on multiprocessors, the cross iteration data dependences need to be enforced by synchronization between processors. In this paper, we propose a new synchronization scheme(Free/Hold) for reducing overgeads occured by synchronization variables in data oriented scheme and delay of time occured by synchronization instruction in statement oriented scheme. The Free/Hold mechanism enforces the correct execution order by inserting synchronization instruction between each instance with data dependence relationship using the RD(Real dependence Distance). We also present an algorithm for removing unnecessary dependences in one-to-many dependences.
PDF

An Efficient Tool for Verifying Races in OpenMP Directive Programs without Interthread Synchronization (스레드 동기화가 없는 OpenMP 디렉티브 프로그램을 위한 효율적인 경합검증 도구)

Ha, Ok-Kyoon;Kang, Moon-Hye;Kim, Young-Joo;Jun, Yong-Ki
- Journal of KIISE:Computing Practices and Letters
- /
- v.14 no.3
- /
- pp.301-305
- /
- 2008
Races must be detected for debugging OpenMP programs with directives, because they may cause unintended nondeterministic results of programs. Intel Thread Checker, an existing tool that can detects races, can not verify the existence of races and is often time-consuming and tends to require large space. To solve these problems, we developed a tool that verifies the existence of races using user requirements and analyzed model of programs. However, the tool does not have optimal performance in programs which have no synchronization for interthread coordination. This paper presents an optimal tool that applies the optimum labeling and protocol for program models without interthread coordination. For synthetic programs without interthread synchronization, the tool verifies races over 250 times faster than the previous tool on the average, even if the maximum parallelism increases in every case of which the number of total accesses are identical.
PDF KSCI

Analysis of Barrier Waiting Times in Data Parallel Programs (데이터 병렬 프로그램에서 배리어 대기시간의 분석)

Jung, In-Bum
- Journal of Industrial Technology
- /
- v.21 no.A
- /
- pp.73-80
- /
- 2001
Barrier is widely used for synchronization in parallel programs. Since the process arrived earlier than others should wait at the barrier, the total processor utilization decreases. In this paper, to find the sources of the barrier waiting time, parallel programs are executed on the various grain sizes through execution-driven simulations. In simulation studies, we found that even if approximately equal amounts of work are distributed to each processor, all processes may not arrive at a barrier at the same time. The reasons are that the different numbers of cache misses and instructions within partitioned grains result in the difference in arrival time of processors at the barrier.
PDF

An Adaptive Multimedia Synchronization Scheme for Media Stream Delivery in Multimedia Communication (멀티미디어 통신에서 미디어스트림 전송을 위한 적응형 멀티미디어 동기화 기법)

Lee, Gi-Sung
- The KIPS Transactions:PartC
- /
- v.9C no.6
- /
- pp.953-960
- /
- 2002
Rel-time application programs have constraints which need to be met between media-data. It is client-leading synchronization that is absorbing variable transmission delay time and that is synchronizing by feedback control and palyout control. It is the important factor for playback rate and QoS if the buffer level is normal or not. This paper, The method of maintenance buffer normal state transmits in multimedia server by appling feedback of filtering function. And synchronization method is processing adaptive playout time for smooth presentation without cut-off while media frame is skip. When audio frame which is master media is in upper threshold buffer level we decrease play out time gradually, low threshold buffer level increase it slowly.
https://doi.org/10.3745/KIPSTC.2002.9C.6.953 인용 PDF KSCI

(A Study on an Adaptive Multimedia Synchronization Scheme for Media Stream Transmission) (미디어 스트림 전송을 위한 적응형 멀티미디어 동기화 기법에 관한 연구)

지정규
- Journal of the Korea Computer Industry Society
- /
- v.3 no.9
- /
- pp.1251-1260
- /
- 2002
Real-time application programs have synchronization constraints which need to be met between media-data. Synchronization method represents feedback method including virtual client-side buffer. This buffer is used in buffer level method. It is client-leading synchronization that is absorbing variable transmission delay time and that is synchronizing by feedback control. It is the important factor for playback rate and QoS if the buffer level is normal or not. To solve the problems, we can control the start of transmission in multimedia server by appling filtering, control and network evaluation function. Synchronization method is processing for smooth presentation without cut-off while media is playing out. When audio frame which is master media is in high threshold buffer level we decrease play out time gradually, otherwise we increase it slowly.
PDF

Adaptive Multimedia Synchronization Using Waiting Time (대기시간을 이용한 적응형 멀티미디어 동기화 기법)

Lee, Gi-Seong;Lee, Geun-Wang;Lee, Jong-Chan;O, Hae-Seok
- The Transactions of the Korea Information Processing Society
- /
- v.7 no.2S
- /
- pp.649-655
- /
- 2000
Real-time application programs have constraints which need to be met between media-data. These constraints represents the delay time ad quality of service between media-data to be presented. In order to efficiently describe the delay time and quality of service, a new synchronization mechanism is needed. Proposed paper is a dynamic synchronization that minimized the effects of adaptive transmission delay time. That is, the method meets the requirements of synchronization between media-dat by handling dynamically the adaptive waiting time resulted from variations of delay time. In addition, the mechanism has interval adjustment using maximum delay jitter time. This paper decreases the data loss resulted from variation of delay time and from loss time of media-data by means of applying delay jitter in order to deal with synchronization interval adjustment. Plus, the mechanism adaptively manages the waiting time of smoothing buffer, which leads to minimize the gap from the variation of delay time. The proposed paper is suitable to the system which requires the guarantee of high quality of service and mechanism improves quality of services such as decrease of loss rate, increase of playout rate.
PDF

Search Result 43, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)