• Title/Summary/Keyword: CPU bandwidth

Search Result 79, Processing Time 0.034 seconds

A Prioritized Task Scheduling Method in Multimedia Systems for MPEG-2 Decoding (MPEG-2 디코딩을 위한 멀티미디어 시스템에서 우선순위에 의한 태스크 스케쥴링 기법)

  • Kim Jinhwan
    • The KIPS Transactions:PartB
    • /
    • v.12B no.2 s.98
    • /
    • pp.173-180
    • /
    • 2005
  • In this paper, we propose an efficient real-time scheduling method of multimedia tasks for decoding frames of MPEG-2 video streams. In our task model, each frame is decoded by a separate multimedia task. The decoding task for each frame is assigned to the priority according to the precedence and importance of frames in a video stream. We use a priority-based scheduling policy in order to effectively allocate the CPU bandwidth to multimedia tasks for MPEG-2 decoding. We show how to dynamically control the fraction of the CPU bandwidth allocated to each multimedia task according to the priority. The primary purpose of our scheduling method is to enhance the real-time performance of the multimedia system by minimizing the number of decoding tasks that have missed their deadlines while reducing the decoding times of these multimedia tasks. The performance of this scheduling method is compared with that of similar mechanisms through simulation experiments.

Comparison of Parallel Computation Performances for 3D Wave Propagation Modeling using a Xeon Phi x200 Processor (제온 파이 x200 프로세서를 이용한 3차원 음향 파동 전파 모델링 병렬 연산 성능 비교)

  • Lee, Jongwoo;Ha, Wansoo
    • Geophysics and Geophysical Exploration
    • /
    • v.21 no.4
    • /
    • pp.213-219
    • /
    • 2018
  • In this study, we simulated 3D wave propagation modeling using a Xeon Phi x200 processor and compared the parallel computation performance with that using a Xeon CPU. Unlike the 1st generation Xeon Phi coprocessor codenamed Knights Corner, the 2nd generation x200 Xeon Phi processor requires no additional communication between the internal memory and the main memory since it can run an operating system directly. The Xeon Phi x200 processor can run large-scale computation independently, with the large main memory and the high-bandwidth memory. For comparison of parallel computation, we performed the modeling using the MPI (Message Passing Interface) and OpenMP (Open Multi-Processing) libraries. Numerical examples using the SEG/EAGE salt model demonstrated that we can achieve 2.69 to 3.24 times faster modeling performance using the Xeon Phi with a large number of computational cores and high-bandwidth memory compared to that using the 12-core CPU.

An Optimal and Dynamic Monitoring Interval for Grid Resource Information Services (그리드 자원정보 서비스를 위한 최적화된 동적 모니터링 인터벌에 관한 연구)

  • Kim Hye-Ju;Huh Eui-Nam;Lee Woong-Jae;Park Hyoung-Woo
    • Journal of Internet Computing and Services
    • /
    • v.4 no.6
    • /
    • pp.13-24
    • /
    • 2003
  • Grid technology requires use of geographically distributed resources from multiple domains. Resource monitoring services or tools consisting sensors or agents will run on many systems to find static resource information (such as architecture vendor, OS name and version, MIPS rate, memory size, CPU capacity, disk size, and NIC information) and dynamic resource information (CPU usage, network usage(bandwidth, latency), memory usage, etc.). Thus monitoring itself may cause system overhead. This paper proposes the optimal monitoring interval to reduce the cost of monitoring services and the dynamic monitoring interval to measure monitoring events accurately. By employing two features, we find out unnecessary system overhead is significantly reduced and accuracy of events is still acquired.

  • PDF

Implementation of a TCP/IP Offload Engine Using High Performance Lightweight TCP/IP (고성능 경량 TCP/IP를 이용한 소프트웨어 기반 TCP/IP 오프로드 엔진 구현)

  • Jun, Yong-Tae;Chung, Sang-Hwa;Yoon, In-Su
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.369-377
    • /
    • 2008
  • Today, Ethernet technology is rapidly developing to have a bandwidth of 10Gbps beyond 1Gbps. In such high-speed networks, the existing method that host CPU processes TCP/IP in the operating system causes numerous overheads. As a result of the overheads, user applications cannot get the enough computing power from the host CPU. To solve this problem, the TCP/IP Offload Engine(TOE) technology was emerged. TOE is a specialized NIC which processes the TCP/IP instead of the host CPU. In this paper, we implemented a high-performance, lightweight TCP/IP(HL-TCP) for the TOE and applied it to an embedded system. The HL-TCP supports existing fundamental TCP/IP functions; flow control, congestion control, retransmission, delayed ACK, processing out-of-order packets. And it was implemented to utilize Ethernet MAC's hardware features such as TCP segmentation offload(TSO), checksum offload(CSO) and interrupt coalescing. Also we eliminated the copy overhead from the host memory to the NIC memory when sending data and we implemented an efficient DMA mechanism for the TCP retransmission. The TOE using the HL-TCP has the CPU utilization of less than 6% and the bandwidth of 453Mbps.

A Partitioning Method of Balancing CPU Utilization of Servers in DVE (분산 가상 환경에서 균등 부하 분산을 위한 CPU 사용률 기반 파티션 분할)

  • Won, Dong-Kee;An, Dong-Un;Chung, Seung-Jong
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.777-778
    • /
    • 2008
  • The partitioning problem is one of efficient issues on designing an excellent DVE. A brilliant partitioning method is related with assigning several avatars into the suitable servers with well balancing the growing requirement of bandwidth and computational resources in DVE. In this paper, a new method LCAA is proposed. The LCAA is a new partitioning method that balancing the CPU utilization of servers in DVE especially.

  • PDF

Design and Implementation of A Dual CPU Based Embedded Web Camera Streaming Server (Dual CPU 기반 임베디드 웹 카메라 스트리밍 서버의 설계 및 구현)

  • 홍진기;문종려;백승걸;정선태
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.417-420
    • /
    • 2003
  • Most Embedded Web Camera Server products currently deployed on the market adopt JPEG for compression of video data continuously acquired from the cameras. However, JPEG does not efficiently compress the continuous video stream, and is not appropriate for the Internet where the transmission bandwidth is not guaranteed. In our previous work, we presented the result of designing and implementing an embedded web camera streaming server using MPEG4 codec. But the server in our previous work did not show good performance since one CPU had to both compress and process the network transmission. In this paper, we present our efforts to improve our previous result by using dual CPUs, where DSP is employed for data compression and StrongARM is used for network processing. Better performance has been observed, but it is found that still more time is needed to optimize the performance.

  • PDF

Design of VCR Functions With MPEG Characteristics for VOD based on Multicast (멀티캐스트 기반의 VOD 시스템에서 MPEG의 특성을 고려한 VCR 기능의 설계)

  • Lee, Joa-Hyoung;Jung, In-Bum
    • The KIPS Transactions:PartC
    • /
    • v.16C no.4
    • /
    • pp.487-494
    • /
    • 2009
  • VOD(Video On Demand) that provides streaming service according to the user's requirement in real time, consists of the video streaming server and the client system. Since it is very hard to apply the traditional server-client model that a server communicates with many clients through 1:1 connection to VOD system because it requires very high network bandwidth, many researches have been done to address this problem. Batching technique is one of VOD system based on Multicast that requires very small network bandwidth. However, the batching based VOD system has a limitation that it is very hard to provide VCR(Video Cassette Recorder) ability. In this paper, we propose a technique that reduces the required network bandwidth to provide VCR function by using the characteristic of MPEG, one of international video compression standard. In the proposed technique, a new video stream for VCR function is constructed with I pictures that is able to be decoded independently. The new video stream for VCR function is transmitted with the video stream for normal play together in Batching manner. The performance evaluation result shows that the proposed technique not only reduces the required network bandwidth and memory usage but also decreases the CPU usages.

A New Network Bandwidth Reduction Method of Distributed Rendering System for Scalable Display (확장형 디스플레이를 위한 분산 렌더링 시스템의 네트워크 대역폭 감소 기법)

  • Park, Woo-Chan;Lee, Won-Jong;Kim, Hyung-Rae;Kim, Jung-Woo;Han, Tack-Don;Yang, Sung-Bong
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.10
    • /
    • pp.582-588
    • /
    • 2002
  • Scalable displays generate large and high resolution images and provide an immersive environment. Recently, scalable displays are built on the networked clusters of PCs, each of which has a fast graphics accelerator, memory, CPU, and storage. However, the distributed rendering on clusters is a network bound work because of limited network bandwidth. In this paper, we present a new algorithm for reducing the network bandwidth and implement it with a conventional distributed rendering system. This paper describes the algorithm called geometry tracking that avoids the redundant geometry transmission by indexing geometry data. The experimental results show that our algorithm reduces the network bandwidth up to 42%.

GPU Based Incremental Connected Component Processing in Dynamic Graphs (동적 그래프에서 GPU 기반의 점진적 연결 요소 처리)

  • Kim, Nam-Young;Choi, Do-Jin;Bok, Kyoung-Soo;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.6
    • /
    • pp.56-68
    • /
    • 2022
  • Recently, as the demand for real-time processing increases, studies on a dynamic graph that changes over time has been actively done. There is a connected components processing algorithm as one of the algorithms for analyzing dynamic graphs. GPUs are suitable for large-scale graph calculations due to their high memory bandwidth and computational performance. However, when computing the connected components of a dynamic graph using the GPU, frequent data exchange occurs between the CPU and the GPU during real graph processing due to the limited memory of the GPU. The proposed scheme utilizes the Weighted-Quick-Union algorithm to process large-scale graphs on the GPU. It supports fast connected components computation by applying the size to the connected component label. It computes the connected component by determining the parts to be recalculated and minimizing the data to be transmitted to the GPU. In addition, we propose a processing structure in which the GPU and the CPU execute asynchronously to reduce the data transfer time between GPU and CPU. We show the excellence of the proposed scheme through performance evaluation using real dataset.

Dynamic Control of Random Constant Spreading Worm Using the Power-Law Network Characteristic (멱함수 네트워크 특성을 이용한 랜덤확산형 웜의 동적 제어)

  • Park Doo-Soon;No Byung-Gyu
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.3
    • /
    • pp.333-341
    • /
    • 2006
  • Recently, Random Constant worm is increasing The worm retards the availability of the overall network by exhausting resources such as CPU resource and network bandwidth, and damages to an uninfected system as well as an infected system. This paper analyzes the Power-Law network which possesses the preferential characteristics to restrain the worm from spreading. Moreover, this paper suggests the model which dynamically controls the spread of the worm using information about depth distribution of the delivery node which can be seen commonly in such network. It has also verified that the load for each node was minimized at the optimal depth to effectively restrain the spread of the worm by a simulation.

  • PDF