Search | Korea Science

KAWS: Coordinate Kernel-Aware Warp Scheduling and Warp Sharing Mechanism for Advanced GPUs

Vo, Viet Tan;Kim, Cheol Hong
- Journal of Information Processing Systems
- /
- v.17 no.6
- /
- pp.1157-1169
- /
- 2021
Modern graphics processor unit (GPU) architectures offer significant hardware resource enhancements for parallel computing. However, without software optimization, GPUs continuously exhibit hardware resource underutilization. In this paper, we indicate the need to alter different warp scheduler schemes during different kernel execution periods to improve resource utilization. Existing warp schedulers cannot be aware of the kernel progress to provide an effective scheduling policy. In addition, we identified the potential for improving resource utilization for multiple-warp-scheduler GPUs by sharing stalling warps with selected warp schedulers. To address the efficiency issue of the present GPU, we coordinated the kernel-aware warp scheduler and warp sharing mechanism (KAWS). The proposed warp scheduler acknowledges the execution progress of the running kernel to adapt to a more effective scheduling policy when the kernel progress attains a point of resource underutilization. Meanwhile, the warp-sharing mechanism distributes stalling warps to different warp schedulers wherein the execution pipeline unit is ready. Our design achieves performance that is on an average higher than that of the traditional warp scheduler by 7.97% and employs marginal additional hardware overhead.
https://doi.org/10.3745/JIPS.01.0084 인용 PDF KSCI

An Heuristic for Joint Assignments of Power and Subcarriers in Cognitive Radio Networks (인지라디오 네트워크에서 전력과 부반송파 할당을 위한 휴리스틱)

Paik, Chun-Hyun
- Korean Management Science Review
- /
- v.29 no.2
- /
- pp.65-77
- /
- 2012
With the explosivley increasing demand in wireless telecommunication service, the shortage of radio spectrum has been worsen. The traditional approach of the current fixed spectrum allocation leads to spectrum underutilization. Recently, CR (Cognitive Radio) technologies are proposed to enhance the spectrum utilization by allocating dynamically radio resources to CR Networks. In this study, we consider a radio resource(power, subcarrier) allocation problem for OFDMA-based CRN in which a base station supports a variety of CUs (CRN Users) while avoiding the radio interference to PRN (Primary Radio Network). The problem is mathematically formulated as a general 0-1 IP problem. The optimal solution method for the IP problem requires an unrealistic execution time due to its complexity. Therefore, we propose an heuristic that gives an approximate solution within a reasonable execution time.
https://doi.org/10.7737/KMSR.2012.29.2.065 인용 PDF KSCI

Honey Bee Based Load Balancing in Cloud Computing

Hashem, Walaa;Nashaat, Heba;Rizk, Rawya
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.12
- /
- pp.5694-5711
- /
- 2017
The technology of cloud computing is growing very quickly, thus it is required to manage the process of resource allocation. In this paper, load balancing algorithm based on honey bee behavior (LBA_HB) is proposed. Its main goal is distribute workload of multiple network links in the way that avoid underutilization and over utilization of the resources. This can be achieved by allocating the incoming task to a virtual machine (VM) which meets two conditions; number of tasks currently processing by this VM is less than number of tasks currently processing by other VMs and the deviation of this VM processing time from average processing time of all VMs is less than a threshold value. The proposed algorithm is compared with different scheduling algorithms; honey bee, ant colony, modified throttled and round robin algorithms. The results of experiments show the efficiency of the proposed algorithm in terms of execution time, response time, makespan, standard deviation of load, and degree of imbalance.
https://doi.org/10.3837/tiis.2017.12.001 인용 PDF KSCI

Hybrid S-ALOHA/TDMA Protocol for LTE/LTE-A Networks with Coexistence of H2H and M2M Traffic

Sui, Nannan;Wang, Cong;Xie, Wei;Xu, Youyun
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.2
- /
- pp.687-708
- /
- 2017
The machine-to-machine (M2M) communication is featured by tremendous number of devices, small data transmission, and large uplink to downlink traffic ratio. The massive access requests generated by M2M devices would result in the current medium access control (MAC) protocol in LTE/LTE-A networks suffering from physical random access channel (PRACH) overload, high signaling overhead, and resource underutilization. As such, fairness should be carefully considered when M2M traffic coexists with human-to-human (H2H) traffic. To tackle these problems, we propose an adaptive Slotted ALOHA (S-ALOHA) and time division multiple access (TDMA) hybrid protocol. In particular, the proposed hybrid protocol divides the reserved uplink resource blocks (RBs) in a transmission cycle into the S-ALOHA part for M2M traffic with small-size packets and the TDMA part for H2H traffic with large-size packets. Adaptive resource allocation and access class barring (ACB) are exploited and optimized to maximize the channel utility with fairness constraint. Moreover, an upper performance bound for the proposed hybrid protocol is provided by performing the system equilibrium analysis. Simulation results demonstrate that, compared with pure S-ALOHA and pure TDMA protocol under a target fairness constraint of 0.9, our proposed hybrid protocol can improve the capacity by at least 9.44% when ${\lambda}_1:{\lambda}_2=1:1$and by at least 20.53% when ${\lambda}_1:{\lambda}_2=10:1$, where ${\lambda}_1,{\lambda}_2$ are traffic arrival rates of M2M and H2H traffic, respectively.
https://doi.org/10.3837/tiis.2017.02.005 인용 PDF KSCI

A new warp scheduling technique for improving the performance of GPUs by utilizing MSHR information (GPU 성능 향상을 위한 MSHR 정보 기반 워프 스케줄링 기법)

Kim, Gwang Bok;Kim, Jong Myon;Kim, Cheol Hong
- The Journal of Korean Institute of Next Generation Computing
- /
- v.13 no.3
- /
- pp.72-83
- /
- 2017
GPUs can provide high throughput with latency hiding by executing many warps in parallel. MSHR(Miss Status Holding Registers) for L1 data cache tracks cache miss requests until required data is serviced from lower level memory. In recent GPUs, excessive requests for cache resources cause underutilization problem of GPU resources due to cache resource reservation fails. In this paper, we propose a new warp scheduling technique to reduce stall cycles under MSHR resource shortage. Cache miss rates for each warp is predicted based on the observation that each warp shows similar cache miss rates for long period. The warps showing low miss rates or computation-intensive warps are given high priority to be issued when MSHR is full status. Our proposal improves GPU performance by utilizing cache resource more efficiently based on cache miss rate prediction and monitoring the MSHR entries. According to our experimental results, reservation fail cycles can be reduced by 25.7% and IPC is increased by 6.2% with the proposed scheduling technique compared to loose round robin scheduler.

Inter-cell DCA Algorithm for Downlink Wireless Communication Systems (하향링크 무선 통신 시스템에서의 Inter-cell DCA 알고리즘)

Kim, Hyo-Su;Kim, Dong-Hoi;Park, Seung-Young
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.7A
- /
- pp.693-701
- /
- 2008
In OFDMA (Orthogonal Frequency Division Multiple Access) system that frequency reuse factor is 1, as the same channels in the neighborhood cells creates inter-cell co-channel interference which provides a resource underutilization problem, channel allocation schemes to minimize inter-cell interference have been studied. This paper proposes a new CNIR (Carrier to Noise and Interference Ratio)-based distributed Inter-cell DCA (Dynamic Channel Allocation) algorithm in the OFDMA environment with frequency reuse factor of 1. When a channel allocation is requested, if there is not a free channel in home cell or the available free channels in home cell do not satisfy a required threshold value, the proposed Inter-cell DCA algorithm finds CNIR values of available free channels in the neighborhood cells and then allocates a free channel with maximum CNIR value. Through the simulation results, we find that the proposed scheme decreases both new call block rate and forced termination rate due to new call generation at the same time because it increases channel allocation probability.
PDF KSCI

Hair microscopy: an easy adjunct to diagnosis of systemic diseases in children

Dharmagat Bhattarai;Aaqib Zafar Banday;Rohit Sadanand;Kanika Arora;Gurjit Kaur;Satish Sharma;Amit Rawat
- Applied Microscopy
- /
- v.51
- /
- pp.18.1-18.12
- /
- 2021
Hair, having distinct stages of growth, is a dynamic component of the integumentary system. Nonetheless, derangement in its structure and growth pattern often provides vital clues for the diagnosis of systemic diseases. Assessment of the hair structure by various microscopy techniques is, hence, a valuable tool for the diagnosis of several systemic and cutaneous disorders. Systemic illnesses like Comel-Netherton syndrome, Griscelli syndrome, Chediak Higashi syndrome, and Menkes disease display pathognomonic findings on hair microscopy which, consequently, provide crucial evidence for disease diagnosis. With minimal training, light microscopy of the hair can easily be performed even by clinicians and other health care providers which can, thus, serve as a useful tool for disease diagnosis at the patient's bedside. This is especially true for resource-constrained settings where access and availability of advanced investigations (like molecular diagnostics) is a major constraint. Despite its immense clinical utility and non-invasive nature, hair microscopy seems to be an underutilized diagnostic modality. Lack of awareness regarding the important findings on hair microscopy may be one of the crucial reasons for its underutilization. Herein, we, therefore, present a comprehensive overview of the available methods for hair microscopy and the pertinent findings that can be observed in various diseases.
https://doi.org/10.1186/s42649-021-00067-6 인용 PDF

Trends in Regional Disparities in Cardiovascular Surgery and Mortality in Korea: A National Cross-sectional Study

Dal-Lae Jin;Kyoung-Hoon Kim;Euy Suk Chung;Seok-Jun Yoon
- Journal of Preventive Medicine and Public Health
- /
- v.57 no.3
- /
- pp.260-268
- /
- 2024
Objectives: Regional disparities in cardiovascular care in Korea have led to uneven patient outcomes. Despite the growing need for and access to procedures, few studies have linked regional service availability to mortality rates. This study analyzed regional variation in the utilization of major cardiovascular procedures and their associations with short-term mortality to provide better evidence regarding the relationship between healthcare resource distribution and patient survival. Methods: A cross-sectional study was conducted using nationwide claims data for patients who underwent coronary artery bypass grafting (CABG), percutaneous coronary intervention (PCI), stent insertion, or aortic aneurysm resection in 2022. Regional variation was assessed by the relevance index (RI). The associations between the regional RI and 30-day mortality were analyzed. Results: The RI was lowest for aortic aneurysm resection (mean, 26.2; standard deviation, 26.1), indicating the most uneven regional distribution among the surgical procedures. Patients undergoing this procedure in regions with higher RIs showed significantly lower 30-day mortality (adjusted odds ratio [aOR], 0.73; 95% confidence interval, 0.55 to 0.96; p=0.026) versus those with lower RIs. This suggests that cardiovascular surgery regional availability, as measured by RI, has an impact on mortality rates for certain complex surgical procedures. The RI was not associated with significant mortality differences for more widely available procedures like CABG (aOR, 0.96), PCI (aOR, 1.00), or stent insertion (aOR, 0.91). Conclusions: Significant regional variation and underutilization of cardiovascular surgery were found, with reduced access linked to worse mortality for complex procedures. Disparities should be addressed through collaboration among hospitals and policy efforts to improve outcomes.
https://doi.org/10.3961/jpmph.24.057 인용 PDF

Analysis on the Active/Inactive Status of Computational Resources for Improving the Performance of the GPU (GPU 성능 저하 해결을 위한 내부 자원 활용/비활용 상태 분석)

Choi, Hongjun;Son, Dongoh;Kim, Jongmyon;Kim, Cheolhong
- The Journal of the Korea Contents Association
- /
- v.15 no.7
- /
- pp.1-11
- /
- 2015
In recent high performance computing system, GPGPU has been widely used to process general-purpose applications as well as graphics applications, since GPU can provide optimized computational resources for massive parallel processing. Unfortunately, GPGPU doesn't exploit computational resources on GPU in executing general-purpose applications fully, because the applications cannot be optimized to GPU architecture. Therefore, we provide GPU research guideline to improve the performance of computing systems using GPGPU. To accomplish this, we analyze the negative factors on GPU performance. In this paper, in order to clearly classify the cause of the negative factors on GPU performance, GPU core status are defined into 5 status: fully active status, partial active status, idle status, memory stall status and GPU core stall status. All status except fully active status cause performance degradation. We evaluate the ratio of each GPU core status depending on the characteristics of benchmarks to find specific reasons which degrade the performance of GPU. According to our simulation results, partial active status, idle status, memory stall status and GPU core stall status are induced by computational resource underutilization problem, low parallelism, high memory requests, and structural hazard, respectively.
https://doi.org/10.5392/JKCA.2015.15.07.001 인용 PDF KSCI

Analysis of Impact of Correlation Between Hardware Configuration and Branch Handling Methods Executing General Purpose Applications (범용 응용프로그램 실행 시 하드웨어 구성과 분기 처리 기법에 따른 GPU 성능 분석)

Choi, Hong Jun;Kim, Cheol Hong
- The Journal of the Korea Contents Association
- /
- v.13 no.3
- /
- pp.9-21
- /
- 2013
Due to increased computing power and flexibility of GPU, recent GPUs execute general purpose parallel applications as well as graphics applications. Programmers can use GPGPU by using the APIs from GPU vendors. Unfortunately, computational resources of GPU are not fully utilized when executing general purpose applications because of frequent branch instructions. To handle the branch problem, several warp formations have been proposed. Intuitively, we expect that the warp formations providing higher computational resource utilization show higher performance. Contrary to our expectations, according to simulation results, the performance of the warp formation providing better utilization is lower than that of the warp formation providing worse utilization. This is because warp formation providing high utilization causes serious memory bottleneck due to increased memory request. Therefore, warp formation providing high computation utilization cannot guarantee high performance without proper hardware resources. For this reason, we will analyze the correlation between hardware configuration and warp formation. Our simulation results present the guideline to solve the underutilization problem due to branch instructions when designing recent GPU.
https://doi.org/10.5392/JKCA.2013.13.03.009 인용 PDF KSCI

Search Result 11, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)