• Title/Summary/Keyword: 헤드기반 분류

Search Result 29, Processing Time 0.02 seconds

Traffic-based Caching Algorithm and Performance Evaluation for QoS-adaptive Streaming Proxy Server in Wireless Networks (무선 환경에서 QoS 적응적인 스트리밍 프락시 서버를 위한 트래픽 기반 캐싱 알고리즘 및 성능 분석)

  • Kim, HwaSung;Kim, YongSul;Hong, JungPyo
    • Journal of Broadcast Engineering
    • /
    • v.10 no.3
    • /
    • pp.313-320
    • /
    • 2005
  • The increasing popularity of multimedia streaming services introduces new challenges in content distribution. Especially, it is important to provide the QoS guarantees as they are increasingly expected to support the multimedia applications. Multimedia streams typically experience the high start-up delay due to the large protocol overhead, the delay, and the loss properties of the wireless networks. The service providers can improve the performance of multimedia streaming by caching the initial segment (prefix) of the popular streams at proxies near the requesting clients. The proxy can initiate transmission to the client while requesting the remainder of the stream from the server. In this paper, we propose the traffic based caching algorithm (TSLRU) to improve the performance of caching proxy. TSLRU classifies the traffic into three types, and improve the performance of caching proxy by reflecting the several elements such as traffic types, recency, frequency, object size when performing the replacement decision. In simulation, TSLRU performs better than the existing schemes in terms of byte hit rate, hit rate, startup latency, and throughput.

Thread Block Scheduling for GPGPU based on Fine-Grained Resource Utilization (상세 자원 이용률에 기반한 병렬 가속기용 스레드 블록 스케줄링)

  • Bahn, Hyokyung;Cho, Kyungwoon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.49-54
    • /
    • 2022
  • With the recent widespread adoption of general-purpose GPUs (GPGPUs) in cloud systems, maximizing the resource utilization through multitasking in GPGPU has become an important issue. In this article, we show that resource allocation based on the workload classification of computing-bound and memory-bound is not sufficient with respect to resource utilization, and present a new thread block scheduling policy for GPGPU that makes use of fine-grained resource utilizations of each workload. Unlike previous approaches, the proposed policy reduces scheduling overhead by separating profiling and scheduling, and maximizes resource utilizations by co-locating workloads with different bottleneck resources. Through simulations under various virtual machine scenarios, we show that the proposed policy improves the GPGPU throughput by 130.6% on average and up to 161.4%.

Vehicle Headlight Alignment Calibration and Classification Using OpenMP (OpenMP를 이용한 차량 헤드라이트 얼라인먼트 보정 및 분류 방법)

  • Moon, Chang-Bae;Kim, Kun-Hong;Kim, Byeong-Man;Oh, Dukhwan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.22 no.2
    • /
    • pp.61-70
    • /
    • 2017
  • In This Paper, the Classification Speed of Vehicle Headlight Modules is Improved by a CPU-based Parallel Processing Using OpenMP. Also, a Classification Method of Headlight Modules which Extracts their Features after Revising their Alignment is Proposed. To Analyze the Performance of the Proposed Method, the Discrimination Accuracy and the Processing Speed were Compared with the Method Using Gray Image and the Method Using Line Detection. As the Results of the Analysis, in the Discrimination Accuracy, the Proposed Method and the Line Detection Method Showed good Performance, but the Proposed Method Showed Better Performance than the Line Detection Method by the Processing Speed. Also, the Gray-based Method was the Best in Processing Speed, but the Proposed Method is Better than the Gray-based Method in the Discrimination Accuracy.

A Study on Efficient Natural Language Processing Method based on Transformer (트랜스포머 기반 효율적인 자연어 처리 방안 연구)

  • Seung-Cheol Lim;Sung-Gu Youn
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.4
    • /
    • pp.115-119
    • /
    • 2023
  • The natural language processing models used in current artificial intelligence are huge, causing various difficulties in processing and analyzing data in real time. In order to solve these difficulties, we proposed a method to improve the efficiency of processing by using less memory and checked the performance of the proposed model. The technique applied in this paper to evaluate the performance of the proposed model is to divide the large corpus by adjusting the number of attention heads and embedding size of the BERT[1] model to be small, and the results are calculated by averaging the output values of each forward. In this process, a random offset was assigned to the sentences at every epoch to provide diversity in the input data. The model was then fine-tuned for classification. We found that the split processing model was about 12% less accurate than the unsplit model, but the number of parameters in the model was reduced by 56%.

A High Performance Flash Memory Solid State Disk (고성능 플래시 메모리 솔리드 스테이트 디스크)

  • Yoon, Jin-Hyuk;Nam, Eyee-Hyun;Seong, Yoon-Jae;Kim, Hong-Seok;Min, Sang-Lyul;Cho, Yoo-Kun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.378-388
    • /
    • 2008
  • Flash memory has been attracting attention as the next mass storage media for mobile computing systems such as notebook computers and UMPC(Ultra Mobile PC)s due to its low power consumption, high shock and vibration resistance, and small size. A storage system with flash memory excels in random read, sequential read, and sequential write. However, it comes short in random write because of flash memory's physical inability to overwrite data, unless first erased. To overcome this shortcoming, we propose an SSD(Solid State Disk) architecture with two novel features. First, we utilize non-volatile FRAM(Ferroelectric RAM) in conjunction with NAND flash memory, and produce a synergy of FRAM's fast access speed and ability to overwrite, and NAND flash memory's low and affordable price. Second, the architecture categorizes host write requests into small random writes and large sequential writes, and processes them with two different buffer management, optimized for each type of write request. This scheme has been implemented into an SSD prototype and evaluated with a standard PC environment benchmark. The result reveals that our architecture outperforms conventional HDD and other commercial SSDs by more than three times in the throughput for random access workloads.

Wall Cuckoo: A Method for Reducing Memory Access Using Hash Function Categorization (월 쿠쿠: 해시 함수 분류를 이용한 메모리 접근 감소 방법)

  • Moon, Seong-kwang;Min, Dae-hong;Jang, Rhong-ho;Jung, Chang-hun;NYang, Dae-hun;Lee, Kyung-hee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.6
    • /
    • pp.127-138
    • /
    • 2019
  • The data response speed is a critical issue of cloud services because it directly related to the user experience. As such, the in-memory database is widely adopted in many cloud-based applications for achieving fast data response. However, the current implementation of the in-memory database is mostly based on the linked list-based hash table which cannot guarantee the constant data response time. Thus, cuckoo hashing was introduced as an alternative solution, however, there is a disadvantage that only half of the allocated memory can be used for storing data. Subsequently, bucketized cuckoo hashing (BCH) improved the performance of cuckoo hashing in terms of memory efficiency but still cannot overcome the limitation that the insert overhead. In this paper, we propose a data management solution called Wall Cuckoo which aims to improve not only the insert performance but also lookup performance of BCH. The key idea of Wall Cuckoo is that separates the data among a bucket according to the different hash function be used. By doing so, the searching range among the bucket is narrowed down, thereby the amount of slot accesses required for the data lookup can be reduced. At the same time, the insert performance will be improved because the insert is following up the operation of the lookup. According to analysis, the expected value of slot access required for our Wall Cuckoo is less than that of BCH. We conducted experiments to show that Wall Cuckoo outperforms the BCH and Sorting Cuckoo in terms of the amount of slot access in lookup and insert operations and in different load factor (i.e., 10%-95%).

Virtual Source and Flooding-Based QoS Unicast and Multicast Routing in the Next Generation Optical Internet based on IP/DWDM Technology (IP/DWDM 기반 차세대 광 인터넷 망에서 가상 소스와 플러딩에 기초한 QoS 제공 유니캐스트 및 멀티캐스트 라우팅 방법 연구)

  • Kim, Sung-Un;Park, Seon-Yeong
    • Journal of Korea Multimedia Society
    • /
    • v.14 no.1
    • /
    • pp.33-43
    • /
    • 2011
  • Routing technologies considering QoS-based hypermedia services have been seen as a crucial network property in next generation optical Internet (NGOI) networks based on IP/dense-wavelength division multiplexing (DWDM). The huge potential capacity of one single fiber. which is in Tb/s range, can be exploited by applying DWDM technology which transfers multiple data streams (classified and aggregated IP traffics) on multiple wavelengths (classified with QoS-based) simultaneously. So, DWDM-based optical networks have been a favorable approach for the next generation optical backbone networks. Finding a qualified path meeting the multiple constraints is a multi-constraint optimization problem, which has been proven to be NP-complete and cannot be solved by a simple algorithm. The majority of previous works in DWDM networks has viewed heuristic QoS routing algorithms (as an extension of the current Internet routing paradigm) which are very complex and cause the operational and implementation overheads. This aspect will be more pronounced when the network is unstable or when the size of network is large. In this paper, we propose a flooding-based unicast and multicast QoS routing methodologies(YS-QUR and YS-QMR) which incur much lower message overhead yet yields a good connection establishment success rate. The simulation results demonstrate that the YS-QUR and YS-QMR algorithms are superior to the previous routing algorithms.

A Study on Improving the Billing System of the Wireless Internet Service (무선인터넷 서비스의 과금체계 개선에 관한 연구)

  • Min Gyeongju;Hong Jaehwan;Nam Sangsig;Kim Jeongho
    • The KIPS Transactions:PartC
    • /
    • v.12C no.4 s.100
    • /
    • pp.597-602
    • /
    • 2005
  • In this study, file size for measurement and the service system's billing data were submitted to a comparative analysis by performing a verification test on the billing system of three major mobile communication services providers, based on the wireless Internet service packet. As shown in the result of the verification test, there were some differences in the billing data due to transmission overhead, according to the network quality that is affected by the wireless environment of mobile operators. Consequently, the packet analysis system was proposed as a means of applying consistent packet billing to all service providers being compared. If the packet analysis system is added to supplement the current billing system various user requirements can be met. Billing by Packet among mobile operators and differentiated billing based on the content value are available, since the packet data can be extracted through protocol analysis by service, and it can be classified by content tape through traffic data analysis. Furthermore, customer's needs can be satisfied who request more information on the detailed usage, and more flexible and diverse billing policies can be supported like application of charging conditions to the non-charging packet handling. All these services are expected to contribute to the popularization of the wireless Internet service, since user complaints about the service charge could be reduced.

Hybrid All-Reduce Strategy with Layer Overlapping for Reducing Communication Overhead in Distributed Deep Learning (분산 딥러닝에서 통신 오버헤드를 줄이기 위해 레이어를 오버래핑하는 하이브리드 올-리듀스 기법)

  • Kim, Daehyun;Yeo, Sangho;Oh, Sangyoon
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.7
    • /
    • pp.191-198
    • /
    • 2021
  • Since the size of training dataset become large and the model is getting deeper to achieve high accuracy in deep learning, the deep neural network training requires a lot of computation and it takes too much time with a single node. Therefore, distributed deep learning is proposed to reduce the training time by distributing computation across multiple nodes. In this study, we propose hybrid allreduce strategy that considers the characteristics of each layer and communication and computational overlapping technique for synchronization of distributed deep learning. Since the convolution layer has fewer parameters than the fully-connected layer as well as it is located at the upper, only short overlapping time is allowed. Thus, butterfly allreduce is used to synchronize the convolution layer. On the other hand, fully-connecter layer is synchronized using ring all-reduce. The empirical experiment results on PyTorch with our proposed scheme shows that the proposed method reduced the training time by up to 33% compared to the baseline PyTorch.