Search | Korea Science

Analysis of the Influence Factors of Data Loading Performance Using Apache Sqoop (아파치 스쿱을 사용한 하둡의 데이터 적재 성능 영향 요인 분석)

Chen, Liu;Ko, Junghyun;Yeo, Jeongmo
- KIPS Transactions on Software and Data Engineering
- /
- v.4 no.2
- /
- pp.77-82
- /
- 2015
Big Data technology has been attracted much attention in aspect of fast data processing. Research of practicing Big Data technology is also ongoing to process large-scale structured data much faster in Relatioinal Database(RDB). Although there are lots of studies about measuring analyzing performance, studies about structured data loading performance, prior step of analyzing, is very rare. Thus, in this study, structured data in RDB is tested the performance that loads distributed processing platform Hadoop using Apache sqoop. Also in order to analyze the influence factors of data loading, it is tested repeatedly with different options of data loading and compared with data loading performance among RDB based servers. Although data loading performance of Apache Sqoop in test environment was low, but in large-scale Hadoop cluster environment we can expect much better performance because of getting more hardware resources. It is expected to be based on study improving data loading performance and whole steps of performance analyzing structured data in Hadoop Platform.
https://doi.org/10.3745/KTSDE.2015.4.2.77 인용 PDF KSCI

Data Mining Approach for Supporting Hoarding in Mobile Computing Environments

Jeon, Seong-Hae;Ryu, Je-Bok;Lee, Seung-Ju
- Proceedings of the Korean Statistical Society Conference
- /
- 2003.05a
- /
- pp.13-17
- /
- 2003
본 논문에서는 낮은 대역폭, 높은 지연, 그리고 잦은 네트워크 단절로 인한 모바일 컴퓨팅 환경의 문제점들을 해결하기 위한 효과적인 캐시 적재 기법으로서 협업 추천 기반의 데이터 마이닝 전략을 제안하였다. 캐시 적재가 모바일 클라이언트의 이러한 문제점들을 해결하기 위한 효율적인 방법이 된다는 기존의 연구는 많이 진행되어 왔다. 하지만 모바일 컴퓨터의 요구에 대한 이력 정보만을 이용한 기존의 연구는 모바일 클라이언트가 필요로 하는 모든 정보 요구를 만족하지 못하였다. 특히 저장 공간의 제약을 갖는 모바일 컴퓨터의 한계 때문에 더욱 큰 어려움을 갖게 되었다. 본 연구에서는 모바일 클라이언트의 이력 정보에 대하여 데이터 마이닝 기법을 적용한 캐시 적재 기법을 제안하여 적은 캐시 용량만으로도 모바일 클라이언트의 요구를 만족할 수 있는 아이템들을 효과적으로 서비스할 수 있도록 하였다. CSIM Simulator를 이용하여 모의 데이터를 생성하여, 제안 모형의 성능 평가를 위한 실험을 수행하였다. Cache hit ratio를 이용한 객관적인 성능 평가를 통하여 제안된 모형이 모바일 클라이언트의 캐시 적재 기법으로서 우수한 성능을 보임이 확인되었다.
PDF

An Overloaded Vehicle Identifying System based on Object Detection Model (객체 인식 모델을 활용한 적재 불량 화물차 탐지 시스템)

Jung, Woojin;Park, Jinuk;Park, Yongju
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.26 no.12
- /
- pp.1794-1799
- /
- 2022
Recently, the increasing number of overloaded vehicles on the road poses a risk to traffic safety, such as falling objects, road damage, and chain collisions due to the abnormal weight distribution, and can cause great damage once an accident occurs. therefore we propose to build an object detection-based AI model to identify overloaded vehicles that cause such social problems. In addition, we present a simple yet effective method to construct an object detection model for the large-scale vehicle images. In particular, we utilize the large-scale of vehicle image sets provided by open AI-Hub, which include the overloaded vehicles. We inspected the specific features of sizes of vehicles and types of image sources, and pre-processed these images to train a deep learning-based object detection model. Also, we propose an integrated system for tracking the detected vehicles. Finally, we demonstrated that the detection performance of the overloaded vehicle was improved by about 23% compared to the one using raw data.
https://doi.org/10.6109/jkiice.2022.26.12.1794 인용 PDF KSCI

An Overloaded Vehicle Identifying System based on Object Detection Model (객체 인식 모델을 활용한 적재불량 화물차 탐지 시스템 개발)

Jung, Woojin;Park, Yongju;Park, Jinuk;Kim, Chang-il
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2022.10a
- /
- pp.562-565
- /
- 2022
Recently, the increasing number of overloaded vehicles on the road poses a risk to traffic safety, such as falling objects, road damage, and chain collisions due to the abnormal weight distribution, and can cause great damage once an accident occurs. However, this irregular weight distribution is not possible to be recognized with the current weight measurement system for vehicles on roads. To address this limitation, we propose to build an object detection-based AI model to identify overloaded vehicles that cause such social problems. In addition, we present a simple yet effective method to construct an object detection model for the large-scale vehicle images. In particular, we utilize the large-scale of vehicle image sets provided by open AI-Hub, which include the overloaded vehicles from the CCTV, black box, and hand-held camera point of view. We inspected the specific features of sizes of vehicles and types of image sources, and pre-processed these images to train a deep learning-based object detection model. Finally, we demonstrated that the detection performance of the overloaded vehicle was improved by about 23% compared to the one using raw data. From the result, we believe that public big data can be utilized more efficiently and applied to the development of an object detection-based overloaded vehicle detection model.
PDF

Two Level Bin-Packing Algorithm for Data Allocation on Multiple Broadcast Channels (다중 방송 채널에 데이터 할당을 위한 두 단계 저장소-적재 알고리즘)

Kwon, Hyeok-Min
- Journal of Korea Multimedia Society
- /
- v.14 no.9
- /
- pp.1165-1174
- /
- 2011
In data broadcasting systems, servers continuously disseminate data items through broadcast channels, and mobile client only needs to wait for the data of interest to present on a broadcast channel. However, because broadcast channels are shared by a large set of data items, the expected delay of receiving a desired data item may increase. This paper explores the issue of designing proper data allocation on multiple broadcast channels to minimize the average expected delay time of all data items, and proposes a new data allocation scheme named two level bin-packing(TLBP). This paper first introduces the theoretical lower-bound of the average expected delay, and determines the bin capacity based on this value. TLBP partitions all data items into a number of groups using bin-packing algorithm and allocates each group of data items on an individual channel. By employing bin-packing algorithm in two step, TLBP can reflect a variation of access probabilities among data items allocated on the same channel to the broadcast schedule, and thus enhance the performance. Simulation is performed to compare the performance of TLBP with three existing approaches. The simulation results show that TLBP outperforms others in terms of the average expected delay time at a reasonable execution overhead.
https://doi.org/10.9717/kmms.2011.14.9.1165 인용 PDF KSCI

GML Data Integration Method for Load Processing of Spatial Data Warehouse (공간 데이터 웨어하우스에서 GML 데이터의 효율적인 적재를 위한 데이터 통합 기법)

Jeon Byung-Yun;Lee Dong-Wook;You Byeong-Seob;Bae Hae-Young
- Proceedings of the Korea Information Processing Society Conference
- /
- 2006.05a
- /
- pp.27-30
- /
- 2006
GIS 분야에서 데이터 교환의 표준으로 OGC(Open Geospatial Consortium)에서 GML(Geography Markup Language)이 제안되어 웹 어플리케이션이나 공간 데이터 교환에서 사용이 일반화 되어가고 있다. 또한, 공간 데이터를 효과적으로 수집하여 의사결정을 지원하기 위한 시스템인 공간 데이터 웨어하우스에서도 GML 데이터를 추출하여 소스 데이터로 활용하는 것이 요구되고 있다. 하지만 GML 은 반구조형식(semi-structured)의 데이터 형식을 가진다. 따라서 기존 구조적인 데이터와는 추출하는 방식이 다르므로 GML 의 특징에 맞는 공간 데이터 추출이 수행되어야 한다. 본 논문에서는 공간 데이터 웨어하우스에서 GML 기반의 공간 데이터 소스를 추출할 때, 중복되는 공간 객체를 하나의 표현으로 통합하여 효율적으로 적재하는 기법을 제안한다. 이는 GQuery를 이용하여 GML 데이터를 추출한 후, GML 스키마를 메타데이터에서 관리하는 스키마 정보와 비교하여 공간 데이터 웨어하우스에 통합된 공간 데이터를 제공하는 기법이다. 성능평가에서는 기존의 GML 데이터를 추출하는 기법과 제안기법과의 비교를 통하여 제안 기법의 기존 기법에 비해 평균적으로 약 9.95%의 성능향상을 보였다.
PDF

Caching Scheme Considering Access Patterns in Graph Environments (그래프 환경에서 접근 패턴을 고려한 캐싱 기법)

Yoo, Seunghun;Kim, Minsoo;Bok, Kyoungsoo;Yoo, Jaesoo
- Proceedings of the Korea Contents Association Conference
- /
- 2017.05a
- /
- pp.19-20
- /
- 2017
최근 소셜 미디어와 센서 장비의 기술의 발달로 그래프 데이터의 양이 급격히 증가 하였다. 그래프 데이터의 처리 과정에서 I/O 비용이 발생하여 데이터가 많아지면 병목현상으로 인해 데이터의 처리와 관리에 있어 성능에 한계가 발생한다. 이러한 문제를 해결하기 위해 데이터를 메모리에서 관리하는 캐시 기법에 대한 연구가 이루어 졌다. 본 논문에서는 서브그래프 데이터의 접근 패턴을 고려한 캐싱 기법을 제안한다. 그래프 환경에서 그래프 질의 이력을 통해 패턴을 찾고 질의 관리 테이블과 FP(frequent pattern)-Tree 통해 선별된 데이터를 메모리에 적재시킨다. 또한, 캐시 실패(cache miss)가 발생 하였을 때, 주변의 이웃 정점을 같이 메모리에 적재시킨다. 메모리가 가득 찰 경우 캐시 된 데이터를 퇴출시키는 교체 전략을 제안한다.
PDF

Analysis of GPGPU Performance by dedicating L2 Cache for Texture Data (텍스쳐 데이터를 위한 2차 캐쉬 구조를 가지는 그래픽 처리 장치의 성능 분석)

Kim, Gwang Bok;Kim, Cheol Hong
- Proceedings of the Korean Society of Computer Information Conference
- /
- 2017.01a
- /
- pp.143-144
- /
- 2017
최근 그래픽 처리 장치는 DRAM에 대한 접근을 줄이고자 여러 메모리 계층을 사용하고 있다. GPGPU의 L2 캐쉬는 요청 데이터의 타입에 따라 별도로 접근하는 L1 메모리와 다르게 레이턴시가 긴 DRAM에 접근하기 전에 모든 데이터 타입이 접근 가능한 캐쉬이다. 본 논문에서는 애플리케이션에서 명시하는 다양한 데이터 타입에 대하여 접근 및 적재를 허용하는 L2 캐쉬를 오직 텍스쳐 데이터만을 허용하도록 하여 변화하는 성능을 분석하고자 한다. 본 실험을 위해 텍스쳐 데이터 이외의 데이터 타입은 L2 캐쉬를 바이패스하여 바로 DRAM에 접근하도록 구조를 변경한다. 실험을 통한 분석 결과 텍스쳐 데이터만을 허용하는 경우 대부분의 벤치마크에서 성능 감소가 발생하여 기존 구조대비 평균 5.58% 감소율을 확인하였다. 반대로, 본 논문의 실험 환경에서의 L2 캐쉬의 적중률이 낮은 애플리케이션인 needle은 불필요한 L2 접근을 바이패스 함으로써 전체적인 성능 증가를 이끌어낸 것으로 분석된다.
PDF

A Cache Hoarding Method Using Collaborative Filtering in Mobile Computing Environments (모바일 컴퓨팅 환경에서 협업추천 모형을 이용한 캐시 적재 기법)

Jun, Sung-Hae;Jung, Sung-Won;Oh, Kyung-Whan
- Journal of the Korean Institute of Intelligent Systems
- /
- v.14 no.6
- /
- pp.687-692
- /
- 2004
In this paper, we proposed an efficient cache hoarding method in mobile computing environments using collaborative filtering. This method is used for solving the difficult problem of mobile computing, which is the vacuum of information service depending on low bandwidth, long delay, and frequent network disconnection. Many previous researches have been studied a cache hoarding approach for solving these problems of mobile client. But, the research of history information of mobile client did not support all informative requests for mobile clients. In our research, collaborative filtering model using history information and location data of mobile client is proposed. This proposed model supports an efficient service of necessary items for client's requirement. For the performance evaluation of proposed model, we make an experiment of simulation data using SAS enterprise miner. According to objective evaluation using cache hit ratio, we show that our model has a good result.
https://doi.org/10.5391/JKIIS.2004.14.6.687 인용 PDF KSCI

An Efficient Method of the Index Reorganization using Partial Index Transfer in Spatial Data Warehouses (공간 데이터 웨어하우스에서 부분 색인 전송을 이용한 효율적인 색인 재구성 기법)

Jeong, Young-Cheol;You, Byeong-Seob;Park, Soon-Young;Lee, Jae-Dong;Bae, Hae-Young
- Proceedings of the Korea Information Processing Society Conference
- /
- 2005.05a
- /
- pp.39-42
- /
- 2005
공간 데이터 웨어하우스 구축기는 소스 데이터의 변경 사항을 일괄처리의 형태로 공간 데이터 웨어하우스에 적재한다. 또한, 공간 데이터 웨어하우스 서버는 사용자의 질의에 빠른 응답을 하기위해 적재된 데이터로 색인을 구축한다. 색인을 구성하는 기존 기법으로는 벌크 삽입 기법 및 색인 전송 기법이 있다. 벌크 삽입 기법은 색인을 구성하기 위한 클러스터링 비용이 필요하며 검색 성능도 떨어진다. 또한, 색인 전송 기법은 주기적인 소스 데이터의 변경을 지원하지 않는다는 문제점이 있다. 본 논문에서는 이와 같은 문제점을 해결하기 위해 공간 데이터 웨어하우스에서 부분 색인 전송을 이용한 효율적인 색인 재구성 기법을 제안한다. 제안 기법은 구축기에서 색인의 구조에 맞게 클러스터링된 클러스터들을 부분 색인으로 구성하여 페이지 단위로 전송한다. 공간 데이터 웨어하우스 서버에서는 전송된 부분 색인의 물리적 사상 문제를 해결하기 위해 물리적으로 연속된 공간을 예약하고 예약된 공간에 부분 색인을 기록한다. 기록된 부분 색인은 공간 데이터 웨어하우스 서버에 있던 기존 색인에 삽입된다. 부분 색인이 기존 색인에 직접 삽입됨으로써 색인 재구성을 위한 검색, 분할, 재조정 비용은 최소가 된다.
PDF

Search Result 57, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)