• Title/Summary/Keyword: Large Scale Data

Search Result 2,773, Processing Time 0.031 seconds

A Study of File Replacement Policy in Data Grid Environments (데이터 그리드 환경에서 파일 교체 정책 연구)

  • Park, Hong-Jin
    • The KIPS Transactions:PartA
    • /
    • v.13A no.6 s.103
    • /
    • pp.511-516
    • /
    • 2006
  • The data grid computing provides geographically distributed storage resources to solve computational problems with large-scale data. Unlike cache replacement policies in virtual memory or web-caching replacement, an optimal file replacement policy for data grids is the one of the important problems by the fact that file size is very large. The traditional file replacement policies such as LRU(Least Recently Used) LCB-K(Least Cost Beneficial based on K), EBR(Economic-based cache replacement), LVCT(Least Value-based on Caching Time) have the problem that they have to predict requests or need additional resources to file replacement. To solve theses problems, this paper propose SBR-k(Sized-based replacement-k) that replaces files based on file size. The results of the simulation show that the proposed policy performs better than traditional policies.

MRSPAKE : A Web-Scale Spatial Knowledge Extractor Using Hadoop MapReduce (MRSPAKE : Hadoop MapReduce를 이용한 웹 규모의 공간 지식 추출기)

  • Lee, Seok-Jun;Kim, In-Cheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.569-584
    • /
    • 2016
  • In this paper, we present a spatial knowledge extractor implemented in Hadoop MapReduce parallel, distributed computing environment. From a large spatial dataset, this knowledge extractor automatically derives a qualitative spatial knowledge base, which consists of both topological and directional relations on pairs of two spatial objects. By using R-tree index and range queries over a distributed spatial data file on HDFS, the MapReduce-enabled spatial knowledge extractor, MRSPAKE, can produce a web-scale spatial knowledge base in highly efficient way. In experiments with the well-known open spatial dataset, Open Street Map (OSM), the proposed web-scale spatial knowledge extractor, MRSPAKE, showed high performance and scalability.

An Analysis of Technical Efficiency in the Korean RCC/RSC (RCC/RSC별 운영 효율성 분석)

  • Keum Jong-Soo;Jang Woon- Jae
    • Journal of Navigation and Port Research
    • /
    • v.29 no.3 s.99
    • /
    • pp.215-220
    • /
    • 2005
  • This paper aim, to measure and evaluates the technical efficiency, pure technical efficiency and scale efficiency with two inputs and four outputs with the use of DEA(Data Envelopment Analysis) in Korean RCC(Rescue Co-ordination Center)/RSC(Rescue Sub-Center). Several conclusion emerge. first the average efficiency of overall technical efficiency measure about $91.03{\%}$ and pure technical efficiency $96.80{\%}$ is much large then scale efficiency $93.83{\%}$. It means that inefficiency has much more to do whit the inefficient utilization of resources rather then the scale of production. second, DRS(decreasing return to scale)is Tongyeong and IRS(increasing return to scale) is Incheon, Taean, Gunsan, Yeosu, Ulsan, Donghae in RCC/RSC finally, inefficiency RCC/RSC have to benchmarking with reference sets.

J-Tree: An Efficient Index using User Searching Patterns for Large Scale Data (J-tree : 사용자의 검색패턴을 이용한 대용량 데이타를 위한 효율적인 색인)

  • Jang, Su-Min;Seo, Kwang-Seok;Yoo, Jae-Soo
    • Journal of KIISE:Databases
    • /
    • v.36 no.1
    • /
    • pp.44-49
    • /
    • 2009
  • In recent years, with the development of portable terminals, various searching services on large data have been provided in portable terminals. In order to search large data, most applications for information retrieval use indexes such as B-trees or R-trees. However, only a small portion of the data set is accessed by users, and the access frequencies of each data are not uniform. The existing indexes such as B-trees or R-trees do not consider the properties of the skewed access patterns. And a cache stores the frequently accessed data for fast access in memory. But the size of memory used in the cache is restricted. In this paper, we propose a new index based on disk, called J-tree, which considers user's search patterns. The proposed index is a balanced tree which guarantees uniform searching time on all data. It also supports fast searching time on the frequently accessed data. Our experiments show the effectiveness of our proposed index under various settings.

A Study on the Implementation of a Data Acquisition System with a Large Number of Multiple Signal (다채널 다중신호 데이터 획득 시스템의 구현에 관한 연구)

  • Son, Do-Sun;Lee, Sang-Hoon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.20 no.3
    • /
    • pp.326-331
    • /
    • 2010
  • This paper presents the design and implementation of a data acquisition system with a large number of multi-channels for manufacturing machine. The system having a throughput of 800-ch analog signals has been designed with Quartus II tool and Cyclone II FPGA. The proposed system is suitable for the large scale data handling in order to distinguish whether the operation is correct or not. The designed system is composed of a control unit, voltage divider and USB interface. To reduce the data throughput, we utilized an algorithm which can extract the same data from the achieved data. The test results of the system adapted to a manufacturing machine, show a relevant data acquisition operation of 800 channels in short time.

A Case Study: Unsupervised Approach for Tourist Profile Analysis by K-means Clustering in Turkey

  • Yildirim, Mustafa Eren;Kaya, Murat;FurkanInce, Ibrahim
    • Journal of Internet Computing and Services
    • /
    • v.23 no.1
    • /
    • pp.11-17
    • /
    • 2022
  • Data mining is the task of accessing useful information from a large capacity of data. It can also be referred to as searching for correlations that can provide clues about the future in large data warehouses by using computer algorithms. It has been used in the tourism field for marketing, analysis, and business improvement purposes. This study aims to analyze the tourist profile in Turkey through data mining methods. The reason relies behind the selection of Turkey is the fact that Turkey welcomes millions of tourist every year which can be a role model for other touristic countries. In this study, an anonymous and large-scale data set was used under the law on the protection of personal data. The dataset was taken from a leading tourism company that is still active in Turkey. By using the k-means clustering algorithm on this data, key parameters of profiles were obtained and people were clustered into groups according to their characteristics. According to the outcomes, distinguishing characteristics are gathered under three main titles. These are the age of the tourists, the frequency of their vacations and the period between the reservation and the vacation itself. The results obtained show that the frequency of tourist vacations, the time between bookings and vacations, and age are the most important and characteristic parameters for a tourist's profile. Finally, planning future investments, events and campaign packages can make tourism companies more competitive and improve quality of service. For both businesses and tourists, it is advantageous to prepare individual events and offers for the three major groups of tourists.

A Study for Vibration Characteristics of RC Slab with Hybrid Beams in Large Span Educational Facilities (대공간 교육시설에 사용되는 합성보 및 콘크리트 슬래브의 진동평가에 대한 연구)

  • Lee, Kyoung-Hun;Jeong, Eun-Ho
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.9 no.3
    • /
    • pp.34-40
    • /
    • 2010
  • In this study, vibration characteristics of reinforced concrete slab in large span educational facilities were evaluated. A 21.75m X 14.4m full scale reinforced concrete slab specimen was constructed with pre-flex hybrid beams. Vibrations were generated by three different methods such as free falling method of a 6kg sand bag, a 70kg person walking method and impact method by impulse hammer. Vibrations were generated more than 3 times at single location. Vibration characteristic data were collected by SA390 signal analyzer machine at 5 different locations.

Evaluation of turbulent SGS model for large eddy simulation of turbulent flow inside a sudden expansion cylindrical chamber (급 확대부를 갖는 실린더 챔버 내부 유동에 관한 LES 난류모델의 평가)

  • 최창용;고상철
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.28 no.3
    • /
    • pp.423-433
    • /
    • 2004
  • A large eddy simulation (LES) is performed for turbulent flow in a combustion device. The combustion device is simplified as a cylindrical chamber with sudden expansion. A flame holder is attached inside a cylindrical chamber in order to promote turbulent mixing and to accommodate flame stability. The turbulent sub-grid scale models are applied and validated. Emphasis is placed on the evaluation of turbulent model for the LES of complex geometry. The simulation code is constructed by using a general coordinate system based on the physical contravariant velocity components. The calculated Reynolds number is 5000 based on the bulk velocity and the diameter of inlet pipe. The predicted turbulent statistics are evaluated by comparing with the LDV measurement data. The Smagorinsky model coefficients are estimated and the utility of dynamic SGS models are confirmed in the LES of complex geometry.

Large Eddy Simulation of Turbulent Premixed Flame in a Swirled Combustor Using Multi-environment Probability Density Function approach (MEPDF를 이용한 와류 연소실 내부 예혼합 화염의 대 와동 모사)

  • Kim, Namsu;Kim, Yongmo
    • Journal of the Korean Society of Combustion
    • /
    • v.22 no.3
    • /
    • pp.29-34
    • /
    • 2017
  • The multi-environment probability density function model has been applied to simulate a turbulent premixed flame in a swirl combustor. To realistically account for the unsteady flow motion inside the combustor, the formulations are derived for the large eddy simulation. The Flamelet generated manifolds is utilized to simplify a multi-dimensional composition space with reasonable accuracy. The sub grid scale mixing is modeled by the interaction by exchange with the mean mixing model. To validate the present approach, the simulation results are compared with experimental data in terms of mean velocity, temperature, and species mass fractions.

The Evaluation of Accuracy for Airborne Laser Surveying via LiDAR System Calibration (시스템 초기화(Calibration)에 따른 항공레이저측량의 정확도 평가)

  • 이대희;위광재;김승용;김갑진;이재원
    • Proceedings of the Korean Society of Surveying, Geodesy, Photogrammetry, and Cartography Conference
    • /
    • 2004.04a
    • /
    • pp.15-26
    • /
    • 2004
  • The calibration for systematic error in LiDAR is crucial for the accuracy of airborne laser scanning. The main error is the misalignment of platforms between INS(Inertial Navigation System) and Laser scanner For planimetrical calibration of LiDAR, the building is good feature which has great changes in height and continuous flat area in the top. The planimetry error(pitch, roll) is corrected by adjustment of height which is calculated from comparing ground control points(GCP) of building to laser scanning data. We can know scale correction of laser range by the comparison of LiDAR data and GCP is arranged at the end of scan angle where maximize the height error. The area for scale calibration have to be large flat and have almost same elevation. At 1000m for average flying height, The Accuracy of laser scanning data using LiDAR is within 110cm in height and ${\pm}$50cm in planmetry so we can use laser scanning data for generating 3D terrain surface, expecically digital surface model(DSM) which is difficult to measure by aerial photogrammetry in forest, coast, urban area of high buildings

  • PDF