• Title/Summary/Keyword: Large Scale Data

Search Result 2,796, Processing Time 0.041 seconds

Power Investigation of the Entropy-Based Test of Fit for Inverse Gaussian Distribution by the Information Discrimination Index

  • Choi, Byungjin
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.837-847
    • /
    • 2012
  • Inverse Gaussian distribution is widely used in applications to analyze and model right-skewed data. To assess the appropriateness of the distribution prior to data analysis, Mudholkar and Tian (2002) proposed an entropy-based test of fit. The test is based on the entropy power fraction(EPF) index suggested by Gokhale (1983). The simulation results report that the power of the entropy-based test is superior compared to other goodness-of-fit tests; however, this observation is based on the small-scale simulation results on the standard exponential, Weibull W(1; 2) and lognormal LN(0:5; 1) distributions. A large-scale simulation should be performed against various alternative distributions to evaluate the power of the entropy-based test; however, the use of a theoretical method is more effective to investigate the powers. In this paper, utilizing the information discrimination(ID) index defined by Ehsan et al. (1995) as a mathematical tool, we scrutinize the power of the entropy-based test. The selected alternative distributions are the gamma, Weibull and lognormal distributions, which are widely used in data analysis as an alternative to inverse Gaussian distribution. The study results are provided and an illustrative example is analyzed.

Application of a Non-stationary Frequency Analysis Method for Estimating Probable Precipitation in Korea (전국 확률강수량 산정을 위한 비정상성 빈도해석 기법의 적용)

  • Kim, Gwang-Seob;Lee, Gi-Chun
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.54 no.5
    • /
    • pp.141-153
    • /
    • 2012
  • In this study, we estimated probable precipitation amounts at the target year (2020, 2030, 2040) of 55 weather stations in Korea using the 24 hour annual maximum precipitation data from 1973 through 2009 which should be useful for management of agricultural reservoirs. Not only trend tests but also non-stationary tests were performed and non-stationary frequency analysis were conducted to all of 55 sites. Gumbel distribution was chosen and probability weighted moment method was used to estimate model parameters. The behavior of the mean of extreme precipitation data, scale parameter, and location parameter were analyzed. The probable precipitation amount at the target year was estimated by a non-stationary frequency analysis using the linear regression analysis for the mean of extreme precipitation data, scale parameter, and location parameter. Overall results demonstrated that the probable precipitation amounts using the non-stationary frequency analysis were overestimated. There were large increase of the probable precipitation amounts of middle part of Korea and decrease at several sites in Southern part. The non-stationary frequency analysis using a linear model should be applicable to relatively short projection periods.

Comparison of DBMS Performance for processing Small Scale Database (소용량 데이터베이스 처리를 위한 DBMS의 성능 비교)

  • Jang, Si-Woong
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.139-142
    • /
    • 2008
  • While a lot of comparisons of DBMS performance for processing large scale database are given as results of bench-mark tests, there are few comparisons of DBMS performance for processing small scale database. Therefore, in this study, we compared and analyzed on the performance of commercial DBMS and public DBMS for small scale database. Analysis results show that while Oracle has low performance on the operations of update and insert due to the overhead of rollback for data safety, MySQL and MS-SQL have good performance without additional overhead.

  • PDF

Comparison of DBMS Performance for processing Small Scale Database (소용량 데이터베이스 처리를 위한 DBMS의 성능 비교)

  • Jang, Si-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.11
    • /
    • pp.1999-2004
    • /
    • 2008
  • While a lot of comparisons of DBMS performance for processing large scale database are given as results of bench-mark tests, there are few comparisons of DBMS performance for processing small scale database. Therefore, in this study, we compared and analyzed on the performance of commercial DBMS and public DBMS for small scale database. Analysis results show that while Oracle has low performance on the operations of update and insert due to the overhead of rollback for data safely, MySQL and MS-SOL have good performance without additional overhead.

Extraction of similar XML data based on XML structure and processing unit

  • Park, Jong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.59-65
    • /
    • 2017
  • XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar information from XML instance is one of research topics but is insufficient. In this paper, we extract similar information from various kind of XML instances according to the same goal. Also we use only the structure information of XML instance for information extraction because some of XML instance is described without its schema. In order to efficiently extract similar information, we propose a minimum unit of processing and two approaches for finding the unit. The one is a structure-based method which uses only the structure information of XML instance and another is a measure-based method which finds a unit by numerical formula. Our two approaches can be applied to any application that needs the extraction of similar information based on XML data. Also the approach can be used for HTML instance.

The Development of Seismic Monitoring for a Base-Isolated Building System (지진격리 구조물의 지진모니터링 시스템 개발)

  • 김성훈;조대승;박해동;김두훈
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2001.11a
    • /
    • pp.247-251
    • /
    • 2001
  • Nowadays, base isolation systems such as lead-rubber bearing, elastomer bearing and sliding bearing have been installed to the various structures to prevent the disaster from seismic. The performance of base isolation system have been well proved by model-scale experiments and numerical analysis. However. the seismic response data measured at real large base-isolated structures is still insufficient. This paper presents a seismic monitoring system, acquiring real-time acceleration signals up to 32 channels, displaying time history and spectrum of the signals, storing the acquired data at a PC hard disk, and replaying the saved data. Moreover, the system can be operated without any limitation for monitoring period by automatic management of stored data file. The developed system has been installed at a real base-isolated building using lead-rubber bearings and we expect its seismic response data with ground motion signal can be well licquired in case of earthquake occurrence.

  • PDF

Regional Scale Satellite Data Sets for Agricultural, Hydrological and Environmental Applications in Zambia

  • Ngoma, Solomon
    • Proceedings of The Korean Society of Agricultural and Forest Meteorology Conference
    • /
    • 2001.06a
    • /
    • pp.43-48
    • /
    • 2001
  • Many applications in the areas of agricultural, hydrological and environmental resource management require data over very large areas and with a high imaging frequency - monitoring crop growth, water stress, seasonal wetland flooding and natural vegetation development. This precludes the use of fine resolution data (Landsat, Spot) on the grounds of cost, accessibility and low imaging frequency. Meteorological satellites have the potential to fill this need, given their very wide spatial coverage, and high repeat imaging. The Remote Sensing Unit (RSU) at the Zambia Meteorological Department routinely receives, processes and archives imagery from both Meteosat and NOAA AVHRR satellites. Here I wish to present some examples of applications of these data sets that arise from the RSU work - relationships between rainfall and vegetation development as assessed by satellite, derived information and seasonal patterns of flooding in the Barotse floodplain and the Kafue flats. I also wish to outline ways in which a more widespread use of this data by the Zambian institutions canbe achieved.

  • PDF

PLM System Development for Data Management of KSLV-II Program (한국형발사체개발사업 정보 관리를 위한 PLM 시스템 구축)

  • Kwon, Byung-Chan;Park, Chang-Su;Kim, Keun-Taek
    • Journal of Aerospace System Engineering
    • /
    • v.8 no.2
    • /
    • pp.49-54
    • /
    • 2014
  • The main purpose of Korea Space Launch Vehicle II(KSLV-II) Program is to develop a domestic launch vehicle that can deliver a 1.5ton class application satellite into a Low Earth Orbit(600~800km). The data management is an essential factor in systems engineering for success of large-scale complex systems development, and it systematically manages the information and technical data for the total life-cycle of a system. In this paper, data management policies and processes on KSLV-II program are presented, and product life-cycle management system for KSLV-II program is also presented.

Regional Scale Rice Yield Estimation by Using a Time-series of RADARSAT ScanSAR Images

  • Li, Yan;Liao, Qifang;Liao, Shengdong;Chi, Guobin;Peng, Shaolin
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.917-919
    • /
    • 2003
  • This paper demonstrates that RADARSAT ScanSAR data can be an important data source of radar remote sensing for monitoring crop systems and estimation of rice yield for large areas in tropic and sub-tropical regions. Experiments were carried out to show the effectiveness of RADARSAT ScanSAR data for rice yield estimation in whole province of Guangdong, South China. A methodology was developed to deal with a series of issues in extracting rice information from the ScanSAR data, such as topographic influences, levels of agro-management, irregular distribution of paddy fields and different rice cropping systems. A model was provided for rice yield estimation based on the relationship between the backscatter coefficient of multi-temporal SAR data and the biomass of rice.

  • PDF

An Efficient Data Augmentation for 3D Medical Image Segmentation (3차원 의료 영상의 영역 분할을 위한 효율적인 데이터 보강 방법)

  • Park, Sangkun
    • Journal of Institute of Convergence Technology
    • /
    • v.11 no.1
    • /
    • pp.1-5
    • /
    • 2021
  • Deep learning based methods achieve state-of-the-art accuracy, however, they typically rely on supervised training with large labeled datasets. It is known in many medical applications that labeling medical images requires significant expertise and much time, and typical hand-tuned approaches for data augmentation fail to capture the complex variations in such images. This paper proposes a 3D image augmentation method to overcome these difficulties. It allows us to enrich diversity of training data samples that is essential in medical image segmentation tasks, thus reducing the data overfitting problem caused by the fact the scale of medical image dataset is typically smaller. Our numerical experiments demonstrate that the proposed approach provides significant improvements over state-of-the-art methods for 3D medical image segmentation.