• Title/Summary/Keyword: Multiple Data Sets

Search Result 348, Processing Time 0.027 seconds

GIS-based Spatial Integration and Statistical Analysis using Multiple Geoscience Data Sets : A Case Study for Mineral Potential Mapping (다중 지구과학자료를 이용한 GIS 기반 공간통합과 통계량 분석 : 광물 부존 예상도 작성을 위한 사례 연구)

  • 이기원;박노욱;권병두;지광훈
    • Korean Journal of Remote Sensing
    • /
    • v.15 no.2
    • /
    • pp.91-105
    • /
    • 1999
  • Spatial data integration using multiple geo-based data sets has been regarded as one of the primary GIS application issues. As for this issue, several integration schemes have been developed as the perspectives of mathematical geology or geo-mathematics. However, research-based approaches for statistical/quantitative assessments between integrated layer and input layers are not fully considered yet. Related to this niche point, in this study, spatial data integration using multiple geoscientific data sets by known integration algorithms was primarily performed. For spatial integration by using raster-based GIS functionality, geological, geochemical, geophysical data sets, DEM-driven data sets and remotely sensed imagery data sets from the Ogdong area were utilized for geological thematic mapping related by mineral potential mapping. In addition, statistical/quantitative information extraction with respective to relationships among used data sets and/or between each data set and integrated layer was carried out, with the scope of multiple data fusion and schematic statistical assessment methodology. As for the spatial integration scheme, certainty factor (CF) estimation and principal component analysis (PCA) were applied. However, this study was not aimed at direct comparison of both methodologies; whereas, for the statistical/quantitative assessment between integrated layer and input layers, some statistical methodologies based on contingency table were focused. Especially, for the bias reduction, jackknife technique was also applied in PCA-based spatial integration. Through the statistic analyses with respect to the integration information in this case study, new information for relationships of integrated layer and input layers was extracted. In addition, influence effects of input data sets with respect to integrated layer were assessed. This kind of approach provides a decision-making information in the viewpoint of GIS and is also exploratory data analysis in conjunction with GIS and geoscientific application, especially handing spatial integration or data fusion with complex variable data sets.

Automatic Registration Method for Multiple 3D Range Data Sets (다중 3차원 거리정보 데이타의 자동 정합 방법)

  • 김상훈;조청운;홍현기
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.12
    • /
    • pp.1239-1246
    • /
    • 2003
  • Registration is the process aligning the range data sets from different views in a common coordinate system. In order to achieve a complete 3D model, we need to refine the data sets after coarse registration. One of the most popular refinery techniques is the iterative closest point (ICP) algorithm, which starts with pre-estimated overlapping regions. This paper presents an improved ICP algorithm that can automatically register multiple 3D data sets from unknown viewpoints. The sensor projection that represents the mapping of the 3D data into its associated range image is used to determine the overlapping region of two range data sets. By combining ICP algorithm with the sensor projection constraint, we can make an automatic registration of multiple 3D sets without pre-procedures that are prone to errors and any mechanical positioning device or manual assistance. The experimental results showed better performance of the proposed method on a couple of 3D data sets than previous methods.

Application of SOLAS to the Multiple Imputation for Missing Data

  • Moon, Sung-Ho;Kim, Hyun-Jeong;Shin, Jae-Kyoung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.3
    • /
    • pp.579-590
    • /
    • 2003
  • When we analyze incomplete data, i.e., data with missing values, we need treatment for the missing values. A common way to deal with this problem is to delete the cases with missing values. Various other methods have been developed. Among them are EM algorithm and regression algorithm which can estimate missing values and impute the missing elements with the estimated values. In this paper, we introduce multiple imputation software SOLAS which generates multiple data sets and imputes with them.

  • PDF

Forecasting of Seasonal Inflow to Reservoir Using Multiple Linear Regression (다중선형회귀분석에 의한 계절별 저수지 유입량 예측)

  • Kang, Jaewon
    • Journal of Environmental Science International
    • /
    • v.22 no.8
    • /
    • pp.953-963
    • /
    • 2013
  • Reliable long-term streamflow forecasting is invaluable for water resource planning and management which allocates water supply according to the demand of water users. Forecasting of seasonal inflow to Andong dam is performed and assessed using statistical methods based on hydrometeorological data. Predictors which is used to forecast seasonal inflow to Andong dam are selected from southern oscillation index, sea surface temperature, and 500 hPa geopotential height data in northern hemisphere. Predictors are selected by the following procedure. Primary predictors sets are obtained, and then final predictors are determined from the sets. The primary predictor sets for each season are identified using cross correlation and mutual information. The final predictors are identified using partial cross correlation and partial mutual information. In each season, there are three selected predictors. The values are determined using bootstrapping technique considering a specific significance level for predictor selection. Seasonal inflow forecasting is performed by multiple linear regression analysis using the selected predictors for each season, and the results of forecast using cross validation are assessed. Multiple linear regression analysis is performed using SAS. The results of multiple linear regression analysis are assessed by mean squared error and mean absolute error. And contingency table is established and assessed by Heidke skill score. The assessment reveals that the forecasts by multiple linear regression analysis are better than the reference forecasts.

Productivity Evaluation and Comparision of Korean Provincial Hospitals (한국 지방공사 의료원의 생산성 평가와 비교)

  • Ahn, Tae-Sik;Park, Jung-Sik
    • Korea Journal of Hospital Management
    • /
    • v.2 no.1
    • /
    • pp.22-47
    • /
    • 1997
  • This paper evaluated the relative efficiency of 33 provincial medical centers using Data Envelopment Analysis(DEA) and compared the DEA efficiency results with those of the current method conducted by the management evaluation team. DEA Was selected as an alternative efficiency evaluation method since it could handle multiple inputs and multiple outputs simultaneously and identify the sources of inefficiency. To analyze the sensitivity of productivity values to the variable sets, four different sets of input and output variables were identified. Results showed that most of the medical centers are operating far away from the efficiency frontier supporting the previous results. Some centers showed 100% efficiency regardless of the selected variable sets. DEA results are compared with current management evaluation results. Some inconsistencies were found for some DMUs between the results of two methods showing the existence of methodology bias. DEA results and ratio analyses results mostly agree for 1992 data.

  • PDF

A Multi-Model Based Noisy Speech Recognition Using the Model Compensation Method (다 모델 방식과 모델보상을 통한 잡음환경 음성인식)

  • Chung, Young-Joo;Kwak, Seung-Woo
    • MALSORI
    • /
    • no.62
    • /
    • pp.97-112
    • /
    • 2007
  • The speech recognizer in general operates in noisy acoustical environments. Many research works have been done to cope with the acoustical variations. Among them, the multiple-HMM model approach seems to be quite effective compared with the conventional methods. In this paper, we consider a multiple-model approach combined with the model compensation method and investigate the necessary number of the HMM model sets through noisy speech recognition experiments. By using the data-driven Jacobian adaptation for the model compensation, the multiple-model approach with only a few model sets for each noise type could achieve comparable results with the re-training method.

  • PDF

ERS-1 AND CCRS C-SAR Data Integration For Look Direction Bias Correction Using Wavelet Transform

  • Won, J.S.;Moon, Woo-Il M.;Singhroy, Vern;Lowman, Paul-D.Jr.
    • Korean Journal of Remote Sensing
    • /
    • v.10 no.2
    • /
    • pp.49-62
    • /
    • 1994
  • Look direction bias in a single look SAR image can often be misinterpreted in the geological application of radar data. This paper investigates digital processing techniques for SAR image data integration and compensation of the SAR data look direction bias. The two important approaches for reducing look direction bias and integration of multiple SAR data sets are (1) principal component analysis (PCA), and (2) wavelet transform(WT) integration techniques. These two methods were investigated and tested with the ERS-1 (VV-polarization) and CCRS*s airborne (HH-polarization) C-SAR image data sets recorded over the Sudbury test site, Canada. The PCA technique has been very effective for integration of more than two layers of digital image data. When there only two sets of SAR data are available, the PCA thchnique requires at least one more set of auxiliary data for proper rendition of the fine surface features. The WT processing approach of SAR data integration utilizes the property which decomposes images into approximated image ( low frequencies) characterizing the spatially large and relatively distinct structures, and detailed image (high frequencies) in which the information on detailed fine structures are preserved. The test results with the ERS-1and CCRS*s C-SAR data indicate that the new WT approach is more efficient and robust in enhancibng the fine details of the multiple SAR images than the PCA approach.

A Technique to Improve the Fit of Linear Regression Models for Successive Sets of Data

  • Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • v.5 no.1
    • /
    • pp.19-28
    • /
    • 1976
  • In empirical study for fitting a multiple linear regression model for successive cross-sections data observed on the same set of independent variables over several time periods, one often faces the problem of poor $R^2$, the multiple coefficient of determination, which provides a standard measure of how good a specified regression line fits the sample data.

  • PDF

Deformable image registration in radiation therapy

  • Oh, Seungjong;Kim, Siyong
    • Radiation Oncology Journal
    • /
    • v.35 no.2
    • /
    • pp.101-111
    • /
    • 2017
  • The number of imaging data sets has significantly increased during radiation treatment after introducing a diverse range of advanced techniques into the field of radiation oncology. As a consequence, there have been many studies proposing meaningful applications of imaging data set use. These applications commonly require a method to align the data sets at a reference. Deformable image registration (DIR) is a process which satisfies this requirement by locally registering image data sets into a reference image set. DIR identifies the spatial correspondence in order to minimize the differences between two or among multiple sets of images. This article describes clinical applications, validation, and algorithms of DIR techniques. Applications of DIR in radiation treatment include dose accumulation, mathematical modeling, automatic segmentation, and functional imaging. Validation methods discussed are based on anatomical landmarks, physical phantoms, digital phantoms, and per application purpose. DIR algorithms are also briefly reviewed with respect to two algorithmic components: similarity index and deformation models.

Cross platform classification of microarrays by rank comparison

  • Lee, Sunho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.475-486
    • /
    • 2015
  • Mining the microarray data accumulated in the public data repositories can save experimental cost and time and provide valuable biomedical information. Big data analysis pooling multiple data sets increases statistical power, improves the reliability of the results, and reduces the specific bias of the individual study. However, integrating several data sets from different studies is needed to deal with many problems. In this study, I limited the focus to the cross platform classification that the platform of a testing sample is different from the platform of a training set, and suggested a simple classification method based on rank. This method is compared with the diagonal linear discriminant analysis, k nearest neighbor method and support vector machine using the cross platform real example data sets of two cancers.