• Title/Summary/Keyword: Sampling set selection

Search Result 38, Processing Time 0.029 seconds

The Effect of Sports Club Membership Lifestyle on Choice Behavior

  • Sunmun Park;Shuo LI
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.2
    • /
    • pp.267-275
    • /
    • 2023
  • The purpose of this study is to investigate the influence of sports center members' lifestyles on participation promotion and choice behavior. To this end, more specifically, we intend to establish and clarify a hypothetical model based on the preceding studies of facilitating factors and factors that continue to participate according to the lifestyle of sports center members. In order to achieve this research purpose, the study subjects were set as the population of male and female adults over 20 who are using sports centers in Gwangju Metropolitan City and Jeollanam-do in 2021. As for the sampling method, the sample was extracted using cluster random sampling, and 300 people were used for the actual analysis, excluding 60 copies of double-entry and insincere or unreliable questionnaires. The survey tool was modified and supplemented according to this study based on the questionnaire that had been verified for reliability and validity in previous studies, and all questionnaire items were composed of a 5-point scale. The statistical analysis used for data analysis was frequency analysis, exploratory factor analysis, reliability analysis, and multiple regression analysis using SPSS Windows 21.0 Version. The conclusions obtained in this study through data analysis by such methods and procedures are as follows. First, according to the lifestyle of sports center members, participation promotion factors were found to have a partial influence. Second, according to the lifestyle of sports center members, the selection behavior was found to have a partial influence. Third, it was found that the participation promotion factors of sports center members partially affected the choice behavior.

Effects of Single Nucleotide Polymorphism Marker Density on Haplotype Block Partition

  • Kim, Sun Ah;Yoo, Yun Joo
    • Genomics & Informatics
    • /
    • v.14 no.4
    • /
    • pp.196-204
    • /
    • 2016
  • Many researchers have found that one of the most important characteristics of the structure of linkage disequilibrium is that the human genome can be divided into non-overlapping block partitions in which only a small number of haplotypes are observed. The location and distribution of haplotype blocks can be seen as a population property influenced by population genetic events such as selection, mutation, recombination and population structure. In this study, we investigate the effects of the density of markers relative to the full set of all polymorphisms in the region on the results of haplotype partitioning for five popular haplotype block partition methods: three methods in Haploview (confidence interval, four gamete test, and solid spine), MIG++ implemented in PLINK 1.9 and S-MIG++. We used several experimental datasets obtained by sampling subsets of single nucleotide polymorphism (SNP) markers of chromosome 22 region in the 1000 Genomes Project data and also the HapMap phase 3 data to compare the results of haplotype block partitions by five methods. With decreasing sampling ratio down to 20% of the original SNP markers, the total number of haplotype blocks decreases and the length of haplotype blocks increases for all algorithms. When we examined the marker-independence of the haplotype block locations constructed from the datasets of different density, the results using below 50% of the entire SNP markers were very different from the results using the entire SNP markers. We conclude that the haplotype block construction results should be used and interpreted carefully depending on the selection of markers and the purpose of the study.

Development and Testing of a Machine Learning Model Using 18F-Fluorodeoxyglucose PET/CT-Derived Metabolic Parameters to Classify Human Papillomavirus Status in Oropharyngeal Squamous Carcinoma

  • Changsoo Woo;Kwan Hyeong Jo;Beomseok Sohn;Kisung Park;Hojin Cho;Won Jun Kang;Jinna Kim;Seung-Koo Lee
    • Korean Journal of Radiology
    • /
    • v.24 no.1
    • /
    • pp.51-61
    • /
    • 2023
  • Objective: To develop and test a machine learning model for classifying human papillomavirus (HPV) status of patients with oropharyngeal squamous cell carcinoma (OPSCC) using 18F-fluorodeoxyglucose (18F-FDG) PET-derived parameters in derived parameters and an appropriate combination of machine learning methods in patients with OPSCC. Materials and Methods: This retrospective study enrolled 126 patients (118 male; mean age, 60 years) with newly diagnosed, pathologically confirmed OPSCC, that underwent 18F-FDG PET-computed tomography (CT) between January 2012 and February 2020. Patients were randomly assigned to training and internal validation sets in a 7:3 ratio. An external test set of 19 patients (16 male; mean age, 65.3 years) was recruited sequentially from two other tertiary hospitals. Model 1 used only PET parameters, Model 2 used only clinical features, and Model 3 used both PET and clinical parameters. Multiple feature transforms, feature selection, oversampling, and training models are all investigated. The external test set was used to test the three models that performed best in the internal validation set. The values for area under the receiver operating characteristic curve (AUC) were compared between models. Results: In the external test set, ExtraTrees-based Model 3, which uses two PET-derived parameters and three clinical features, with a combination of MinMaxScaler, mutual information selection, and adaptive synthetic sampling approach, showed the best performance (AUC = 0.78; 95% confidence interval, 0.46-1). Model 3 outperformed Model 1 using PET parameters alone (AUC = 0.48, p = 0.047) and Model 2 using clinical parameters alone (AUC = 0.52, p = 0.142) in predicting HPV status. Conclusion: Using oversampling and mutual information selection, an ExtraTree-based HPV status classifier was developed by combining metabolic parameters derived from 18F-FDG PET/CT and clinical parameters in OPSCC, which exhibited higher performance than the models using either PET or clinical parameters alone.

Construction and Application of Network Design System for Optimal Water Quality Monitoring in Reservoir (저수지 최적수질측정망 구축시스템 개발 및 적용)

  • Lee, Yo-Sang;Kwon, Se-Hyug;Lee, Sang-Uk;Ban, Yang-Jin
    • Journal of Korea Water Resources Association
    • /
    • v.44 no.4
    • /
    • pp.295-304
    • /
    • 2011
  • For effective water quality management, it is necessary to secure reliable water quality information. There are many variables that need to be included in a comprehensive practical monitoring network : representative sampling locations, suitable sampling frequencies, water quality variable selection, and budgetary and logistical constraints are examples, especially sampling location is considered to be the most important issues. Until now, monitoring network design for water quality management was set according to the qualitative judgments, which is a problem of representativeness. In this paper, we propose network design system for optimal water quality monitoring using the scientific statistical techniques. Network design system is made based on the SAS program of version 9.2 and configured with simple input system and user friendly outputs considering the convenience of users. It applies to Excel data format for ease to use and all data of sampling location is distinguished to sheet base. In this system, time plots, dendrogram, and scatter plots are shown as follows: Time plots of water quality variables are graphed for identifying variables to classify sampling locations significantly. Similarities of sampling locations are calculated using euclidean distances of principal component variables and dimension coordinate of multidimensional scaling method are calculated and dendrogram by clustering analysis is represented and used for users to choose an appropriate number of clusters. Scatter plots of principle component variables are shown for clustering information with sampling locations and representative location.

The NHPP Bayesian Software Reliability Model Using Latent Variables (잠재변수를 이용한 NHPP 베이지안 소프트웨어 신뢰성 모형에 관한 연구)

  • Kim, Hee-Cheul;Shin, Hyun-Cheul
    • Convergence Security Journal
    • /
    • v.6 no.3
    • /
    • pp.117-126
    • /
    • 2006
  • Bayesian inference and model selection method for software reliability growth models are studied. Software reliability growth models are used in testing stages of software development to model the error content and time intervals between software failures. In this paper, could avoid multiple integration using Gibbs sampling, which is a kind of Markov Chain Monte Carlo method to compute the posterior distribution. Bayesian inference for general order statistics models in software reliability with diffuse prior information and model selection method are studied. For model determination and selection, explored goodness of fit (the error sum of squares), trend tests. The methodology developed in this paper is exemplified with a software reliability random data set introduced by of Weibull distribution(shape 2 & scale 5) of Minitab (version 14) statistical package.

  • PDF

Generation and Selection of Nominal Virtual Examples for Improving the Classifier Performance (분류기 성능 향상을 위한 범주 속성 가상예제의 생성과 선별)

  • Lee, Yu-Jung;Kang, Byoung-Ho;Kang, Jae-Ho;Ryu, Kwang-Ryel
    • Journal of KIISE:Software and Applications
    • /
    • v.33 no.12
    • /
    • pp.1052-1061
    • /
    • 2006
  • This paper presents a method of using virtual examples to improve the classification accuracy for data with nominal attributes. Most of the previous researches on virtual examples focused on data with numeric attributes, and they used domain-specific knowledge to generate useful virtual examples for a particularly targeted learning algorithm. Instead of using domain-specific knowledge, our method samples virtual examples from a naive Bayesian network constructed from the given training set. A sampled example is considered useful if it contributes to the increment of the network's conditional likelihood when added to the training set. A set of useful virtual examples can be collected by repeating this process of sampling followed by evaluation. Experiments have shown that the virtual examples collected this way.can help various learning algorithms to derive classifiers of improved accuracy.

An Ileal Amino Acid Digestibility Assay for the Growing Meat Chicken-Effect of Feeding Method and Digesta Collection Procedures

  • Yap, K.H.;Kadim, I.T.;King, R.D.;Moughan, P.J.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.10 no.6
    • /
    • pp.671-678
    • /
    • 1997
  • The objective was to evaluate method of feeding (free access or intubation), method of slaughter (carbon dioxide gas or barbiturate) and digesta flushing medium (distilled water or physiological saline), in the development of an ileal amino acid digestibility assay for 4 week-old broiler chickens. Three diets were used (commercial (C), semi-synthetic meat-and bone meal (MBM) or wheat (W)). For the coarser C and W diets but not for the MBM diet, feeding method had a significant effect on concentrations of chromium (Cr), nitrogen (N), acid detergent fibre (ADF) and neutral detergent fibre (NDF) in the crop contents at a set time after a meal. There appeared to be a selection of food particles under free-access feeding. For birds receiving the wheat diet there was an effect (p < 0.05) of sampling time after feeding on the concentrations of Cr, N, ADF and NDF/Cr in the crop contents. Flushing ileal digesta with distilled water or saline led to similar apparent ileal N digestibility coefficients. Birds given the MBM diet, and killed by inhalation of $CO_2$, had significantly (p < 0.05) lower apparent ileal N digestibility coefficients (73 versus 80%) than those killed by barbiturate overdose.

Response on New Credit Program In Indonesia: An Asymmetric Information Perspective

  • PURWONO, Rudi;NUGROHO, Ris Yuwono Yudo;MUBIN, M. Khoerul
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.6 no.2
    • /
    • pp.33-44
    • /
    • 2019
  • The Indonesian government launched a new people's business credit program as part of a package of economic policy and deregulation. The interest rate is set lower than the average of the current loan interest rates, especially when compared with rural bank interest rates. To capture the social spatial aspects, quota sampling is applied to ten areas that divided based on the social culture. Further, the method utilized in this research is logit models, which designed to analyse the determinants of asymmetric information particularly on the rural bank and small micro enterprises. The study was conducted in East Java as the province with the largest number of rural banks in Indonesia. Based on the estimation of asymmetric information model to the respondent of rural banks and small businesses, the result shows that adverse selection can be avoided by strengthening the information about prospective borrowers. Regarding moral hazard, rural banks and small businessmen argued that the imposition of the collateral to the debtor has an important role to avoid moral hazard. Rural bank respondents stated that the KUR program with low-interest rates has affected their business development. The results implied the need of broadening the collaboration schemes between this people's business credit program and rural banks.

Quantitative Comparison of Activity Calculation Methods for the Selection of Most Reliable Radionuclide Inventory Estimation

  • Hwang, Ki-Ha;Lee, Sang-Chul;Lee, Kun-Jai;Jeong, Chan-Woo;Ahn, Sang-Myeon;Kim, Tae-Wook;Kim, Kyoung-Doek;Herr, Y.H.
    • Proceedings of the Korean Radioactive Waste Society Conference
    • /
    • 2003.11a
    • /
    • pp.322-327
    • /
    • 2003
  • It is important to know the accurate radionuclide inventory of radioactive waste for the reliable management. However, estimation of radionuclide concentrations in drummed radioactive waste is difficult and unreliable because of difficulties of direct detection, high cost, and radiation exposure of sampling personnel. In order to overcome these difficulties, scaling factors (SFs) have been used to assess the activities of radionuclides that could not be directly analyzed. A radionuclide assay system has been operated at KORI site since 1996 and consolidated scaling factor method has played a dominant role in determination of radionuclides concentrations. However, some problems are still remained such as uncertainty of estimated scaling factor values, inaccuracy of analyzed sample values, and disparity between the actual and ideal correlation pairs and the others. Therefore, it needs to improve the accuracy of scaling factor values. The scope of this paper is focused on the improvement of accuracy and representativeness of calculated scaling factor values based on statistical techniques. For the selection of reliable activity determination method, the accuracy of estimated SF values for each activity determination method is compared. From the comparison of each activity determination methods, it is recommended that SF determination method should be changed from the arithmetic mean to the geometrical mean for more reliable estimation of radionuclide activity. Arithmetic mean method and geometric mean method are compared based on the data set in KORI system.

  • PDF

An Evaluation of ETM+ Data Capability to Provide 'Forest-Shrub land-Range' Map (A Case Study of Neka-Zalemroud Region-Mazandaran-Iran)

  • Latifi Hooman;Olade Djafar;Saroee Saeed;jalilvand Hamid
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.403-406
    • /
    • 2005
  • In order to evaluate the Capability of ETM+ remotely- sensed data to provide 'Forest-shrub land-Rangeland' cover type map in areas near the timberline of northern forests of Iran, the data were analyzed in a portion of nearly 790 ha located in Neka-Zalemroud region. First, ortho-rectification process was used to correct the geometric errors of the image, yielding 0/68 and 0/69 pixels of RMS. error in X and Y axis, respectively. The original and panchromatic bands were fused using PANSHARP Statistical module. The ground truth map was made using 1 ha field plots in a systematic-random sampling grid, and vegetative form of trees, shrubs and rangelands was recorded as a criteria to name the plots. A set of channels including original bands, NDVI and IR/R indices and first components of PCI from visible and infrared bands, was used for classification procedure. Pair-wise divergence through CHNSEL command was used, In order to evaluate the separability of classes and selection of optimal channels. Classification was performed using ML classifier, on both original and fused data sets. Showing the best results of $67\%$ of overall accuracy, and 0/43 of Kappa coefficient in original data set. Due to the results represented above, it's concluded that ETM+ data has an intermediate capability to fulfill the spectral variations of three form- based classes over the study area.

  • PDF