• Title/Summary/Keyword: survey data

Search Result 23,809, Processing Time 0.047 seconds

A case study on verification of internet survey (인터넷 설문조사의 검증에 관한 사례연구)

  • Ryu, Gui-Yeol;Moon, Young-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.11-18
    • /
    • 2014
  • The object of study is to verify the accuracy of internet survey by comparing database data and internet survey. Internet survey was conducted on August, 2012. Respondents were subscribers of KISTI NDSL. Variables were age, organization as demographic variables, number of use, and period of use as attitude variables. Mismatch rates of age, organization, number of use, and period, are 7.5%, 5%, 92%, and 55% respectively. We could estimate the mismatch rate for age as 3% as a pessimistic point of view, and 1% as an optimistic point of view by detail verification. The mismatch rates of organization are 4.5% as a pessimistic point of view, and 2% as an optimistic point of view. The mismatch rates for the frequency of use, the period of use are very high, because measurement error, problems in memory, and internet attitude, etc. Implication of this study is that data of internet survey could be reliable. Many further researches are needed for verification of internet survey.

An Alternative Design of the Internet Survey for Data Quality (데이터 품질을 위한 인터넷 설문조사의 대안적 설계)

  • Kim, Byoung-Gil;Lee, Ki-Dong
    • Journal of Digital Convergence
    • /
    • v.8 no.3
    • /
    • pp.129-141
    • /
    • 2010
  • Though an internet survey, an alternative method for the off-line survey, has various merits is, it still has some problems such as sampling bias and low reliability from insincerity during survey. Especially the exogenous variables such as sample respondents of the environment should be controlled to make internet survey trustworthy. This study attempts to design and implement such system that can help researchers to control the network and sampling environment and behaviors on respondents. Thru the various Question forms and structured Questionnaire design, this study tries to contribute the improvement of survey satisfaction and the reliability for survey result in internet survey system.

  • PDF

Case of Geophysical Survey Guideline for Site Investigation of Spent Nuclear Fuel disposal: Focusing on airborne electromagnetic and seismic reflection survey (사용후핵연료 처분시설 부지조사를 위한 물리탐사 수행지침서 작성 사례 : 항공전자탐사와 탄성파 반사법탐사 중심으로)

  • NamYoung Kong;Hagsoo Kim;Yoonsup Moon;Manho Han
    • Geophysics and Geophysical Exploration
    • /
    • v.27 no.1
    • /
    • pp.69-83
    • /
    • 2024
  • Considering importance and specificity, site investigations for deep geological disposal of Spent Nuclear Fuel require stringent quality control, unlike general geotechnical investigations for tunnels and bridges. In this study, we present a case of selecting geophysical survey method for individual site investigation stage and preparing geophysical survey guideline. The proposed geophysical survey guidelines include procedures, considerations, and quality control for exploration planning, data acquisition, data processing, and interpretation. They comprehensively summarize the contents of airborne electromagnetic survey and seismic reflection survey.

Environmental Survey Data Modeling Using K-means Clustering Techniques

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.3
    • /
    • pp.557-566
    • /
    • 2005
  • Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. In this paper we used k-means clustering of several clustering techniques. The k-means Clustering Is classified as a partitional clustering method. We analyze 2002 Gyeongnam social indicator survey data using k-means clustering techniques for environmental information. We can use these outputs given by k-means clustering for environmental preservation and environmental improvement.

  • PDF

K-means Clustering for Environmental Indicator Survey Data

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.185-192
    • /
    • 2005
  • There are many data mining techniques such as association rule, decision tree, neural network analysis, clustering, genetic algorithm, bayesian network, memory-based reasoning, etc. We analyze 2003 Gyeongnam social indicator survey data using k-means clustering technique for environmental information. Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. In this paper, we used k-means clustering of several clustering techniques. The k-means clustering is classified as a partitional clustering method. We can apply k-means clustering outputs to environmental preservation and environmental improvement.

  • PDF

Variance estimation for distribution rate in stratified cluster sampling with missing values

  • Heo, Sunyeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.443-449
    • /
    • 2017
  • Estimation of population proportion like the distribution rate of LED TV and the prevalence of a disease are often estimated based on survey sample data. Population proportion is generally considered as a special form of population mean. In complex sampling like stratified multistage sampling with unequal probability sampling, the denominator of mean may be random variable and it is estimated like ratio estimator. In this research, we examined the estimation of distribution rate based on stratified multistage sampling, and determined some numerical outcomes using stratified random sample data with about 25% of missing observations. In the data used for this research, the survey weight was determined by deterministic way. So, the weights are not random variable, and the population distribution rate and its variance estimator can be estimated like population mean estimation. When the weights are not random variable, if one estimates the variance of proportion estimator using ratio method, then the variances may be inflated. Therefore, in estimating variance for population proportion, we need to examine the structure of data and survey design before making any decision for estimation methods.

Statistical Methods for Multivariate Missing Data in Health Survey Research (보건조사연구에서 다변량결측치가 내포된 자료를 효율적으로 분석하기 위한 통계학적 방법)

  • Kim, Dong-Kee;Park, Eun-Cheol;Sohn, Myong-Sei;Kim, Han-Joong;Park, Hyung-Uk;Ahn, Chae-Hyung;Lim, Jong-Gun;Song, Ki-Jun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.31 no.4 s.63
    • /
    • pp.875-884
    • /
    • 1998
  • Missing observations are common in medical research and health survey research. Several statistical methods to handle the missing data problem have been proposed. The EM algorithm (Expectation-Maximization algorithm) is one of the ways of efficiently handling the missing data problem based on sufficient statistics. In this paper, we developed statistical models and methods for survey data with multivariate missing observations. Especially, we adopted the EM algorithm to handle the multivariate missing observations. We assume that the multivariate observations follow a multivariate normal distribution, where the mean vector and the covariance matrix are primarily of interest. We applied the proposed statistical method to analyze data from a health survey. The data set we used came from a physician survey on Resource-Based Relative Value Scale(RBRVS). In addition to the EM algorithm, we applied the complete case analysis, which uses only completely observed cases, and the available case analysis, which utilizes all available information. The residual and normal probability plots were evaluated to access the assumption of normality. We found that the residual sum of squares from the EM algorithm was smaller than those of the complete-case and the available-case analyses.

  • PDF

Development of LX GNSS On-line Data Processing System Based on the GIPSY-OASIS (GIPSY-OASIS 기반 LX GNSS 온라인 자료처리 시스템 개발)

  • Kim, Hyun-Ho;Ha, Ji-Hyun;Tcha, Dek-Kie
    • Journal of Advanced Navigation Technology
    • /
    • v.18 no.6
    • /
    • pp.555-561
    • /
    • 2014
  • Data processing service via internet help user to get the GNSS data processing result more precise and easily. Thus, online data process system is operated and developed by various research groups and national. But this service is difficult to use in domestic cadastral survey. In this study, we developed the online data processing system for a domestic cadastral survey. This is calculated coordinate using NGII CORS(SUWN) fiducially. And use PPP technique by GIPSY-OASIS. If user choose the observation data which want to calculate the coordinate, then is uploaded to GIPSY-OASIS server through FTP. After upload is complete, server automatically calculate coordinate, and send the report about result using e-mail. And it takes 2 minutes runtime on the basis of the 3 sessions. To verify the result, we used the data on SOUL, JUNJ as compared with notified-coordinate from NGII. As a result, got the difference for east-west 1.4 cm, north-south -1.0 cm, vertical 0.5 cm.

Mitigating TCP Incast Issue in Cloud Data Centres using Software-Defined Networking (SDN): A Survey

  • Shah, Zawar
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.11
    • /
    • pp.5179-5202
    • /
    • 2018
  • Transmission Control Protocol (TCP) is the most widely used protocol in the cloud data centers today. However, cloud data centers using TCP experience many issues as TCP was designed based on the assumption that it would primarily be used in Wide Area Networks (WANs). One of the major issues with TCP in the cloud data centers is the Incast issue. This issue arises because of the many-to-one communication pattern that commonly exists in the modern cloud data centers. In many-to-one communication pattern, multiple senders simultaneously send data to a single receiver. This causes packet loss at the switch buffer which results in TCP throughput collapse that leads to high Flow Completion Time (FCT). Recently, Software-Defined Networking (SDN) has been used by many researchers to mitigate the Incast issue. In this paper, a detailed survey of various SDN based solutions to the Incast issue is carried out. In this survey, various SDN based solutions are classified into four categories i.e. TCP Receive Window based solutions, Tuning TCP Parameters based solutions, Quick Recovery based solutions and Application Layer based solutions. All the solutions are critically evaluated in terms of their principles, advantages, and shortcomings. Another important feature of this survey is to compare various SDN based solutions with respect to different performance metrics e.g. maximum number of concurrent senders supported, calculation of delay at the controller etc. These performance metrics are important for deployment of any SDN based solution in modern cloud data centers. In addition, future research directions are also discussed in this survey that can be explored to design and develop better SDN based solutions to the Incast issue.

Proposal of Analysis Method for Biota Survey Data Using Co-occurrence Frequency

  • Yong-Ki Kim;Jeong-Boon Lee;Sung Je Lee;Jong-Hyun Kang
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.5 no.3
    • /
    • pp.76-85
    • /
    • 2024
  • The purpose of this study is to propose a new method of analysis focusing on interconnections between species rather than traditional biodiversity analysis, which represents ecosystems in terms of species and individual counts such as species diversity and species richness. This new approach aims to enhance our understanding of ecosystem networks. Utilizing data from the 4th National Natural Environment Survey (2014-2018), the following eight taxonomic groups were targeted for our study: herbaceous plants, woody plants, butterflies, Passeriformes birds, mammals, reptiles & amphibians, freshwater fishes, and benthonic macroinvertebrates. A co-occurrence frequency analysis was conducted using nationwide data collected over five years. As a result, in all eight taxonomic groups, the degree value represented by a linear regression trend line showed a slope of 0.8 and the weighted degree value showed an exponential nonlinear curve trend line with a coefficient of determination (R2) exceeding 0.95. The average value of the clustering coefficient was also around 0.8, reminiscent of well-known social phenomena. Creating a combination set from the species list grouped by temporal information such as survey date and spatial information such as coordinates or grids is an easy approach to discern species distributed regionally and locally. Particularly, grouping by species or taxonomic groups to produce data such as co-occurrence frequency between survey points could allow us to discover spatial similarities based on species present. This analysis could overcome limitations of species data. Since there are no restrictions on time or space, data collected over a short period in a small area and long-term national-scale data can be analyzed through appropriate grouping. The co-occurrence frequency analysis enables us to measure how many species are associated with a single species and the frequency of associations among each species, which will greatly help us understand ecosystems that seem too complex to comprehend. Such connectivity data and graphs generated by the co-occurrence frequency analysis of species are expected to provide a wealth of information and insights not only to researchers, but also to those who observe, manage, and live within ecosystems.