• Title/Summary/Keyword: exploratory data analysis

Search Result 1,337, Processing Time 0.035 seconds

Firework Plot as a Graphical Exploratory Data Analysis Tool to Evaluate the Impact of Outliers in a Mixture Experiment (혼합물 실험에서 특이값의 영향을 평가하기 위한 그래픽 탐색적 자료분석 도구로서의 불꽃그림)

  • Jang, Dae-Heung;Ahn, SoJin;Kim, Youngil
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.4
    • /
    • pp.629-643
    • /
    • 2014
  • It is common to check the validity of an assumed model with the heavy use of diagnostics tools when conducting data analysis with regression techniques; however, outliers and influential data points often distort the regression output in undesired manner. Jang and Anderson-Cook (2013) proposed a graphical method called a firework plot for exploratory analysis that could visualize the trace of the impact of possible outlying and/or influential data points on individual regression coefficients and the overall residual sum of squares(SSE) measure. They developed 3-D plot as well as pair-wise plot for the appropriate measures of interest. In this paper, the approach was extended further to tell the strength of their approach; in addition, a more meaningful interpretation was possible by adding a measure not mentioned in their paper. This approach was applied to the mixture experiment because we felt that a detailed analysis of statistical measure sensitivity is required in a small experiment.

A Big Data-Driven Business Data Analysis System: Applications of Artificial Intelligence Techniques in Problem Solving

  • Donggeun Kim;Sangjin Kim;Juyong Ko;Jai Woo Lee
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.35-47
    • /
    • 2023
  • It is crucial to develop effective and efficient big data analytics methods for problem-solving in the field of business in order to improve the performance of data analytics and reduce costs and risks in the analysis of customer data. In this study, a big data-driven data analysis system using artificial intelligence techniques is designed to increase the accuracy of big data analytics along with the rapid growth of the field of data science. We present a key direction for big data analysis systems through missing value imputation, outlier detection, feature extraction, utilization of explainable artificial intelligence techniques, and exploratory data analysis. Our objective is not only to develop big data analysis techniques with complex structures of business data but also to bridge the gap between the theoretical ideas in artificial intelligence methods and the analysis of real-world data in the field of business.

Empirical modelling approaches to modelling failures

  • Baik, Jaiwook;Jo, Jinnam
    • International Journal of Reliability and Applications
    • /
    • v.14 no.2
    • /
    • pp.107-114
    • /
    • 2013
  • Modelling of failures is an important element of reliability modelling. Empirical modelling approach suitable for complex item is explored in this paper. First step of the empirical modelling approach is to plot hazard function, density function, Weibull probability plot as well as cumulative intensity function to see which model fits best for the given data. Next step of the empirical modelling approach is select appropriate model for the data and fit the parametric model accordingly and estimate the parameters.

  • PDF

A Study on the Application of Innovative Teaching Method in Tourism in the Generation AI Era (생성형 AI 시대의 관광 분야 혁신교수법 적용에 관한 연구)

  • Choi Younghwan
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.20 no.1
    • /
    • pp.87-98
    • /
    • 2024
  • This study conducted an empirical study on the application of innovative teaching methods in the tourism field in a situation where innovative teaching methods suitable for the AI era are required. It was intended to provide exploratory basic data on the application of a wide range of innovative teaching methods through actual verification of the educational effectiveness before and after the application of the innovative teaching method. To this end, the effectiveness before and after education was empirically verified with 60 students who majored in tourism at Y University in the metropolitan area. Reliability analysis, corresponding sample t-test, and map analysis using graphs were performed on the collected data to increase visibility. As a result of the study, it was found that all the competencies of the innovative teaching method had a statistically significant influence after the application of the innovative teaching method. In addition, by increasing the effect of interaction between instructors and learners acting as facilitators, exploratory results were derived for potential benefits and areas that could be improved.

Exploratory research based on big data for Improving the revisit rate of foreign tourists and invigorating consumption (외국인 관광객 재방문율 향상과 소비 활성화를 위한 빅데이터 기반의 탐색적 연구)

  • An, Sung-Hyun;Park, Seong-Taek
    • Journal of Industrial Convergence
    • /
    • v.18 no.6
    • /
    • pp.19-25
    • /
    • 2020
  • Big data analytics are indispensable today in various industries and public sectors. Therefore, in this study, we will utilize big data analysis to search for improvement plans for domestic tourism services using the LDA analysis method. In particular, we have tried an exploratory approach that can improve tourist satisfaction, which can improve revisit and service, especially in Seoul, which has the largest number of foreign tourists. In this study, we collected and analyzed statistical data of Seoul City and Korea Tourism Organization and Internet information such as SNS via R. And we utilized text mining methods including LDA. As a result of the analysis, one of the purposes of visiting South Korea by foreigners was gastronomic tourism. We will try to derive measures to improve the quality of services centered on gastronomic tourism.

Feature Vector Extraction for Solar Energy Prediction through Data Visualization and Exploratory Data Analysis (데이터 시각화 및 탐색적 데이터 분석을 통한 태양광 에너지 예측용 특징벡터 추출)

  • Jung, Wonseok;Ham, Kyung-Sun;Park, Moon-Ghu;Jeong, Young-Hwa;Seo, Jeongwook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.10a
    • /
    • pp.514-517
    • /
    • 2017
  • In solar photovoltaic systems, power generation is greatly affected by the weather conditions, so it is essential to predict solar energy for stable load operation. Therefore, data on weather conditions are needed as inputs to machine learning algorithms for solar energy prediction. In this paper, we use 15 kinds of weather data such as the precipitation accumulated during the 3 hours of the surface, upward and downward longwave radiation average, upward and downward shortwave radiation average, the temperature during the past 3 hours at 2 m above from the ground and temperature from the ground surface as input data to the algorithm. We analyzed the statistical characteristics and correlations of weather data and extracted the downward and upward shortwave radiation averages as a major elements of a feature vector with high correlation of 70% or more with solar energy.

  • PDF

Utilization of Log Data Reflecting User Information-Seeking Behavior in the Digital Library

  • Lee, Seonhee;Lee, Jee Yeon
    • Journal of Information Science Theory and Practice
    • /
    • v.10 no.1
    • /
    • pp.73-88
    • /
    • 2022
  • This exploratory study aims to understand the potential of log data analysis and expand its utilization in user research methods. Transaction log data are records of electronic interactions that have occurred between users and web services, reflecting information-seeking behavior in the context of digital libraries where users interact with the service system during the search for information. Two ways were used to analyze South Korea's National Digital Science Library (NDSL) log data for three days, including 150,000 data: a log pattern analysis, and log context analysis using statistics. First, a pattern-based analysis examined the general paths of usage by logged and unlogged users. The correlation between paths was analyzed through a χ2 analysis. The subsequent log context analysis assessed 30 identified users' data using basic statistics and visualized the individual user information-seeking behavior while accessing NDSL. The visualization shows included 30 diverse paths for 30 cases. Log analysis provided insight into general and individual user information-seeking behavior. The results of log analysis can enhance the understanding of user actions. Therefore, it can be utilized as the basic data to improve the design of services and systems in the digital library to meet users' needs.

The Development and Validation of a Scale to Measure the Mathematical Interaction of Young Children's Parents (유아기 부모의 수학적 상호작용 척도 개발 및 타당화 연구)

  • Kim, Jihyun
    • Korean Journal of Child Studies
    • /
    • v.36 no.5
    • /
    • pp.95-113
    • /
    • 2015
  • This study aimed to develop and validate a scale which could be used to evaluate mathematical interactions of parents with their young children. The subjects comprised 408 mothers of 4-6-year-old children. Means, standard deviation, $x^2$, Cramer's V, exploratory factor analysis, confirmatory factor analysis, Pearson correlations, and Cronbach's ${\alpha}$ were calculated. First, 49 items were developed through a review of relevant research, parent interviews, confirmation of item adequacy and content validity. These items were then edited down to a final list of 24 items representing 4 factors identified by exploratory factor analysis. Second, this 24-item. 4-factor scale was shown to have adequate construct validity, norm validity, and reliability by confirmatory factor analysis, Pearson correlation analysis, Cronbach's ${\alpha}$. In conclusion, the final mathematical interaction scale for young children's parents was composed of 24 items with 4 factors: "interaction regarding numbers and operations, measurements, and patterns", "interaction regarding data collection and result presentation", "interaction with picture books", and "interaction regarding shapes and figures"

A Study on Job Satisfaction by Medical Information System Accomplishment

  • Kim, Chung-Gun;Sohn, Chang-yong;Chung, Yun-kyung
    • Journal of Korean Clinical Health Science
    • /
    • v.6 no.2
    • /
    • pp.1126-1135
    • /
    • 2018
  • Purpose. The purpose of this study is to investigate the success model related to the hospital information system accomplishment. It is important to examine the success model of the hospital information system and to analyze the factors affecting the job satisfaction accomplishment. Methods. The method of this study is to 150 copies of the entire survey data were distributed and 135 copies were collected, showing a collection rate of 90%. In order to ensure the reliability of the questionnaire items, Cronbach's Alpha was used to test reliability, and exploratory factor analysis was conducted to determine the convergence of various items. In order to grasp the convergence of various items, exploratory factor analysis was performed. The results of exploratory factor analysis were used to analyze the correlations between variables that were proven to have a single dimensionality before calculating factor loadings and regression analysis by Orthogonal Rotation by Varimax method Results. The results of this study, first, the system quality of the hospital information system has a statistically significant effect on user satisfaction. Second, the information quality of hospital information system is statistically significant for user satisfaction, indicating that information quality improves user satisfaction. Third, service quality of hospital information system was statistically significant in user satisfaction. Finally, the higher the satisfaction of the users who use the hospital information system, the higher the accomplishment of the organization Conclusions. This study is based on the successful model of D & M information system. In addition, the hospital information system, the user satisfaction, and the organizational accomplishment in connection with it can be found significant.

Correspondence analysis for studying association between geography and cancer

  • Song, Joon-Jin;Yu, Pingjian;Ren, Yuan;Chung, Ming-Hua
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.919-924
    • /
    • 2009
  • Geographical location carries information such as demography, local economy, environment, and life styles, which could be the sources of cancer occurrence. Analyzing geographical location associated with cancer occurrence can be instructive to physicians, patients, and health administrators regarding resource allocation, expenditures, prophylaxis and treatments. In this paper, we explored the correspondence relationship between geographical locations and mortality rates of the cancers using correspondence analysis and illustrated the approach with the mortality rates of the top 10 cancers in the 75 counties in Arkansas from 2001 to 2005. Geographical variations with respect to the mortality rates of cancers are evaluated across Arkansas counties. Based on the contingency table, correspondence analysis model is developed and the simple indices which indicate the degree to which the regions and the cancers affect each other are calculated. Quantitative results are visualized and mapped in two-dimensional graphs.

  • PDF