• Title/Summary/Keyword: Education Data Mining

Search Result 268, Processing Time 0.028 seconds

Development of the Goods Recommendation System using Association Rules and Collaborating Filtering (연관규칙과 협업적 필터링을 이용한 상품 추천 시스템 개발)

  • Kim, Ji-Hye;Park, Doo-Soon
    • The Journal of Korean Association of Computer Education
    • /
    • v.9 no.1
    • /
    • pp.71-80
    • /
    • 2006
  • As e-commerce developing rapidly, it is becoming a research focus about how to find customer's behavior patterns and realize commerce intelligence by use of Web mining technology. One of the most successful and widely used technologies for building personalization and goods recommendation system is collaborating filtering. However, collaborative filtering have serious data sparsity problem. Traditional association rule does not consider user's interests or preferences to provide a user with specific personalized service.In this paper, we propose an goods recommendation system, which is integrated an collaborative filtering algorithm with item-to-item corelation and an improved Apriori algorithm. This system has user's interests or preferences ro provide a user with specific personalized service.

  • PDF

Comparative study of K-scale and the internet addiction diagnosis method using tolerance degree for internet use (K-척도와 인터넷 사용 내성정도를 이용한 인터넷 중독 진단 방법의 비교 연구)

  • Kim, Hee-Jae;Kim, Jong-Wan
    • The Journal of Korean Association of Computer Education
    • /
    • v.15 no.2
    • /
    • pp.47-55
    • /
    • 2012
  • We discovered the fact that the most important factor for judging adults' internet addiction in the K-scale method which was developed by Korea National Information Society Agency (NIA), has composed of 4 categories including 20 items, is tolerance and preoccupation factor from the experiments by using data mining techniques. In this research, we propose a new internet addiction diagnostic method based on the degree of tolerance considering users' non-duty internet activities. From some questionnaire participants, their feedbacks for the K-scale and the proposed diagnostic method were collected, and then we confirmed that the proposed user-centered diagnostic method is effective to find undiscovered addicts due to individuals's intention in the K-scale.

  • PDF

Analysis of Trends in Education Policy of STEAM Using Text Mining: Comparative Analysis of Ministry of Education's Documents, Articles, and Abstract of Researches from 2009 to 2020 (텍스트 마이닝을 활용한 융합인재교육정책 동향 분석 -2009년~2020년 교육부보도, 언론보도, 학술지 초록 비교분석-)

  • You, Jungmin;Kim, Sung-Won
    • Journal of The Korean Association For Science Education
    • /
    • v.41 no.6
    • /
    • pp.455-470
    • /
    • 2021
  • This study examines the trend changes in keywords and topics of STEAM education from 2009 to 2020 to derive future development direction and education implications. Among the collected data, 42 cases of Ministry of Education's documents, 1,534 cases of articles, and 880 cases of abstract of researches were selected as research subjects. Keyword analysis, keyword network and topic modeling were performed for each stage of STEAM education policy through the Python program. As a result of the analysis, according to the STEAM education policy stage, there were differences in the frequency and network of keywords related to STEAM education by media. It was confirmed that there was a difference in interest in STEAM education policy as there were differences in keywords and topics that were mainly used importantly by media. Most of the topics of the Ministry of Education's documents were found to correspond to topics derived from articles. The implications for the development direction of STEAM education derived from the results of this study are as follows: first, STEAM education needs to consider ways to connect multiple topics, including the humanities. Second, since the media has a difference in interest in STEAM education policy, it is necessary to seek a cooperative development direction through understanding this. Third, the Ministry of Education's support for core competency reinforcement and convergence literacy for nurturing future talents, the goal of STEAM education, and the media's efforts to increase the public's understanding of STEAM education are required. Lastly, it is necessary to continuously analyze the themes that will appear in the evaluation process and change STEAM education policy.

Clickstream Big Data Mining for Demographics based Digital Marketing (인구통계특성 기반 디지털 마케팅을 위한 클릭스트림 빅데이터 마이닝)

  • Park, Jiae;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.143-163
    • /
    • 2016
  • The demographics of Internet users are the most basic and important sources for target marketing or personalized advertisements on the digital marketing channels which include email, mobile, and social media. However, it gradually has become difficult to collect the demographics of Internet users because their activities are anonymous in many cases. Although the marketing department is able to get the demographics using online or offline surveys, these approaches are very expensive, long processes, and likely to include false statements. Clickstream data is the recording an Internet user leaves behind while visiting websites. As the user clicks anywhere in the webpage, the activity is logged in semi-structured website log files. Such data allows us to see what pages users visited, how long they stayed there, how often they visited, when they usually visited, which site they prefer, what keywords they used to find the site, whether they purchased any, and so forth. For such a reason, some researchers tried to guess the demographics of Internet users by using their clickstream data. They derived various independent variables likely to be correlated to the demographics. The variables include search keyword, frequency and intensity for time, day and month, variety of websites visited, text information for web pages visited, etc. The demographic attributes to predict are also diverse according to the paper, and cover gender, age, job, location, income, education, marital status, presence of children. A variety of data mining methods, such as LSA, SVM, decision tree, neural network, logistic regression, and k-nearest neighbors, were used for prediction model building. However, this research has not yet identified which data mining method is appropriate to predict each demographic variable. Moreover, it is required to review independent variables studied so far and combine them as needed, and evaluate them for building the best prediction model. The objective of this study is to choose clickstream attributes mostly likely to be correlated to the demographics from the results of previous research, and then to identify which data mining method is fitting to predict each demographic attribute. Among the demographic attributes, this paper focus on predicting gender, age, marital status, residence, and job. And from the results of previous research, 64 clickstream attributes are applied to predict the demographic attributes. The overall process of predictive model building is compose of 4 steps. In the first step, we create user profiles which include 64 clickstream attributes and 5 demographic attributes. The second step performs the dimension reduction of clickstream variables to solve the curse of dimensionality and overfitting problem. We utilize three approaches which are based on decision tree, PCA, and cluster analysis. We build alternative predictive models for each demographic variable in the third step. SVM, neural network, and logistic regression are used for modeling. The last step evaluates the alternative models in view of model accuracy and selects the best model. For the experiments, we used clickstream data which represents 5 demographics and 16,962,705 online activities for 5,000 Internet users. IBM SPSS Modeler 17.0 was used for our prediction process, and the 5-fold cross validation was conducted to enhance the reliability of our experiments. As the experimental results, we can verify that there are a specific data mining method well-suited for each demographic variable. For example, age prediction is best performed when using the decision tree based dimension reduction and neural network whereas the prediction of gender and marital status is the most accurate by applying SVM without dimension reduction. We conclude that the online behaviors of the Internet users, captured from the clickstream data analysis, could be well used to predict their demographics, thereby being utilized to the digital marketing.

An Exploratory Study of e-Learning Satisfaction: A Mixed Methods of Text Mining and Interview Approaches (이러닝 만족도 증진을 위한 탐색적 연구: 텍스트 마이닝과 인터뷰 혼합방법론)

  • Sun-Gyu Lee;Soobin Choi;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.21 no.1
    • /
    • pp.39-59
    • /
    • 2019
  • E-learning has improved the educational effect by making it possible to learn anytime and anywhere by escaping the traditional infusion education. As the use of e-learning system increases with the increasing popularity of e-learning, it has become important to measure e-learning satisfaction. In this study, we used the mixed research method to identify satisfaction factors of e-learning. The mixed research method is to perform both qualitative research and quantitative research at the same time. As a quantitative research, we collected reviews in Udemy.com by text mining. Then we classified high and low rated lectures and applied topic modeling technique to derive factors from reviews. Also, this study conducted an in-depth 1:1 interview on e-learning learners as a qualitative research. By combining these results, we were able to derive factors of e-learning satisfaction and dissatisfaction. Based on these factors, we suggested ways to improve e-learning satisfaction. In contrast to the fact that survey-based research was mainly conducted in the past, this study collects actual data by text mining. The academic significance of this study is that the results of the topic modeling are combined with the factor based on the information system success model.

An Analysis of Causes of Marine Incidents at sea Using Big Data Technique (빅데이터 기법을 활용한 항해 중 준해양사고 발생원인 분석에 관한 연구)

  • Kang, Suk-Young;Kim, Ki-Sun;Kim, Hong-Beom;Rho, Beom-Seok
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.24 no.4
    • /
    • pp.408-414
    • /
    • 2018
  • Various studies have been conducted to reduce marine accidents. However, research on marine incidents is only marginal. There are many reports of marine incidents, but the main content of existing studies has been qualitative, which makes quantitative analysis difficult. However, quantitative analysis of marine accidents is necessary to reduce marine incidents. The purpose of this paper is to analyze marine incident data quantitatively by applying big data techniques to predict marine incident trends and reduce marine accident. To accomplish this, about 10,000 marine incident reports were prepared in a unified format through pre-processing. Using this preprocessed data, we first derived major keywords for the Marine incidents at sea using text mining techniques. Secondly, time series and cluster analysis were applied to major keywords. Trends for possible marine incidents were predicted. The results confirmed that it is possible to use quantified data and statistical analysis to address this topic. Also, we have confirmed that it is possible to provide information on preventive measures by grasping objective tendencies for marine incidents that may occur in the future through big data techniques.

Analysis of Characteristics of Clusters of Middle School Students Using K-Means Cluster Analysis (K-평균 군집분석을 활용한 중학생의 군집화 및 특성 분석)

  • Jaebong, Lee
    • Journal of The Korean Association For Science Education
    • /
    • v.42 no.6
    • /
    • pp.611-619
    • /
    • 2022
  • The purpose of this study is to explore the possibility of applying big data analysis to provide appropriate feedback to students using evaluation data in science education at a time when interest in educational data mining has recently increased in education. In this study, we use the evaluation data of 2,576 students who took 24 questions of the national assessment of educational achievement. And we use K-means cluster analysis as a method of unsupervised machine learning for clustering. As a result of clustering, students were divided into six clusters. The middle-ranking students are divided into various clusters when compared to upper or lower ranks. According to the results of the cluster analysis, the most important factor influencing clusterization is academic achievement, and each cluster shows different characteristics in terms of content domains, subject competencies, and affective characteristics. Learning motivation is important among the affective domains in the lower-ranking achievement cluster, and scientific inquiry and problem-solving competency, as well as scientific communication competency have a major influence in terms of subject competencies. In the content domain, achievement of motion and energy and matter are important factors to distinguish the characteristics of the cluster. As a result, we can provide students with customized feedback for learning based on the characteristics of each cluster. We discuss implications of these results for science education, such as the possibility of using this study results, balanced learning by content domains, enhancement of subject competency, and improvement of scientific attitude.

A Study on the Potential and Limitation of Pre-producing Dramas through Social Analysis -focusing on a jtbc drama - (소셜 분석을 통한 사전제작 드라마의 가능성과 한계에 관한 연구 -jtbc <맨투맨>을 중심으로-)

  • Kim, Kyung-Ae;Ku, Jin-Hee
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.2
    • /
    • pp.164-172
    • /
    • 2018
  • This paper examines the relevance of pre-production and storytelling in big data analysis and, focusing on JTBC's Man to Man series, looks at how the drama's storytelling should be structured. In this study, we conducted text mining on blogs focused on a particular topic to read the viewer's thoughts on pre-produced dramas and on 67 blogs written about Pre-Production Dramas from 2016.12.15 to 2017.12.15. Also, we conducted sentiment analysis about the Man to Man series, which is not only a pre-production drama, but also has storytelling issues. The blog text extraction and text mining were analyzed using the OutWit Hub and the R, and the tools.provided by social metrics were used to make sentiment analyses of the larger data. Sentiment analysis revealed that the viewers of the Man to Man series did not agree with the romance between Kim Sul-woo and Cha Do-ha, due to the lack of reality in the female characters. Therefore, it was concluded that it is crucial to increase the reality of the characters in order to increase the audience's empathy. These studies will continue to be necessary, because they will form the basis for digitally driven storytelling studies and will provide valuable materials for conducting predictions and instructions in the cultural content industry.

A Study on Questionnaire Improvement using Text Mining (텍스트 마이닝 기법을 활용한 설문 문항 개선에 관한 연구)

  • Paek, Yun-Ji;Jung, Chang-Hyun
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.26 no.2
    • /
    • pp.121-128
    • /
    • 2020
  • The Marine Safety Culture Index (MSCI) was developed in the year 2018 for objectively assessing the public safety culture levels and for incorporating it as data to spread knowledge regarding the marine safety culture. The method for calculating the safety culture index should include issues that may affect the safety culture and should consist of appropriate attributes for estimating the current status. In addition, continuous verification and supplementation are required for addressing social and economic changes. In this study, to determine whether the questionnaire designed by marine experts reflects the people's interests and needs, we analyzed 915 marine safety proposals. Text mining was employed for analyzing the unstructured data of the marine safety proposals, and network analysis and topic modeling were subsequently performed. Analysis of the marine safety proposals was centered on attributes such as education, public relations, safety rules, awareness, skilled workers, and systems. Eighteen questions were modified and supplemented for reflecting the marine safety proposals, and reliability of the revised questions was analyzed. Furthermore, compared to the previous year, the questionnaire's internal consistency was improved upon and was rated at a high value of 0.895. It is expected that by employing the derived marine safety culture index and incorporating the improved questionnaire that reflects the requirements of marine experts and the people, the improved questionnaire will contribute to the establishment of policies for spreading knowledge regarding the marine safety culture.

Tectonic Structure Modeling around the Ulleung Basin and Dokdo Using Potential Data (포텐셜 자료를 이용한 울릉분지와 독도 주변 지체구조 연구)

  • Park, Gye-Soon;Park, Jun-Suk;Kwon, Byung-Doo;Kim, Chang-Hwan;Park, Chan-Hong
    • Journal of the Korean earth science society
    • /
    • v.30 no.2
    • /
    • pp.165-175
    • /
    • 2009
  • The East Sea including the area of this study is identified as a typical back-arc sea located in the backside of the Circum-Pacific volcanic and earthquake belt. Previous studies reported that the East Sea has begun to open by tensile force and formed its current shape. In this study, we investigate the regional tectonic structure of the East Sea using ship-borne gravity, magnetic, and satellite gravity data. The result of three-dimensional depth inversion shows that Moho depth of the study area is approximately 13-25km and inversely proportional to the thickness of the crust. In addition, as approaching to the center of the Ulleung Basin (UB), the thickness of the crust of the UB becomes thinner due to the extension caused by tensile force which had opened the East Sea.