• Title/Summary/Keyword: Selective Data Learning

Search Result 33, Processing Time 0.022 seconds

Data Mining Using Reversible Jump MCMC and Bayesian Network Learning (Reversible Jump MCMC와 베이지안망 학습에 의한 데이터마이닝)

  • 하선영;장병탁
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.90-92
    • /
    • 2000
  • 데이터마이닝 문제는 데이터를 그 속성들에 따라 분류하여 예측하는 것뿐만 아니라 분류된 속성들간의 연관성에 대해 잘 설명할 수 있어야 한다. 일반적으로 변수들간의 연관성을 잘 설명할 수 있으면서도 높은 예측력을 가지는 방법으로는 베이지안 네트웍 분류자(Bayesian network classifier)가 있다. 그러나 이것은 데이터 마이닝과 같은 대용량 데이터에서는 성능이 떨어지는 단점이 있다. 이에 이 논문에서는 최근 RBF 신경망이 입력변수 선정문제에 성공적으로 적용된 Reversible Jump Markov Chain Monte Carlo 방법을 이용하여 최적의 입력변수들만을 선택하여 베이지안 네트웍을 학습하는 Selective BN Augmented Naive-Bayes Classifier를 새로운 방안으로 제안하고 이를 실제 데이터마이닝 문제에 적용한 결과를 제시한다.

  • PDF

A study on categories of questions when holding counselling on learning math in regards to grounded theoretical approaches (근거이론적 접근에 따른 수학학습 상담 발문 유형에 대한 연구)

  • Ko, Ho Kyoung;Kim, Dong Won;Lee, Hwan Chul;Choi, Tae Young
    • Journal of the Korean School Mathematics Society
    • /
    • v.17 no.1
    • /
    • pp.73-92
    • /
    • 2014
  • This study was performed in part with the task to find measures to improve the defining characteristics of feelings, value, interest, self-efficacy, and others aspects in regards to learning math among elementary and middle school students. For this study, it was essential to understand the appropriate questions that are needed to be asked during a consultation at a math clinic, for students that are having a hard time learning math. As a method for performing this study, the content of scheduled counseling over 2 years from a math clinic were collected and the questions that were given and taken were analyzed in order to figure out the types of questions needed in order to effectively examine students that are facing difficulty with learning math. The analysis was performed using Grounded theory analysis by Strauss & Corbin(1998) and went through the process of open coding, axial coding, and selective coding. For the paradigm in the categorical analysis stage, 'attitude towards learning math' was set as the casual condition, 'feelings towards learning math' was set as the contextual condition, 'confidence in one's ability to learn math' was set as the phenomenon, 'individual tendencies when learning math' was set as the intervening condition, 'self-management of learning math' was set as the action/interaction strategy, and 'method of learning' was set as the consequence. Through this, the questions that appeared during counseling were linked into categories and subcategories. Through this process, 81 concepts were deducted, which were grouped into 31 categories. I believe that this data can be used as grounded theory for standardization of consultation in clinics.

  • PDF

Variations of Shared Learning in Trading Zone: Focus on the Case of Teachers in the 'Learning Community of Woodworking' (교역지대 내에서 공유된 배움의 다양한 변주: 목공 학습 공동체 교사들의 사례를 중심으로)

  • Jung, Young-Hee;Shin, Sein;Lee, Jun-Ki
    • Journal of Science Education
    • /
    • v.43 no.2
    • /
    • pp.239-257
    • /
    • 2019
  • This study attempts to understand the context of shared learning in the trading zone formed by teachers from different backgrounds and the process in which this shared learning varies in the educational context, focusing on the case of 'Woodwork Science Education Study Group.' To do this, data was collected through in-depth interviews with eight teachers who participated in the 'Woodworking Science Education Research Group' and analyzed their responses based on grounded theory. As a result, the causal conditions of the teachers' research group were 'various contexts of entering the trading zone' and the central phenomenon was 'encounter with learning in the trading zone.' Contextual conditions affecting this phenomenon were 'woodwork as a boundary object and individual transfiguration experience,' and action/interaction strategy was 'various efforts and influences in the field.' The intervention condition was 'practical effort and experience in educational field.' Final result in this model is 'the new practice of learning shared in the trading zone.' In selective coating, it was found that the practice of the teacher's research group appears as four types of' 'Extracurricular creative experience type,' 'career education type,' 'curricula education type,' and 'school management type.' The results of this study suggest that the shared learning and antonymous practice among teachers in the teachers' research group as trading zone do not only meet their learning needs but also lead to various teaching practices in the individual teachers' context of education and improve the diversity and quality of education.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

Groping of a New Evaluation Method using the Knowledge State Analysis in the Selective Examination of Scientifical Gifted (과학영재 선발시험에서 지식상태 분석법을 통한 새로운 평가 방법 모색)

  • Park, Sang-Tae;Byun, Du-Won;Yuk, Keun-Cheol;Jung, Jum-Soon
    • Journal of Gifted/Talented Education
    • /
    • v.15 no.1
    • /
    • pp.37-48
    • /
    • 2005
  • Comparing to the other subject, the relationship among physics contents is strong from the perspective of knowledge order as grades go up. That is, The things already that students learned, are learning and will learn are closed related from grade to grade. We expect students to be proactive and creative in studying physics, which is the goal of 21th century, analyzing their knowledge structure based on the knowledge order through assessment. Especially, using computer system, we provide students with substantial feedback for the assessment as well as objective validity is increased along with speedy and exact process in a bid to help students' mathematical understanding grow. This paper seeks to analyze the data from assessment applying knowledge spaces of the scientifical gifted in selective examination and to applicate on development of evaluation method.

An Exploration of the Process of Enhancing Science Self-Efficacy of High School Students in the STEM Track (자연계열 고등학생의 과학 자기효능감 향상 과정 탐색)

  • Shin, Seung-Hee;Mun, Kongju;Kim, Sung-Won
    • Journal of The Korean Association For Science Education
    • /
    • v.39 no.3
    • /
    • pp.321-335
    • /
    • 2019
  • This study aims to explore the influencing factors and the process of enhancing science self-efficacy (SSE) and to lay the foundation in understanding science self-efficacy of students. The ten categories related to the science self-efficacy were derived through the coding of the interview data based on the grounded theory and paradigm analysis to develop a process model of science self-efficacy improvement. Through the process analysis, four cyclical phases were found in the process of enhancing SSE: 'Entering into learning science' phase, 'enhancing SSE' phase, 'adjustment' phase, and 'result' phase. More specifically, the phase of 'entering into learning science' is where students choose science track and stimulated to construct SSE. The phase of 'enhancing SSE' is where students taking a science track actively learn science and perform science activities. In the phase of 'adjustment', students come to have successful performance about learning science and performing science activities by using diverse strategies. Finally, 'result' phase indicates different appearances of students depending on SSE levels. The phases were non-linear and periodically repeat depending on situation. The core category in the selective coding was indicated to be 'enhancing science self-efficacy.' Students' SSE form by learning science and performing science activities. These finding may help better understand the behavior of students who are taking a science track by facilitating effective science learning through the increase of their SSE levels.

Examination on unified Silla's cultural exchange and brick pagoda formation course (통일신라의 문화교류 및 전탑형성과정에 대한 고찰)

  • Kim, Sang-Gu;Lee, Jeong-Soo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.8
    • /
    • pp.5369-5377
    • /
    • 2014
  • Korean pagodas were constructed in the shape of a wood pagoda, brick pagoda, stone pagoda, etc. On the other hand, the currently remaining traditional pagodas are those having nonflammable materials, such as brick, stone, etc. Compared to the stone pagoda, there is data regarding brick pagodas, but there is little literature data on how to construct these pagodas. This appears to be because there are relatively few Korean brick pagodas currently remaining, they are locally restricted, the material limit is not overcome, pagoda's historical and regional problems have not been analyzed, and pagoda construction is centered on pagoda construction. Therefore, this study examined the local cultural characteristics on the construction of brick pagodas. As a result, cultural exchange between Korea and China was performed through the silk road and there was a marine route for cultural exchange. Such exchange was shared with the East Asia area as well, which can be found by comparing remains at related areas. Exchange with China can be mentioned as the selective exchange of local powers as well as blind learning. Second, brick pagoda were constructed in Korea because of the good quality soil easily. Uisang's Hwaeomjong was negotiated with the main power not agreeing with Buddhism, which was popularized and the local power. Third, brick pagoda construction was influenced by negotiation related between Balhae and Silla, in which the ethnic influence was locally affected and could be mentioned as being a culturally selective result transferred from China. As a result, brick pagodas can be oriented by forming a unitary state rather than a small country within China's influence range as well as cultural transfer through the silk road.

An Exploratory Study with Grounded Theory on Secondary Mathematics Teachers' Difficulties of Technology in Geometry Class (기하 수업에서 중등 수학교사가 경험한 공학도구 사용의 어려움에 대한 근거이론적 탐색)

  • Jeon, Soo Kyung;Cho, Cheong-Soo
    • Journal of Educational Research in Mathematics
    • /
    • v.24 no.3
    • /
    • pp.387-407
    • /
    • 2014
  • This study investigeted secondary math teachers' difficulties of technology in geometry class with grounded theory by Strauss and Corbin. 178 secondary math teachers attending the professional development program on technology-based geometry teaching at eight locations in January 2014, participated in this study with informed consents. Data was collected with an open-ended questionnaire survey. In line with grounded theory, open, axial and selective coding were applied to data analysis. According to the results of this study, teachers were found to experience resistance in using technology due to new learning and changes, with knowledge and awareness of technology effectively interacting to lessen such resistance. In using technology, teachers were found to go through the 'access-resistance-unaccepted use-acceptance' stages. Teachers having difficulties in using technology included the following four types: 'inaccessible, denial of acceptance, discontinuation of use, and acceptance 'These findings suggest novel perspectives towards teachers having difficulties in using technology, providing implications for teachers' professional development.

  • PDF

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

A Study on Regional-customizededucation program selection model using big data analysis (빅데이터 분석을 활용한 지역 맞춤형 교육프로그램 선정 모형 개발)

  • Hyeon-Seong Kim;Jin-Sook Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.2
    • /
    • pp.381-388
    • /
    • 2023
  • This thesis is purposed to develop a regional-customized education program selection model using big data analysis. Based on the literature review, the concepts and characteristics of big data and lifelong education are analyzed. In addition, this thesis presents how to collect the data for lifelong education and to use big data suitable for the characteristics of lifelong education. Based on these results, a regional- customized lifelong education program selection model is developed. The regional customized lifelong education program model is developed by the following six steps. The customized education program model proposed in this study has a high degree of flexibility in terms of practical use, as it can be utilized in real-time data provision methods such as the nationally approved Lifelong Learning Personal Status Survey without the need for analysis one year later, allowing for selective analysis and future predictions. It is clear that there is a significant need and value for big data in the education field. Furthermore, all programs used in the sample model are provided free of charge, and due to the programming nature, the community is actively engaged in exchanges, making it very easy to modify and improve for the development of a more complete education program model in the future.