• Title/Summary/Keyword: machine learning framework

Search Result 250, Processing Time 0.024 seconds

Students' Performance Prediction in Higher Education Using Multi-Agent Framework Based Distributed Data Mining Approach: A Review

  • M.Nazir;A.Noraziah;M.Rahmah
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.10
    • /
    • pp.135-146
    • /
    • 2023
  • An effective educational program warrants the inclusion of an innovative construction which enhances the higher education efficacy in such a way that accelerates the achievement of desired results and reduces the risk of failures. Educational Decision Support System (EDSS) has currently been a hot topic in educational systems, facilitating the pupil result monitoring and evaluation to be performed during their development. Insufficient information systems encounter trouble and hurdles in making the sufficient advantage from EDSS owing to the deficit of accuracy, incorrect analysis study of the characteristic, and inadequate database. DMTs (Data Mining Techniques) provide helpful tools in finding the models or forms of data and are extremely useful in the decision-making process. Several researchers have participated in the research involving distributed data mining with multi-agent technology. The rapid growth of network technology and IT use has led to the widespread use of distributed databases. This article explains the available data mining technology and the distributed data mining system framework. Distributed Data Mining approach is utilized for this work so that a classifier capable of predicting the success of students in the economic domain can be constructed. This research also discusses the Intelligent Knowledge Base Distributed Data Mining framework to assess the performance of the students through a mid-term exam and final-term exam employing Multi-agent system-based educational mining techniques. Using single and ensemble-based classifiers, this study intends to investigate the factors that influence student performance in higher education and construct a classification model that can predict academic achievement. We also discussed the importance of multi-agent systems and comparative machine learning approaches in EDSS development.

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

  • Kim, JaeHun;Lee, Myungjin
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.43-61
    • /
    • 2019
  • Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.

Research study on cognitive IoT platform for fog computing in industrial Internet of Things (산업용 사물인터넷에서 포그 컴퓨팅을 위한 인지 IoT 플랫폼 조사연구)

  • Sunghyuck Hong
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.1
    • /
    • pp.69-75
    • /
    • 2024
  • This paper proposes an innovative cognitive IoT framework specifically designed for fog computing (FC) in the context of industrial Internet of Things (IIoT). The discourse in this paper is centered on the intricate design and functional architecture of the Cognitive IoT platform. A crucial feature of this platform is the integration of machine learning (ML) and artificial intelligence (AI), which enhances its operational flexibility and compatibility with a wide range of industrial applications. An exemplary application of this platform is highlighted through the Predictive Maintenance-as-a-Service (PdM-as-a-Service) model, which focuses on real-time monitoring of machine conditions. This model transcends traditional maintenance approaches by leveraging real-time data analytics for maintenance and management operations. Empirical results substantiate the platform's effectiveness within a fog computing milieu, thereby illustrating its transformative potential in the domain of industrial IoT applications. Furthermore, the paper delineates the inherent challenges and prospective research trajectories in the spheres of Cognitive IoT and Fog Computing within the ambit of Industrial Internet of Things (IIoT).

MOnCa2: High-Level Context Reasoning Framework based on User Travel Behavior Recognition and Route Prediction for Intelligent Smartphone Applications (MOnCa2: 지능형 스마트폰 어플리케이션을 위한 사용자 이동 행위 인지와 경로 예측 기반의 고수준 콘텍스트 추론 프레임워크)

  • Kim, Je-Min;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.295-306
    • /
    • 2015
  • MOnCa2 is a framework for building intelligent smartphone applications based on smartphone sensors and ontology reasoning. In previous studies, MOnCa determined and inferred user situations based on sensor values represented by ontology instances. When this approach is applied, recognizing user space information or objects in user surroundings is possible, whereas determining the user's physical context (travel behavior, travel destination) is impossible. In this paper, MOnCa2 is used to build recognition models for travel behavior and routes using smartphone sensors to analyze the user's physical context, infer basic context regarding the user's travel behavior and routes by adapting these models, and generate high-level context by applying ontology reasoning to the basic context for creating intelligent applications. This paper is focused on approaches that are able to recognize the user's travel behavior using smartphone accelerometers, predict personal routes and destinations using GPS signals, and infer high-level context by applying realization.

Design of Query Processing System to Retrieve Information from Social Network using NLP

  • Virmani, Charu;Juneja, Dimple;Pillai, Anuradha
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1168-1188
    • /
    • 2018
  • Social Network Aggregators are used to maintain and manage manifold accounts over multiple online social networks. Displaying the Activity feed for each social network on a common dashboard has been the status quo of social aggregators for long, however retrieving the desired data from various social networks is a major concern. A user inputs the query desiring the specific outcome from the social networks. Since the intention of the query is solely known by user, therefore the output of the query may not be as per user's expectation unless the system considers 'user-centric' factors. Moreover, the quality of solution depends on these user-centric factors, the user inclination and the nature of the network as well. Thus, there is a need for a system that understands the user's intent serving structured objects. Further, choosing the best execution and optimal ranking functions is also a high priority concern. The current work finds motivation from the above requirements and thus proposes the design of a query processing system to retrieve information from social network that extracts user's intent from various social networks. For further improvements in the research the machine learning techniques are incorporated such as Latent Dirichlet Algorithm (LDA) and Ranking Algorithm to improve the query results and fetch the information using data mining techniques.The proposed framework uniquely contributes a user-centric query retrieval model based on natural language and it is worth mentioning that the proposed framework is efficient when compared on temporal metrics. The proposed Query Processing System to Retrieve Information from Social Network (QPSSN) will increase the discoverability of the user, helps the businesses to collaboratively execute promotions, determine new networks and people. It is an innovative approach to investigate the new aspects of social network. The proposed model offers a significant breakthrough scoring up to precision and recall respectively.

Parallel k-Modes Algorithm for Spark Framework (스파크 프레임워크를 위한 병렬적 k-Modes 알고리즘)

  • Chung, Jaehwa
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.487-492
    • /
    • 2017
  • Clustering is a technique which is used to measure similarities between data in big data analysis and data mining field. Among various clustering methods, k-Modes algorithm is representatively used for categorical data. To increase the performance of iterative-centric tasks such as k-Modes, a distributed and concurrent framework Spark has been received great attention recently because it overcomes the limitation of Hadoop. Spark provides an environment that can process large amount of data in main memory using the concept of abstract objects called RDD. Spark provides Mllib, a dedicated library for machine learning, but Mllib only includes k-means that can process only continuous data, so there is a limitation that categorical data processing is impossible. In this paper, we design RDD for k-Modes algorithm for categorical data clustering in spark environment and implement an algorithm that can operate effectively. Experiments show that the proposed algorithm increases linearly in the spark environment.

CANVAS: A Cloud-based Research Data Analytics Environment and System

  • Kim, Seongchan;Song, Sa-kwang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.10
    • /
    • pp.117-124
    • /
    • 2021
  • In this paper, we propose CANVAS (Creative ANalytics enVironment And System), an analytics system of the National Research Data Platform (DataON). CANVAS is a personalized analytics cloud service for researchers who need computing resources and tools for research data analysis. CANVAS is designed in consideration of scalability based on micro-services architecture and was built on top of open-source software such as eGovernment Standard framework (Spring framework), Kubernetes, and JupyterLab. The built system provides personalized analytics environments to multiple users, enabling high-speed and large-capacity analysis by utilizing high-performance cloud infrastructure (CPU/GPU). More specifically, modeling and processing data is possible in JupyterLab or GUI workflow environment. Since CANVAS shares data with DataON, the research data registered by users or downloaded data can be directly processed in the CANVAS. As a result, CANVAS enhances the convenience of data analysis for users in DataON and contributes to the sharing and utilization of research data.

Introduction and Utilization of Time Series Data Integration Framework with Different Characteristics (서로 다른 특성의 시계열 데이터 통합 프레임워크 제안 및 활용)

  • Jisoo, Hwanga;Jaewon, Moon
    • Journal of Broadcast Engineering
    • /
    • v.27 no.6
    • /
    • pp.872-884
    • /
    • 2022
  • With the development of the IoT industry, different types of time series data are being generated in various industries, and it is evolving into research that reproduces and utilizes it through re-integration. In addition, due to data processing speed and issues of the utilization system in the actual industry, there is a growing tendency to compress the size of data when using time series data and integrate it. However, since the guidelines for integrating time series data are not clear and each characteristic such as data description time interval and time section is different, it is difficult to use it after batch integration. In this paper, two integration methods are proposed based on the integration criteria setting method and the problems that arise during integration of time series data. Based on this, integration framework of a heterogeneous time series data was constructed that is considered the characteristics of time series data, and it was confirmed that different heterogeneous time series data compressed can be used for integration and various machine learning.

Quantile Co-integration Application for Maritime Business Fluctuation (분위수 공적분 모형과 해운 경기변동 분석)

  • Kim, Hyun-Sok
    • Journal of Korea Port Economic Association
    • /
    • v.38 no.2
    • /
    • pp.153-164
    • /
    • 2022
  • In this study, we estimate the quantile-regression framework of the shipping industry for the Capesize used ship, which is a typical raw material transportation from January 2000 to December 2021. This research aims two main contributions. First, we analyze the relationship between the Capesize used ship, which is a typical type in the raw material transportation market, and the freight market, for which mixed empirical analysis results are presented. Second, we present an empirical analysis model that considers the structural transformation proposed in the Hyunsok Kim and Myung-hee Chang(2020a) study in quantile-regression. In structural change investigations, the empirical results confirm that the quantile model is able to overcome the problems caused by non-stationarity in time series analysis. Then, the long-run relationship of the co-integration framework divided into long and short-run effects of exogenous variables, and this is extended to a prediction model subdivided by quantile. The results are the basis for extending the analysis based on the shipping theory to artificial intelligence and machine learning approaches.

Research on BGP dataset analysis and CyCOP visualization methods (BGP 데이터셋 분석 및 CyCOP 가시화 방안 연구)

  • Jae-yeong Jeong;Kook-jin Kim;Han-sol Park;Ji-soo Jang;Dong-il Shin;Dong-kyoo Shin
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.177-188
    • /
    • 2024
  • As technology evolves, Internet usage continues to grow, resulting in a geometric increase in network traffic and communication volumes. The network path selection process, which is one of the core elements of the Internet, is becoming more complex and advanced as a result, and it is important to effectively manage and analyze it, and there is a need for a representation and visualization method that can be intuitively understood. To this end, this study designs a framework that analyzes network data using BGP, a network path selection method, and applies it to the cyber common operating picture for situational awareness. After that, we analyze the visualization elements required to visualize the information and conduct an experiment to implement a simple visualization. Based on the data collected and preprocessed in the experiment, the visualization screens implemented help commanders or security personnel to effectively understand the network situation and take command and control.