• Title/Summary/Keyword: Heterogeneity Learning

Search Result 45, Processing Time 0.034 seconds

The Effect of Worker Heterogeneity in Learning and Forgetting on System Productivity (학습과 망각에 대한 작업자들의 이질성 정도가 시스템 생산성에 미치는 영향)

  • Kim, Sungsu
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.40 no.4
    • /
    • pp.145-156
    • /
    • 2015
  • Incorporation of individual learning and forgetting behaviors within worker-task assignment models produces a mixed integer nonlinear program (MINLP) problem, which is difficult to solve as a NP hard due to its nonlinearity in the objective function. Previous studies commonly assume homogeneity among workers in workforce scheduling that takes account of learning and forgetting characteristics. This paper expands previous researches by considering heterogeneous individual learning/forgetting, and investigates the impact of worker heterogeneity in initial expertise, steady-state productivity, learning and forgetting on system performance to assist manager's decision-making in worker-task assignments without tackling complex MINLP models. In order to understand the performance implications of workforce heterogeneity, this paper examines analytically how heterogeneity in each of the four parameters of the exponential learning and forgetting (L/F) model affects system performance in three cases : consecutive assignments with no break, n breaks of s-length each, and total b break-periods occurred over T periods. The study presents the direction of change in worker performance under different assignment schedules as the variance in initial expertise, steady-state productivity, learning or forgetting increases. Thus, it implies whether having more heterogenous workforce in terms of each of four parameters in the L/F model is desired or not in different schedules from the perspective of system productivity measurement.

FedGCD: Federated Learning Algorithm with GNN based Community Detection for Heterogeneous Data

  • Wooseok Shin;Jitae Shin
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.1-11
    • /
    • 2023
  • Federated learning (FL) is a ground breaking machine learning paradigm that allow smultiple participants to collaboratively train models in a cloud environment, all while maintaining the privacy of their raw data. This approach is in valuable in applications involving sensitive or geographically distributed data. However, one of the challenges in FL is dealing with heterogeneous and non-independent and identically distributed (non-IID) data across participants, which can result in suboptimal model performance compared to traditionalmachine learning methods. To tackle this, we introduce FedGCD, a novel FL algorithm that employs Graph Neural Network (GNN)-based community detection to enhance model convergence in federated settings. In our experiments, FedGCD consistently outperformed existing FL algorithms in various scenarios: for instance, in a non-IID environment, it achieved an accuracy of 0.9113, a precision of 0.8798,and an F1-Score of 0.8972. In a semi-IID setting, it demonstrated the highest accuracy at 0.9315 and an impressive F1-Score of 0.9312. We also introduce a new metric, nonIIDness, to quantitatively measure the degree of data heterogeneity. Our results indicate that FedGCD not only addresses the challenges of data heterogeneity and non-IIDness but also sets new benchmarks for FL algorithms. The community detection approach adopted in FedGCD has broader implications, suggesting that it could be adapted for other distributed machine learning scenarios, thereby improving model performance and convergence across a range of applications.

Hybrid Learning-Based Cell Morphology Profiling Framework for Classifying Cancer Heterogeneity (암의 이질성 분류를 위한 하이브리드 학습 기반 세포 형태 프로파일링 기법)

  • Min, Chanhong;Jeong, Hyuntae;Yang, Sejung;Shin, Jennifer Hyunjong
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.5
    • /
    • pp.232-240
    • /
    • 2021
  • Heterogeneity in cancer is the major obstacle for precision medicine and has become a critical issue in the field of a cancer diagnosis. Many attempts were made to disentangle the complexity by molecular classification. However, multi-dimensional information from dynamic responses of cancer poses fundamental limitations on biomolecular marker-based conventional approaches. Cell morphology, which reflects the physiological state of the cell, can be used to track the temporal behavior of cancer cells conveniently. Here, we first present a hybrid learning-based platform that extracts cell morphology in a time-dependent manner using a deep convolutional neural network to incorporate multivariate data. Feature selection from more than 200 morphological features is conducted, which filters out less significant variables to enhance interpretation. Our platform then performs unsupervised clustering to unveil dynamic behavior patterns hidden from a high-dimensional dataset. As a result, we visualize morphology state-space by two-dimensional embedding as well as representative morphology clusters and trajectories. This cell morphology profiling strategy by hybrid learning enables simplification of the heterogeneous population of cancer.

Machine Learning Aided Tracking Analysis of Haze Pollution and Regional Heterogeneity

  • Gu, Fangfang;Jiang, Keshen;Cao, Fangdong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.6
    • /
    • pp.2031-2048
    • /
    • 2021
  • Not only can air pollution reduce the overall competitiveness of tourist destinations, but also changes tourists' travel decisions, thereby affecting the tourism flows. The study presents a machine learning method to analyze how the haze pollution puts spatial effect on tourism flows in China from 2001 to 2018, and reveals the regional differences in heterogeneity among eastern, central, and western China. Our investigation reveals three interesting observations. First, the Environmental Kuznets Curve of the impact of haze pollution on tourism flows is not significant. In the eastern and western regions, the interaction between haze pollution and domestic tourism flows as well as inbound tourism flows shows an inverted U-shaped curve respectively. Second, there is an significantly positive spillover effect of tourism flows in all of the eastern, central, and western regions. As to the intensity of spillover, domestic tourism flows is higher than that of the inbound tourism flows. Both of the above figures are greatest in the eastern. Third, the Chinese haze pollution mainly reduces the inbound tourism flows, and only imposes significantly negative direct effects on the domestic tourism flows in the central region. In the central and eastern regions, significantly negative direct effects and spillover effects are exerted on inbound tourism.

Collaborative Modeling of Medical Image Segmentation Based on Blockchain Network

  • Yang Luo;Jing Peng;Hong Su;Tao Wu;Xi Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.958-979
    • /
    • 2023
  • Due to laws, regulations, privacy, etc., between 70-90 percent of providers do not share medical data, forming a "data island". It is essential to collaborate across multiple institutions without sharing patient data. Most existing methods adopt distributed learning and centralized federal architecture to solve this problem, but there are problems of resource heterogeneity and data heterogeneity in the practical application process. This paper proposes a collaborative deep learning modelling method based on the blockchain network. The training process uses encryption parameters to replace the original remote source data transmission to protect privacy. Hyperledger Fabric blockchain is adopted to realize that the parties are not restricted by the third-party authoritative verification end. To a certain extent, the distrust and single point of failure caused by the centralized system are avoided. The aggregation algorithm uses the FedProx algorithm to solve the problem of device heterogeneity and data heterogeneity. The experiments show that the maximum improvement of segmentation accuracy in the collaborative training mode proposed in this paper is 11.179% compared to local training. In the sequential training mode, the average accuracy improvement is greater than 7%. In the parallel training mode, the average accuracy improvement is greater than 8%. The experimental results show that the model proposed in this paper can solve the current problem of centralized modelling of multicenter data. In particular, it provides ideas to solve privacy protection and break "data silos", and protects all data.

The Effects of Small Group's Cooperative Learning According to Personality Types on Young Children's Science Activities (성격유형별 소집단 협동학습이 유아의 과학활동에 미치는 효과)

  • Kang, Sang;Shin, Ji-Hye
    • Korean Journal of Childcare and Education
    • /
    • v.9 no.1
    • /
    • pp.201-220
    • /
    • 2013
  • This study focused on science activities that need collaborative inquiry process and evaluate the effects of small group's science cooperative learning according to personality types on young children's science activities. The subjects are 30 five-year-old kindergarteners. They have been divided equally into three groups, extroversion(E), interversion(I), and heterogeneous group of EI mixed group depending on EI indicators through K-ABC cognitive ability tests and MMTIC personality types targeting. Both of groups have 10 members each. For data analysis, scientific attitude was analyzed with ANCOVA, scientific knowledge development was done with frequency analysis. As a result, first there was a difference in scientific knowledge development between the homogeneous group and heterogeneity group in small group's cooperative learning. Through the results of a Scheffe post-hoc test, there was a significant difference between E and I homogeneous groups but there was no difference between I homogeneous group and the heterogeneity group, and between E homogeneous group and the heterogeneity group, I homogeneous group had the most effective group composition in scientific attitude improvement.

Ontology Mapping and Rule-Based Inference for Learning Resource Integration

  • Jetinai, Kotchakorn;Arch-int, Ngamnij;Arch-int, Somjit
    • Journal of information and communication convergence engineering
    • /
    • v.14 no.2
    • /
    • pp.97-105
    • /
    • 2016
  • With the increasing demand for interoperability among existing learning resource systems in order to enable the sharing of learning resources, such resources need to be annotated with ontologies that use different metadata standards. These different ontologies must be reconciled through ontology mediation, so as to cope with information heterogeneity problems, such as semantic and structural conflicts. In this paper, we propose an ontology-mapping technique using Semantic Web Rule Language (SWRL) to generate semantic mapping rules that integrate learning resources from different systems and that cope with semantic and structural conflicts. Reasoning rules are defined to support a semantic search for heterogeneous learning resources, which are deduced by rule-based inference. Experimental results demonstrate that the proposed approach enables the integration of learning resources originating from multiple sources and helps users to search across heterogeneous learning resource systems.

Enhancing LoRA Fine-tuning Performance Using Curriculum Learning

  • Daegeon Kim;Namgyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.3
    • /
    • pp.43-54
    • /
    • 2024
  • Recently, there has been a lot of research on utilizing Language Models, and Large Language Models have achieved innovative results in various tasks. However, the practical application faces limitations due to the constrained resources and costs required to utilize Large Language Models. Consequently, there has been recent attention towards methods to effectively utilize models within given resources. Curriculum Learning, a methodology that categorizes training data according to difficulty and learns sequentially, has been attracting attention, but it has the limitation that the method of measuring difficulty is complex or not universal. Therefore, in this study, we propose a methodology based on data heterogeneity-based Curriculum Learning that measures the difficulty of data using reliable prior information and facilitates easy utilization across various tasks. To evaluate the performance of the proposed methodology, experiments were conducted using 5,000 specialized documents in the field of information communication technology and 4,917 documents in the field of healthcare. The results confirm that the proposed methodology outperforms traditional fine-tuning in terms of classification accuracy in both LoRA fine-tuning and full fine-tuning.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

A Study on Blockchain-Based Asynchronous Federated Learning Framework

  • Qian, Zhuohao;Latt, Cho Nwe Zin;Kang, Sung-Won;Rhee, Kyung-Hyune
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.272-275
    • /
    • 2022
  • The federated learning can be utilized in conjunction with the blockchain technology to provide good privacy protection and reward distribution mechanism in the field of intelligent IOT in edge computing scenarios. Nonetheless, the synchronous federated learning ignores the waiting delay due to the heterogeneity of edge devices (different computing power, communication bandwidth, and dataset size). Moreover, the potential of smart contracts was not fully explored to do some flexible design. This paper investigates the fusion application based on the FLchain, which is the combination of asynchronous federated learning and blockchain, discusses the communication optimization, and explores the feasible design of smart contract to solve some problems.