• Title/Summary/Keyword: multi-task learning

Search Result 132, Processing Time 0.032 seconds

Deep Learning-Based Dynamic Scheduling with Multi-Agents Supporting Scalability in Edge Computing Environments (멀티 에이전트 에지 컴퓨팅 환경에서 확장성을 지원하는 딥러닝 기반 동적 스케줄링)

  • JongBeom Lim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.9
    • /
    • pp.399-406
    • /
    • 2023
  • Cloud computing has been evolved to support edge computing architecture that combines fog management layer with edge servers. The main reason why it is received much attention is low communication latency for real-time IoT applications. At the same time, various cloud task scheduling techniques based on artificial intelligence have been proposed. Artificial intelligence-based cloud task scheduling techniques show better performance in comparison to existing methods, but it has relatively high scheduling time. In this paper, we propose a deep learning-based dynamic scheduling with multi-agents supporting scalability in edge computing environments. The proposed method shows low scheduling time than previous artificial intelligence-based scheduling techniques. To show the effectiveness of the proposed method, we compare the performance between previous and proposed methods in a scalable experimental environment. The results show that our method supports real-time IoT applications with low scheduling time, and shows better performance in terms of the number of completed cloud tasks in a scalable experimental environment.

Low-Resource Morphological Analysis for Kazakh using Multi-Task Learning (Low-Resource 환경에서 Multi-Task 학습을 이용한 카자흐어 형태소 분석)

  • Kaibalina, Nazira;Park, Seong-Bae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.437-440
    • /
    • 2021
  • 지난 10년 동안 기계학습을 통해 자연어 처리 분야에서 많은 발전이 있었다. Machine translation, question answering과 같은 문제는 사용 가능한 데이터가 많은 언어에서 높은 정확도 성능 결과를 보여준다. 그러나 low-resource 언어에선 동일한 수준의 성능에 도달할 수 없다. 카자흐어는 형태학적 분석을 위해 구축된 대용량 데이터셋이 없으므로 low-resource 환경이다. 카자흐어는 단일 어근으로 수백 개의 단어 형태를 생성할 수 있는 교착어이다. 그래서 카자흐어 문장의 형태학적 분석은 카자흐어 문장의 의미를 이해하는 기본적인 단계이다. 기존에 존재하는 카자흐어 데이터셋은 구체적인 형태학적 분석의 부재로 모델이 충분한 학습이 이루어지지 못하기 때문에 본 논문에서 새로운 데이터셋을 제안한다. 본 논문은 low-resource 환경에서 높은 정확도를 달성할 수 있는 신경망 모델 기반의 카자흐어 형태학 분석기를 제안한다.

Saliency-Assisted Collaborative Learning Network for Road Scene Semantic Segmentation

  • Haifeng Sima;Yushuang Xu;Minmin Du;Meng Gao;Jing Wang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.861-880
    • /
    • 2023
  • Semantic segmentation of road scene is the key technology of autonomous driving, and the improvement of convolutional neural network architecture promotes the improvement of model segmentation performance. The existing convolutional neural network has the simplification of learning knowledge and the complexity of the model. To address this issue, we proposed a road scene semantic segmentation algorithm based on multi-task collaborative learning. Firstly, a depthwise separable convolution atrous spatial pyramid pooling is proposed to reduce model complexity. Secondly, a collaborative learning framework is proposed involved with saliency detection, and the joint loss function is defined using homoscedastic uncertainty to meet the new learning model. Experiments are conducted on the road and nature scenes datasets. The proposed method achieves 70.94% and 64.90% mIoU on Cityscapes and PASCAL VOC 2012 datasets, respectively. Qualitatively, Compared to methods with excellent performance, the method proposed in this paper has significant advantages in the segmentation of fine targets and boundaries.

A Multi-category Task for Bitrate Interval Prediction with the Target Perceptual Quality

  • Yang, Zhenwei;Shen, Liquan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4476-4491
    • /
    • 2021
  • Video service providers tend to face user network problems in the process of transmitting video streams. They strive to provide user with superior video quality in a limited bitrate environment. It is necessary to accurately determine the target bitrate range of the video under different quality requirements. Recently, several schemes have been proposed to meet this requirement. However, they do not take the impact of visual influence into account. In this paper, we propose a new multi-category model to accurately predict the target bitrate range with target visual quality by machine learning. Firstly, a dataset is constructed to generate multi-category models by machine learning. The quality score ladders and the corresponding bitrate-interval categories are defined in the dataset. Secondly, several types of spatial-temporal features related to VMAF evaluation metrics and visual factors are extracted and processed statistically for classification. Finally, bitrate prediction models trained on the dataset by RandomForest classifier can be used to accurately predict the target bitrate of the input videos with target video quality. The classification prediction accuracy of the model reaches 0.705 and the encoded video which is compressed by the bitrate predicted by the model can achieve the target perceptual quality.

A Research Review on Effective Use of IS drawn on Multi-level Dynamic Capability (정보시스템 분석수준 별 역동적 역량에 기반한 효율적 사용에 관한 연구 리뷰)

  • Kang, Hyunjeong
    • The Journal of Information Systems
    • /
    • v.29 no.2
    • /
    • pp.27-50
    • /
    • 2020
  • Purpose The research on the effective use of IS needs to embrace the alignment to organization learning process, which expands the limited focus on dynamic capability of IS use. In addition, it should be done in multi-level analysis with system, user, task, and organization. The current study suggests the inclusion of multi-level analysis of effective use of IS in the perspective of exploration and exploitation. Design/methodology/approach This review selected the representative studies in IS discipline which have investigated the effective use of IS, dynamic capability, operational capability, exploration, exploitation, or organizational learning. In the search of academic archives with those keywords, seventeen papers which have been most cited were chosen and validated whether the focus constructs are directly theorized or validated the suggested keywords. In addition, the level of analysis was verified whether it includes one or more levels of system, individual, task, or organization. Based on the initial analysis of dynamic capability, the further review of research on explorational and exploitational capabilities was implemented. Findings The present review study on previous literature on effective use of IS presented that it is largely implemented in the level of individual but few of them has included organization level. Similarly, the direct investigation of explorational and exploitational use of IS has not been done so much. The needs of study on effective use of IS in depth have been inquired for a decade. However, the review presented that it still lacks profound theories and empirical validations compared to those of adoption stage of IS. Based on the review, future research on the transition between explorational and exploitational use of IS is suggested.

Weakly-supervised Semantic Segmentation using Exclusive Multi-Classifier Deep Learning Model (독점 멀티 분류기의 심층 학습 모델을 사용한 약지도 시맨틱 분할)

  • Choi, Hyeon-Joon;Kang, Dong-Joong
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.6
    • /
    • pp.227-233
    • /
    • 2019
  • Recently, along with the recent development of deep learning technique, neural networks are achieving success in computer vision filed. Convolutional neural network have shown outstanding performance in not only for a simple image classification task, but also for tasks with high difficulty such as object segmentation and detection. However many such deep learning models are based on supervised-learning, which requires more annotation labels than image-level label. Especially image semantic segmentation model requires pixel-level annotations for training, which is very. To solve these problems, this paper proposes a weakly-supervised semantic segmentation method which requires only image level label to train network. Existing weakly-supervised learning methods have limitations in detecting only specific area of object. In this paper, on the other hand, we use multi-classifier deep learning architecture so that our model recognizes more different parts of objects. The proposed method is evaluated using VOC 2012 validation dataset.

The Improved Joint Bayesian Method for Person Re-identification Across Different Camera

  • Hou, Ligang;Guo, Yingqiang;Cao, Jiangtao
    • Journal of Information Processing Systems
    • /
    • v.15 no.4
    • /
    • pp.785-796
    • /
    • 2019
  • Due to the view point, illumination, personal gait and other background situation, person re-identification across cameras has been a challenging task in video surveillance area. In order to address the problem, a novel method called Joint Bayesian across different cameras for person re-identification (JBR) is proposed. Motivated by the superior measurement ability of Joint Bayesian, a set of Joint Bayesian matrices is obtained by learning with different camera pairs. With the global Joint Bayesian matrix, the proposed method combines the characteristics of multi-camera shooting and person re-identification. Then this method can improve the calculation precision of the similarity between two individuals by learning the transition between two cameras. For investigating the proposed method, it is implemented on two compare large-scale re-ID datasets, the Market-1501 and DukeMTMC-reID. The RANK-1 accuracy significantly increases about 3% and 4%, and the maximum a posterior (MAP) improves about 1% and 4%, respectively.

Improving Transformer with Dynamic Convolution and Shortcut for Video-Text Retrieval

  • Liu, Zhi;Cai, Jincen;Zhang, Mengmeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.7
    • /
    • pp.2407-2424
    • /
    • 2022
  • Recently, Transformer has made great progress in video retrieval tasks due to its high representation capability. For the structure of a Transformer, the cascaded self-attention modules are capable of capturing long-distance feature dependencies. However, the local feature details are likely to have deteriorated. In addition, increasing the depth of the structure is likely to produce learning bias in the learned features. In this paper, an improved Transformer structure named TransDCS (Transformer with Dynamic Convolution and Shortcut) is proposed. A Multi-head Conv-Self-Attention module is introduced to model the local dependencies and improve the efficiency of local features extraction. Meanwhile, the augmented shortcuts module based on a dual identity matrix is applied to enhance the conduction of input features, and mitigate the learning bias. The proposed model is tested on MSRVTT, LSMDC and Activity-Net benchmarks, and it surpasses all previous solutions for the video-text retrieval task. For example, on the LSMDC benchmark, a gain of about 2.3% MdR and 6.1% MnR is obtained over recently proposed multimodal-based methods.

Multi-Scale, Multi-Object and Real-Time Face Detection and Head Pose Estimation Using Deep Neural Networks (다중크기와 다중객체의 실시간 얼굴 검출과 머리 자세 추정을 위한 심층 신경망)

  • Ahn, Byungtae;Choi, Dong-Geol;Kweon, In So
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.3
    • /
    • pp.313-321
    • /
    • 2017
  • One of the most frequently performed tasks in human-robot interaction (HRI), intelligent vehicles, and security systems is face related applications such as face recognition, facial expression recognition, driver state monitoring, and gaze estimation. In these applications, accurate head pose estimation is an important issue. However, conventional methods have been lacking in accuracy, robustness or processing speed in practical use. In this paper, we propose a novel method for estimating head pose with a monocular camera. The proposed algorithm is based on a deep neural network for multi-task learning using a small grayscale image. This network jointly detects multi-view faces and estimates head pose in hard environmental conditions such as illumination change and large pose change. The proposed framework quantitatively and qualitatively outperforms the state-of-the-art method with an average head pose mean error of less than $4.5^{\circ}$ in real-time.

Concurrent Detection for Vehicles and Lanes Using Light-Weight Model of Multi-Task CNN (멀티 테스크 CNN의 경량화 모델을 이용한 차량 및 차선의 동시 검출)

  • Shin, Hyeon-Sik;Kim, Hyung-Won;Hong, Sang-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.367-373
    • /
    • 2022
  • As deep learning-based autonomous driving technology develops, artificial intelligence models for various purposes have been studied. Based on these studies, several models were used simultaneously to develop autonomous driving systems. It can occur by increasing hardware resource consumption. We propose a multi-tasks model using a shared backbone to solve this problem. This can solve the increase in the number of backbones for using AI models. As a result, in the proposed lightweight model, the model parameters could be reduced by more than 50% compared to the existing model, and the speed could be improved. In addition, each lane can be classified through lane detection using the instance segmentation method. However, further research is needed on the decrease in accuracy compared to the existing model.