• 제목/요약/키워드: task features

검색결과 559건 처리시간 0.026초

Experiment on Intermediate Feature Coding for Object Detection and Segmentation

  • Jeong, Min Hyuk;Jin, Hoe-Yong;Kim, Sang-Kyun;Lee, Heekyung;Choo, Hyon-Gon;Lim, Hanshin;Seo, Jeongil
    • 방송공학회논문지
    • /
    • 제25권7호
    • /
    • pp.1081-1094
    • /
    • 2020
  • With the recent development of deep learning, most computer vision-related tasks are being solved with deep learning-based network technologies such as CNN and RNN. Computer vision tasks such as object detection or object segmentation use intermediate features extracted from the same backbone such as Resnet or FPN for training and inference for object detection and segmentation. In this paper, an experiment was conducted to find out the compression efficiency and the effect of encoding on task inference performance when the features extracted in the intermediate stage of CNN are encoded. The feature map that combines the features of 256 channels into one image and the original image were encoded in HEVC to compare and analyze the inference performance for object detection and segmentation. Since the intermediate feature map encodes the five levels of feature maps (P2 to P6), the image size and resolution are increased compared to the original image. However, when the degree of compression is weakened, the use of feature maps yields similar or better inference results to the inference performance of the original image.

Blind Quality Metric via Measurement of Contrast, Texture, and Colour in Night-Time Scenario

  • Xiao, Shuyan;Tao, Weige;Wang, Yu;Jiang, Ye;Qian, Minqian.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권11호
    • /
    • pp.4043-4064
    • /
    • 2021
  • Night-time image quality evaluation is an urgent requirement in visual inspection. The lighting environment of night-time results in low brightness, low contrast, loss of detailed information, and colour dissonance of image, which remains a daunting task of delicately evaluating the image quality at night. A new blind quality assessment metric is presented for realistic night-time scenario through a comprehensive consideration of contrast, texture, and colour in this article. To be specific, image blocks' color-gray-difference (CGD) histogram that represents contrast features is computed at first. Next, texture features that are measured by the mean subtracted contrast normalized (MSCN)-weighted local binary pattern (LBP) histogram are calculated. Then statistical features in Lαβ colour space are detected. Finally, the quality prediction model is conducted by the support vector regression (SVR) based on extracted contrast, texture, and colour features. Experiments conducted on NNID, CCRIQ, LIVE-CH, and CID2013 databases indicate that the proposed metric is superior to the compared BIQA metrics.

Automated detection of panic disorder based on multimodal physiological signals using machine learning

  • Eun Hye Jang;Kwan Woo Choi;Ah Young Kim;Han Young Yu;Hong Jin Jeon;Sangwon Byun
    • ETRI Journal
    • /
    • 제45권1호
    • /
    • pp.105-118
    • /
    • 2023
  • We tested the feasibility of automated discrimination of patients with panic disorder (PD) from healthy controls (HCs) based on multimodal physiological responses using machine learning. Electrocardiogram (ECG), electrodermal activity (EDA), respiration (RESP), and peripheral temperature (PT) of the participants were measured during three experimental phases: rest, stress, and recovery. Eleven physiological features were extracted from each phase and used as input data. Logistic regression (LoR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) algorithms were implemented with nested cross-validation. Linear regression analysis showed that ECG and PT features obtained in the stress and recovery phases were significant predictors of PD. We achieved the highest accuracy (75.61%) with MLP using all 33 features. With the exception of MLP, applying the significant predictors led to a higher accuracy than using 24 ECG features. These results suggest that combining multimodal physiological signals measured during various states of autonomic arousal has the potential to differentiate patients with PD from HCs.

3D Cross-Modal Retrieval Using Noisy Center Loss and SimSiam for Small Batch Training

  • Yeon-Seung Choo;Boeun Kim;Hyun-Sik Kim;Yong-Suk Park
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권3호
    • /
    • pp.670-684
    • /
    • 2024
  • 3D Cross-Modal Retrieval (3DCMR) is a task that retrieves 3D objects regardless of modalities, such as images, meshes, and point clouds. One of the most prominent methods used for 3DCMR is the Cross-Modal Center Loss Function (CLF) which applies the conventional center loss strategy for 3D cross-modal search and retrieval. Since CLF is based on center loss, the center features in CLF are also susceptible to subtle changes in hyperparameters and external inferences. For instance, performance degradation is observed when the batch size is too small. Furthermore, the Mean Squared Error (MSE) used in CLF is unable to adapt to changes in batch size and is vulnerable to data variations that occur during actual inference due to the use of simple Euclidean distance between multi-modal features. To address the problems that arise from small batch training, we propose a Noisy Center Loss (NCL) method to estimate the optimal center features. In addition, we apply the simple Siamese representation learning method (SimSiam) during optimal center feature estimation to compare projected features, making the proposed method robust to changes in batch size and variations in data. As a result, the proposed approach demonstrates improved performance in ModelNet40 dataset compared to the conventional methods.

A Study on Feature Analysis of Archival Metadata Standards in the Records Lifecycle

  • 백재은
    • 한국문헌정보학회지
    • /
    • 제48권3호
    • /
    • pp.71-111
    • /
    • 2014
  • Metadata schemas are well recognized as one of the important technological components for archiving and preservation of digital resources. However, a single standard is not enough to cover the whole lifecycle for archiving and preserving digital resources. This means that we need to appropriately select metadata standards and combine them to develop metadata schemas to cover the whole lifecycle of resources (or records). Creating a unified framework to understand the features of metadata standards is necessary in order to improve metadata interoperability that covers the whole resource lifecycle. In this study, the author approached this issue from the task-centric view of metadata, proposing a Task model as a framework and analyzing the feature of archival metadata standards. The proposed model provides a new scheme to create metadata element mappings and to make metadata interoperable. From this study, the author found out that no single metadata standard can cover the whole lifecycle and also that an in-depth analysis of mappings between metadata standards in accordance with the lifecycle stages is required. The author also discovered that most metadata standards are primarily resource-centric and the different tasks in the resource lifecycle are not reflected in the design of metadata standard data models.

Sketch-based 3D modeling by aligning outlines of an image

  • Li, Chunxiao;Lee, Hyowon;Zhang, Dongliang;Jiang, Hao
    • Journal of Computational Design and Engineering
    • /
    • 제3권3호
    • /
    • pp.286-294
    • /
    • 2016
  • In this paper we present an efficient technique for sketch-based 3D modeling using automatically extracted image features. Creating a 3D model often requires a drawing of irregular shapes composed of curved lines as a starting point but it is difficult to hand-draw such lines without introducing awkward bumps and edges along the lines. We propose an automatic alignment of a user's hand-drawn sketch lines to the contour lines of an image, facilitating a considerable level of ease with which the user can carelessly continue sketching while the system intelligently snaps the sketch lines to a background image contour, no longer requiring the strenuous effort and stress of trying to make a perfect line during the modeling task. This interactive technique seamlessly combines the efficiency and perception of the human user with the accuracy of computational power, applied to the domain of 3D modeling where the utmost precision of on-screen drawing has been one of the hurdles of the task hitherto considered a job requiring a highly skilled and careful manipulation by the user. We provide several examples to demonstrate the accuracy and efficiency of the method with which complex shapes were achieved easily and quickly in the interactive outline drawing task.

The Comparative Study on Third Party Mobile Payment Between UTAUT2 and TTF

  • Wu, Run-Ze;Lee, Jong-Ho
    • 유통과학연구
    • /
    • 제15권11호
    • /
    • pp.5-19
    • /
    • 2017
  • Purpose - According to the research findings, it proposes corresponding market promotion schemes, for Alipay, WeChat wallet and even other payment service providers and mobile internet companies to understand the factors which promote or hinder users' acceptance of mobile payment. Research design, data, and methodology - Statistic analysis of data and social science statistical software of IBM Statistics 23.0 and IBM SPSS AMOS 23.0 were adopted for all the data researched. Results - The technical features of the third party mobile payment and the task characteristics of users have positive influence on the matching degree between task and technology, and the matching degree between task and technology of the third party mobile payment has positive influence on the performance expectancy, effort expectancy and usage intention. The social influence, facilitating condition, price value and enjoyment motivation have significant and positive influence on users' intention of mobile payment adoption. The perceive security of the mobile fingerprint payment of users has positive influence on users' intention of usage. Conclusions - This research has the main contribution on the analysis on the key factors with influence on the third party mobile payment usage by utilizing the integrated model of UTAUT2 and TTF.

The Effect of Aquatic Task Training on Gait and Balance Ability in Stroke Patients

  • Lee, Ji-Yeun;Park, Jung-Seo;Kim, Kyoung
    • The Journal of Korean Physical Therapy
    • /
    • 제23권3호
    • /
    • pp.29-35
    • /
    • 2011
  • Purpose: The purpose of study was to measure stroke patients' ability to balance and their degrees of clinical function and to examine the effect of the aquatic exercise method using tasks related to these features. Methods: Twenty stroke patients were randomly assigned to an aquatic task exercise group and a land task exercise group. Both groups used the same exercise method for 60 minutes each session, three times a week for 12 weeks at the same time point and with the same amount of exercise. Results: Before and after the exercise, static balance was measured using balance measuring instruments locomotive faculties, muscular strength, and dynamic balance were assessed through the Berg balance and 10 m gait tests. Finally, gait abilities were measured, and the data obtained were analyzed to generate the results. Conclusion: Both groups showed significant improvement, but the aquatic exercise group showed slightly more significant results in static balance, Berg balance, and upright walking tests. It is thought that the improvement of stroke patients' balance and gait ability can be triggered through the application of aquatic exercise programs in the future.

Multi-Task FaceBoxes: A Lightweight Face Detector Based on Channel Attention and Context Information

  • Qi, Shuaihui;Yang, Jungang;Song, Xiaofeng;Jiang, Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권10호
    • /
    • pp.4080-4097
    • /
    • 2020
  • In recent years, convolutional neural network (CNN) has become the primary method for face detection. But its shortcomings are obvious, such as expensive calculation, heavy model, etc. This makes CNN difficult to use on the mobile devices which have limited computing and storage capabilities. Therefore, the design of lightweight CNN for face detection is becoming more and more important with the popularity of smartphones and mobile Internet. Based on the CPU real-time face detector FaceBoxes, we propose a multi-task lightweight face detector, which has low computing cost and higher detection precision. First, to improve the detection capability, the squeeze and excitation modules are used to extract attention between channels. Then, the textual and semantic information are extracted by shallow networks and deep networks respectively to get rich features. Finally, the landmark detection module is used to improve the detection performance for small faces and provide landmark data for face alignment. Experiments on AFW, FDDB, PASCAL, and WIDER FACE datasets show that our algorithm has achieved significant improvement in the mean average precision. Especially, on the WIDER FACE hard validation set, our algorithm outperforms the mean average precision of FaceBoxes by 7.2%. For VGA-resolution images, the running speed of our algorithm can reach 23FPS on a CPU device.

Parallel Multi-task Cascade Convolution Neural Network Optimization Algorithm for Real-time Dynamic Face Recognition

  • Jiang, Bin;Ren, Qiang;Dai, Fei;Zhou, Tian;Gui, Guan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권10호
    • /
    • pp.4117-4135
    • /
    • 2020
  • Due to the angle of view, illumination and scene diversity, real-time dynamic face detection and recognition is no small difficulty in those unrestricted environments. In this study, we used the intrinsic correlation between detection and calibration, using a multi-task cascaded convolutional neural network(MTCNN) to improve the efficiency of face recognition, and the output of each core network is mapped in parallel to a compact Euclidean space, where distance represents the similarity of facial features, so that the target face can be identified as quickly as possible, without waiting for all network iteration calculations to complete the recognition results. And after the angle of the target face and the illumination change, the correlation between the recognition results can be well obtained. In the actual application scenario, we use a multi-camera real-time monitoring system to perform face matching and recognition using successive frames acquired from different angles. The effectiveness of the method was verified by several real-time monitoring experiments, and good results were obtained.