• Title/Summary/Keyword: Task complexity

Search Result 327, Processing Time 0.025 seconds

Context-Dependent Video Data Augmentation for Human Instance Segmentation (인물 개체 분할을 위한 맥락-의존적 비디오 데이터 보강)

  • HyunJin Chun;JongHun Lee;InCheol Kim
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.5
    • /
    • pp.217-228
    • /
    • 2023
  • Video instance segmentation is an intelligent visual task with high complexity because it not only requires object instance segmentation for each image frame constituting a video, but also requires accurate tracking of instances throughout the frame sequence of the video. In special, human instance segmentation in drama videos has an unique characteristic that requires accurate tracking of several main characters interacting in various places and times. Also, it is also characterized by a kind of the class imbalance problem because there is a significant difference between the frequency of main characters and that of supporting or auxiliary characters in drama videos. In this paper, we introduce a new human instance datatset called MHIS, which is built upon drama videos, Miseang, and then propose a novel video data augmentation method, CDVA, in order to overcome the data imbalance problem between character classes. Different from the previous video data augmentation methods, the proposed CDVA generates more realistic augmented videos by deciding the optimal location within the background clip for a target human instance to be inserted with taking rich spatio-temporal context embedded in videos into account. Therefore, the proposed augmentation method, CDVA, can improve the performance of a deep neural network model for video instance segmentation. Conducting both quantitative and qualitative experiments using the MHIS dataset, we prove the usefulness and effectiveness of the proposed video data augmentation method.

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

Accelerometer-based Gesture Recognition for Robot Interface (로봇 인터페이스 활용을 위한 가속도 센서 기반 제스처 인식)

  • Jang, Min-Su;Cho, Yong-Suk;Kim, Jae-Hong;Sohn, Joo-Chan
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.53-69
    • /
    • 2011
  • Vision and voice-based technologies are commonly utilized for human-robot interaction. But it is widely recognized that the performance of vision and voice-based interaction systems is deteriorated by a large margin in the real-world situations due to environmental and user variances. Human users need to be very cooperative to get reasonable performance, which significantly limits the usability of the vision and voice-based human-robot interaction technologies. As a result, touch screens are still the major medium of human-robot interaction for the real-world applications. To empower the usability of robots for various services, alternative interaction technologies should be developed to complement the problems of vision and voice-based technologies. In this paper, we propose the use of accelerometer-based gesture interface as one of the alternative technologies, because accelerometers are effective in detecting the movements of human body, while their performance is not limited by environmental contexts such as lighting conditions or camera's field-of-view. Moreover, accelerometers are widely available nowadays in many mobile devices. We tackle the problem of classifying acceleration signal patterns of 26 English alphabets, which is one of the essential repertoires for the realization of education services based on robots. Recognizing 26 English handwriting patterns based on accelerometers is a very difficult task to take over because of its large scale of pattern classes and the complexity of each pattern. The most difficult problem that has been undertaken which is similar to our problem was recognizing acceleration signal patterns of 10 handwritten digits. Most previous studies dealt with pattern sets of 8~10 simple and easily distinguishable gestures that are useful for controlling home appliances, computer applications, robots etc. Good features are essential for the success of pattern recognition. To promote the discriminative power upon complex English alphabet patterns, we extracted 'motion trajectories' out of input acceleration signal and used them as the main feature. Investigative experiments showed that classifiers based on trajectory performed 3%~5% better than those with raw features e.g. acceleration signal itself or statistical figures. To minimize the distortion of trajectories, we applied a simple but effective set of smoothing filters and band-pass filters. It is well known that acceleration patterns for the same gesture is very different among different performers. To tackle the problem, online incremental learning is applied for our system to make it adaptive to the users' distinctive motion properties. Our system is based on instance-based learning (IBL) where each training sample is memorized as a reference pattern. Brute-force incremental learning in IBL continuously accumulates reference patterns, which is a problem because it not only slows down the classification but also downgrades the recall performance. Regarding the latter phenomenon, we observed a tendency that as the number of reference patterns grows, some reference patterns contribute more to the false positive classification. Thus, we devised an algorithm for optimizing the reference pattern set based on the positive and negative contribution of each reference pattern. The algorithm is performed periodically to remove reference patterns that have a very low positive contribution or a high negative contribution. Experiments were performed on 6500 gesture patterns collected from 50 adults of 30~50 years old. Each alphabet was performed 5 times per participant using $Nintendo{(R)}$ $Wii^{TM}$ remote. Acceleration signal was sampled in 100hz on 3 axes. Mean recall rate for all the alphabets was 95.48%. Some alphabets recorded very low recall rate and exhibited very high pairwise confusion rate. Major confusion pairs are D(88%) and P(74%), I(81%) and U(75%), N(88%) and W(100%). Though W was recalled perfectly, it contributed much to the false positive classification of N. By comparison with major previous results from VTT (96% for 8 control gestures), CMU (97% for 10 control gestures) and Samsung Electronics(97% for 10 digits and a control gesture), we could find that the performance of our system is superior regarding the number of pattern classes and the complexity of patterns. Using our gesture interaction system, we conducted 2 case studies of robot-based edutainment services. The services were implemented on various robot platforms and mobile devices including $iPhone^{TM}$. The participating children exhibited improved concentration and active reaction on the service with our gesture interface. To prove the effectiveness of our gesture interface, a test was taken by the children after experiencing an English teaching service. The test result showed that those who played with the gesture interface-based robot content marked 10% better score than those with conventional teaching. We conclude that the accelerometer-based gesture interface is a promising technology for flourishing real-world robot-based services and content by complementing the limits of today's conventional interfaces e.g. touch screen, vision and voice.

A Study on the cost allocation method of the operating room in the hospital (수술실의 원가배부기준 설정연구)

  • Kim, Hwi-Jung;Jung, Key-Sun;Choi, Sung-Woo
    • Korea Journal of Hospital Management
    • /
    • v.8 no.1
    • /
    • pp.135-164
    • /
    • 2003
  • The operating room is the major facility that costs the highest investment per unit area in a hospital. It requires commitment of hospital resources such as manpower, equipments and material. The quantity of these resources committed actually differs from one type of operation to another. Because of this, it is not an easy task to allocate the operating cost to individual clinical departments that share the operating room. A practical way to do so may be to collect and add the operating costs incurred by each clinical department and charge the net cost to the account of the corresponding clinical department. It has been customary to allocate the cost of the operating room to the account of each individual department on the basis of the ratio of the number of operations of the department or the total revenue by each operating room. In an attempt to set up more rational cost allocation method than the customary method, this study proposes a new cost allocation method that calls for itemizing the operation cost into its constituent expenses in detail and adding them up for the operating cost incurred by each individual department. For comparison of the new method with the conventional method, the operating room in the main building of hospital A near Seoul is chosen as a study object. It is selected because it is the biggest operating room in hospital A and most of operations in this hospital are conducted in this room. For this study the one-month operation record performed in January 2001 in this operating room is analyzed to allocate the per-month operation cost to six clinical departments that used this operating room; the departments of general surgery, orthopedic surgery, neuro-surgery, dental surgery, urology, and obstetrics & gynecology. In the new method(or method 1), each operation cost is categorized into three major expenses; personnel expense, material expense, and overhead expense and is allocated into the account of the clinical department that used the operating room. The method 1 shows that, among the total one-month operating cost of 814,054 thousand wons in this hospital, 163,714 thousand won is allocated to GS, 335,084 thousand won to as, 202,772 thousand won to NS, 42,265 thousand won to uno, 33,423 thousand won to OB/GY, and 36.796 thousand won to DS. The allocation of the operating cost to six departments by the new method is quite different from that by the conventional method. According to one conventional allocation method based on the ratio of the number of operations of a department to the total number of operations in the operating room(method 2 hereafter), 329,692 thousand won are allocated to GS, 262,125 thousand won to as, 87,104 thousand won to NS, 59,426 thousand won to URO, 51.285 thousand won to OB/GY, and 24,422 thousand won to DS. According to the other conventional allocation method based on the ratio of the revenue of a department(method 3 hereafter), 148,158 thousand won are allocated to GS, 272,708 thousand won to as, 268.638 thousand won to NS, 45,587 thousand won to uno, 51.285 thousand won to OB/GY, and 27.678 thousand won to DS. As can be noted from these results, the cost allocation to six departments by method 1 is strikingly different from those by method 2 and method 3. The operating cost allocated to GS by method 2 is about twice by method 1. Method 3 makes allocations of the operating cost to individual departments very similarly as method 1. However, there are still discrepancies between the two methods. In particular the cost allocations to OB/GY by the two methods have roughly 53.4% discrepancy. The conventional methods 2 and 3 fail to take into account properly the fact that the average time spent for the operation is different and dependent on the clinical department, whether or not to use expensive clinical material dictate the operating cost, and there is difference between the official operating cost and the actual operating cost. This is why the conventional methods turn out to be inappropriate as the operating cost allocation methods. In conclusion, the new method here may be laborious and cause a complexity in bookkeeping because it requires detailed bookkeeping of the operation cost by its constituent expenses and also by individual clinical department, treating each department as an independent accounting unit. But the method is worth adopting because it will allow the concerned hospital to estimate the operating cost as accurately as practicable. The cost data used in this study such as personnel expense, material cost, overhead cost may not be correct ones. Therefore, the operating cost estimated in the main text may not be the same as the actual cost. Also, the study is focused on the case of only hospital A, which is hardly claimed to represent the hospitals across the nation. In spite of these deficiencies, this study is noteworthy from the standpoint that it proposes a practical allocation method of the operating cost to each individual clinical department.

  • PDF

Intelligent Optimal Route Planning Based on Context Awareness (상황인식 기반 지능형 최적 경로계획)

  • Lee, Hyun-Jung;Chang, Yong-Sik
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.117-137
    • /
    • 2009
  • Recently, intelligent traffic information systems have enabled people to forecast traffic conditions before hitting the road. These convenient systems operate on the basis of data reflecting current road and traffic conditions as well as distance-based data between locations. Thanks to the rapid development of ubiquitous computing, tremendous context data have become readily available making vehicle route planning easier than ever. Previous research in relation to optimization of vehicle route planning merely focused on finding the optimal distance between locations. Contexts reflecting the road and traffic conditions were then not seriously treated as a way to resolve the optimal routing problems based on distance-based route planning, because this kind of information does not have much significant impact on traffic routing until a a complex traffic situation arises. Further, it was also not easy to take into full account the traffic contexts for resolving optimal routing problems because predicting the dynamic traffic situations was regarded a daunting task. However, with rapid increase in traffic complexity the importance of developing contexts reflecting data related to moving costs has emerged. Hence, this research proposes a framework designed to resolve an optimal route planning problem by taking full account of additional moving cost such as road traffic cost and weather cost, among others. Recent technological development particularly in the ubiquitous computing environment has facilitated the collection of such data. This framework is based on the contexts of time, traffic, and environment, which addresses the following issues. First, we clarify and classify the diverse contexts that affect a vehicle's velocity and estimates the optimization of moving cost based on dynamic programming that accounts for the context cost according to the variance of contexts. Second, the velocity reduction rate is applied to find the optimal route (shortest path) using the context data on the current traffic condition. The velocity reduction rate infers to the degree of possible velocity including moving vehicles' considerable road and traffic contexts, indicating the statistical or experimental data. Knowledge generated in this papercan be referenced by several organizations which deal with road and traffic data. Third, in experimentation, we evaluate the effectiveness of the proposed context-based optimal route (shortest path) between locations by comparing it to the previously used distance-based shortest path. A vehicles' optimal route might change due to its diverse velocity caused by unexpected but potential dynamic situations depending on the road condition. This study includes such context variables as 'road congestion', 'work', 'accident', and 'weather' which can alter the traffic condition. The contexts can affect moving vehicle's velocity on the road. Since these context variables except for 'weather' are related to road conditions, relevant data were provided by the Korea Expressway Corporation. The 'weather'-related data were attained from the Korea Meteorological Administration. The aware contexts are classified contexts causing reduction of vehicles' velocity which determines the velocity reduction rate. To find the optimal route (shortest path), we introduced the velocity reduction rate in the context for calculating a vehicle's velocity reflecting composite contexts when one event synchronizes with another. We then proposed a context-based optimal route (shortest path) algorithm based on the dynamic programming. The algorithm is composed of three steps. In the first initialization step, departure and destination locations are given, and the path step is initialized as 0. In the second step, moving costs including composite contexts into account between locations on path are estimated using the velocity reduction rate by context as increasing path steps. In the third step, the optimal route (shortest path) is retrieved through back-tracking. In the provided research model, we designed a framework to account for context awareness, moving cost estimation (taking both composite and single contexts into account), and optimal route (shortest path) algorithm (based on dynamic programming). Through illustrative experimentation using the Wilcoxon signed rank test, we proved that context-based route planning is much more effective than distance-based route planning., In addition, we found that the optimal solution (shortest paths) through the distance-based route planning might not be optimized in real situation because road condition is very dynamic and unpredictable while affecting most vehicles' moving costs. For further study, while more information is needed for a more accurate estimation of moving vehicles' costs, this study still stands viable in the applications to reduce moving costs by effective route planning. For instance, it could be applied to deliverers' decision making to enhance their decision satisfaction when they meet unpredictable dynamic situations in moving vehicles on the road. Overall, we conclude that taking into account the contexts as a part of costs is a meaningful and sensible approach to in resolving the optimal route problem.

A Study on Forecasting Accuracy Improvement of Case Based Reasoning Approach Using Fuzzy Relation (퍼지 관계를 활용한 사례기반추론 예측 정확성 향상에 관한 연구)

  • Lee, In-Ho;Shin, Kyung-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.67-84
    • /
    • 2010
  • In terms of business, forecasting is a work of what is expected to happen in the future to make managerial decisions and plans. Therefore, the accurate forecasting is very important for major managerial decision making and is the basis for making various strategies of business. But it is very difficult to make an unbiased and consistent estimate because of uncertainty and complexity in the future business environment. That is why we should use scientific forecasting model to support business decision making, and make an effort to minimize the model's forecasting error which is difference between observation and estimator. Nevertheless, minimizing the error is not an easy task. Case-based reasoning is a problem solving method that utilizes the past similar case to solve the current problem. To build the successful case-based reasoning models, retrieving the case not only the most similar case but also the most relevant case is very important. To retrieve the similar and relevant case from past cases, the measurement of similarities between cases is an important key factor. Especially, if the cases contain symbolic data, it is more difficult to measure the distances. The purpose of this study is to improve the forecasting accuracy of case-based reasoning approach using fuzzy relation and composition. Especially, two methods are adopted to measure the similarity between cases containing symbolic data. One is to deduct the similarity matrix following binary logic(the judgment of sameness between two symbolic data), the other is to deduct the similarity matrix following fuzzy relation and composition. This study is conducted in the following order; data gathering and preprocessing, model building and analysis, validation analysis, conclusion. First, in the progress of data gathering and preprocessing we collect data set including categorical dependent variables. Also, the data set gathered is cross-section data and independent variables of the data set include several qualitative variables expressed symbolic data. The research data consists of many financial ratios and the corresponding bond ratings of Korean companies. The ratings we employ in this study cover all bonds rated by one of the bond rating agencies in Korea. Our total sample includes 1,816 companies whose commercial papers have been rated in the period 1997~2000. Credit grades are defined as outputs and classified into 5 rating categories(A1, A2, A3, B, C) according to credit levels. Second, in the progress of model building and analysis we deduct the similarity matrix following binary logic and fuzzy composition to measure the similarity between cases containing symbolic data. In this process, the used types of fuzzy composition are max-min, max-product, max-average. And then, the analysis is carried out by case-based reasoning approach with the deducted similarity matrix. Third, in the progress of validation analysis we verify the validation of model through McNemar test based on hit ratio. Finally, we draw a conclusion from the study. As a result, the similarity measuring method using fuzzy relation and composition shows good forecasting performance compared to the similarity measuring method using binary logic for similarity measurement between two symbolic data. But the results of the analysis are not statistically significant in forecasting performance among the types of fuzzy composition. The contributions of this study are as follows. We propose another methodology that fuzzy relation and fuzzy composition could be applied for the similarity measurement between two symbolic data. That is the most important factor to build case-based reasoning model.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.