• 제목/요약/키워드: Supervised learning

검색결과 747건 처리시간 0.028초

Stochastic Non-linear Hashing for Near-Duplicate Video Retrieval using Deep Feature applicable to Large-scale Datasets

  • Byun, Sung-Woo;Lee, Seok-Pil
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권8호
    • /
    • pp.4300-4314
    • /
    • 2019
  • With the development of video-related applications, media content has increased dramatically through applications. There is a substantial amount of near-duplicate videos (NDVs) among Internet videos, thus NDVR is important for eliminating near-duplicates from web video searches. This paper proposes a novel NDVR system that supports large-scale retrieval and contributes to the efficient and accurate retrieval performance. For this, we extracted keyframes from each video at regular intervals and then extracted both commonly used features (LBP and HSV) and new image features from each keyframe. A recent study introduced a new image feature that can provide more robust information than existing features even if there are geometric changes to and complex editing of images. We convert a vector set that consists of the extracted features to binary code through a set of hash functions so that the similarity comparison can be more efficient as similar videos are more likely to map into the same buckets. Lastly, we calculate similarity to search for NDVs; we examine the effectiveness of the NDVR system and compare this against previous NDVR systems using the public video collections CC_WEB_VIDEO. The proposed NDVR system's performance is very promising compared to previous NDVR systems.

Improving methods for normalizing biomedical text entities with concepts from an ontology with (almost) no training data at BLAH5 the CONTES

  • Ferre, Arnaud;Ba, Mouhamadou;Bossy, Robert
    • Genomics & Informatics
    • /
    • 제17권2호
    • /
    • pp.20.1-20.5
    • /
    • 2019
  • Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.

Fault Diagnosis Method based on Feature Residual Values for Industrial Rotor Machines

  • Kim, Donghwan;Kim, Younhwan;Jung, Joon-Ha;Sohn, Seokman
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제4권2호
    • /
    • pp.89-99
    • /
    • 2018
  • Downtime and malfunction of industrial rotor machines represents a crucial cost burden and productivity loss. Fault diagnosis of this equipment has recently been carried out to detect their fault(s) and cause(s) by using fault classification methods. However, these methods are of limited use in detecting rotor faults because of their hypersensitivity to unexpected and different equipment conditions individually. These limitations tend to affect the accuracy of fault classification since fault-related features calculated from vibration signal are moved to other regions or changed. To improve the limited diagnosis accuracy of existing methods, we propose a new approach for fault diagnosis of rotor machines based on the model generated by supervised learning. Our work is based on feature residual values from vibration signals as fault indices. Our diagnostic model is a robust and flexible process that, once learned from historical data only one time, allows it to apply to different target systems without optimization of algorithms. The performance of the proposed method was evaluated by comparing its results with conventional methods for fault diagnosis of rotor machines. The experimental results show that the proposed method can be used to achieve better fault diagnosis, even when applied to systems with different normal-state signals, scales, and structures, without tuning or the use of a complementary algorithm. The effectiveness of the method was assessed by simulation using various rotor machine models.

Implementation of ML Algorithm for Mung Bean Classification using Smart Phone

  • Almutairi, Mubarak;Mutiullah, Mutiullah;Munir, Kashif;Hashmi, Shadab Alam
    • International Journal of Computer Science & Network Security
    • /
    • 제21권11호
    • /
    • pp.89-96
    • /
    • 2021
  • This work is an extension of my work presented a robust and economically efficient method for the Discrimination of four Mung-Beans [1] varieties based on quantitative parameters. Due to the advancement of technology, users try to find the solutions to their daily life problems using smartphones but still for computing power and memory. Hence, there is a need to find the best classifier to classify the Mung-Beans using already suggested features in previous work with minimum memory requirements and computational power. To achieve this study's goal, we take the experiments on various supervised classifiers with simple architecture and calculations and give the robust performance on the most relevant 10 suggested features selected by Fisher Co-efficient, Probability of Error, Mutual Information, and wavelet features. After the analysis, we replace the Artificial Neural Network and Deep learning with a classifier that gives approximately the same classification results as the above classifier but is efficient in terms of resources and time complexity. This classifier is easily implemented in the smartphone environment.

Analyzing behavior of circular concrete-filled steel tube column using improved fuzzy models

  • Zheng, Yuxin;Jin, Hongwei;Jiang, Congying;Moradi, Zohre;Khadimallah, Mohamed Amine;Safa, Maryam
    • Steel and Composite Structures
    • /
    • 제43권5호
    • /
    • pp.625-637
    • /
    • 2022
  • Axial compression capacity (Pu) is a significant yet complex parameter of concrete-filled steel tube (CFST) columns. This study offers a novel ensemble tool, adaptive neuro-fuzzy inference system (ANFIS) supervised by equilibrium optimization (EO), for accurately predicting this parameter. Moreover, grey wolf optimization (GWO) and Harris hawk optimizer (HHO) are considered as comparative supervisors. The used data is taken from earlier literature provided by finite element analysis. ANFIS is trained by several population sizes of the EO, GWO, and HHO to detect the best configurations. At a glance, the results showed the competency of such ensembles for learning and reproducing the Pu behavior. In details, respective mean absolute errors along with correlation values of 4.1809% and 0.99564, 10.5947% and 0.98006, and 4.8947% and 0.99462 obtained for the EO-ANFIS, GWO-ANFIS, and HHO-ANFIS, respectively, indicated that the proposed EO-ANFIS can analyze and predict the behavior of CFST columns with the highest accuracy. Considering both time and accuracy, the EO provides the most efficient optimization of ANFIS and can be a nice substitute for experimental approaches.

Tri-training algorithm based on cross entropy and K-nearest neighbors for network intrusion detection

  • Zhao, Jia;Li, Song;Wu, Runxiu;Zhang, Yiying;Zhang, Bo;Han, Longzhe
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권12호
    • /
    • pp.3889-3903
    • /
    • 2022
  • To address the problem of low detection accuracy due to training noise caused by mislabeling when Tri-training for network intrusion detection (NID), we propose a Tri-training algorithm based on cross entropy and K-nearest neighbors (TCK) for network intrusion detection. The proposed algorithm uses cross-entropy to replace the classification error rate to better identify the difference between the practical and predicted distributions of the model and reduce the prediction bias of mislabeled data to unlabeled data; K-nearest neighbors are used to remove the mislabeled data and reduce the number of mislabeled data. In order to verify the effectiveness of the algorithm proposed in this paper, experiments were conducted on 12 UCI datasets and NSL-KDD network intrusion datasets, and four indexes including accuracy, recall, F-measure and precision were used for comparison. The experimental results revealed that the TCK has superior performance than the conventional Tri-training algorithms and the Tri-training algorithms using only cross-entropy or K-nearest neighbor strategy.

A Classification Model for Predicting the Injured Body Part in Construction Accidents in Korea

  • Lim, Jiseon;Cho, Sungjin;Kang, Sanghyeok
    • 국제학술발표논문집
    • /
    • The 9th International Conference on Construction Engineering and Project Management
    • /
    • pp.230-237
    • /
    • 2022
  • It is difficult to predict industrial accidents in the construction industry because many accident factors, such as human-related factors and environment-related factors, affect the accidents. Many studies have analyzed the severity of injuries and types of accidents; however, there were few studies on the prediction of injured body parts. This study aims to develop a classification model to predict the part of the injured body based on accident-related factors. Construction accident cases from June 2018 to July 2021 provided by the Korea Construction Safety Management Integrated Information were collected through web crawling and then preprocessed. A naïve Bayes classifier, one of the supervised learning algorithms, was employed to construct a classification model of the injured body part, which has four categories: 1) torso, 2) upper extremity, 3) head, and 4) lower extremity. The predictor variables are accident type, type of work, facility type, injury source, and activity type. As a result, the average accuracy for each injured body part was 50.4%. The accuracy of the upper extremity and lower extremity was relatively higher than the cases of the torso and head. Unlike the other classifications, such as spam mail filtering, a naïve Bayes classifier does not provide a good classification performance in construction accidents. The reasons are discussed in the study. Based on the results of this study, more detailed guidelines for construction safety management can be provided, which help establish safety measures at the construction site.

  • PDF

Prediction Model for Gastric Cancer via Class Balancing Techniques

  • Danish, Jamil ;Sellappan, Palaniappan;Sanjoy Kumar, Debnath;Muhammad, Naseem;Susama, Bagchi ;Asiah, Lokman
    • International Journal of Computer Science & Network Security
    • /
    • 제23권1호
    • /
    • pp.53-63
    • /
    • 2023
  • Many researchers are trying hard to minimize the incidence of cancers, mainly Gastric Cancer (GC). For GC, the five-year survival rate is generally 5-25%, but for Early Gastric Cancer (EGC), it is almost 90%. Predicting the onset of stomach cancer based on risk factors will allow for an early diagnosis and more effective treatment. Although there are several models for predicting stomach cancer, most of these models are based on unbalanced datasets, which favours the majority class. However, it is imperative to correctly identify cancer patients who are in the minority class. This research aims to apply three class-balancing approaches to the NHS dataset before developing supervised learning strategies: Oversampling (Synthetic Minority Oversampling Technique or SMOTE), Undersampling (SpreadSubsample), and Hybrid System (SMOTE + SpreadSubsample). This study uses Naive Bayes, Bayesian Network, Random Forest, and Decision Tree (C4.5) methods. We measured these classifiers' efficacy using their Receiver Operating Characteristics (ROC) curves, sensitivity, and specificity. The validation data was used to test several ways of balancing the classifiers. The final prediction model was built on the one that did the best overall.

RNN-based integrated system for real-time sensor fault detection and fault-informed accident diagnosis in nuclear power plant accidents

  • Jeonghun Choi;Seung Jun Lee
    • Nuclear Engineering and Technology
    • /
    • 제55권3호
    • /
    • pp.814-826
    • /
    • 2023
  • Sensor faults in nuclear power plant instrumentation have the potential to spread negative effects from wrong signals that can cause an accident misdiagnosis by plant operators. To detect sensor faults and make accurate accident diagnoses, prior studies have developed a supervised learning-based sensor fault detection model and an accident diagnosis model with faulty sensor isolation. Even though the developed neural network models demonstrated satisfactory performance, their diagnosis performance should be reevaluated considering real-time connection. When operating in real-time, the diagnosis model is expected to indiscriminately accept fault data before receiving delayed fault information transferred from the previous fault detection model. The uncertainty of neural networks can also have a significant impact following the sensor fault features. In the present work, a pilot study was conducted to connect two models and observe actual outcomes from a real-time application with an integrated system. While the initial results showed an overall successful diagnosis, some issues were observed. To recover the diagnosis performance degradations, additive logics were applied to minimize the diagnosis failures that were not observed in the previous validations of the separate models. The results of a case study were then analyzed in terms of the real-time diagnosis outputs that plant operators would actually face in an emergency situation.

Real-time prediction of dynamic irregularity and acceleration of HSR bridges using modified LSGAN and in-service train

  • Huile Li;Tianyu Wang;Huan Yan
    • Smart Structures and Systems
    • /
    • 제31권5호
    • /
    • pp.501-516
    • /
    • 2023
  • Dynamic irregularity and acceleration of bridges subjected to high-speed trains provide crucial information for comprehensive evaluation of the health state of under-track structures. This paper proposes a novel approach for real-time estimation of vertical track dynamic irregularity and bridge acceleration using deep generative adversarial network (GAN) and vibration data from in-service train. The vehicle-body and bogie acceleration responses are correlated with the two target variables by modeling train-bridge interaction (TBI) through least squares generative adversarial network (LSGAN). To realize supervised learning required in the present task, the conventional LSGAN is modified by implementing new loss function and linear activation function. The proposed approach can offer pointwise and accurate estimates of track dynamic irregularity and bridge acceleration, allowing frequent inspection of high-speed railway (HSR) bridges in an economical way. Thanks to its applicability in scenarios of high noise level and critical resonance condition, the proposed approach has a promising prospect in engineering applications.