• Title/Summary/Keyword: problem features

Search Result 1,863, Processing Time 0.023 seconds

DeepPTP: A Deep Pedestrian Trajectory Prediction Model for Traffic Intersection

  • Lv, Zhiqiang;Li, Jianbo;Dong, Chuanhao;Wang, Yue;Li, Haoran;Xu, Zhihao
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.7
    • /
    • pp.2321-2338
    • /
    • 2021
  • Compared with vehicle trajectories, pedestrian trajectories have stronger degrees of freedom and complexity, which poses a higher challenge to trajectory prediction tasks. This paper designs a mode to divide the trajectory of pedestrians at a traffic intersection, which converts the trajectory regression problem into a trajectory classification problem. This paper builds a deep model for pedestrian trajectory prediction at intersections for the task of pedestrian short-term trajectory prediction. The model calculates the spatial correlation and temporal dependence of the trajectory. More importantly, it captures the interactive features among pedestrians through the Attention mechanism. In order to improve the training speed, the model is composed of pure convolutional networks. This design overcomes the single-step calculation mode of the traditional recurrent neural network. The experiment uses Vulnerable Road Users trajectory dataset for related modeling and evaluation work. Compared with the existing models of pedestrian trajectory prediction, the model proposed in this paper has advantages in terms of evaluation indicators, training speed and the number of model parameters.

Comparing the Performance of 17 Machine Learning Models in Predicting Human Population Growth of Countries

  • Otoom, Mohammad Mahmood
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.1
    • /
    • pp.220-225
    • /
    • 2021
  • Human population growth rate is an important parameter for real-world planning. Common approaches rely upon fixed parameters like human population, mortality rate, fertility rate, which is collected historically to determine the region's population growth rate. Literature does not provide a solution for areas with no historical knowledge. In such areas, machine learning can solve the problem, but a multitude of machine learning algorithm makes it difficult to determine the best approach. Further, the missing feature is a common real-world problem. Thus, it is essential to compare and select the machine learning techniques which provide the best and most robust in the presence of missing features. This study compares 17 machine learning techniques (base learners and ensemble learners) performance in predicting the human population growth rate of the country. Among the 17 machine learning techniques, random forest outperformed all the other techniques both in predictive performance and robustness towards missing features. Thus, the study successfully demonstrates and compares machine learning techniques to predict the human population growth rate in settings where historical data and feature information is not available. Further, the study provides the best machine learning algorithm for performing population growth rate prediction.

Predicting Reports of Theft in Businesses via Machine Learning

  • JungIn, Seo;JeongHyeon, Chang
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.499-510
    • /
    • 2022
  • This study examines the reporting factors of crime against business in Korea and proposes a corresponding predictive model using machine learning. While many previous studies focused on the individual factors of theft victims, there is a lack of evidence on the reporting factors of crime against a business that serves the public good as opposed to those that protect private property. Therefore, we proposed a crime prevention model for the willingness factor of theft reporting in businesses. This study used data collected through the 2015 Commercial Crime Damage Survey conducted by the Korea Institute for Criminal Policy. It analyzed data from 834 businesses that had experienced theft during a 2016 crime investigation. The data showed a problem with unbalanced classes. To solve this problem, we jointly applied the Synthetic Minority Over Sampling Technique and the Tomek link techniques to the training data. Two prediction models were implemented. One was a statistical model using logistic regression and elastic net. The other involved a support vector machine model, tree-based machine learning models (e.g., random forest, extreme gradient boosting), and a stacking model. As a result, the features of theft price, invasion, and remedy, which are known to have significant effects on reporting theft offences, can be predicted as determinants of such offences in companies. Finally, we verified and compared the proposed predictive models using several popular metrics. Based on our evaluation of the importance of the features used in each model, we suggest a more accurate criterion for predicting var.

Experimental Analysis of Bankruptcy Prediction with SHAP framework on Polish Companies

  • Tuguldur Enkhtuya;Dae-Ki Kang
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.53-58
    • /
    • 2023
  • With the fast development of artificial intelligence day by day, users are demanding explanations about the results of algorithms and want to know what parameters influence the results. In this paper, we propose a model for bankruptcy prediction with interpretability using the SHAP framework. SHAP (SHAPley Additive exPlanations) is framework that gives a visualized result that can be used for explanation and interpretation of machine learning models. As a result, we can describe which features are important for the result of our deep learning model. SHAP framework Force plot result gives us top features which are mainly reflecting overall model score. Even though Fully Connected Neural Networks are a "black box" model, Shapley values help us to alleviate the "black box" problem. FCNNs perform well with complex dataset with more than 60 financial ratios. Combined with SHAP framework, we create an effective model with understandable interpretation. Bankruptcy is a rare event, then we avoid imbalanced dataset problem with the help of SMOTE. SMOTE is one of the oversampling technique that resulting synthetic samples are generated for the minority class. It uses K-nearest neighbors algorithm for line connecting method in order to producing examples. We expect our model results assist financial analysts who are interested in forecasting bankruptcy prediction of companies in detail.

Forecasting Short-Term KOSPI using Wavelet Transforms and Fuzzy Neural Network (웨이블릿 변환과 퍼지 신경망을 이용한 단기 KOSPI 예측)

  • Shin, Dong-Kun;Chung, Kyung-Yong
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.6
    • /
    • pp.1-7
    • /
    • 2011
  • The methodology of KOSPI forecast has been considered as one of the most difficult problem to develop accurately since short-term KOSPI is correlated with various factors including politics and economics. In this paper, we presents a methodology for forecasting short-term trends of stock price for five days using the feature selection method based on a neural network with weighted fuzzy membership functions (NEWFM). The distributed non-overlap area measurement method selects the minimized number of input features by removing the worst input features one by one. A technical indicator are selected for preprocessing KOSPI data in the first step. In the second step, thirty-nine numbers of input features are produced by wavelet transforms. Twelve numbers of input features are selected as the minimized numbers of input features from thirty-nine numbers of input features using the non-overlap area distribution measurement method. The proposed method shows that sensitivity, specificity, and accuracy rates are 72.79%, 74.76%, and 73.84%, respectively.

Performance Evaluation of Price-based Input Features in Stock Price Prediction using Tensorflow (텐서플로우를 이용한 주가 예측에서 가격-기반 입력 피쳐의 예측 성능 평가)

  • Song, Yoojeong;Lee, Jae Won;Lee, Jongwoo
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.11
    • /
    • pp.625-631
    • /
    • 2017
  • The stock price prediction for stock markets remains an unsolved problem. Although there have been various overtures and studies to predict the price of stocks scientifically, it is impossible to predict the future precisely. However, stock price predictions have been a subject of interest in a variety of related fields such as economics, mathematics, physics, and computer science. In this paper, we will study fluctuation patterns of stock prices and predict future trends using the Deep learning. Therefore, this study presents the three deep learning models using Tensorflow, an open source framework in which each learning model accepts different input features. We expand the previous study that used simple price data. We measured the performance of three predictive models increasing the number of priced-based input features. Through this experiment, we measured the performance change of the predictive model depending on the price-based input features. Finally, we compared and analyzed the experiment result to evaluate the impact of the price-based input features in stock price prediction.

Accurate Intrusion Detection using n-Gram Augmented Naive Bayes (N-Gram 증강 나이브 베이스를 이용한 정확한 침입 탐지)

  • Kang, Dae-Ki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.10a
    • /
    • pp.285-288
    • /
    • 2008
  • In many intrusion detection applications, n-gram approach has been widely applied. However, n-gram approach has shown a few problems including double counting of features. To address those problems, we applied n-gram augmented Naive Bayes directly to classify intrusive sequences and compared performance with those of Naive Bayes and Support Vector Machines (SVM) with n-gram features by the experiments on host-based intrusion detection benchmark data sets. Experimental results on the University of New Mexico (UNM) benchmark data sets show that the n-gram augmented method, which solves the problem of independence violation that happens when n-gram features are directly applied to Naive Bayes (i.e. Naive Bayes with n-gram features), yields intrusion detectors with higher accuracy than those from Naive Bayes with n-gram features and shows comparable accuracy to those from SVM with n-gram features.

  • PDF

Background Cytologic Features of Metastatic Carcinomas in the Liver in Fine Needle Aspiration Cytology - Analysis of 20 Cases - (간의 전이성 상피암 20예의 세침 천자 흡인시 배경 병변의 세포학적 소견)

  • Myong, Na-Hye;Koh, Jae-Soo;Ha, Chang-Won;Cho, Kyung-Ja;Jang, Ja-June
    • The Korean Journal of Cytopathology
    • /
    • v.2 no.2
    • /
    • pp.90-97
    • /
    • 1991
  • Liver is generally known as an organ which is most commonly involved by the metastic tumors. According to the tendency of using fine needle aspiration in the diagnosis of hepatic tumors, the differentital diagnosis between hepatocellular carcinoma and metastatic carcinoma frequently has been a main issue in the poorly differentitated cases, especially to the pathologists of Korea, an endemic area of hepatocellular carcinoma. Until now the problem has been usually solved by the comparison of cytologic characteristics of their tumor cells but not by background cytologic features which rarely have been studied. We observed the background cytologic features helpful for the differential diagnosis through the analysis of 20 cases who had confirmed primary cancer and were diagnosed as metastatic carcinomas in the liver by fine needle aspiration cytology. Twenty cases included 9 adenocarcinomas, 7 spuamous cell carcinomas, 1 small cell carcinoma, 1 carcinoid, 1 adenoid cystic carcinoma, and 1 renal cell cacinoma. Analysis of background cytologic features revealed that 77% of adenocacinoma cases showed benign mesenchymal components and hepatocytes and spuamous cell carcinoma cases disclosed benign mesenchymal tissue (71%) and necrosis (57%), Remaining cases showed variable combinations of benign mesenchymal component, necrosis, hepatocytes, and bile duct epithelial cells. No case revealed atypical hepatocytic naked nuclei, a useful cytologic finding of hepatocellular carcinoma. In summary, the background cytologic features more commonly observed in metastatic carcinomas than in the hepatocellular carcinoma were benign mesenchymal components, hepatocytes, necrosis, and bile duct epithelium. The endothelial cells and hepatocytic naked nuclei, two relatively specific findings of hepatocellular carcinoma were not observed except for renal ceil carcinoma. Above background cytologic features are thought to be helpful for the differential diagnosis between the hepatocellular carcinoma and various metastatic carcinomas in the poorly differentiated cases.

  • PDF

De-cloaking Malicious Activities in Smartphones Using HTTP Flow Mining

  • Su, Xin;Liu, Xuchong;Lin, Jiuchuang;He, Shiming;Fu, Zhangjie;Li, Wenjia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.6
    • /
    • pp.3230-3253
    • /
    • 2017
  • Android malware steals users' private information, and embedded unsafe advertisement (ad) libraries, which execute unsafe code causing damage to users. The majority of such traffic is HTTP and is mixed with other normal traffic, which makes the detection of malware and unsafe ad libraries a challenging problem. To address this problem, this work describes a novel HTTP traffic flow mining approach to detect and categorize Android malware and unsafe ad library. This work designed AndroCollector, which can automatically execute the Android application (app) and collect the network traffic traces. From these traces, this work extracts HTTP traffic features along three important dimensions: quantitative, timing, and semantic and use these features for characterizing malware and unsafe ad libraries. Based on these HTTP traffic features, this work describes a supervised classification scheme for detecting malware and unsafe ad libraries. In addition, to help network operators, this work describes a fine-grained categorization method by generating fingerprints from HTTP request methods for each malware family and unsafe ad libraries. This work evaluated the scheme using HTTP traffic traces collected from 10778 Android apps. The experimental results show that the scheme can detect malware with 97% accuracy and unsafe ad libraries with 95% accuracy when tested on the popular third-party Android markets.

A Log-Energy Feature Normalization Method Using ARMA Filter (ARMA 필터를 이용한 로그 에너지 특징의 정규화 방법)

  • Shen, Guang-Hu;Jung, Ho-Youl;Chung, Hyun-Yeol
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.10
    • /
    • pp.1325-1337
    • /
    • 2008
  • The difference of environments between training and recognition is the major reason of degradation of speech recognition. To solve this mismatch of environments, various noise processing methods have been studied. Among them, ERN(log-Energy dynamic Range Normalization) and SEN(Silence Energy Normalization) for normalization of log energy features show better performance than others. However, these methods have a problem that they can hardly achieve normalization for the relatively higher values of log energy features and the environmental mismatch caused by this problem becomes bigger especially in low SNR environments. To solve these problems, we propose applying ARMA filter as post-processing for smoothing log energy features by calculating the moving average in auto-regression scheme. From the recognition results conducted on Aurora 2.0 DB, the proposed method shows improved recognition results comparing with conventional methods.

  • PDF