• Title/Summary/Keyword: decision support systems

Search Result 891, Processing Time 0.028 seconds

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.

Predicting Future ESG Performance using Past Corporate Financial Information: Application of Deep Neural Networks (심층신경망을 활용한 데이터 기반 ESG 성과 예측에 관한 연구: 기업 재무 정보를 중심으로)

  • Min-Seung Kim;Seung-Hwan Moon;Sungwon Choi
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.85-100
    • /
    • 2023
  • Corporate ESG performance (environmental, social, and corporate governance) reflecting a company's strategic sustainability has emerged as one of the main factors in today's investment decisions. The traditional ESG performance rating process is largely performed in a qualitative and subjective manner based on the institution-specific criteria, entailing limitations in reliability, predictability, and timeliness when making investment decisions. This study attempted to predict the corporate ESG rating through automated machine learning based on quantitative and disclosed corporate financial information. Using 12 types (21,360 cases) of market-disclosed financial information and 1,780 ESG measures available through the Korea Institute of Corporate Governance and Sustainability during 2019 to 2021, we suggested a deep neural network prediction model. Our model yielded about 86% of accurate classification performance in predicting ESG rating, showing better performance than other comparative models. This study contributed the literature in a way that the model achieved relatively accurate ESG rating predictions through an automated process using quantitative and publicly available corporate financial information. In terms of practical implications, the general investors can benefit from the prediction accuracy and time efficiency of our proposed model with nominal cost. In addition, this study can be expanded by accumulating more Korean and international data and by developing a more robust and complex model in the future.

The Effects of Metaphors in the Interface of Smartphone Applications on Users' Intention to Use (사용자환경의 메타포가 스마트폰 애플리케이션 사용의도에 미치는 영향)

  • Jung, Wonjin;Hong, Suk-Ki
    • Asia pacific journal of information systems
    • /
    • v.24 no.3
    • /
    • pp.255-279
    • /
    • 2014
  • It is not too much to say that smartphones have become an essential part of our lives due to their versatility. Nevertheless, they still have less overall capabilities than their desktop counterparts. Specifically, they have small screens and low resolutions, which make their applications difficult to have a usable interface. To account for these limitations, the interface of smartphone applications should be designed carefully and properly. Good interface design to any application is critical. However, a comprehensive information systems (IS) literature review found that there has been little research on the user interface design of smartphone applications. More specifically, there has been little empirical evidence and understanding about how metaphors, an imaginative way of describing objects and concepts, in the user interface of smartphone applications affect users' intention to use the applications. Thereby, the research goals of this study are to examine 1) the effects of the metaphors in the user interface of smartphone applications on the interaction between users and applications and 2) the effects of mediating variables including the interaction between users and applications, users' beliefs and attitudes, on users' intention to use the applications. A survey was conducted to collect data. University students and practitioners participated in the survey. A 24-item questionnaire was developed on a 5-point Likert-type scale. The measurement items were mostly adapted from the previous studies in the IS literature and modified to fit the context of this study. First, a principal component factor analysis was performed to explore the inter-relationships among a set of variables. The analysis showed that most of the items loaded quite strongly on the six components. The analysis also revealed the six components with eigenvalues exceeding 1, explaining a total of 70.7 per cent of the variance. The reliabilities of the items were also checked. Most Cronbach alpha values were above 0.8, so the scales were considered reliable. In sum, the results of the analysis support the decision to retain the six factors for further investigation. Next, the structural model was analyzed with AMOS structural equation modeling. The values of GFI, AGFI, NFI, TLI, CFI, and RMSEA were checked. The values showed that the research model considerably have a good fit in general. Next, the convergent and discriminant validities of all constructs were examined. The values for the standardized regression weights and critical ration (CR) indicated sufficient convergent validity for all constructs. In addition, the square root of the average variance extracted (AVE) of each construct was compared with its correlations with all other constructs. The results supported discriminant validity for all constructs. In sum, the results of analysis demonstrated adequate convergent and discriminant validities for all constructs. Finally, path coefficients between the variables were examined. Methphor was found to have an impact on interaction (${\beta}$ = .457, p = .000). There were also significant effects of the interaction on perceived usefulness (${\beta}$ = .273, p = .000) and ease of use (${\beta}$ = .405, p = .000). User attitude was significantly influenced by these two beliefs, perceived usefulness (${\beta}$ = .386, p = .000) and ease of use (${\beta}$ = .347, p = .000) respectively. Further, the results of analysis found that users' intention to use smartphone applications was significantly influenced by user attitude (${\beta}$ = .567, p = .000). Based upon the analyses, all hypotheses were supported. This study found that the metaphors used in the interface of smartphone applications affect not only the interaction between users and applications, but also users' intention to use the applications through the mediating variables, perceived usefulness and ease of use. These findings imply that if the metaphors used in the user interface of application are easy enough to understand for smartphone users, then the application can be perceived useful and easy to use, which in turn make users to have an intention to use the application. In conclusion, this study contributed not only to validate and extend Technology Acceptance Model (TAM) partially, but also to develop the construct of metaphor in smartphone settings. However, since a single empirical study cannot be enough to validate the findings, some limitations should be considered.

Evolution of Aviation Safety Regulations to cope with the concept of data-driven rulemaking - Safety Management System & Fatigue Risk Management System

  • Lee, Gun-Young
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.33 no.2
    • /
    • pp.345-366
    • /
    • 2018
  • Article 37 of the International Convention on Civil Aviation requires that rules should be adopted to keep in compliance with international standards and recommended practices established by ICAO. As SARPs are revised annually, each ICAO Member State needs to reflect the new content in its national aviation Acts in a timely manner. In recent years, data-driven international standards have been developed because of the important roles of aviation safety data and information-based legislation in accident prevention based on human factors. The Safety Management System and crew Fatigue Risk Management Systems were reviewed as examples of the result of data-driven rulemaking. The safety management system was adopted in 2013 with the introduction of Annex 19 and Chapter 5 of the relevant manual describes safety data collection and analysis systems. Through analysis of safety data and information, decision makers can make informed data-driven decisions. The Republic of Korea introduced Safety Management System in accordance with Article 58 of the Aviation Safety Act for all airlines, maintenance companies, and airport corporations. To support the SMS, both mandatory reporting and voluntary safety reporting systems need to be in place. Up until now, the standard of administrative penal dispensation for violations of the safety management system has been very weak. Various regulations have been developed and implemented in the United States and Europe for the proper legislation of the safety management system. In the wake of the crash of the Colgan aircraft, the US Aviation Safety Committee recommended the US Federal Aviation Administration to establish a system that can identify and manage pilot fatigue hazards. In 2010, a notice of proposed rulemaking was issued by the Federal Aviation Administration and in 2011, the final rule was passed. The legislation was applied to help differentiate risk based on flight according to factors such as the pilot's duty starting time, the availability of the auxiliary crew, and the class of the rest facility. Numerous amounts data and information were analyzed during the rulemaking process, and reflected in the resultant regulations. A cost-benefit analysis, based on the data of the previous 10 year period, was conducted before the final legislation was reached and it was concluded that the cost benefits are positive. The Republic of Korea also currently has a clause on aviation safety legislation related to crew fatigue risk, where an airline can choose either to conform to the traditional flight time limitation standard or fatigue risk management system. In the United States, specifically for the purpose of data-driven rulemaking, the Airline Rulemaking Committee was formed, and operates in this capacity. Considering the advantageous results of the ARC in the US, and the D4S in Europe, this is a system that should definitely be introduced in Korea as well. A cost-benefit analysis is necessary, and can serve to strengthen the resulting legislation. In order to improve the effectiveness of data-based legislation, it is necessary to have reinforcement of experts and through them prepare a more detailed checklist of relevant variables.

A Study on the Diffusion Factor of e-finance (e-Finance의 확산요인에 관한 연구)

  • Kim, Min-Ho;Song, Chae-Hun;Song, Sun-Yok;Cha, Sun-Kwon
    • International Commerce and Information Review
    • /
    • v.4 no.2
    • /
    • pp.253-277
    • /
    • 2002
  • Nowaday, the advanced technology in information and communication has been leading the dramatic change of transaction paradigm expansion from physical basis to electronic one. As we know, financial services support most of financial exchange between two business parties. So the expansion of electronic transaction paradigm affects to every financial institutions which provide financial services. Thus, financial institutions have accepted e-Finance systems and providing internet financial services to live in the competition. The purpose of this study is to contribute the qualitative enhancement of its customer service, rapid diffusion and accurate strategy establishment for e-Finance industry in the user side. Through the literature review and factor and reliability analysis, this study selects six diffusion factors such as efficiency of perceived e-Finance, reliability and safety of e-Finance in perceived e-Finance itself's characteristic; confidence, technical factors and the customer service quality of e-Finance system in perception on e-Finance System; inclination to innovation in the personal characteristic. According to result of hypothesis verification by using logistics regression analysis, technical factors and the customer service quality of e-Finance system in perception on e-Finance System and inclination to innovation in the personal characteristic gave statistically positive effect to the diffusion decision at the significant level 0.05 and 0.01. However efficiency of perceived e-Finance, reliability and safety of e-Finance in perceived e-Finance itself's characteristic didn't affect to diffusion decision and confidence of e-Finance system in perception on e-Finance System didn't have any statistical significancy. This study can be used as a basic material for the forward empirical study of diffusion factors in the user side and be able to apply to company and government policy making or embodiment, determination for customer service quality degree of financial institutions. But this study has some limitations like didn't touch satisfaction factors and its effect, only deal domestic customers and didn't use multi-regression analysis.

  • PDF

A Study on Forecasting Accuracy Improvement of Case Based Reasoning Approach Using Fuzzy Relation (퍼지 관계를 활용한 사례기반추론 예측 정확성 향상에 관한 연구)

  • Lee, In-Ho;Shin, Kyung-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.67-84
    • /
    • 2010
  • In terms of business, forecasting is a work of what is expected to happen in the future to make managerial decisions and plans. Therefore, the accurate forecasting is very important for major managerial decision making and is the basis for making various strategies of business. But it is very difficult to make an unbiased and consistent estimate because of uncertainty and complexity in the future business environment. That is why we should use scientific forecasting model to support business decision making, and make an effort to minimize the model's forecasting error which is difference between observation and estimator. Nevertheless, minimizing the error is not an easy task. Case-based reasoning is a problem solving method that utilizes the past similar case to solve the current problem. To build the successful case-based reasoning models, retrieving the case not only the most similar case but also the most relevant case is very important. To retrieve the similar and relevant case from past cases, the measurement of similarities between cases is an important key factor. Especially, if the cases contain symbolic data, it is more difficult to measure the distances. The purpose of this study is to improve the forecasting accuracy of case-based reasoning approach using fuzzy relation and composition. Especially, two methods are adopted to measure the similarity between cases containing symbolic data. One is to deduct the similarity matrix following binary logic(the judgment of sameness between two symbolic data), the other is to deduct the similarity matrix following fuzzy relation and composition. This study is conducted in the following order; data gathering and preprocessing, model building and analysis, validation analysis, conclusion. First, in the progress of data gathering and preprocessing we collect data set including categorical dependent variables. Also, the data set gathered is cross-section data and independent variables of the data set include several qualitative variables expressed symbolic data. The research data consists of many financial ratios and the corresponding bond ratings of Korean companies. The ratings we employ in this study cover all bonds rated by one of the bond rating agencies in Korea. Our total sample includes 1,816 companies whose commercial papers have been rated in the period 1997~2000. Credit grades are defined as outputs and classified into 5 rating categories(A1, A2, A3, B, C) according to credit levels. Second, in the progress of model building and analysis we deduct the similarity matrix following binary logic and fuzzy composition to measure the similarity between cases containing symbolic data. In this process, the used types of fuzzy composition are max-min, max-product, max-average. And then, the analysis is carried out by case-based reasoning approach with the deducted similarity matrix. Third, in the progress of validation analysis we verify the validation of model through McNemar test based on hit ratio. Finally, we draw a conclusion from the study. As a result, the similarity measuring method using fuzzy relation and composition shows good forecasting performance compared to the similarity measuring method using binary logic for similarity measurement between two symbolic data. But the results of the analysis are not statistically significant in forecasting performance among the types of fuzzy composition. The contributions of this study are as follows. We propose another methodology that fuzzy relation and fuzzy composition could be applied for the similarity measurement between two symbolic data. That is the most important factor to build case-based reasoning model.

Development of a Feasibility Evaluation Model for Apartment Remodeling with the Number of Households Increasing at the Preliminary Stage (노후공동주택 세대수증가형 리모델링 사업의 기획단계 사업성평가 모델 개발)

  • Koh, Won-kyung;Yoon, Jong-sik;Yu, Il-han;Shin, Dong-woo;Jung, Dae-woon
    • Korean Journal of Construction Engineering and Management
    • /
    • v.20 no.4
    • /
    • pp.22-33
    • /
    • 2019
  • The government has steadily revised and developed laws and systems for activating remodeling of apartments in response to the problems of aged apartments. However, despite such efforts, remodeling has yet to be activated. For many reasons, this study noted that there were no tools for reasonable profitability judgements and decision making in the preliminary stages of the remodeling project. Thus, the feasibility evaluation model was developed. Generally, the profitability judgements are made after the conceptual design. However, decisions to drive remodeling projects are made at the preliminary stage. So a feasibility evaluation model is required at the preliminary stage. Accordingly, In this study, a feasibility evaluation model was developed for determining preliminary stage profitability. Construction costs, business expenses, financial expenses, and generally sales revenue were calculated using the initial available information and remodeling variables derived through the existing cases. Through this process, we developed an algorithm that can give an overview of the return on investment. In addition, the preliminary stage feasibility evaluation model developed was applied to three cases to verify the applicability of the model. Although applied in three cases, the difference between the model's forecast and actual case values is less than 5%, which is considered highly applicable. If cases are expanded in the future, it will be a useful tool that can be used in actual work. The feasibility evaluation model developed in this study will support decision making by union members, and if the model is applied in different regions, it will be expected to help local governments to understand the size of possible remodeling projects.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Prospective for Successful IT in Agriculture (일본 농업분야 정보기술활용 성공사례와 전망)

  • Seishi Ninomiya;Byong-Lyol Lee
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.6 no.2
    • /
    • pp.107-117
    • /
    • 2004
  • If doubtlessly contributes much to agriculture and rural development. The roles can be summarized as; 1. to activate rural areas and to provide more comfortable and safe rural life with equivalent services to those in urban areas, facilitating distance education, tole-medicine, remote public services, remote entertainment etc. 2. To initiate new agricultural and rural business such as e-commerce, real estate business for satellite officies, rural tourism and virtual corporation of small-scale farms. 3. To support policy-making and evaluation on optimal farm production, disaster management, effective agro-environmental resource management etc., providing tools such as GIS. 4. To improve farm management and farming technologies by efficient farm management, risk management, effective information or knowledge transfer etc., realizing competitive and sustainable farming with safe products. 5. To provide systems and tools to secure food traceability and reliability that has been an emerging issue concerning farm products since serious contamination such as BSE and chicken flu was detected. 6. To take an important and key role for industrialization of farming or lam business enterprise, combining the above roles.