• Title/Summary/Keyword: decision support system

Search Result 1,617, Processing Time 0.026 seconds

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model (문서 요약 기법이 가짜 뉴스 탐지 모형에 미치는 영향에 관한 연구)

  • Shim, Jae-Seung;Won, Ha-Ram;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.201-220
    • /
    • 2019
  • Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.

A Recidivism Prediction Model Based on XGBoost Considering Asymmetric Error Costs (비대칭 오류 비용을 고려한 XGBoost 기반 재범 예측 모델)

  • Won, Ha-Ram;Shim, Jae-Seung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.127-137
    • /
    • 2019
  • Recidivism prediction has been a subject of constant research by experts since the early 1970s. But it has become more important as committed crimes by recidivist steadily increase. Especially, in the 1990s, after the US and Canada adopted the 'Recidivism Risk Assessment Report' as a decisive criterion during trial and parole screening, research on recidivism prediction became more active. And in the same period, empirical studies on 'Recidivism Factors' were started even at Korea. Even though most recidivism prediction studies have so far focused on factors of recidivism or the accuracy of recidivism prediction, it is important to minimize the prediction misclassification cost, because recidivism prediction has an asymmetric error cost structure. In general, the cost of misrecognizing people who do not cause recidivism to cause recidivism is lower than the cost of incorrectly classifying people who would cause recidivism. Because the former increases only the additional monitoring costs, while the latter increases the amount of social, and economic costs. Therefore, in this paper, we propose an XGBoost(eXtream Gradient Boosting; XGB) based recidivism prediction model considering asymmetric error cost. In the first step of the model, XGB, being recognized as high performance ensemble method in the field of data mining, was applied. And the results of XGB were compared with various prediction models such as LOGIT(logistic regression analysis), DT(decision trees), ANN(artificial neural networks), and SVM(support vector machines). In the next step, the threshold is optimized to minimize the total misclassification cost, which is the weighted average of FNE(False Negative Error) and FPE(False Positive Error). To verify the usefulness of the model, the model was applied to a real recidivism prediction dataset. As a result, it was confirmed that the XGB model not only showed better prediction accuracy than other prediction models but also reduced the cost of misclassification most effectively.

A Study on the Correspondence and the Autonomy between the Act on the Guarantee of Rights of and Support for Persons with Developmental Disabilities and the Similar Ordinances of the Local Governments (발달장애인 권리보장 및 지원에 관한 법률과 지방자치단체 유사조례 간의 연계성과 자치성에 관한 연구)

  • Jeon, Jihye;Lee, Sehee
    • 한국사회정책
    • /
    • v.25 no.2
    • /
    • pp.367-402
    • /
    • 2018
  • This study analyzed the relationship between the act on the guarantee of rights of and support for persons with developmental disabilities(Act for PWDD) and the similar ordinance of the local governments based on this law and focused on the correspondence(the rate of reflection) and the autonomy(differentiation). As of October 2017, 63 local government regulations and Act for PWDD were analyzed in this study. The results of the analysis are as follows: First, the rate of reflection in the ordinance of Act for PWDD was different according to the clause. In the aspect of emphasizing welfare support, the agreement between local ordinance and rate was high. While the Act for PWDD emphasized the rights of persons with developmental disabilities, there was little information about their right in the ordinance of local governments. This is evidence that current ordinance is based on the protective point of view for people with developmental disabilities. In the future, policy measures will be needed to ensure that respect for decision-making by persons with developmental disabilities and rights guarantees are included in the bylaws. Second, there is a provision that the rate of ordinance reflection is 0%, which may be guaranteed by other laws in the area, so it does not mean the absence of related system in the region, but there is possibility of institutional blind spot. In the future, consideration should be given to the complementarity of other legal systems in the area with developmental disabilities, so that persons with developmental disabilities should not be placed in institutional blind spots. Third, the autonomy(differentiation) of local ordinance was examined from the contents aspect and the administrative aspect to help practical implementation. The differentiation between the ordinances vary. Emphasizing the responsibilities of the head of the organization, emphasizing the fact-finding survey, setting up the welfare committee, or adding local needs were included to the ordinance. Local governments considering the enactment of ordinances in the future should refer to these cases and establish enactable local ordinances that take advantage of the characteristics of local autonomy.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

Application of diversity of recommender system accordingtouserpreferencechange (사용자 선호도 변화에 따른 추천시스템의 다양성 적용)

  • Na, Hyeyeon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.67-86
    • /
    • 2020
  • Recommender Systems have been huge influence users and business more and more. Recently the importance of E-commerce has been reached rapid growth greatly in world-wide COVID-19 pandemic. Recommender system is the center of E-commerce lively. Top ranked E-commerce managers mentioned that recommender systems have a major influence on customer's purchase such as about 50% of Netflix, Amazon sales from their recommender systems. Most algorithms have been focused on improving accuracy of recommender system regardless of novelty, diversity, serendipity etc. Recommender systems with only high accuracy cannot satisfy business long-term profit because of generating sales polarization. In addition, customers do not experience enjoyment of shopping from only focusing accuracy recommender system because customer's preference is changed constantly. Therefore, recommender systems with various values need to be developed for user's high satisfaction. Reranking is the most useful methodology to realize diversity of recommender system. In this paper, diversity of recommender system is represented through constructing high similarity with users who have different preference using each user's purchased item's category algorithm. It is distinguished from past research approach which is changing the algorithm of recommender system without user's diversity preference level. We tried to discover user's diversity preference level and observed the results how the effect was different according to user's diversity preference level. In addition, graph-based recommender system was used to show diversity through user's network, not collaborative filtering. In this paper, Amazon Grocery and Gourmet Food data was used because the low-involvement product, such as habitual product, foods, low-priced goods etc., had high probability to show customer's diversity. First, a bipartite graph with users and items simultaneously is constructed to make graph-based recommender system. However, each users and items unipartite graph also need to be established to show diversity of recommender system. The weight of each unipartite graph has played crucial role changing Jaccard Distance of item's category. We can observe two important results from the user's unipartite network. First, the user's diversity preference level is observed from the network and second, dissimilar users can be discovered in the user's network. Through the research process, diversity of recommender system is presented highly with small accuracy loss and optimalization for higher accuracy is possible controlling diversity ratio. This paper has three important theoretical points. First, this research expands recommender system research for user's satisfaction with various values. Second, the graph-based recommender system is developed newly. Third, the evaluation indicator of diversity is made for diversity. In addition, recommender systems are useful for corporate profit practically and this paper has contribution on business closely. Above all, business long-term profit can be improved using recommender system with diversity and the recommender system can provide right service according to user's diversity level. Lastly, the corporate selling low-involvement products have great effect based on the results.

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

  • Hong, Taeho;Lee, Taewon;Li, Jingjing
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.187-204
    • /
    • 2016
  • Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.

Construction of Artificial Intelligence Training Platform for Multi-Center Clinical Research (다기관 임상연구를 위한 인공지능 학습 플랫폼 구축)

  • Lee, Chung-Sub;Kim, Ji-Eon;No, Si-Hyeong;Kim, Tae-Hoon;Yoon, Kwon-Ha;Jeong, Chang-Won
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.9 no.10
    • /
    • pp.239-246
    • /
    • 2020
  • In the medical field where artificial intelligence technology is introduced, research related to clinical decision support system(CDSS) in relation to diagnosis and prediction is actively being conducted. In particular, medical imaging-based disease diagnosis area applied AI technologies at various products. However, medical imaging data consists of inconsistent data, and it is a reality that it takes considerable time to prepare and use it for research. This paper describes a one-stop AI learning platform for converting to medical image standard R_CDM(Radiology Common Data Model) and supporting AI algorithm development research based on the dataset. To this, the focus is on linking with the existing CDM(common data model) and model the system, including the schema of the medical imaging standard model and report information for multi-center research based on DICOM(Digital Imaging and Communications in Medicine) tag information. And also, we show the execution results based on generated datasets through the AI learning platform. As a proposed platform, it is expected to be used for various image-based artificial intelligence researches.

Opportunities for Agricultural Water Management Interventions in the Krishna Western Delta - A case from Andhra Pradesh, India

  • Kumar, K. Nirmal Ravi
    • Agribusiness and Information Management
    • /
    • v.9 no.1
    • /
    • pp.7-17
    • /
    • 2017
  • Agricultural water management has gained enormous attention in the developing world to alleviate poverty, reduce hunger and conserve ecosystems in small-scale production systems of resource-poor farmers. The story of food security in the $21^{st}$ century in India is likely t o be closely linked to the story of water security. Today, the water resource is under severe threat. The past experiences in India in general and in Andhra Pradesh in particular, indicated inappropriate management of irrigation has led to severe problems like excessive water depletion, reduction in water quality, water logging, salinization, marked reduction in the annual discharge of some of the rivers, lowering of ground water tables due to pumping at unsustainable rates, intrusion of salt water in some coastal areas etc. Considering the importance of irrigation water resource efficiency, Krishna Western Delta (KWD) of Andhra Pradesh was purposively selected for this in depth study, as the farming community in this area are severely affected due to severe soil salinity and water logging problems and hence, adoption of different water saving crop production technologies deserve special mention. It is quite disappointing that, canals, tube wells and filter points and other wells could not contribute much to the irrigated area in KWD. Due to less contribution from these sources, the net area irrigated also showed declining growth at a rate of -6.15 per cent. Regarding paddy production, both SRI and semi-dry cultivation technologies involves less irrigation cost (Rs. 2475.21/ha and Rs. 3248.15/ha respectively) when compared to transplanted technology (Rs. 4321.58/ha). The share of irrigation cost in Total Operational Cost (TOC) was highest for transplanted technology of paddy (11.06%) followed by semi-dry technology (10.85%) and SRI technology (6.21%). The increased yield and declined cost of cultivation of paddy in SRI and semi-dry production technologies respectively were mainly responsible for the low cost of production of paddy in SRI (Rs. 495.22/qtl) and semi-dry (Rs. 532.81/qtl) technologies over transplanted technology (Rs. 574.93/qtl). This clearly indicates that, by less water usage, paddy returns can be boosted by adopting SRI and semi-dry production technologies. Both the system-level and field-level interventions should be addressed to solve the issues/problems of water management. The enabling environment, institutional roles and functions and management instruments are posing favourable picture for executing the water management interventions in the State of Andhra Pradesh in general and in KWD in particular. This facilitates the farming community to harvest good crop per unit of water resource used in the production programme. To achieve better results, the Farmers' Organizations, Water Users Associations, Department of Irrigation etc., will have to aim at improving productivity per unit of water drop used and this must be supported through system-wide enhancement of water delivery systems and decision support tools to assist farmers in optimizing the allocation of limited water among crops, selection of crops based on farming situations, and adoption of appropriate alternative crops in drought years.

Necessity of Standardization and Standardized Method for Substances Accounting of Environmental Liability Insurance (환경책임보험 배출 물질 정산의 표준화 필요성 및 산출방법 표준화)

  • Park, Myeongnam;Kim, Chang-wan;Shin, Dongil
    • Journal of the Korean Institute of Gas
    • /
    • v.22 no.5
    • /
    • pp.1-17
    • /
    • 2018
  • Related incidents and accidents are frequent after 2000 years, such as the outbreak of the Taian peninsula crude oil spillage and Gumi hydrofluoric acid leakage accident. In the wake of such environmental pollution accidents, Consensus has been formed to enact legislation on liability for the compensation of environmental pollution in 2014 and the rescue, and has been in force since January 2016. Therefore, in the domestic insurance industry, the introduced environmental liability insurance system needs to be managed through the standardization formula of a new insurance model for managing the environmental risk. This study has been carried out by the emergence of a safe insurance model with a risky nature of the risk type, which is one of the services of the knowledge base. The verification of the six assurance media on the occurrence of environmental pollution such as chemical, waste, marine, soil, etc. is expressed through semantic interoperability through this possible ontology. The insurance model was designed and presented by deducing the relationship between the amount of money and the amount of money that was written in the area of existing expertise, In order to exclude the possible consequences, the concept of abstract is conceptualized in the form of a customer, and a plan for the future development of an ontology-based decision support system is proposed to reduce the cost and resources consumed every year. It is expected that standardization of the verification standard of the mass of mass will minimize errors and reduce the time and resources required for verification.

NIR-TECHNOLOGY FOR RATIONALE SOIL ANALYSIS WITH IMPLICATIONS FOR PRECISION AGRICULTURE

  • Stenberg, Bo
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1061-1061
    • /
    • 2001
  • The scope of precision agriculture is to reach the put up cultivation goals by adjusting inputs as precise as possible after what is required by the soil and crop potentials, on a high spatial resolution. Consequently, precision agriculture is also often called site specific agriculture. Regulation of field inputs “on the run” has been made possible by the GPS (Geographical Position System)-technology, which gives the farmer his exact real time positioning in the field. The general goal with precision agriculture is to apply inputs where they best fill their purpose. Thus, resources could be saved, and nutrient losses as well as the impact on the environment could be minimized without lowering total yields or putting product quality at risk. As already indicated the technology exists to regulate the input based on beforehand decisions. However, the real challenge is to provide a reliable basis for decision-making. To support high spatial resolution, extensive sampling and analysis is required for many soil and plant characteristics. The potential of the NIR-technology to provide rapid, low cost analyses with a minimum of sample preparation for a multitude of characteristics therefore constitutes a far to irresistible opportunity to be un-scrutinized. In our work we have concentrated on soil-analysis. The instrument we have used is a Bran Lubbe InfraAlyzer 500 (1300-2500 nm). Clay- and organic matter-contents are soil constituents with major implications for most properties and processes in the soil system. For these constituents we had a 3000-sample material provided. High performance models for the agricultural areas in Sweden have been constructed for clay-content, but a rather large reference material is required, probably due to the large variability of Swedish soils. By subdividing Sweden into six areas the total performance was improved. Unfortunately organic matter was not as easy to get at. Reliable models for larger areas could not be constructed. However, through keeping the mineral fraction of the soil at minimal variation good performance could be achieved locally. The influence of a highly variable mineral fraction is probably one of the reasons for the contradictory results found in the literature regarding organic matter content. Tentative studies have also been performed to elucidate the potential performance in contexts with direct operational implications: lime requirement and prediction of plant uptake of soil nitrogen. In both cases there is no definite reference method, but there are numerous indirect, or indicator, methods suggested. In our study, field experiments where used as references and NIR was compared with methods normally used in Sweden. The NIR-models performed equally or slightly better as the standard methods in both situations. However, whether this is good enough is open for evaluation.

  • PDF