• Title/Summary/Keyword: Interpretability

Search Result 90, Processing Time 0.036 seconds

Mining Spatio-Temporal Patterns in Trajectory Data

  • Kang, Ju-Young;Yong, Hwan-Seung
    • Journal of Information Processing Systems
    • /
    • v.6 no.4
    • /
    • pp.521-536
    • /
    • 2010
  • Spatio-temporal patterns extracted from historical trajectories of moving objects reveal important knowledge about movement behavior for high quality LBS services. Existing approaches transform trajectories into sequences of location symbols and derive frequent subsequences by applying conventional sequential pattern mining algorithms. However, spatio-temporal correlations may be lost due to the inappropriate approximations of spatial and temporal properties. In this paper, we address the problem of mining spatio-temporal patterns from trajectory data. The inefficient description of temporal information decreases the mining efficiency and the interpretability of the patterns. We provide a formal statement of efficient representation of spatio-temporal movements and propose a new approach to discover spatio-temporal patterns in trajectory data. The proposed method first finds meaningful spatio-temporal regions and extracts frequent spatio-temporal patterns based on a prefix-projection approach from the sequences of these regions. We experimentally analyze that the proposed method improves mining performance and derives more intuitive patterns.

Rule Selection Method in Decision Tree Models (의사결정나무 모델에서의 중요 룰 선택기법)

  • Son, Jieun;Kim, Seoung Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.4
    • /
    • pp.375-381
    • /
    • 2014
  • Data mining is a process of discovering useful patterns or information from large amount of data. Decision tree is one of the data mining algorithms that can be used for both classification and prediction and has been widely used for various applications because of its flexibility and interpretability. Decision trees for classification generally generate a number of rules that belong to one of the predefined category and some rules may belong to the same category. In this case, it is necessary to determine the significance of each rule so as to provide the priority of the rule with users. The purpose of this paper is to propose a rule selection method in classification tree models that accommodate the umber of observation, accuracy, and effectiveness in each rule. Our experiments demonstrate that the proposed method produce better performance compared to other existing rule selection methods.

Image Fusion for Improving Classification

  • Lee, Dong-Cheon;Kim, Jeong-Woo;Kwon, Jay-Hyoun;Kim, Chung;Park, Ki-Surk
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1464-1466
    • /
    • 2003
  • classification of the satellite images provides information about land cover and/or land use. Quality of the classification result depends mainly on the spatial and spectral resolutions of the images. In this study, image fusion in terms of resolution merging, and band integration with multi-source of the satellite images; Landsat ETM+ and Ikonos were carried out to improve classification. Resolution merging and band integration could generate imagery of high resolution with more spectral bands. Precise image co-registration is required to remove geometric distortion between different sources of images. Combination of unsupervised and supervised classification of the fused imagery was implemented to improve classification. 3D display of the results was possible by combining DEM with the classification result so that interpretability could be improved.

  • PDF

A Study on Explainable Artificial Intelligence-based Sentimental Analysis System Model

  • Song, Mi-Hwa
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.1
    • /
    • pp.142-151
    • /
    • 2022
  • In this paper, a model combined with explanatory artificial intelligence (xAI) models was presented to secure the reliability of machine learning-based sentiment analysis and prediction. The applicability of the proposed model was tested and described using the IMDB dataset. This approach has an advantage in that it can explain how the data affects the prediction results of the model from various perspectives. In various applications of sentiment analysis such as recommendation system, emotion analysis through facial expression recognition, and opinion analysis, it is possible to gain trust from users of the system by presenting more specific and evidence-based analysis results to users.

A novel heuristic for handover priority in mobile heterogeneous networks based on a multimodule Takagi-Sugeno-Kang fuzzy system

  • Zhang, Fuqi;Xiao, Pingping;Liu, Yujia
    • ETRI Journal
    • /
    • v.44 no.4
    • /
    • pp.560-572
    • /
    • 2022
  • H2RDC (heuristic handover based on RCC-DTSK-C), a heuristic algorithm based on a highly interpretable deep Takagi-Sugeno-Kang fuzzy classifier, is proposed for suppressing the mobile heterogeneous networks problem of frequent handover and handover ping-pong in the multibase-station scenario. This classifier uses a stack structure between subsystems to form a deep classifier before generating a base station (BS) priority sequence during the handover process, and adaptive handover hysteresis is calculated. Simulation results show that H2RDC allows user equipment to switch to the best antenna at the optimal time. In high-BS density load and mobility scenarios, the proposed algorithm's handover success rate is similar to those of classic algorithms such as best connection (BC), self tuning handover algorithm (STHA), and heuristic for handover based on AHP-TOPSIS-FUZZY (H2ATF). Moreover, the handover rate is 83% lower under H2RDC than under BC, whereas the handover ping-pong rate is 76% lower.

Sequence Anomaly Detection based on Diffusion Model (확산 모델 기반 시퀀스 이상 탐지)

  • Zhiyuan Zhang;Inwhee, Joe
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.2-4
    • /
    • 2023
  • Sequence data plays an important role in the field of intelligence, especially for industrial control, traffic control and other aspects. Finding abnormal parts in sequence data has long been an application field of AI technology. In this paper, we propose an anomaly detection method for sequence data using a diffusion model. The diffusion model has two major advantages: interpretability derived from rigorous mathematical derivation and unrestricted selection of backbone models. This method uses the diffusion model to predict and reconstruct the sequence data, and then detects the abnormal part by comparing with the real data. This paper successfully verifies the feasibility of the diffusion model in the field of anomaly detection. We use the combination of MLP and diffusion model to generate data and compare the generated data with real data to detect anomalous points.

Combining AutoML and XAI: Automating machine learning models and improving interpretability (AutoML 과 XAI 의 결합 : 기계학습 모델의 자동화와 해석력 향상을 위하여)

  • Min Hyeok Son;Nam Hun Kim;Hyeon Ji Lee;Do Yeon Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.924-925
    • /
    • 2023
  • 본 연구는 최근 기계학습 모델의 복잡성 증가와 '블랙 박스'로 인식된 머신러닝 모델의 해석 문제에 주목하였다. 이를 해결하기 위해, AutoML 기술을 사용하여 효율적으로 최적의 모델을 탐색하고, XAI 기법을 도입하여 모델의 예측 과정에 대한 투명성을 확보하려 하였다. XAI 기법을 도입한 방식은 전통적인 방법에 비해 뛰어난 해석력을 제공하며, 사용자가 머신러닝 모델의 예측 근거와 그 타당성을 명확히 이해할 수 있음을 확인하였다.

Application of Artificial Intelligence in Gastric Cancer (위암에서 인공지능의 응용)

  • Jung In Lee
    • Journal of Digestive Cancer Research
    • /
    • v.11 no.3
    • /
    • pp.130-140
    • /
    • 2023
  • Gastric cancer (GC) is one of the most common malignant tumors worldwide, with a 5-year survival rate of < 40%. The diagnosis and treatment decisions of GC rely on human experts' judgments on medical images; therefore, the accuracy can be hindered by image condition, objective criterion, limited experience, and interobserver discrepancy. In recent years, several applications of artificial intelligence (AI) have emerged in the GC field based on improvement of computational power and deep learning algorithms. AI can support various clinical practices in endoscopic examination, pathologic confirmation, radiologic staging, and prognosis prediction. This review has systematically summarized the current status of AI applications after a comprehensive literature search. Although the current approaches are challenged by data scarcity and poor interpretability, future directions of this field are likely to overcome the risk and enhance their accuracy and applicability in clinical practice.

Hourly Prediction of Particulate Matter (PM2.5) Concentration Using Time Series Data and Random Forest (시계열 데이터와 랜덤 포레스트를 활용한 시간당 초미세먼지 농도 예측)

  • Lee, Deukwoo;Lee, Soowon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.4
    • /
    • pp.129-136
    • /
    • 2020
  • PM2.5 which is a very tiny air particulate matter even smaller than PM10 has been issued in the environmental problem. Since PM2.5 can cause eye diseases or respiratory problems and infiltrate even deep blood vessels in the brain, it is important to predict PM2.5. However, it is difficult to predict PM2.5 because there is no clear explanation yet regarding the creation and the movement of PM2.5. Thus, prediction methods which not only predict PM2.5 accurately but also have the interpretability of the result are needed. To predict hourly PM2.5 of Seoul city, we propose a method using random forest with the adjusted bootstrap number from the time series ground data preprocessed on different sources. With this method, the prediction model can be trained uniformly on hourly information and the result has the interpretability. To evaluate the prediction performance, we conducted comparative experiments. As a result, the performance of the proposed method was superior against other models in all labels. Also, the proposed method showed the importance of the variables regarding the creation of PM2.5 and the effect of China.

A Research on Yield Prediction of Mixed Pastures in Korea via Model Construction in Stages (혼파초지에서 모형의 단계적 적용을 통한 수량예측 연구)

  • Oh, Seung Min;Kim, Moon Ju;Peng, Jinglun;Lee, Bae Hun;Kim, Ji Yung;Kim, Byong Wan;Jo, Mu Hwan;Sung, Kyung Il
    • Journal of The Korean Society of Grassland and Forage Science
    • /
    • v.37 no.1
    • /
    • pp.80-91
    • /
    • 2017
  • The objective of this study was to select a model showing high-levels of interpretability which is high in R-squared value in terms of predicting the yield in the mixed pasture using the factors of fertilization, seeding rate and years after pasture establishment in steps, as well as the climate as a basic factor. The processes of constructing the yield prediction model for the mixed pasture were performed in the sequence of data collection (forage and climatic data), preparation, analysis, and model construction. Through this process, six models were constructed after considering climatic variables, fertilization management, seeding rates, and periods after pasture establishment years in steps, thereafter the optimum model was selected through considering the coincidence of the models to the forage production theories. As a result, Model VI (R squared = 53.8%) including climatic variables, fertilization amount, seeding rates, and periods after pasture establishment was considered as the optimum yield prediction model for mixed pastures in South Korea. The interpretability of independent variables in the model were decreased in the sequence of climatic variables(24.5%), fertilization amount(17.8%), seeding rates(10.7%), and periods after pasture establishment(0.8%). However, it is necessary to investigate the reasons of positive correlation between dry matter yield and days of summer depression (DSD) by considering cultivated locations and using other cumulative temperature related variables instead of DSD. Meanwhile the another research about the optimum levels of fertilization amounts and seeding rates is required using the quadratic term due to the certain value-centered distribution of these two variables.