• Title/Summary/Keyword: Feature Importance Analysis

Search Result 139, Processing Time 0.024 seconds

A Semantic Analysis of One Prodiscourse Maker in Korean:kulay (담화대용표지{그래}의 의미 연구)

  • 신현숙
    • Korean Journal of Cognitive Science
    • /
    • v.2 no.1
    • /
    • pp.143-165
    • /
    • 1990
  • I will discuss some aspects of the meaning of prodiscoure maker 'kulay'in Korea.This marker has been studied few scholars,since Korean lingusts did not have any interest about this category of linguistic form.Also,they did not realized the importance of discourse and discourse markers.So,we have only shallow information about prodiscourse phenomena and prodiscourse markers,too. Morphologically,kulay(그래)'could be analyzed into 'ku(그)'and 'lay(래)'and 'lay(래)'could be divided into'l(ㄹ)'and 'ay(ㅐ)' again.But I will discuss 'kulay'as one linguistic unit without divison. It will be claimed in this paper that both [prodiscoures]feature and [discourse continuity]feature can satisfactorily account for the core meaning of'kulay'.And,it will be mentioned that the marker has many kinds of specfic meaning depends on paricular discourse.Also, I would like to examine the semantic feature([prodiscourse+discourse continuity]) in many kinds of korean discourse.And I will show that some factors re;ated tp the marker's specific meaning are the meaning of preceding and following discourse and the participant's psychological attitude.The conclusion must be that the meaning of 'kulay'can help us understand certain phenomena about prodiscourse and prodiscourse markers in the korean language.Also the various meanings of 'kulay'can give more information to Applied-Korean linguistics.

Predicting Determinants of Seoul-Bike Data Using Optimized Gradient-Boost (최적화된 Gradient-Boost를 사용한 서울 자전거 데이터의 결정 요인 예측)

  • Kim, Chayoung;Kim, Yoon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.861-866
    • /
    • 2022
  • Seoul introduced the shared bicycle system, "Seoul Public Bike" in 2015 to help reduce traffic volume and air pollution. Hence, to solve various problems according to the supply and demand of the shared bicycle system, "Seoul Public Bike," several studies are being conducted. Most of the research is a strategic "Bicycle Rearrangement" in regard to the imbalance between supply and demand. Moreover, most of these studies predict demand by grouping features such as weather or season. In previous studies, demand was predicted by time-series-analysis. However, recently, studies that predict demand using deep learning or machine learning are emerging. In this paper, we can show that demand prediction can be made a little better by discovering new features or ordering the importance of various features based on well-known feature-patterns. In this study, by ordering the selection of new features or the importance of the features, a better coefficient of determination can be obtained even if the well-known deep learning or machine learning or time-series-analysis is exploited as it is. Therefore, we could be a better one for demand prediction.

A sensitivity analysis of machine learning models on fire-induced spalling of concrete: Revealing the impact of data manipulation on accuracy and explainability

  • Mohammad K. al-Bashiti;M.Z. Naser
    • Computers and Concrete
    • /
    • v.33 no.4
    • /
    • pp.409-423
    • /
    • 2024
  • Using an extensive database, a sensitivity analysis across fifteen machine learning (ML) classifiers was conducted to evaluate the impact of various data manipulation techniques, evaluation metrics, and explainability tools. The results of this sensitivity analysis reveal that the examined models can achieve an accuracy ranging from 72-93% in predicting the fire-induced spalling of concrete and denote the light gradient boosting machine, extreme gradient boosting, and random forest algorithms as the best-performing models. Among such models, the six key factors influencing spalling were maximum exposure temperature, heating rate, compressive strength of concrete, moisture content, silica fume content, and the quantity of polypropylene fiber. Our analysis also documents some conflicting results observed with the deep learning model. As such, this study highlights the necessity of selecting suitable models and carefully evaluating the presence of possible outcome biases.

Method of Extracting the Topic Sentence Considering Sentence Importance based on ELMo Embedding (ELMo 임베딩 기반 문장 중요도를 고려한 중심 문장 추출 방법)

  • Kim, Eun Hee;Lim, Myung Jin;Shin, Ju Hyun
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.39-46
    • /
    • 2021
  • This study is about a method of extracting a summary from a news article in consideration of the importance of each sentence constituting the article. We propose a method of calculating sentence importance by extracting the probabilities of topic sentence, similarity with article title and other sentences, and sentence position as characteristics that affect sentence importance. At this time, a hypothesis is established that the Topic Sentence will have a characteristic distinct from the general sentence, and a deep learning-based classification model is trained to obtain a topic sentence probability value for the input sentence. Also, using the pre-learned ELMo language model, the similarity between sentences is calculated based on the sentence vector value reflecting the context information and extracted as sentence characteristics. The topic sentence classification performance of the LSTM and BERT models was 93% accurate, 96.22% recall, and 89.5% precision, resulting in high analysis results. As a result of calculating the importance of each sentence by combining the extracted sentence characteristics, it was confirmed that the performance of extracting the topic sentence was improved by about 10% compared to the existing TextRank algorithm.

Assessment of Landslide Susceptibility in Jecheon Using Deep Learning Based on Exploratory Data Analysis (데이터 탐색을 활용한 딥러닝 기반 제천 지역 산사태 취약성 분석)

  • Sang-A Ahn;Jung-Hyun Lee;Hyuck-Jin Park
    • The Journal of Engineering Geology
    • /
    • v.33 no.4
    • /
    • pp.673-687
    • /
    • 2023
  • Exploratory data analysis is the process of observing and understanding data collected from various sources to identify their distributions and correlations through their structures and characterization. This process can be used to identify correlations among conditioning factors and select the most effective factors for analysis. This can help the assessment of landslide susceptibility, because landslides are usually triggered by multiple factors, and the impacts of these factors vary by region. This study compared two stages of exploratory data analysis to examine the impact of the data exploration procedure on the landslide prediction model's performance with respect to factor selection. Deep-learning-based landslide susceptibility analysis used either a combinations of selected factors or all 23 factors. During the data exploration phase, we used a Pearson correlation coefficient heat map and a histogram of random forest feature importance. We then assessed the accuracy of our deep-learning-based analysis of landslide susceptibility using a confusion matrix. Finally, a landslide susceptibility map was generated using the landslide susceptibility index derived from the proposed analysis. The analysis revealed that using all 23 factors resulted in low accuracy (55.90%), but using the 13 factors selected in one step of exploration improved the accuracy to 81.25%. This was further improved to 92.80% using only the nine conditioning factors selected during both steps of the data exploration. Therefore, exploratory data analysis selected the conditioning factors most suitable for landslide susceptibility analysis and thereby improving the performance of the analysis.

EEG Feature Engineering for Machine Learning-Based CPAP Titration Optimization in Obstructive Sleep Apnea

  • Juhyeong Kang;Yeojin Kim;Jiseon Yang;Seungwon Chung;Sungeun Hwang;Uran Oh;Hyang Woon Lee
    • International journal of advanced smart convergence
    • /
    • v.12 no.3
    • /
    • pp.89-103
    • /
    • 2023
  • Obstructive sleep apnea (OSA) is one of the most prevalent sleep disorders that can lead to serious consequences, including hypertension and/or cardiovascular diseases, if not treated promptly. Continuous positive airway pressure (CPAP) is widely recognized as the most effective treatment for OSA, which needs the proper titration of airway pressure to achieve the most effective treatment results. However, the process of CPAP titration can be time-consuming and cumbersome. There is a growing importance in predicting personalized CPAP pressure before CPAP treatment. The primary objective of this study was to optimize the CPAP titration process for obstructive sleep apnea patients through EEG feature engineering with machine learning techniques. We aimed to identify and utilize the most critical EEG features to forecast key OSA predictive indicators, ultimately facilitating more precise and personalized CPAP treatment strategies. Here, we analyzed 126 OSA patients' PSG datasets before and after the CPAP treatment. We extracted 29 EEG features to predict the features that have high importance on the OSA prediction index which are AHI and SpO2 by applying the Shapley Additive exPlanation (SHAP) method. Through extracted EEG features, we confirmed the six EEG features that had high importance in predicting AHI and SpO2 using XGBoost, Support Vector Machine regression, and Random Forest Regression. By utilizing the predictive capabilities of EEG-derived features for AHI and SpO2, we can better understand and evaluate the condition of patients undergoing CPAP treatment. The ability to predict these key indicators accurately provides more immediate insight into the patient's sleep quality and potential disturbances. This not only ensures the efficiency of the diagnostic process but also provides more tailored and effective treatment approach. Consequently, the integration of EEG analysis into the sleep study protocol has the potential to revolutionize sleep diagnostics, offering a time-saving, and ultimately more effective evaluation for patients with sleep-related disorders.

Importance of Meta-Analysis and Practical Obstacles in Oncological and Epidemiological Studies: Statistics Very Close but Also Far!

  • Tanriverdi, Ozgur;Yeniceri, Nese
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.3
    • /
    • pp.1303-1306
    • /
    • 2015
  • Studies of epidemiological and prognostic factors are very important for oncology practice. There is a rapidly increasing amount of research and resultant knowledge in the scientific literature. This means that health professionals have major challenges in accessing relevant information and they increasingly require best available evidence to make their clinical decisions. Meta-analyses of prognostic and other epidemiological factors are very practical statistical approaches to define clinically important parameters. However, they also feature many obstacles in terms of data collection, standardization of results from multiple centers, bias, and commentary for intepretation. In this paper, the obstacles of meta-analysis are briefly reviewed, and potential problems with this statistical method are discussed.

Issues Related to the Modeling of Solid Oxide Fuel Cell Stacks

  • Yang Shi;Ramakrishna P.A.;Sohn Chang-Hyun
    • Journal of Mechanical Science and Technology
    • /
    • v.20 no.3
    • /
    • pp.391-398
    • /
    • 2006
  • This work involves a method for modeling the flow distribution in the stack of a solid oxide fuel cell. Towards this end, a three dimensional modeling of the flow through a Solid Oxide Fuel Cell (SOFC) stack was carried out using the CFD analysis. This paper examines the efficacy of using cold flow analysis to describe the flow through a SOFC stack. It brings out the relative importance of temperature effect and the mass transfer effect on the SOFC manifold design. Another feature of this study is to utilize statistical tools to ascertain the extent of uniform flow through a stack. The results showed that the cold flow analysis of flow through SOFC might not lead to correct manifold designs. The results of the numerical calculations also indicated that the mass transfer across membrane was essential to correctly describe the cathode flow, while only temperature effects were sufficient to describe the anode flow in a SOFC.

Human Emotion Recognition based on Variance of Facial Features (얼굴 특징 변화에 따른 휴먼 감성 인식)

  • Lee, Yong-Hwan;Kim, Youngseop
    • Journal of the Semiconductor & Display Technology
    • /
    • v.16 no.4
    • /
    • pp.79-85
    • /
    • 2017
  • Understanding of human emotion has a high importance in interaction between human and machine communications systems. The most expressive and valuable way to extract and recognize the human's emotion is by facial expression analysis. This paper presents and implements an automatic extraction and recognition scheme of facial expression and emotion through still image. This method has three main steps to recognize the facial emotion: (1) Detection of facial areas with skin-color method and feature maps, (2) Creation of the Bezier curve on eyemap and mouthmap, and (3) Classification and distinguish the emotion of characteristic with Hausdorff distance. To estimate the performance of the implemented system, we evaluate a success-ratio with emotional face image database, which is commonly used in the field of facial analysis. The experimental result shows average 76.1% of success to classify and distinguish the facial expression and emotion.

  • PDF

Analysis of Occupational Injury and Feature Importance of Fall Accidents on the Construction Sites using Adaboost (에이다 부스트를 활용한 건설현장 추락재해의 강도 예측과 영향요인 분석)

  • Choi, Jaehyun;Ryu, HanGuk
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.35 no.11
    • /
    • pp.155-162
    • /
    • 2019
  • The construction industry is the highest safety accident causing industry as 28.55% portion of all industries' accidents in Korea. In particular, falling is the highest accidents type composed of 60.16% among the construction field accidents. Therefore, we analyzed the factors of major disaster affecting the fall accident and then derived feature importances by considering various variables. We used data collected from Korea Occupational Safety & Health Agency (KOSHA) for learning and predicting in the proposed model. We have an effort to predict the degree of occupational fall accidents by using the machine learning model, i.e., Adaboost, short for Adaptive Boosting. Adaboost is a machine learning meta-algorithm which can be used in conjunction with many other types of learning algorithms to improve performance. Decision trees were combined with AdaBoost in this model to predict and classify the degree of occupational fall accidents. HyOperpt was also used to optimize hyperparameters and to combine k-fold cross validation by hierarchy. We extracted and analyzed feature importances and affecting fall disaster by permutation technique. In this study, we verified the degree of fall accidents with predictive accuracy. The machine learning model was also confirmed to be applicable to the safety accident analysis in construction site. In the future, if the safety accident data is accumulated automatically in the network system using IoT(Internet of things) technology in real time in the construction site, it will be possible to analyze the factors and types of accidents according to the site conditions from the real time data.