• Title/Summary/Keyword: variable feature

Search Result 394, Processing Time 0.028 seconds

Clustering Analysis by Customer Feature based on SOM for Predicting Purchase Pattern in Recommendation System (추천시스템에서 구매 패턴 예측을 위한 SOM기반 고객 특성에 의한 군집 분석)

  • Cho, Young Sung;Moon, Song Chul;Ryu, Keun Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.2
    • /
    • pp.193-200
    • /
    • 2014
  • Due to the advent of ubiquitous computing environment, it is becoming a part of our common life style. And tremendous information is cumulated rapidly. In these trends, it is becoming a very important technology to find out exact information in a large data to present users. Collaborative filtering is the method based on other users' preferences, can not only reflect exact attributes of user but also still has the problem of sparsity and scalability, though it has been practically used to improve these defects. In this paper, we propose clustering method by user's features based on SOM for predicting purchase pattern in u-Commerce. it is necessary for us to make the cluster with similarity by user's features to be able to reflect attributes of the customer information in order to find the items with same propensity in the cluster rapidly. The proposed makes the task of clustering to apply the variable of featured vector for the user's information and RFM factors based on purchase history data. To verify improved performance of proposing system, we make experiments with dataset collected in a cosmetic internet shopping mall.

Organization of Profitable Cattle Husbandry Through Exploiting Favourable Environment Factors (환경요인을 적절하게 이용한 경제성 있는 축산조직 -헝가리의 사례연구-)

  • Alpha, Gyorgy;Kim, Jong-Moo
    • Korean Journal of Organic Agriculture
    • /
    • v.7 no.2
    • /
    • pp.89-97
    • /
    • 1999
  • Through manifestation of the principles of commodity production spatial sharing of labour can be observed in the agriculture as well as in cattle production. Better adjustment of the production structure to the environment factors brings higher yields and more effective production. In being able to maximize the profit the entrepreneurs opt for producing output that closely matches to their featuring conditions. In contrary to the relatively high "mobility" of crop production animal husbandry and within this cattle production - as known - is strictly chained to forage production. On the basis of our economic research and as a result of multivariable analysis(factor analysis) it can be concluded that two variable groups(factors) are highly dominant in organizing profitable cattle production. First of them is the crop site factor (indicated by gold crown value), the second is the forage production feature(forage and grassland area and the yield of them). During recent years the weight of environmental factors suffered from devaluation. As a result of the central economic administration differentiating effects were suppressed and the chances of equalizing concepts strengthened. The outcome can be observed even today. In the regions, for example, being predominantly suitable for grass and forage cropping the milk and slaughter cattle production decreased. The same is ture for com and pig production regions. Unexploitment of local environmental features can be observed mainly in grassland management. Branches being potential user of grasslands hardly take them into consideration. Main method of rational use of grasslands is pasturing. Presence of pastures and the usage of them through cattle production is highly important not only for profitable production but also for maintaining ecological stability.

  • PDF

A Comparative Study on the Symbolism of the Combination of Animals One Another in East Asian Comedic Stories and Proverbs (동아시아 소화(笑話)·속담(俗談)속의 동물조합 상징성 비교)

  • Keum, Young-Jin
    • Cross-Cultural Studies
    • /
    • v.42
    • /
    • pp.205-240
    • /
    • 2016
  • The combination of animals has been developed in each of the cultural spheres as a method of metaphor and symbolism of the cultural code. However, its symbolism is not a fixed constant, but a variable and relative constant. This work focused on its features in comparison with East Asian cultural spheres comedic stories and proverbs. Consequently, several features were identified. First, the combinations of animals in similar comedic stories and proverbs among Korea, Japan and China show a difference in point of view. Korean focuses on the difference of the two animals, but Chinese and Japanese focus on the differences in value and level. Second, the method of anthropomorphization is relatively more developed in China and Japan than Korea. The combinations of animals of Chinese comedic stories and proverbs particularly in the field of anthropomorphization, are most focused on age and sex of the animal. The animal's age or sex remains mostly undetermined in Korean animal's proverbs, unlike Chinese proverbs. On the other hand, two animals in Japanese comedic stories and proverbs are usually of the male and female gender from. Third, the combinations of animals of Chinese and Japanese focus on the animal's body and its characteristics of action. Chinese and Japanese combine the characteristics of the two animal's bodies and actions. This feature apparently caused the resultant combinations of the animal's body parts, for example, the Dragon. Understanding of the combinations of two animals is a good portal into the features of East Asian culture sphere.

Code Coverage Measurement in Configurable Software Product Line Testing (구성가능한 소프트웨어 제품라인 시험에서 코드 커버리지 측정)

  • Han, Soobin;Lee, Jihyun;Go, Seoyeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.7
    • /
    • pp.273-282
    • /
    • 2022
  • Testing approaches for configurable software product lines differs significantly from a single software testing, as it requires consideration of common parts used by all member products of a product line and variable parts shared by some or a single product. Test coverage is a measure of the adequacy of testing performed. Test coverage measurements are important to evaluate the adequacy of testing at the software product line level, as there can be hundreds of member products produced from configurable software product lines. This paper proposes a method for measuring code coverage at the product line level in configurable software product lines. The proposed method tests the member products of a product line after hierarchizing member products based on the inclusion relationship of the selected features, and quantifies SPL(Software Product Line) test coverage by synthesizing the test coverage of each product. As a result of applying the proposed method to 11 configurable software product line cases, we confirmed that the proposed method could quantitatively visualize how thoroughly the SPL testing was performed to help verify the adequacy of the SPL testing. In addition, we could check whether the newly performed testing for a member product covers the newly added code parts of a feature.

Contrast Media Side Effects Prediction Study using Artificial Intelligence Technique (인공지능 기법을 이용한 조영제 부작용 예측 연구)

  • Sang-Hyun Kim
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.3
    • /
    • pp.423-431
    • /
    • 2023
  • The purpose of this study is to analyze the factors affecting the classification of the severity of contrast media side effects based on the patient's body information using artificial intelligence techniques to be used as basic data to reduce the degree of contrast medium side effects. The data used in this study were 606 examiners who had no contrast medium side effects in the past history survey among 1,235 cases of contrast medium side effects among 58,000 CT scans performed at a general hospital in Seoul. The total data is 606, of which 70% was used as a training set and the remaining 30% was used as a test set for validation. Age, BMI(Body Mass Index), GFR(Glomerular Filtration Rate), BUN(Blood Urea Nitrogen), GGT(Gamma Glutamyl Transgerase), AST(Aspartate Amino Transferase,), and ALT(Alanine Amiono Transferase) features were used as independent variables, and contrast media severity was used as a target variable. AUC(Area under curve), CA(Classification Accuracy), F1, Precision, and Recall were identified through AdaBoost, Tree, Neural network, SVM, and Random foest algorithm. AdaBoost and Random Forest show the highest evaluation index in the classification prediction algorithm. The largest factors in the predictions of all models were GFR, BMI, and GGT. It was found that the difference in the amount of contrast media injected according to renal filtration function and obesity, and the presence or absence of metabolic syndrome affected the severity of contrast medium side effects.

Corporate Bankruptcy Prediction Model using Explainable AI-based Feature Selection (설명가능 AI 기반의 변수선정을 이용한 기업부실예측모형)

  • Gundoo Moon;Kyoung-jae Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.241-265
    • /
    • 2023
  • A corporate insolvency prediction model serves as a vital tool for objectively monitoring the financial condition of companies. It enables timely warnings, facilitates responsive actions, and supports the formulation of effective management strategies to mitigate bankruptcy risks and enhance performance. Investors and financial institutions utilize default prediction models to minimize financial losses. As the interest in utilizing artificial intelligence (AI) technology for corporate insolvency prediction grows, extensive research has been conducted in this domain. However, there is an increasing demand for explainable AI models in corporate insolvency prediction, emphasizing interpretability and reliability. The SHAP (SHapley Additive exPlanations) technique has gained significant popularity and has demonstrated strong performance in various applications. Nonetheless, it has limitations such as computational cost, processing time, and scalability concerns based on the number of variables. This study introduces a novel approach to variable selection that reduces the number of variables by averaging SHAP values from bootstrapped data subsets instead of using the entire dataset. This technique aims to improve computational efficiency while maintaining excellent predictive performance. To obtain classification results, we aim to train random forest, XGBoost, and C5.0 models using carefully selected variables with high interpretability. The classification accuracy of the ensemble model, generated through soft voting as the goal of high-performance model design, is compared with the individual models. The study leverages data from 1,698 Korean light industrial companies and employs bootstrapping to create distinct data groups. Logistic Regression is employed to calculate SHAP values for each data group, and their averages are computed to derive the final SHAP values. The proposed model enhances interpretability and aims to achieve superior predictive performance.

Studying the Comparative Analysis of Highway Traffic Accident Severity Using the Random Forest Method. (Random Forest를 활용한 고속도로 교통사고 심각도 비교분석에 관한 연구)

  • Sun-min Lee;Byoung-Jo Yoon;WutYeeLwin
    • Journal of the Society of Disaster Information
    • /
    • v.20 no.1
    • /
    • pp.156-168
    • /
    • 2024
  • Purpose: The trend of highway traffic accidents shows a repeating pattern of increase and decrease, with the fatality rate being highest on highways among all road types. Therefore, there is a need to establish improvement measures that reflect the situation within the country. Method: We conducted accident severity analysis using Random Forest on data from accidents occurring on 10 specific routes with high accident rates among national highways from 2019 to 2021. Factors influencing accident severity were identified. Result: The analysis, conducted using the SHAP package to determine the top 10 variable importance, revealed that among highway traffic accidents, the variables with a significant impact on accident severity are the age of the perpetrator being between 20 and less than 39 years, the time period being daytime (06:00-18:00), occurrence on weekends (Sat-Sun), seasons being summer and winter, violation of traffic regulations (failure to comply with safe driving), road type being a tunnel, geometric structure having a high number of lanes and a high speed limit. We identified a total of 10 independent variables that showed a positive correlation with highway traffic accident severity. Conclusion: As accidents on highways occur due to the complex interaction of various factors, predicting accidents poses significant challenges. However, utilizing the results obtained from this study, there is a need for in-depth analysis of the factors influencing the severity of highway traffic accidents. Efforts should be made to establish efficient and rational response measures based on the findings of this research.

Consumer expectation and consumer satisfaction before and after health care service (의료이용 전.후 기대와 만족수준 비교)

  • Park, Jang-Soon;Yu, Seung-Hum;Sohn, Tae-Yong;Park, Eun-Cheol
    • Korea Journal of Hospital Management
    • /
    • v.8 no.1
    • /
    • pp.112-134
    • /
    • 2003
  • The purpose of this study is to analyze the consumer's expectation before the health care service and the consumer's satisfaction after it. The participants of the study are inpatients in a general hospital located in Seoul. The resources were collected from the self-administration questionnaire survey run parallel with face to face interview. In order to measure the degree of the consumer's expectation, 349 samples were collected from the first questionnaire survey on the date of admission to the hospital. The second questionnaire survey was carried out on the date of discharge to the hospital with the participants responding to the first questionnaire survey. There are 154 samples collected from this survey. The results from the analysis of these resources are as follow. First, the survey shows that one of the highest consumers' expectations was about the generosity, kindliness and sincerity from the staff at the hospital, specially from doctors. Second, according to the analysis of the factors affecting the expectations of the consumers, with regard to path of admission to a hospital relating to patient's features, outpatient who gets into a hospital expected good medical care much more than the other patients. In regard of doctor's features, patients usually and highly expect good medical care from doctors who have good carrier and much experience. Third, according to the second questionnaire survey, what patients are satisfied most with is about the generosity and sincerity from staff at a hospital, especially from doctors and their gem attitudes. The results from survey show that the differences among the degree of consumers' satisfaction are very variable, depending on surrounding environments and facilities. The only fact that expectation didn't meet with satisfaction appeared to the case about technology and skill of medical care and the case about updated medical skills and equipments. Fourth, comparing the degree of expectation with the degree of satisfaction of consumers, correlative analysis was concerned significantly and specifically about the part of overall cleanliness relating to facilities and surrounding environments, the items about medical examination and test plan procedure relating to skill of medical care, professional specialties and convenience for procedure, and the items about satisfying explanations and concern about patients from doctors relating to staff's generosity and sincerity. Fifth, the analysis of the factors affecting the degree of how much patients are satisfied with shows that relating to sociodemographical features, patients are not satisfied with the case when the time and process of medical treatment are getting longer. It is surveyed that consumer were satisfied with the motivation to visit a hospital and the insurance type in patient's feature and so were the medical department and the factor of the degree of the expectation in disease's feature. Sixth, according to analysis based on the survey, patients would join again a hospital when they get satisfaction from the medical care and also they want to come again regarding to doctor's capability. For example, when doctors are old, have a good carrier and much experience, patients would come again. As seen from the above, consumers are usually satisfied with the medical treatment more than that they expected before. They would intend to use again when they get satisfaction from the medical care provided at a hospital. Patients and consumers highly expect good attitude as well as capacity from medical doctors and they are also generally satisfied with those things. Therefore, in order to increase the degree of consumer's satisfaction and their intention to come again, the hospital staff would have to commit themselves to achieve high quality service continuously and would have to make an effort to offer the finest quality service.

  • PDF

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.

The Effect of Physical Pedestrian Environment on Walking Satisfaction - Focusing on the Case of Jinhae City - (물리적 보행환경이 보행만족도에 미치는 영향 - 진해시를 사례지역으로 -)

  • Byeon, Ji-Hye;Park, Kyung-Hun;Choi, Sang-Rok
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.37 no.6
    • /
    • pp.57-65
    • /
    • 2010
  • Physical activity of the people has decreased due to a sedentary lifestyle according to developing the economy throughout the world. It is thought to increase the risk of chronic diseases, including obesity, diabetes, etc. People are interested in walking, which is an easy activity to engage in as an antidote to chronic diseases. The aim of this study is to increase the diminishing physical activity of modem society by inducing walking as part of everyday life through building a walking-based activity-friendly city where people can live merrily, safely and pleasantly. For this purpose, this study conducted a satisfaction survey to dwellers of Jinhae on the physical pedestrian environments which affect determining walking participation and intentions of people, and also provided a valid model to evaluate the effects of the physical environmental factors on walking satisfaction using factor analysis and multiple linear regression analysis. The results are summarized as follows. The 18 variables of the physical pedestrian environments were selected based on pre-literature reviews. The results of the satisfaction surveys showed that the satisfaction of crossing aids in segments was highest, while the building feature was the lowest. Factor analysis was run through a two-step process. The first analysis was conducted to examine the adequacy of this factor analysis on the selected 18 variables. As a result, two variables were removed and the remaining 16 variables were extracted to the four factors by second analysis. Each factor was named function of path, effect of traffic, amenity and safety based on the each factor's commonality. Each factor score of the extracted four factors was set as the independent variable, while the overall walking satisfaction was set as the dependent variable. Then, the multiple linear regression analysis was conducted and showed that all four factors had a positive influence on the overall satisfaction of walking, especially the 'function of path' and 'amenity' factors, followed by 'effect of traffic' and 'safety'. The results of this research will be used as foundational data for creating a walking-based activity-friendly city.