• Title/Summary/Keyword: Feature Importance Analysis

Search Result 146, Processing Time 0.025 seconds

Design and Implementation of Prototype System for Management of Social Contribution (사회공헌관리 시스템의 프로토타입 설계 및 구현)

  • Sok, Yun-Young;Choi, Sun-O
    • The Journal of the Convergence on Culture Technology
    • /
    • v.5 no.1
    • /
    • pp.319-325
    • /
    • 2019
  • The importance of social responsibility is growing, and as a result of social awareness, many universities and institutions are carrying out community service activities. Although the volunteer portal site has excellent performance and good accessibility, the overall service performance of a specific organization cannot be managed because of personal information. Almost community service managers of a organization manually manage their community service activities without the support of a community service management program. In this study, have been feature analysis of existing portal site for volunteer coordination, then a prototype was designed and developed as a model of social contribution management system so that universities and organizations can systematically support and manage the social service activities of members based on the questionnaire about social contribution management system.

Application of Statistical and Machine Learning Techniques for Habitat Potential Mapping of Siberian Roe Deer in South Korea

  • Lee, Saro;Rezaie, Fatemeh
    • Proceedings of the National Institute of Ecology of the Republic of Korea
    • /
    • v.2 no.1
    • /
    • pp.1-14
    • /
    • 2021
  • The study has been carried out with an objective to prepare Siberian roe deer habitat potential maps in South Korea based on three geographic information system-based models including frequency ratio (FR) as a bivariate statistical approach as well as convolutional neural network (CNN) and long short-term memory (LSTM) as machine learning algorithms. According to field observations, 741 locations were reported as roe deer's habitat preferences. The dataset were divided with a proportion of 70:30 for constructing models and validation purposes. Through FR model, a total of 10 influential factors were opted for the modelling process, namely altitude, valley depth, slope height, topographic position index (TPI), topographic wetness index (TWI), normalized difference water index, drainage density, road density, radar intensity, and morphological feature. The results of variable importance analysis determined that TPI, TWI, altitude and valley depth have higher impact on predicting. Furthermore, the area under the receiver operating characteristic (ROC) curve was applied to assess the prediction accuracies of three models. The results showed that all the models almost have similar performances, but LSTM model had relatively higher prediction ability in comparison to FR and CNN models with the accuracy of 76% and 73% during the training and validation process. The obtained map of LSTM model was categorized into five classes of potentiality including very low, low, moderate, high and very high with proportions of 19.70%, 19.81%, 19.31%, 19.86%, and 21.31%, respectively. The resultant potential maps may be valuable to monitor and preserve the Siberian roe deer habitats.

Quality Prediction Model for Manufacturing Process of Free-Machining 303-series Stainless Steel Small Rolling Wire Rods (쾌삭 303계 스테인리스강 소형 압연 선재 제조 공정의 생산품질 예측 모형)

  • Seo, Seokjun;Kim, Heungseob
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.4
    • /
    • pp.12-22
    • /
    • 2021
  • This article suggests the machine learning model, i.e., classifier, for predicting the production quality of free-machining 303-series stainless steel(STS303) small rolling wire rods according to the operating condition of the manufacturing process. For the development of the classifier, manufacturing data for 37 operating variables were collected from the manufacturing execution system(MES) of Company S, and the 12 types of derived variables were generated based on literature review and interviews with field experts. This research was performed with data preprocessing, exploratory data analysis, feature selection, machine learning modeling, and the evaluation of alternative models. In the preprocessing stage, missing values and outliers are removed, and oversampling using SMOTE(Synthetic oversampling technique) to resolve data imbalance. Features are selected by variable importance of LASSO(Least absolute shrinkage and selection operator) regression, extreme gradient boosting(XGBoost), and random forest models. Finally, logistic regression, support vector machine(SVM), random forest, and XGBoost are developed as a classifier to predict the adequate or defective products with new operating conditions. The optimal hyper-parameters for each model are investigated by the grid search and random search methods based on k-fold cross-validation. As a result of the experiment, XGBoost showed relatively high predictive performance compared to other models with an accuracy of 0.9929, specificity of 0.9372, F1-score of 0.9963, and logarithmic loss of 0.0209. The classifier developed in this study is expected to improve productivity by enabling effective management of the manufacturing process for the STS303 small rolling wire rods.

Analysis of Hypertension Risk Factors by Life Cycle Based on Machine Learning (머신러닝 기반 생애주기별 고혈압 위험 요인 분석)

  • Kang, SeongAn;Kim, SoHui;Ryu, Min Ho
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.27 no.5
    • /
    • pp.73-82
    • /
    • 2022
  • Chronic diseases such as hypertension require a differentiated approach according to age and life cycle. Chronic diseases such as hypertension require differentiated management according to the life cycle. It is also known that the cause of hypertension is a combination of various factors. This study uses machine learning prediction techniques to analyze various factors affecting hypertension by life cycle. To this end, a total of 35 variables were used through preprocessing and variable selection processes for the National Health and Nutrition Survey data of the Korea Centers for Disease Control and Prevention. As a result of the study, among the tree-based machine learning models, XGBoost was found to have high predictive performance in both middle and old age. Looking at the risk factors for hypertension by life cycle, individual characteristic factors, genetic factors, and nutritional intake factors were found to be risk factors for hypertension in the middle age, and nutritional intake factors, dietary factors, and lifestyle factors were derived as risk factors for hypertension. The results of this study are expected to be used as basic data useful for hypertension management by life cycle.

Performance Comparison of CNN-Based Image Classification Models for Drone Identification System (드론 식별 시스템을 위한 합성곱 신경망 기반 이미지 분류 모델 성능 비교)

  • YeongWan Kim;DaeKyun Cho;GunWoo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.639-644
    • /
    • 2024
  • Recent developments in the use of drones on battlefields, extending beyond reconnaissance to firepower support, have greatly increased the importance of technologies for early automatic drone identification. In this study, to identify an effective image classification model that can distinguish drones from other aerial targets of similar size and appearance, such as birds and balloons, we utilized a dataset of 3,600 images collected from the internet. We adopted a transfer learning approach that combines the feature extraction capabilities of three pre-trained convolutional neural network models (VGG16, ResNet50, InceptionV3) with an additional classifier. Specifically, we conducted a comparative analysis of the performance of these three pre-trained models to determine the most effective one. The results showed that the InceptionV3 model achieved the highest accuracy at 99.66%. This research represents a new endeavor in utilizing existing convolutional neural network models and transfer learning for drone identification, which is expected to make a significant contribution to the advancement of drone identification technologies.

Contents Analysis on the Image of Nurses in the Television Drama (텔레비전 드라마의 간호사 이미지에 대한 분석)

  • Moon, Young-Im;Im, Mi-Lim;Yun, Kyung-Yi
    • The Korean Nurse
    • /
    • v.37 no.2
    • /
    • pp.44-52
    • /
    • 1998
  • The purpose of this study is to inquire the people's views on nursing for nurses, correct the image of nurse and take it as basis to be applied on nursing education examining the image of nursing on Television drama playing important role of mass media. 22 nurses of the characters in drama is applied to the analysis object of this study by selecting 6 dramas of Television ones the nurse play on the prime time from June 1 to August 31 in 1997. Contents analysis method was used in Data Analysis, 4 items was used after Coders previously modify and compensate it based on research documents of 1m Milim(1996) 2 Coders made the Coding the article on each person by them seeing the recorded film making the Coding Paper each items is written by the character. The average of reliability degree was 90% which measured the reliability degree by the mathod of Holsti. The statisic method of frequency, percentage was used SPSS Program in data processing The results were as follows. 1. Relative importance of 86.2% nurses in drama was depicted as extra characters 2. The affair attitude of nurses shown on drama was revealed as mechanical(84.7%), passive(45.5%), dependent(54.4%) unkind(68.2%). 3. The activity of nurses was classified with professional! simple affair. The professional affairs such as I.V., Blood Pressure Check, Rounding, Nursing Recording, Patient Education, Assist of Operation, Assistant meal of Patient, etc is mainly depicted and the screen of simple affair such as Receiving telephone, Carrying Tray or Dragging, Stretcher Car, Dressing Car and or Wheel Chair than professional affair. 4. The appearance feature of nurses was shown on thin physique(68.2%), common stature(68.2), dirty costume(45.4%), common appearance(81.9%), unnoble action(63.6%). The image of nurses is illuminated as the exterial scene of technical affair such as assisting the doctors and affair focused on accident and educational activity of nureses or extended role is nor depicted on Television drama. Therefore, the people regard the nurse as sexual object with good appearance than professional worker working professional nursing We want the following, epigraph based on above conclusion. 1. The continuous research is required on the image of nurse shown on various mass media. 2. The later research is required on appliction strategy of mass media for advancing the image of nurse. 3. The research to strengthen the objectivity by comparing analyzed data on drama & analyzing it is required 4. Through the deep study, the standard to show a concrete and professional work of nurses to scenario writers of TV drama is suggested by the association. 5. The monitoring about the mass media must be activated, not by some nurses, on a national scale and much study on the basis of this is needed.

  • PDF

Improvement of crossflow model of MULTID component in MARS-KS with inter-channel mixing model for enhancing analysis performance in rod bundle

  • Yunseok Lee;Taewan Kim
    • Nuclear Engineering and Technology
    • /
    • v.55 no.12
    • /
    • pp.4357-4366
    • /
    • 2023
  • MARS-KS, a domestic regulatory confirmatory code of Republic of Korea, had been developed by integrating RELAP5/MOD2 and COBRA-TF. The integration of COBRA-TF allowed to extend the capability of MARS-KS, limited to one-dimensional analysis, to multi-dimensional analysis. The use of COBRA-TF was mainly focused on subchannel analyses for simulating multi-dimensional behavior within the reactor core. However, this feature has been remained as a legacy without ongoing maintenance. Meanwhile, MARS-KS also includes its own multidimensional component, namely MULTID, which is also feasible to simulate three-dimensional convection and diffusion. The MULTID is capable of modeling the turbulent diffusion using simple mixing length model. The implementation of the turbulent mixing is of importance for analyzing the reactor core where a disturbing cross-sectional structure of rod bundle makes the flow perturbation and corresponding mixing stronger. In addition, the presence of this turbulent behavior allows the secondary transports with net mass exchange between subchannels. However, a series of assessments performed in previous studies revealed that the turbulence model of the MULTID could not simulate the aforementioned effective mixing occurred in the subchannel-scale problems. This is obvious consequence since the physical models of the MULTID neglect the effect of mass transport and thereby, it cannot model the void drift effect and resulting phasic distribution within a bundle. Thus, in this study, the turbulence mixing model of the MULTID has been improved by means of the inter-channel mixing model, widely utilized in subchannel analysis, in order to extend the application of the MULTID to small-scale problems. A series of assessments has been performed against rod bundle experiments, namely GE 3X3 and PSBT, to evaluate the performance of the introduced mixing model. The assessment results revealed that the application of the inter-channel mixing model allowed to enhance the prediction of the MULTID in subchannel scale problems. In addition, it was indicated that the code could not predict appropriate phasic distribution in the rod bundle without the model. Considering that the proper prediction of the phasic distribution is important when considering pin-based and/or assembly-based expressions of the reactor core, the results of this study clearly indicate that the inter-channel mixing model is required for analyzing the rod bundle, appropriately.

Performance analysis of Frequent Itemset Mining Technique based on Transaction Weight Constraints (트랜잭션 가중치 기반의 빈발 아이템셋 마이닝 기법의 성능분석)

  • Yun, Unil;Pyun, Gwangbum
    • Journal of Internet Computing and Services
    • /
    • v.16 no.1
    • /
    • pp.67-74
    • /
    • 2015
  • In recent years, frequent itemset mining for considering the importance of each item has been intensively studied as one of important issues in the data mining field. According to strategies utilizing the item importance, itemset mining approaches for discovering itemsets based on the item importance are classified as follows: weighted frequent itemset mining, frequent itemset mining using transactional weights, and utility itemset mining. In this paper, we perform empirical analysis with respect to frequent itemset mining algorithms based on transactional weights. The mining algorithms compute transactional weights by utilizing the weight for each item in large databases. In addition, these algorithms discover weighted frequent itemsets on the basis of the item frequency and weight of each transaction. Consequently, we can see the importance of a certain transaction through the database analysis because the weight for the transaction has higher value if it contains many items with high values. We not only analyze the advantages and disadvantages but also compare the performance of the most famous algorithms in the frequent itemset mining field based on the transactional weights. As a representative of the frequent itemset mining using transactional weights, WIS introduces the concept and strategies of transactional weights. In addition, there are various other state-of-the-art algorithms, WIT-FWIs, WIT-FWIs-MODIFY, and WIT-FWIs-DIFF, for extracting itemsets with the weight information. To efficiently conduct processes for mining weighted frequent itemsets, three algorithms use the special Lattice-like data structure, called WIT-tree. The algorithms do not need to an additional database scanning operation after the construction of WIT-tree is finished since each node of WIT-tree has item information such as item and transaction IDs. In particular, the traditional algorithms conduct a number of database scanning operations to mine weighted itemsets, whereas the algorithms based on WIT-tree solve the overhead problem that can occur in the mining processes by reading databases only one time. Additionally, the algorithms use the technique for generating each new itemset of length N+1 on the basis of two different itemsets of length N. To discover new weighted itemsets, WIT-FWIs performs the itemset combination processes by using the information of transactions that contain all the itemsets. WIT-FWIs-MODIFY has a unique feature decreasing operations for calculating the frequency of the new itemset. WIT-FWIs-DIFF utilizes a technique using the difference of two itemsets. To compare and analyze the performance of the algorithms in various environments, we use real datasets of two types (i.e., dense and sparse) in terms of the runtime and maximum memory usage. Moreover, a scalability test is conducted to evaluate the stability for each algorithm when the size of a database is changed. As a result, WIT-FWIs and WIT-FWIs-MODIFY show the best performance in the dense dataset, and in sparse dataset, WIT-FWI-DIFF has mining efficiency better than the other algorithms. Compared to the algorithms using WIT-tree, WIS based on the Apriori technique has the worst efficiency because it requires a large number of computations more than the others on average.

Transfer Learning using Multiple ConvNet Layers Activation Features with Principal Component Analysis for Image Classification (전이학습 기반 다중 컨볼류션 신경망 레이어의 활성화 특징과 주성분 분석을 이용한 이미지 분류 방법)

  • Byambajav, Batkhuu;Alikhanov, Jumabek;Fang, Yang;Ko, Seunghyun;Jo, Geun Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.205-225
    • /
    • 2018
  • Convolutional Neural Network (ConvNet) is one class of the powerful Deep Neural Network that can analyze and learn hierarchies of visual features. Originally, first neural network (Neocognitron) was introduced in the 80s. At that time, the neural network was not broadly used in both industry and academic field by cause of large-scale dataset shortage and low computational power. However, after a few decades later in 2012, Krizhevsky made a breakthrough on ILSVRC-12 visual recognition competition using Convolutional Neural Network. That breakthrough revived people interest in the neural network. The success of Convolutional Neural Network is achieved with two main factors. First of them is the emergence of advanced hardware (GPUs) for sufficient parallel computation. Second is the availability of large-scale datasets such as ImageNet (ILSVRC) dataset for training. Unfortunately, many new domains are bottlenecked by these factors. For most domains, it is difficult and requires lots of effort to gather large-scale dataset to train a ConvNet. Moreover, even if we have a large-scale dataset, training ConvNet from scratch is required expensive resource and time-consuming. These two obstacles can be solved by using transfer learning. Transfer learning is a method for transferring the knowledge from a source domain to new domain. There are two major Transfer learning cases. First one is ConvNet as fixed feature extractor, and the second one is Fine-tune the ConvNet on a new dataset. In the first case, using pre-trained ConvNet (such as on ImageNet) to compute feed-forward activations of the image into the ConvNet and extract activation features from specific layers. In the second case, replacing and retraining the ConvNet classifier on the new dataset, then fine-tune the weights of the pre-trained network with the backpropagation. In this paper, we focus on using multiple ConvNet layers as a fixed feature extractor only. However, applying features with high dimensional complexity that is directly extracted from multiple ConvNet layers is still a challenging problem. We observe that features extracted from multiple ConvNet layers address the different characteristics of the image which means better representation could be obtained by finding the optimal combination of multiple ConvNet layers. Based on that observation, we propose to employ multiple ConvNet layer representations for transfer learning instead of a single ConvNet layer representation. Overall, our primary pipeline has three steps. Firstly, images from target task are given as input to ConvNet, then that image will be feed-forwarded into pre-trained AlexNet, and the activation features from three fully connected convolutional layers are extracted. Secondly, activation features of three ConvNet layers are concatenated to obtain multiple ConvNet layers representation because it will gain more information about an image. When three fully connected layer features concatenated, the occurring image representation would have 9192 (4096+4096+1000) dimension features. However, features extracted from multiple ConvNet layers are redundant and noisy since they are extracted from the same ConvNet. Thus, a third step, we will use Principal Component Analysis (PCA) to select salient features before the training phase. When salient features are obtained, the classifier can classify image more accurately, and the performance of transfer learning can be improved. To evaluate proposed method, experiments are conducted in three standard datasets (Caltech-256, VOC07, and SUN397) to compare multiple ConvNet layer representations against single ConvNet layer representation by using PCA for feature selection and dimension reduction. Our experiments demonstrated the importance of feature selection for multiple ConvNet layer representation. Moreover, our proposed approach achieved 75.6% accuracy compared to 73.9% accuracy achieved by FC7 layer on the Caltech-256 dataset, 73.1% accuracy compared to 69.2% accuracy achieved by FC8 layer on the VOC07 dataset, 52.2% accuracy compared to 48.7% accuracy achieved by FC7 layer on the SUN397 dataset. We also showed that our proposed approach achieved superior performance, 2.8%, 2.1% and 3.1% accuracy improvement on Caltech-256, VOC07, and SUN397 dataset respectively compare to existing work.

Shape Deformation Monitoring for VLBI Antenna Using Close-Range Photogrammetry and Total Least Squares (근접사진측량과 Total Least Squares를 활용한 VLBI 안테나 형상 변형 모니터링 방안 연구)

  • Kim, Hyuk Gil;Yun, Hong Sik
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.1
    • /
    • pp.99-107
    • /
    • 2016
  • In order to maintain the precise positioning accuracy of the VLBI system, the shape deformation found in antenna structure should be monitored. In fact, reduced the antenna gaining of an electromagnetic wave reception from the Quasar has been particularly expected due to the shape deformation of main reflector in VLBI antenna. Therefore, the importance of shape deformation monitoring for the main reflector has been significantly increased. The main reflector has come out as the high potential for deformation in the VLBI structure. The fact has led us to investigate the monitoring system for the main reflector based on the efficient algorithm in accordance with the close-range photogrammetry, which of expecting to be utilized as the continuous and automated monitoring system for the structure deformation in the near future. Ten fitting lines were estimated with the TLS for feature points of distributed in all directions from the main reflector. The resultant intersection point of estimated fitting lines was calculated by using the nearest point calculation algorithm, based on those non-intersection lines. Following to the intuitive basis for the time series analysis, the results was able to provide the calculation of numerical variation in the intersection point, which is represented in 3-axis,; that we are expecting to open the way for predicting a deformation rate as well as deformation direction