• Title/Summary/Keyword: Machine classification

Search Result 2,038, Processing Time 0.031 seconds

A Study on Efficient AI Model Drift Detection Methods for MLOps (MLOps를 위한 효율적인 AI 모델 드리프트 탐지방안 연구)

  • Ye-eun Lee;Tae-jin Lee
    • Journal of Internet Computing and Services
    • /
    • v.24 no.5
    • /
    • pp.17-27
    • /
    • 2023
  • Today, as AI (Artificial Intelligence) technology develops and its practicality increases, it is widely used in various application fields in real life. At this time, the AI model is basically learned based on various statistical properties of the learning data and then distributed to the system, but unexpected changes in the data in a rapidly changing data situation cause a decrease in the model's performance. In particular, as it becomes important to find drift signals of deployed models in order to respond to new and unknown attacks that are constantly created in the security field, the need for lifecycle management of the entire model is gradually emerging. In general, it can be detected through performance changes in the model's accuracy and error rate (loss), but there are limitations in the usage environment in that an actual label for the model prediction result is required, and the detection of the point where the actual drift occurs is uncertain. there is. This is because the model's error rate is greatly influenced by various external environmental factors, model selection and parameter settings, and new input data, so it is necessary to precisely determine when actual drift in the data occurs based only on the corresponding value. There are limits to this. Therefore, this paper proposes a method to detect when actual drift occurs through an Anomaly analysis technique based on XAI (eXplainable Artificial Intelligence). As a result of testing a classification model that detects DGA (Domain Generation Algorithm), anomaly scores were extracted through the SHAP(Shapley Additive exPlanations) Value of the data after distribution, and as a result, it was confirmed that efficient drift point detection was possible.

Verification Test of High-Stability SMEs Using Technology Appraisal Items (기술력 평가항목을 이용한 고안정성 중소기업 판별력 검증)

  • Jun-won Lee
    • Information Systems Review
    • /
    • v.20 no.4
    • /
    • pp.79-96
    • /
    • 2018
  • This study started by focusing on the internalization of the technology appraisal model into the credit rating model to increase the discriminative power of the credit rating model not only for SMEs but also for all companies, reflecting the items related to the financial stability of the enterprises among the technology appraisal items. Therefore, it is aimed to verify whether the technology appraisal model can be applied to identify high-stability SMEs in advance. We classified companies into industries (manufacturing vs. non-manufacturing) and the age of company (initial vs. non-initial), and defined as a high-stability company that has achieved an average debt ratio less than 1/2 of the group for three years. The C5.0 was applied to verify the discriminant power of the model. As a result of the analysis, there is a difference in importance according to the type of industry and the age of company at the sub-item level, but in the mid-item level the R&D capability was a key variable for discriminating high-stability SMEs. In the early stage of establishment, the funding capacity (diversification of funding methods, capital structure and capital cost which taking into account profitability) is an important variable in financial stability. However, we concluded that technology development infrastructure, which enables continuous performance as the age of company increase, becomes an important variable affecting financial stability. The classification accuracy of the model according to the age of company and industry is 71~91%, and it is confirmed that it is possible to identify high-stability SMEs by using technology appraisal items.

A Study on Analyzing Sentiments on Movie Reviews by Multi-Level Sentiment Classifier (영화 리뷰 감성분석을 위한 텍스트 마이닝 기반 감성 분류기 구축)

  • Kim, Yuyoung;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.71-89
    • /
    • 2016
  • Sentiment analysis is used for identifying emotions or sentiments embedded in the user generated data such as customer reviews from blogs, social network services, and so on. Various research fields such as computer science and business management can take advantage of this feature to analyze customer-generated opinions. In previous studies, the star rating of a review is regarded as the same as sentiment embedded in the text. However, it does not always correspond to the sentiment polarity. Due to this supposition, previous studies have some limitations in their accuracy. To solve this issue, the present study uses a supervised sentiment classification model to measure a more accurate sentiment polarity. This study aims to propose an advanced sentiment classifier and to discover the correlation between movie reviews and box-office success. The advanced sentiment classifier is based on two supervised machine learning techniques, the Support Vector Machines (SVM) and Feedforward Neural Network (FNN). The sentiment scores of the movie reviews are measured by the sentiment classifier and are analyzed by statistical correlations between movie reviews and box-office success. Movie reviews are collected along with a star-rate. The dataset used in this study consists of 1,258,538 reviews from 175 films gathered from Naver Movie website (movie.naver.com). The results show that the proposed sentiment classifier outperforms Naive Bayes (NB) classifier as its accuracy is about 6% higher than NB. Furthermore, the results indicate that there are positive correlations between the star-rate and the number of audiences, which can be regarded as the box-office success of a movie. The study also shows that there is the mild, positive correlation between the sentiment scores estimated by the classifier and the number of audiences. To verify the applicability of the sentiment scores, an independent sample t-test was conducted. For this, the movies were divided into two groups using the average of sentiment scores. The two groups are significantly different in terms of the star-rated scores.

A Clinical Study of Tinnitus (耳鳴에 관한 임상적 연구)

  • Choi, In-Hwa
    • The Journal of Korean Medicine Ophthalmology and Otolaryngology and Dermatology
    • /
    • v.14 no.2
    • /
    • pp.134-145
    • /
    • 2001
  • Introduction: Noises in the ear, whether real or imagined, are called tinnitus. Subjective causes of tinnitus(which is heard only by the patient) are extremely common and the majority of them are treated conservatively. For certain individuals their tinnitus is a major handicap; for others a trivial concern. The most common from of subjective tinnitus is a rushing, hissing or buzzing noise; it is frequently associated with sensorineural heanng loss. The patient may be unaware of the hearing loss, especially if it is a high frequency deficit of moderate severity. The character of the tinnitus may give a clue to the etiology. But the patient often has difficulty in explaining his/her tinnitus in absolute terms, as they have no other tinnitus with which to compare it but their own Tinnitus, like pain, is a subjective state and trying to objectively assess the severity is problematic. Audiological techniques to match subjective loudness to machine-produced noise may offer some help, in that sound intensity matches can bear little correspondence to subjective complaint. In spite of many studies, most patients presently seen complaining of tinnitus are told by their doctors that there is no treatment and that they will have to learn to live with this symptom. Objectives: To perform a clinical analysis of tinnitus and estimate the efficacy of Oriental Medical treatment according to the Byeonjeung(辨證). Subject: We studied 34 patients with complaints of tinnitus who had visited Pundang Cha Oriental Medicine Hospital Department of Otorhinolaryngology from March 1998 to February 2000. All of them had been treated 2 or 3 times a week with acupuncture treatment and had taken herbs according to the Byeonjeung(辨證) method. It was therefore possible for me to know whether their symptoms improved or not. Parameters Observed and Method: We treated them with acupuncture & herb-medication. Sometimes we gave them moxibustion or negative therapy with bloodletting at the acupuncture points(耳門, 聽宮, 聽會). Parameters Observed 1) Distribution of age & sex 2) Chief complaints 3) The sites of tinnitus 4) The quality of tinnitu 5) The duration of disease 6) The problem induced tinnitus 7) Factors increasing disease severity 8) The classification of the Byeonjeung(辨證) 9) The efficacy of treatments Results: 1. Age and sex distribution: The most common occurrence was found in males in their twenties: 6 males($17.7\%$), and in females in their thirties and over sixty: 8 females($23.5\%$). Total patient numbers for men and women were 20 men($58.8\%$), 14 women ($41.2\%$). 2. The most frequent major complaints were hearing disturbances related to tinnitus; and dizziness with tinnitus; each comprising 10 cases($29.4\%$). There were also 7 patients($20.6\%$) with only tinnitus. 3. Tinnitus sites: 13($38.2\%$) said that they felt tinnitus in both ears, equally. In the right ear, 9($26.5\%$), in the left, 6($17.7\%$). 4. The most frequent descriptive symptoms of tinnitus were: humming, hissing, buzzing etc. 5. The duration of disease. 14cases($41.2\%$) had a duration of less than 1 year. 6. 15cases($44.1\%$) complained that it was hard to watch TV or make a phone call because of tinnitus. 10 cases($29.4\%$) complained about depression. 7. Factors increasing severity of tinnitus: ⅰ) fatigue: 18cases($52.9\%$) ⅱ) stress/ tension: 10 cases($29.4\%$) ⅲ) alcohol and tobacco: 5cases($l4.7\%$) 8. Classification through Byeonjeung : ⅰ) 19 cases($55.9\%$) were classified as showing Deficiency syndrome. ⅱ) 15 cases($44.l\%$) were classified as showing Excess syndrome. The deficiency of Qi was 7($20.6\%$), deficiency of Xue, 8($23.5\%$) and insufficiency of the Kidney Yin & Yang, 4($11.8\%$). The flare of Liver fire was 8($23.5\%$) and phlegm-fire, 7($20.6\%$), 9. The efficacy of treatments showed: an improvement in 17cases($50.0\%$); no real improvement or changes in 13 cases($38.2\%$); and some worsening in 4 cases($11.8\%$). In the group with deficiency in Qi, 4($57.1\%$) improved, 1($14.3\%$) showed no change and 2($28.6\%$) were aggravated. In the cases of deficiency in Xue, 6($75.0\%$) improved, 2($25.0\%$) showed no change. In the cases of insufficiency of Kidney Yin & Yang, 3($75.0\%$) showed no change and 1($25.0\%$) were aggravated. In the group of flare of Liver fire, 4($50.0\%$) improved, 3($37.5\%$) no change and 1($12.5\%$) were aggravated. In the cases of phlegm-fire, 3($42.9\%$) improved, 4($57.1\%$) showed no change. Conclusion: We would recommend that any further studies of tinnitus utilize trial treatments of longer than 2 months duration, as any positive effects observed in our study showed that improvement occurred fairly slowly. And we suggest that this study could be utilized as a reference for clinical Oriental Medical treatment of tinnitus. If we try to apply music or sound therapy treatment properly combined with ours, we expect it to provide psycological stability in addition to inducing masking effects, even though it may not directly decrease or completely remove tinnitus.

  • PDF

Assessment of climate change impact on aquatic ecology health indices in Han river basin using SWAT and random forest (SWAT 및 random forest를 이용한 기후변화에 따른 한강유역의 수생태계 건강성 지수 영향 평가)

  • Woo, So Young;Jung, Chung Gil;Kim, Jin Uk;Kim, Seong Joon
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.10
    • /
    • pp.863-874
    • /
    • 2018
  • The purpose of this study is to evaluate the future climate change impact on stream aquatic ecology health of Han River watershed ($34,148km^2$) using SWAT (Soil and Water Assessment Tool) and random forest. The 8 years (2008~2015) spring (April to June) Aquatic ecology Health Indices (AHI) such as Trophic Diatom Index (TDI), Benthic Macroinvertebrate Index (BMI) and Fish Assessment Index (FAI) scored (0~100) and graded (A~E) by NIER (National Institute of Environmental Research) were used. The 8 years NIER indices with the water quality (T-N, $NH_4$, $NO_3$, T-P, $PO_4$) showed that the deviation of AHI score is large when the concentration of water quality is low, and AHI score had negative correlation when the concentration is high. By using random forest, one of the Machine Learning techniques for classification analysis, the classification results for the 3 indices grade showed that all of precision, recall, and f1-score were above 0.81. The future SWAT hydrology and water quality results under HadGEM3-RA RCP 4.5 and 8.5 scenarios of Korea Meteorological Administration (KMA) showed that the future nitrogen-related water quality in watershed average increased up to 43.2% by the baseflow increase effect and the phosphorus-related water quality decreased up to 18.9% by the surface runoff decrease effect. The future FAI and BMI showed a little better Index grade while the future TDI showed a little worse index grade. We can infer that the future TDI is more sensitive to nitrogen-related water quality and the future FAI and BMI are responded to phosphorus-related water quality.

A Case Study: Improvement of Wind Risk Prediction by Reclassifying the Detection Results (풍해 예측 결과 재분류를 통한 위험 감지확률의 개선 연구)

  • Kim, Soo-ock;Hwang, Kyu-Hong
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.3
    • /
    • pp.149-155
    • /
    • 2021
  • Early warning systems for weather risk management in the agricultural sector have been developed to predict potential wind damage to crops. These systems take into account the daily maximum wind speed to determine the critical wind speed that causes fruit drops and provide the weather risk information to farmers. In an effort to increase the accuracy of wind risk predictions, an artificial neural network for binary classification was implemented. In the present study, the daily wind speed and other weather data, which were measured at weather stations at sites of interest in Jeollabuk-do and Jeollanam-do as well as Gyeongsangbuk- do and part of Gyeongsangnam- do provinces in 2019, were used for training the neural network. These weather stations include 210 synoptic and automated weather stations operated by the Korean Meteorological Administration (KMA). The wind speed data collected at the same locations between January 1 and December 12, 2020 were used to validate the neural network model. The data collected from December 13, 2020 to February 18, 2021 were used to evaluate the wind risk prediction performance before and after the use of the artificial neural network. The critical wind speed of damage risk was determined to be 11 m/s, which is the wind speed reported to cause fruit drops and damages. Furthermore, the maximum wind speeds were expressed using Weibull distribution probability density function for warning of wind damage. It was found that the accuracy of wind damage risk prediction was improved from 65.36% to 93.62% after re-classification using the artificial neural network. Nevertheless, the error rate also increased from 13.46% to 37.64%, as well. It is likely that the machine learning approach used in the present study would benefit case studies where no prediction by risk warning systems becomes a relatively serious issue.

Development on Early Warning System about Technology Leakage of Small and Medium Enterprises (중소기업 기술 유출에 대한 조기경보시스템 개발에 대한 연구)

  • Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.143-159
    • /
    • 2017
  • Due to the rapid development of IT in recent years, not only personal information but also the key technologies and information leakage that companies have are becoming important issues. For the enterprise, the core technology that the company possesses is a very important part for the survival of the enterprise and for the continuous competitive advantage. Recently, there have been many cases of technical infringement. Technology leaks not only cause tremendous financial losses such as falling stock prices for companies, but they also have a negative impact on corporate reputation and delays in corporate development. In the case of SMEs, where core technology is an important part of the enterprise, compared to large corporations, the preparation for technological leakage can be seen as an indispensable factor in the existence of the enterprise. As the necessity and importance of Information Security Management (ISM) is emerging, it is necessary to check and prepare for the threat of technology infringement early in the enterprise. Nevertheless, previous studies have shown that the majority of policy alternatives are represented by about 90%. As a research method, literature analysis accounted for 76% and empirical and statistical analysis accounted for a relatively low rate of 16%. For this reason, it is necessary to study the management model and prediction model to prevent leakage of technology to meet the characteristics of SMEs. In this study, before analyzing the empirical analysis, we divided the technical characteristics from the technology value perspective and the organizational factor from the technology control point based on many previous researches related to the factors affecting the technology leakage. A total of 12 related variables were selected for the two factors, and the analysis was performed with these variables. In this study, we use three - year data of "Small and Medium Enterprise Technical Statistics Survey" conducted by the Small and Medium Business Administration. Analysis data includes 30 industries based on KSIC-based 2-digit classification, and the number of companies affected by technology leakage is 415 over 3 years. Through this data, we conducted a randomized sampling in the same industry based on the KSIC in the same year, and compared with the companies (n = 415) and the unaffected firms (n = 415) 1:1 Corresponding samples were prepared and analyzed. In this research, we will conduct an empirical analysis to search for factors influencing technology leakage, and propose an early warning system through data mining. Specifically, in this study, based on the questionnaire survey of SMEs conducted by the Small and Medium Business Administration (SME), we classified the factors that affect the technology leakage of SMEs into two factors(Technology Characteristics, Organization Characteristics). And we propose a model that informs the possibility of technical infringement by using Support Vector Machine(SVM) which is one of the various techniques of data mining based on the proven factors through statistical analysis. Unlike previous studies, this study focused on the cases of various industries in many years, and it can be pointed out that the artificial intelligence model was developed through this study. In addition, since the factors are derived empirically according to the actual leakage of SME technology leakage, it will be possible to suggest to policy makers which companies should be managed from the viewpoint of technology protection. Finally, it is expected that the early warning model on the possibility of technology leakage proposed in this study will provide an opportunity to prevent technology Leakage from the viewpoint of enterprise and government in advance.

Increasing Accuracy of Stock Price Pattern Prediction through Data Augmentation for Deep Learning (데이터 증강을 통한 딥러닝 기반 주가 패턴 예측 정확도 향상 방안)

  • Kim, Youngjun;Kim, Yeojeong;Lee, Insun;Lee, Hong Joo
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.1-12
    • /
    • 2019
  • As Artificial Intelligence (AI) technology develops, it is applied to various fields such as image, voice, and text. AI has shown fine results in certain areas. Researchers have tried to predict the stock market by utilizing artificial intelligence as well. Predicting the stock market is known as one of the difficult problems since the stock market is affected by various factors such as economy and politics. In the field of AI, there are attempts to predict the ups and downs of stock price by studying stock price patterns using various machine learning techniques. This study suggest a way of predicting stock price patterns based on the Convolutional Neural Network(CNN) among machine learning techniques. CNN uses neural networks to classify images by extracting features from images through convolutional layers. Therefore, this study tries to classify candlestick images made by stock data in order to predict patterns. This study has two objectives. The first one referred as Case 1 is to predict the patterns with the images made by the same-day stock price data. The second one referred as Case 2 is to predict the next day stock price patterns with the images produced by the daily stock price data. In Case 1, data augmentation methods - random modification and Gaussian noise - are applied to generate more training data, and the generated images are put into the model to fit. Given that deep learning requires a large amount of data, this study suggests a method of data augmentation for candlestick images. Also, this study compares the accuracies of the images with Gaussian noise and different classification problems. All data in this study is collected through OpenAPI provided by DaiShin Securities. Case 1 has five different labels depending on patterns. The patterns are up with up closing, up with down closing, down with up closing, down with down closing, and staying. The images in Case 1 are created by removing the last candle(-1candle), the last two candles(-2candles), and the last three candles(-3candles) from 60 minutes, 30 minutes, 10 minutes, and 5 minutes candle charts. 60 minutes candle chart means one candle in the image has 60 minutes of information containing an open price, high price, low price, close price. Case 2 has two labels that are up and down. This study for Case 2 has generated for 60 minutes, 30 minutes, 10 minutes, and 5minutes candle charts without removing any candle. Considering the stock data, moving the candles in the images is suggested, instead of existing data augmentation techniques. How much the candles are moved is defined as the modified value. The average difference of closing prices between candles was 0.0029. Therefore, in this study, 0.003, 0.002, 0.001, 0.00025 are used for the modified value. The number of images was doubled after data augmentation. When it comes to Gaussian Noise, the mean value was 0, and the value of variance was 0.01. For both Case 1 and Case 2, the model is based on VGG-Net16 that has 16 layers. As a result, 10 minutes -1candle showed the best accuracy among 60 minutes, 30 minutes, 10 minutes, 5minutes candle charts. Thus, 10 minutes images were utilized for the rest of the experiment in Case 1. The three candles removed from the images were selected for data augmentation and application of Gaussian noise. 10 minutes -3candle resulted in 79.72% accuracy. The accuracy of the images with 0.00025 modified value and 100% changed candles was 79.92%. Applying Gaussian noise helped the accuracy to be 80.98%. According to the outcomes of Case 2, 60minutes candle charts could predict patterns of tomorrow by 82.60%. To sum up, this study is expected to contribute to further studies on the prediction of stock price patterns using images. This research provides a possible method for data augmentation of stock data.

  • PDF

Development of Deep Learning Structure to Improve Quality of Polygonal Containers (다각형 용기의 품질 향상을 위한 딥러닝 구조 개발)

  • Yoon, Suk-Moon;Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.25 no.3
    • /
    • pp.493-500
    • /
    • 2021
  • In this paper, we propose the development of deep learning structure to improve quality of polygonal containers. The deep learning structure consists of a convolution layer, a bottleneck layer, a fully connect layer, and a softmax layer. The convolution layer is a layer that obtains a feature image by performing a convolution 3x3 operation on the input image or the feature image of the previous layer with several feature filters. The bottleneck layer selects only the optimal features among the features on the feature image extracted through the convolution layer, reduces the channel to a convolution 1x1 ReLU, and performs a convolution 3x3 ReLU. The global average pooling operation performed after going through the bottleneck layer reduces the size of the feature image by selecting only the optimal features among the features of the feature image extracted through the convolution layer. The fully connect layer outputs the output data through 6 fully connect layers. The softmax layer multiplies and multiplies the value between the value of the input layer node and the target node to be calculated, and converts it into a value between 0 and 1 through an activation function. After the learning is completed, the recognition process classifies non-circular glass bottles by performing image acquisition using a camera, measuring position detection, and non-circular glass bottle classification using deep learning as in the learning process. In order to evaluate the performance of the deep learning structure to improve quality of polygonal containers, as a result of an experiment at an authorized testing institute, it was calculated to be at the same level as the world's highest level with 99% good/defective discrimination accuracy. Inspection time averaged 1.7 seconds, which was calculated within the operating time standards of production processes using non-circular machine vision systems. Therefore, the effectiveness of the performance of the deep learning structure to improve quality of polygonal containers proposed in this paper was proven.

Development of 1ST-Model for 1 hour-heavy rain damage scale prediction based on AI models (1시간 호우피해 규모 예측을 위한 AI 기반의 1ST-모형 개발)

  • Lee, Joonhak;Lee, Haneul;Kang, Narae;Hwang, Seokhwan;Kim, Hung Soo;Kim, Soojun
    • Journal of Korea Water Resources Association
    • /
    • v.56 no.5
    • /
    • pp.311-323
    • /
    • 2023
  • In order to reduce disaster damage by localized heavy rains, floods, and urban inundation, it is important to know in advance whether natural disasters occur. Currently, heavy rain watch and heavy rain warning by the criteria of the Korea Meteorological Administration are being issued in Korea. However, since this one criterion is applied to the whole country, we can not clearly recognize heavy rain damage for a specific region in advance. Therefore, in this paper, we tried to reset the current criteria for a special weather report which considers the regional characteristics and to predict the damage caused by rainfall after 1 hour. The study area was selected as Gyeonggi-province, where has more frequent heavy rain damage than other regions. Then, the rainfall inducing disaster or hazard-triggering rainfall was set by utilizing hourly rainfall and heavy rain damage data, considering the local characteristics. The heavy rain damage prediction model was developed by a decision tree model and a random forest model, which are machine learning technique and by rainfall inducing disaster and rainfall data. In addition, long short-term memory and deep neural network models were used for predicting rainfall after 1 hour. The predicted rainfall by a developed prediction model was applied to the trained classification model and we predicted whether the rain damage after 1 hour will be occurred or not and we called this as 1ST-Model. The 1ST-Model can be used for preventing and preparing heavy rain disaster and it is judged to be of great contribution in reducing damage caused by heavy rain.