• Title/Summary/Keyword: Machine Learning

Search Result 5,285, Processing Time 0.031 seconds

Crab Landing QAR (Quick Access Recorder) Flight Data Statistical Analysis Model (크랩랜딩(Crab Landing) QAR(Quick Access Recorder) 비행 데이터 통계분석 모델)

  • Jeon Je-Hyung;Kim Hyeon-deok
    • Journal of Advanced Navigation Technology
    • /
    • v.28 no.2
    • /
    • pp.185-192
    • /
    • 2024
  • The aviation has improved safety through technological innovation and strengthened flight safety through safety regulations and supervision by aviation authorities. As the industry's safety approach has evolved into a systematic approach to the aircraft system, airlines have established a safety management system. Technical defects or abnormal data in an aircraft can be warning signs that could lead to an accident, and the risk of an accident can be reduced by identifying and responding to these signs early. Therefore, management of abnormal warning signs is an essential element in promoting data-based decision-making and enhancing the operational efficiency and safety level of airlines. In this study, we present a model to statistically analyze quick access recorder (QAR) flight data in the preliminary analysis stage to analyze the patterns and causes of crab landing events that can lead to runway departures when landing an aircraft, and provide a precursor to a landing event. We aim to identify signs and causes and contribute to increasing the efficiency of safety management.

Analyzing fashion item purchase patterns and channel transition patterns using association rules and brand loyalty in big data (빅데이터의 연관규칙과 브랜드 충성도를 활용한 패션품목 구매패턴과 구매채널 전환패턴 분석)

  • Ki Yong Kwon
    • The Research Journal of the Costume Culture
    • /
    • v.32 no.2
    • /
    • pp.199-214
    • /
    • 2024
  • Until now, research on consumers' purchasing behavior has primarily focused on psychological aspects or depended on consumer surveys. However, there may be a gap between consumers' self-reported perceptions and their observable actions. In response, this study aimed to investigate consumer purchasing behavior utilizing a big data approach. To this end, this study investigated the purchasing patterns of fashion items, both online and in retail stores, from a data-driven perspective. We also investigated whether individual consumers switched between online websites and retail establishments for making purchases. Data on 516,474 purchases were obtained from fashion companies. We used association rule analysis and K-means clustering to identify purchase patterns that were influenced by customer loyalty. Furthermore, sequential pattern analysis was applied to investigate the usage patterns of online and offline channels by consumers. The results showed that high-loyalty consumers mainly purchased infrequently bought items in the brand line, as well as high-priced items, and that these purchase patterns were similar both online and in stores. In contrast, the low-loyalty group showed different purchasing behaviors for online versus in-store purchases. In physical environments, the low-loyalty consumers tended to purchase less popular or more expensive items from the brand line, whereas in online environments, their purchases centered around items with relatively high sales volumes. Finally, we found that both high and low loyalty groups exclusively used a single preferred channel, either online or in-store. The findings help companies better understand consumer purchase patterns and build future marketing strategies around items with high brand centrality.

A Study on the i-YOLOX Architecture for Multiple Object Detection and Classification of Household Waste (생활 폐기물 다중 객체 검출과 분류를 위한 i-YOLOX 구조에 관한 연구)

  • Weiguang Wang;Kyung Kwon Jung;Taewon Lee
    • Convergence Security Journal
    • /
    • v.23 no.5
    • /
    • pp.135-142
    • /
    • 2023
  • In addressing the prominent issues of climate change, resource scarcity, and environmental pollution associated with household waste, extensive research has been conducted on intelligent waste classification methods. These efforts range from traditional classification algorithms to machine learning and neural networks. However, challenges persist in effectively classifying waste in diverse environments and conditions due to insufficient datasets, increased complexity in neural network architectures, and performance limitations for real-world applications. Therefore, this paper proposes i-YOLOX as a solution for rapid classification and improved accuracy. The proposed model is evaluated based on network parameters, detection speed, and accuracy. To achieve this, a dataset comprising 10,000 samples of household waste, spanning 17 waste categories, is created. The i-YOLOX architecture is constructed by introducing the Involution channel convolution operator and the Convolution Branch Attention Module (CBAM) into the YOLOX structure. A comparative analysis is conducted with the performance of the existing YOLO architecture. Experimental results demonstrate that i-YOLOX enhances the detection speed and accuracy of waste objects in complex scenes compared to conventional neural networks. This confirms the effectiveness of the proposed i-YOLOX architecture in the detection and classification of multiple household waste objects.

An analysis of the waning effect of COVID-19 vaccinations

  • Bogyeom Lee;Hanbyul Song;Catherine Apio;Kyulhee Han;Jiwon Park;Zhe Liu;Hu Xuwen;Taesung Park
    • Genomics & Informatics
    • /
    • v.21 no.4
    • /
    • pp.50.1-50.9
    • /
    • 2023
  • Vaccine development is one of the key efforts to control the spread of coronavirus disease 2019 (COVID-19). However, it has become apparent that the immunity acquired through vaccination is not permanent, known as the waning effect. Therefore, monitoring the proportion of the population with immunity is essential to improve the forecasting of future waves of the pandemic. Despite this, the impact of the waning effect on forecasting accuracies has not been extensively studied. We proposed a method for the estimation of the effective immunity (EI) rate which represents the waning effect by integrating the second and booster doses of COVID-19 vaccines. The EI rate, with different periods to the onset of the waning effect, was incorporated into three statistical models and two machine learning models. Stringency Index, omicron variant BA.5 rate (BA.5 rate), booster shot rate (BSR), and the EI rate were used as covariates and the best covariate combination was selected using prediction error. Among the prediction results, Generalized Additive Model showed the best improvement (decreasing 86% test error) with the EI rate. Furthermore, we confirmed that South Korea's decision to recommend booster shots after 90 days is reasonable since the waning effect onsets 90 days after the last dose of vaccine which improves the prediction of confirmed cases and deaths. Substituting BSR with EI rate in statistical models not only results in better predictions but also makes it possible to forecast a potential wave and help the local community react proactively to a rapid increase in confirmed cases.

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

  • Hong, Taeho;Lee, Taewon;Li, Jingjing
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.187-204
    • /
    • 2016
  • Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.

Validation of nutrient intake of smartphone application through comparison of photographs before and after meals (식사 전후의 사진 비교를 통한 스마트폰 앱의 영양소섭취량 타당도 평가)

  • Lee, Hyejin;Kim, Eunbin;Kim, Su Hyeon;Lim, Haeun;Park, Yeong Mi;Kang, Joon Ho;Kim, Heewon;Kim, Jinho;Park, Woong-Yang;Park, Seongjin;Kim, Jinki;Yang, Yoon Jung
    • Journal of Nutrition and Health
    • /
    • v.53 no.3
    • /
    • pp.319-328
    • /
    • 2020
  • Purpose: This study was conducted to evaluate the validity of the Gene-Health application in terms of estimating energy and macronutrients. Methods: The subjects were 98 health adults participating in a weight-control intervention study. They recorded their diets in the Gene-Health application, took photographs before and after every meal on the same day, and uploaded them to the Gene-Health application. The amounts of foods and drinks consumed were estimated based on the photographs by trained experts, and the nutrient intakes were calculated using the CAN-Pro 5.0 program, which was named 'Photo Estimation'. The energy and macronutrients estimated from the Gene-Health application were compared with those from a Photo Estimation. The mean differences in energy and macronutrient intakes between the two methods were compared using paired t-test. Results: The mean energy intakes of Gene-Health and Photo Estimation were 1,937.0 kcal and 1,928.3 kcal, respectively. There were no significant differences in intakes of energy, carbohydrate, fat, and energy from fat (%) between two methods. The protein intake and energy from protein (%) of the Gene-Health were higher than those from the Photo Estimation. The energy from carbohydrate (%) for the Photo Estimation was higher than that of the Gene-Health. The Pearson correlation coefficients, weighted Kappa coefficients, and adjacent agreements for energy and macronutrient intakes between the two methods ranged from 0.382 to 0.607, 0.588 to 0.649, and 79.6% to 86.7%, respectively. Conclusion: The Gene-Health application shows acceptable validity as a dietary intake assessment tool for energy and macronutrients. Further studies with female subjects and various age groups will be needed.

Thermal Characteristics of Daegu using Land Cover Data and Satellite-derived Surface Temperature Downscaled Based on Machine Learning (기계학습 기반 상세화를 통한 위성 지표면온도와 환경부 토지피복도를 이용한 열환경 분석: 대구광역시를 중심으로)

  • Yoo, Cheolhee;Im, Jungho;Park, Seonyoung;Cho, Dongjin
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.6_2
    • /
    • pp.1101-1118
    • /
    • 2017
  • Temperatures in urban areas are steadily rising due to rapid urbanization and on-going climate change. Since the spatial distribution of heat in a city varies by region, it is crucial to investigate detailed thermal characteristics of urban areas. Recently, many studies have been conducted to identify thermal characteristics of urban areas using satellite data. However,satellite data are not sufficient for precise analysis due to the trade-off of temporal and spatial resolutions.In this study, in order to examine the thermal characteristics of Daegu Metropolitan City during the summers between 2012 and 2016, Moderate Resolution Imaging Spectroradiometer (MODIS) daytime and nighttime land surface temperature (LST) data at 1 km spatial resolution were downscaled to a spatial resolution of 250 m using a machine learning method called random forest. Compared to the original 1 km LST, the downscaled 250 m LST showed a higher correlation between the proportion of impervious areas and mean land surface temperatures in Daegu by the administrative neighborhood unit. Hot spot analysis was then conducted using downscaled daytime and nighttime 250 m LST. The clustered hot spot areas for daytime and nighttime were compared and examined based on the land cover data provided by the Ministry of Environment. The high-value hot spots were relatively more clustered in industrial and commercial areas during the daytime and in residential areas at night. The thermal characterization of urban areas using the method proposed in this study is expected to contribute to the establishment of city and national security policies.

Building battery deterioration prediction model using real field data (머신러닝 기법을 이용한 납축전지 열화 예측 모델 개발)

  • Choi, Keunho;Kim, Gunwoo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.243-264
    • /
    • 2018
  • Although the worldwide battery market is recently spurring the development of lithium secondary battery, lead acid batteries (rechargeable batteries) which have good-performance and can be reused are consumed in a wide range of industry fields. However, lead-acid batteries have a serious problem in that deterioration of a battery makes progress quickly in the presence of that degradation of only one cell among several cells which is packed in a battery begins. To overcome this problem, previous researches have attempted to identify the mechanism of deterioration of a battery in many ways. However, most of previous researches have used data obtained in a laboratory to analyze the mechanism of deterioration of a battery but not used data obtained in a real world. The usage of real data can increase the feasibility and the applicability of the findings of a research. Therefore, this study aims to develop a model which predicts the battery deterioration using data obtained in real world. To this end, we collected data which presents change of battery state by attaching sensors enabling to monitor the battery condition in real time to dozens of golf carts operated in the real golf field. As a result, total 16,883 samples were obtained. And then, we developed a model which predicts a precursor phenomenon representing deterioration of a battery by analyzing the data collected from the sensors using machine learning techniques. As initial independent variables, we used 1) inbound time of a cart, 2) outbound time of a cart, 3) duration(from outbound time to charge time), 4) charge amount, 5) used amount, 6) charge efficiency, 7) lowest temperature of battery cell 1 to 6, 8) lowest voltage of battery cell 1 to 6, 9) highest voltage of battery cell 1 to 6, 10) voltage of battery cell 1 to 6 at the beginning of operation, 11) voltage of battery cell 1 to 6 at the end of charge, 12) used amount of battery cell 1 to 6 during operation, 13) used amount of battery during operation(Max-Min), 14) duration of battery use, and 15) highest current during operation. Since the values of the independent variables, lowest temperature of battery cell 1 to 6, lowest voltage of battery cell 1 to 6, highest voltage of battery cell 1 to 6, voltage of battery cell 1 to 6 at the beginning of operation, voltage of battery cell 1 to 6 at the end of charge, and used amount of battery cell 1 to 6 during operation are similar to that of each battery cell, we conducted principal component analysis using verimax orthogonal rotation in order to mitigate the multiple collinearity problem. According to the results, we made new variables by averaging the values of independent variables clustered together, and used them as final independent variables instead of origin variables, thereby reducing the dimension. We used decision tree, logistic regression, Bayesian network as algorithms for building prediction models. And also, we built prediction models using the bagging of each of them, the boosting of each of them, and RandomForest. Experimental results show that the prediction model using the bagging of decision tree yields the best accuracy of 89.3923%. This study has some limitations in that the additional variables which affect the deterioration of battery such as weather (temperature, humidity) and driving habits, did not considered, therefore, we would like to consider the them in the future research. However, the battery deterioration prediction model proposed in the present study is expected to enable effective and efficient management of battery used in the real filed by dramatically and to reduce the cost caused by not detecting battery deterioration accordingly.

A Research in Applying Big Data and Artificial Intelligence on Defense Metadata using Multi Repository Meta-Data Management (MRMM) (국방 빅데이터/인공지능 활성화를 위한 다중메타데이터 저장소 관리시스템(MRMM) 기술 연구)

  • Shin, Philip Wootaek;Lee, Jinhee;Kim, Jeongwoo;Shin, Dongsun;Lee, Youngsang;Hwang, Seung Ho
    • Journal of Internet Computing and Services
    • /
    • v.21 no.1
    • /
    • pp.169-178
    • /
    • 2020
  • The reductions of troops/human resources, and improvement in combat power have made Korean Department of Defense actively adapt 4th Industrial Revolution technology (Artificial Intelligence, Big Data). The defense information system has been developed in various ways according to the task and the uniqueness of each military. In order to take full advantage of the 4th Industrial Revolution technology, it is necessary to improve the closed defense datamanagement system.However, the establishment and usage of data standards in all information systems for the utilization of defense big data and artificial intelligence has limitations due to security issues, business characteristics of each military, anddifficulty in standardizing large-scale systems. Based on the interworking requirements of each system, data sharing is limited through direct linkage through interoperability agreement between systems. In order to implement smart defense using the 4th Industrial Revolution technology, it is urgent to prepare a system that can share defense data and make good use of it. To technically support the defense, it is critical to develop Multi Repository Meta-Data Management (MRMM) that supports systematic standard management of defense data that manages enterprise standard and standard mapping for each system and promotes data interoperability through linkage between standards which obeys the Defense Interoperability Management Development Guidelines. We introduced MRMM, and implemented by using vocabulary similarity using machine learning and statistical approach. Based on MRMM, We expect to simplify the standardization integration of all military databases using artificial intelligence and bigdata. This will lead to huge reduction of defense budget while increasing combat power for implementing smart defense.

IPC Multi-label Classification based on Functional Characteristics of Fields in Patent Documents (특허문서 필드의 기능적 특성을 활용한 IPC 다중 레이블 분류)

  • Lim, Sora;Kwon, YongJin
    • Journal of Internet Computing and Services
    • /
    • v.18 no.1
    • /
    • pp.77-88
    • /
    • 2017
  • Recently, with the advent of knowledge based society where information and knowledge make values, patents which are the representative form of intellectual property have become important, and the number of the patents follows growing trends. Thus, it needs to classify the patents depending on the technological topic of the invention appropriately in order to use a vast amount of the patent information effectively. IPC (International Patent Classification) is widely used for this situation. Researches about IPC automatic classification have been studied using data mining and machine learning algorithms to improve current IPC classification task which categorizes patent documents by hand. However, most of the previous researches have focused on applying various existing machine learning methods to the patent documents rather than considering on the characteristics of the data or the structure of patent documents. In this paper, therefore, we propose to use two structural fields, technical field and background, considered as having impacts on the patent classification, where the two field are selected by applying of the characteristics of patent documents and the role of the structural fields. We also construct multi-label classification model to reflect what a patent document could have multiple IPCs. Furthermore, we propose a method to classify patent documents at the IPC subclass level comprised of 630 categories so that we investigate the possibility of applying the IPC multi-label classification model into the real field. The effect of structural fields of patent documents are examined using 564,793 registered patents in Korea, and 87.2% precision is obtained in the case of using title, abstract, claims, technical field and background. From this sequence, we verify that the technical field and background have an important role in improving the precision of IPC multi-label classification in IPC subclass level.