• Title/Summary/Keyword: 데이터 분석성능

Search Result 5,934, Processing Time 0.031 seconds

Estimation of Chlorophyll-a Concentration in Nakdong River Using Machine Learning-Based Satellite Data and Water Quality, Hydrological, and Meteorological Factors (머신러닝 기반 위성영상과 수질·수문·기상 인자를 활용한 낙동강의 Chlorophyll-a 농도 추정)

  • Soryeon Park;Sanghun Son;Jaegu Bae;Doi Lee;Dongju Seo;Jinsoo Kim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.655-667
    • /
    • 2023
  • Algal bloom outbreaks are frequently reported around the world, and serious water pollution problems arise every year in Korea. It is necessary to protect the aquatic ecosystem through continuous management and rapid response. Many studies using satellite images are being conducted to estimate the concentration of chlorophyll-a (Chl-a), an indicator of algal bloom occurrence. However, machine learning models have recently been used because it is difficult to accurately calculate Chl-a due to the spectral characteristics and atmospheric correction errors that change depending on the water system. It is necessary to consider the factors affecting algal bloom as well as the satellite spectral index. Therefore, this study constructed a dataset by considering water quality, hydrological and meteorological factors, and sentinel-2 images in combination. Representative ensemble models random forest and extreme gradient boosting (XGBoost) were used to predict the concentration of Chl-a in eight weirs located on the Nakdong river over the past five years. R-squared score (R2), root mean square errors (RMSE), and mean absolute errors (MAE) were used as model evaluation indicators, and it was confirmed that R2 of XGBoost was 0.80, RMSE was 6.612, and MAE was 4.457. Shapley additive expansion analysis showed that water quality factors, suspended solids, biochemical oxygen demand, dissolved oxygen, and the band ratio using red edge bands were of high importance in both models. Various input data were confirmed to help improve model performance, and it seems that it can be applied to domestic and international algal bloom detection.

A Comparative Study On Accident Prediction Model Using Nonlinear Regression And Artificial Neural Network, Structural Equation for Rural 4-Legged Intersection (비선형 회귀분석, 인공신경망, 구조방정식을 이용한 지방부 4지 신호교차로 교통사고 예측모형 성능 비교 연구)

  • Oh, Ju Taek;Yun, Ilsoo;Hwang, Jeong Won;Han, Eum
    • Journal of Korean Society of Transportation
    • /
    • v.32 no.3
    • /
    • pp.266-279
    • /
    • 2014
  • For the evaluation of roadway safety, diverse methods, including before-after studies, simple comparison using historic traffic accident data, methods based on experts' opinion or literature, have been applied. Especially, many research efforts have developed traffic accident prediction models in order to identify critical elements causing accidents and evaluate the level of safety. A traffic accident prediction model must secure predictability and transferability. By acquiring the predictability, the model can increase the accuracy in predicting the frequency of accidents qualitatively and quantitatively. By guaranteeing the transferability, the model can be used for other locations with acceptable accuracy. To this end, traffic accident prediction models using non-linear regression, artificial neural network, and structural equation were developed in this study. The predictability and transferability of three models were compared using a model development data set collected from 90 signalized intersections and a model validation data set from other 33 signalized intersections based on mean absolute deviation and mean squared prediction error. As a result of the comparison using the model development data set, the artificial neural network showed the highest predictability. However, the non-linear regression model was found out to be most appropriate in the comparison using the model validation data set. Conclusively, the artificial neural network has a strong ability in representing the relationship between the frequency of traffic accidents and traffic and road design elements. However, the predictability of the artificial neural network significantly decreased when the artificial neural network was applied to a new data which was not used in the model developing.

The Development of Quality Assurance Program for CyberKnife (사이버나이프의 품질관리 절차서 개발)

  • Jang, Ji-Sun;Kang, Young-Nam;Shin, Dong-Oh;Kim, Moon-Chan;Yoon, Sei-Chul;Choi, Ihl-Bohng;Kim, Mi-Sook;Cho, Chul-Koo;Yoo, Seong-Yul;Kwon, Soo-Il;Lee, Dong-Han
    • Radiation Oncology Journal
    • /
    • v.24 no.3
    • /
    • pp.185-191
    • /
    • 2006
  • [ $\underline{Purpose}$ ]: Standardization quality assurance (QA) program of CyberKnife for suitable circumstances in Korea has not been established. In this research, we investigated the development of QA program for CyberKnife and evaluation of the feasibility under applications. $\underline{Materials\;and\;Methods}$: Considering the feature of constitution for systems and the therapeutic methodology of CyberKnife, the list of quality control (QC) was established and divided dependent on the each period of operations. And then all these developed QC lists were categorized into three groups such as basic QC, delivery specific QC, and patient specific QC based on the each purpose of QA. In order to verify the validity of the established QA program, this QC lists was applied to two CyberKnife centers. The acceptable tolerance was based on the undertaking inspection list from the CyberKnife manufacturer and the QC results during last three years of two CyberKnife centers in Korea. The acquired measurement results were evaluated for the analysis of the current QA status and the verification of the propriety for the developed QA program. $\underline{Results}$: The current QA status of two CyberKnife centers was evaluated from the accuracy of all measurements in relation with application of the established QA program. Each measurement result was verified having a good agreement within the acceptable tolerance limit of the developed QA program. $\underline{Conclusion}$: It is considered that the developed QA program in this research could be established the standardization of QC methods for CyberKnife and confirmed the accuracy and stability for the image-guided stereotactic radiotherapy.

Characteristics of Indoor Particulate Matter Concentrations by Size at an Apartment House During Dusty-Day (황사 발생시 아파트 실내에서 미세먼지 크기별 농도 특성)

  • Joo, Sang-Woo;Ji, Jun-Ho
    • Particle and aerosol research
    • /
    • v.15 no.1
    • /
    • pp.37-44
    • /
    • 2019
  • It is recommended for the public to stay at home and to close the doors and windows when a high-particulate-matter environment such as a yellow sand event occurs outside. However, there are lack of empirical studies describing how much outdoor PM infiltrates into a closed house and how much indoor PM an inhabitant is exposed to during the period. In this study, the $PM_{10}$ and $PM_{2.5}$ were measured at the kitchen in an apartment house by an optical particle counter for 3 days including a yellow sand event. The outdoor PMs and the outdoor wind speeds were referred from surrounding weather stations. We analyzed the penetration of $PM_{10-2.5}$ and $PM_{2.5}$ at the test house against the outdoor wind speed supposed corresponding to the change of air exchange rate. In addition, the effect of an indoor activity on change in the indoor PM was investigated. In result, the indoor $PM_{10-2.5}$ was very low even a yellow sand event occurred outside; rather, a contribution of indoor activities to increase in $PM_{10-2.5}$ was higher. In contrast, the indoor $PM_{2.5}$ fluctuated following the outdoor $PM_{2.5}$ trend at high wind speeds or remained almost constant at low wind speed.

Parallel Processing of Satellite Images using CUDA Library: Focused on NDVI Calculation (CUDA 라이브러리를 이용한 위성영상 병렬처리 : NDVI 연산을 중심으로)

  • LEE, Kang-Hun;JO, Myung-Hee;LEE, Won-Hee
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.3
    • /
    • pp.29-42
    • /
    • 2016
  • Remote sensing allows acquisition of information across a large area without contacting objects, and has thus been rapidly developed by application to different areas. Thus, with the development of remote sensing, satellites are able to rapidly advance in terms of their image resolution. As a result, satellites that use remote sensing have been applied to conduct research across many areas of the world. However, while research on remote sensing is being implemented across various areas, research on data processing is presently insufficient; that is, as satellite resources are further developed, data processing continues to lag behind. Accordingly, this paper discusses plans to maximize the performance of satellite image processing by utilizing the CUDA(Compute Unified Device Architecture) Library of NVIDIA, a parallel processing technique. The discussion in this paper proceeds as follows. First, standard KOMPSAT(Korea Multi-Purpose Satellite) images of various sizes are subdivided into five types. NDVI(Normalized Difference Vegetation Index) is implemented to the subdivided images. Next, ArcMap and the two techniques, each based on CPU or GPU, are used to implement NDVI. The histograms of each image are then compared after each implementation to analyze the different processing speeds when using CPU and GPU. The results indicate that both the CPU version and GPU version images are equal with the ArcMap images, and after the histogram comparison, the NDVI code was correctly implemented. In terms of the processing speed, GPU showed 5 times faster results than CPU. Accordingly, this research shows that a parallel processing technique using CUDA Library can enhance the data processing speed of satellites images, and that this data processing benefits from multiple advanced remote sensing techniques as compared to a simple pixel computation like NDVI.

A Dynamic Prefetch Filtering Schemes to Enhance Usefulness Of Cache Memory (캐시 메모리의 유용성을 높이는 동적 선인출 필터링 기법)

  • Chon Young-Suk;Lee Byung-Kwon;Lee Chun-Hee;Kim Suk-Il;Jeon Joong-Nam
    • The KIPS Transactions:PartA
    • /
    • v.13A no.2 s.99
    • /
    • pp.123-136
    • /
    • 2006
  • The prefetching technique is an effective way to reduce the latency caused memory access. However, excessively aggressive prefetch not only leads to cache pollution so as to cancel out the benefits of prefetch but also increase bus traffic leading to overall performance degradation. In this thesis, a prefetch filtering scheme is proposed which dynamically decides whether to commence prefetching by referring a filtering table to reduce the cache pollution due to unnecessary prefetches In this thesis, First, prefetch hashing table 1bitSC filtering scheme(PHT1bSC) has been shown to analyze the lock problem of the conventional scheme, this scheme such as conventional scheme used to be N:1 mapping, but it has the two state to 1bit value of each entries. A complete block address table filtering scheme(CBAT) has been introduced to be used as a reference for the comparative study. A prefetch block address lookup table scheme(PBALT) has been proposed as the main idea of this paper which exhibits the most exact filtering performance. This scheme has a length of the table the same as the PHT1bSC scheme, the contents of each entry have the fields the same as CBAT scheme recently, never referenced data block address has been 1:1 mapping a entry of the filter table. On commonly used prefetch schemes and general benchmarks and multimedia programs simulates change cache parameters. The PBALT scheme compared with no filtering has shown enhanced the greatest 22%, the cache miss ratio has been decreased by 7.9% by virtue of enhanced filtering accuracy compared with conventional PHT2bSC. The MADT of the proposed PBALT scheme has been decreased by 6.1% compared with conventional schemes to reduce the total execution time.

Automatic Speech Style Recognition Through Sentence Sequencing for Speaker Recognition in Bilateral Dialogue Situations (양자 간 대화 상황에서의 화자인식을 위한 문장 시퀀싱 방법을 통한 자동 말투 인식)

  • Kang, Garam;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.2
    • /
    • pp.17-32
    • /
    • 2021
  • Speaker recognition is generally divided into speaker identification and speaker verification. Speaker recognition plays an important function in the automatic voice system, and the importance of speaker recognition technology is becoming more prominent as the recent development of portable devices, voice technology, and audio content fields continue to expand. Previous speaker recognition studies have been conducted with the goal of automatically determining who the speaker is based on voice files and improving accuracy. Speech is an important sociolinguistic subject, and it contains very useful information that reveals the speaker's attitude, conversation intention, and personality, and this can be an important clue to speaker recognition. The final ending used in the speaker's speech determines the type of sentence or has functions and information such as the speaker's intention, psychological attitude, or relationship to the listener. The use of the terminating ending has various probabilities depending on the characteristics of the speaker, so the type and distribution of the terminating ending of a specific unidentified speaker will be helpful in recognizing the speaker. However, there have been few studies that considered speech in the existing text-based speaker recognition, and if speech information is added to the speech signal-based speaker recognition technique, the accuracy of speaker recognition can be further improved. Hence, the purpose of this paper is to propose a novel method using speech style expressed as a sentence-final ending to improve the accuracy of Korean speaker recognition. To this end, a method called sentence sequencing that generates vector values by using the type and frequency of the sentence-final ending appearing in the utterance of a specific person is proposed. To evaluate the performance of the proposed method, learning and performance evaluation were conducted with a actual drama script. The method proposed in this study can be used as a means to improve the performance of Korean speech recognition service.

Extension Method of Association Rules Using Social Network Analysis (사회연결망 분석을 활용한 연관규칙 확장기법)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.111-126
    • /
    • 2017
  • Recommender systems based on association rule mining significantly contribute to seller's sales by reducing consumers' time to search for products that they want. Recommendations based on the frequency of transactions such as orders can effectively screen out the products that are statistically marketable among multiple products. A product with a high possibility of sales, however, can be omitted from the recommendation if it records insufficient number of transactions at the beginning of the sale. Products missing from the associated recommendations may lose the chance of exposure to consumers, which leads to a decline in the number of transactions. In turn, diminished transactions may create a vicious circle of lost opportunity to be recommended. Thus, initial sales are likely to remain stagnant for a certain period of time. Products that are susceptible to fashion or seasonality, such as clothing, may be greatly affected. This study was aimed at expanding association rules to include into the list of recommendations those products whose initial trading frequency of transactions is low despite the possibility of high sales. The particular purpose is to predict the strength of the direct connection of two unconnected items through the properties of the paths located between them. An association between two items revealed in transactions can be interpreted as the interaction between them, which can be expressed as a link in a social network whose nodes are items. The first step calculates the centralities of the nodes in the middle of the paths that indirectly connect the two nodes without direct connection. The next step identifies the number of the paths and the shortest among them. These extracts are used as independent variables in the regression analysis to predict future connection strength between the nodes. The strength of the connection between the two nodes of the model, which is defined by the number of nodes between the two nodes, is measured after a certain period of time. The regression analysis results confirm that the number of paths between the two products, the distance of the shortest path, and the number of neighboring items connected to the products are significantly related to their potential strength. This study used actual order transaction data collected for three months from February to April in 2016 from an online commerce company. To reduce the complexity of analytics as the scale of the network grows, the analysis was performed only on miscellaneous goods. Two consecutively purchased items were chosen from each customer's transactions to obtain a pair of antecedent and consequent, which secures a link needed for constituting a social network. The direction of the link was determined in the order in which the goods were purchased. Except for the last ten days of the data collection period, the social network of associated items was built for the extraction of independent variables. The model predicts the number of links to be connected in the next ten days from the explanatory variables. Of the 5,711 previously unconnected links, 611 were newly connected for the last ten days. Through experiments, the proposed model demonstrated excellent predictions. Of the 571 links that the proposed model predicts, 269 were confirmed to have been connected. This is 4.4 times more than the average of 61, which can be found without any prediction model. This study is expected to be useful regarding industries whose new products launch quickly with short life cycles, since their exposure time is critical. Also, it can be used to detect diseases that are rarely found in the early stages of medical treatment because of the low incidence of outbreaks. Since the complexity of the social networking analysis is sensitive to the number of nodes and links that make up the network, this study was conducted in a particular category of miscellaneous goods. Future research should consider that this condition may limit the opportunity to detect unexpected associations between products belonging to different categories of classification.

How to improve the accuracy of recommendation systems: Combining ratings and review texts sentiment scores (평점과 리뷰 텍스트 감성분석을 결합한 추천시스템 향상 방안 연구)

  • Hyun, Jiyeon;Ryu, Sangyi;Lee, Sang-Yong Tom
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.219-239
    • /
    • 2019
  • As the importance of providing customized services to individuals becomes important, researches on personalized recommendation systems are constantly being carried out. Collaborative filtering is one of the most popular systems in academia and industry. However, there exists limitation in a sense that recommendations were mostly based on quantitative information such as users' ratings, which made the accuracy be lowered. To solve these problems, many studies have been actively attempted to improve the performance of the recommendation system by using other information besides the quantitative information. Good examples are the usages of the sentiment analysis on customer review text data. Nevertheless, the existing research has not directly combined the results of the sentiment analysis and quantitative rating scores in the recommendation system. Therefore, this study aims to reflect the sentiments shown in the reviews into the rating scores. In other words, we propose a new algorithm that can directly convert the user 's own review into the empirically quantitative information and reflect it directly to the recommendation system. To do this, we needed to quantify users' reviews, which were originally qualitative information. In this study, sentiment score was calculated through sentiment analysis technique of text mining. The data was targeted for movie review. Based on the data, a domain specific sentiment dictionary is constructed for the movie reviews. Regression analysis was used as a method to construct sentiment dictionary. Each positive / negative dictionary was constructed using Lasso regression, Ridge regression, and ElasticNet methods. Based on this constructed sentiment dictionary, the accuracy was verified through confusion matrix. The accuracy of the Lasso based dictionary was 70%, the accuracy of the Ridge based dictionary was 79%, and that of the ElasticNet (${\alpha}=0.3$) was 83%. Therefore, in this study, the sentiment score of the review is calculated based on the dictionary of the ElasticNet method. It was combined with a rating to create a new rating. In this paper, we show that the collaborative filtering that reflects sentiment scores of user review is superior to the traditional method that only considers the existing rating. In order to show that the proposed algorithm is based on memory-based user collaboration filtering, item-based collaborative filtering and model based matrix factorization SVD, and SVD ++. Based on the above algorithm, the mean absolute error (MAE) and the root mean square error (RMSE) are calculated to evaluate the recommendation system with a score that combines sentiment scores with a system that only considers scores. When the evaluation index was MAE, it was improved by 0.059 for UBCF, 0.0862 for IBCF, 0.1012 for SVD and 0.188 for SVD ++. When the evaluation index is RMSE, UBCF is 0.0431, IBCF is 0.0882, SVD is 0.1103, and SVD ++ is 0.1756. As a result, it can be seen that the prediction performance of the evaluation point reflecting the sentiment score proposed in this paper is superior to that of the conventional evaluation method. In other words, in this paper, it is confirmed that the collaborative filtering that reflects the sentiment score of the user review shows superior accuracy as compared with the conventional type of collaborative filtering that only considers the quantitative score. We then attempted paired t-test validation to ensure that the proposed model was a better approach and concluded that the proposed model is better. In this study, to overcome limitations of previous researches that judge user's sentiment only by quantitative rating score, the review was numerically calculated and a user's opinion was more refined and considered into the recommendation system to improve the accuracy. The findings of this study have managerial implications to recommendation system developers who need to consider both quantitative information and qualitative information it is expect. The way of constructing the combined system in this paper might be directly used by the developers.

End to End Model and Delay Performance for V2X in 5G (5G에서 V2X를 위한 End to End 모델 및 지연 성능 평가)

  • Bae, Kyoung Yul;Lee, Hong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.107-118
    • /
    • 2016
  • The advent of 5G mobile communications, which is expected in 2020, will provide many services such as Internet of Things (IoT) and vehicle-to-infra/vehicle/nomadic (V2X) communication. There are many requirements to realizing these services: reduced latency, high data rate and reliability, and real-time service. In particular, a high level of reliability and delay sensitivity with an increased data rate are very important for M2M, IoT, and Factory 4.0. Around the world, 5G standardization organizations have considered these services and grouped them to finally derive the technical requirements and service scenarios. The first scenario is broadcast services that use a high data rate for multiple cases of sporting events or emergencies. The second scenario is as support for e-Health, car reliability, etc.; the third scenario is related to VR games with delay sensitivity and real-time techniques. Recently, these groups have been forming agreements on the requirements for such scenarios and the target level. Various techniques are being studied to satisfy such requirements and are being discussed in the context of software-defined networking (SDN) as the next-generation network architecture. SDN is being used to standardize ONF and basically refers to a structure that separates signals for the control plane from the packets for the data plane. One of the best examples for low latency and high reliability is an intelligent traffic system (ITS) using V2X. Because a car passes a small cell of the 5G network very rapidly, the messages to be delivered in the event of an emergency have to be transported in a very short time. This is a typical example requiring high delay sensitivity. 5G has to support a high reliability and delay sensitivity requirements for V2X in the field of traffic control. For these reasons, V2X is a major application of critical delay. V2X (vehicle-to-infra/vehicle/nomadic) represents all types of communication methods applicable to road and vehicles. It refers to a connected or networked vehicle. V2X can be divided into three kinds of communications. First is the communication between a vehicle and infrastructure (vehicle-to-infrastructure; V2I). Second is the communication between a vehicle and another vehicle (vehicle-to-vehicle; V2V). Third is the communication between a vehicle and mobile equipment (vehicle-to-nomadic devices; V2N). This will be added in the future in various fields. Because the SDN structure is under consideration as the next-generation network architecture, the SDN architecture is significant. However, the centralized architecture of SDN can be considered as an unfavorable structure for delay-sensitive services because a centralized architecture is needed to communicate with many nodes and provide processing power. Therefore, in the case of emergency V2X communications, delay-related control functions require a tree supporting structure. For such a scenario, the architecture of the network processing the vehicle information is a major variable affecting delay. Because it is difficult to meet the desired level of delay sensitivity with a typical fully centralized SDN structure, research on the optimal size of an SDN for processing information is needed. This study examined the SDN architecture considering the V2X emergency delay requirements of a 5G network in the worst-case scenario and performed a system-level simulation on the speed of the car, radius, and cell tier to derive a range of cells for information transfer in SDN network. In the simulation, because 5G provides a sufficiently high data rate, the information for neighboring vehicle support to the car was assumed to be without errors. Furthermore, the 5G small cell was assumed to have a cell radius of 50-100 m, and the maximum speed of the vehicle was considered to be 30-200 km/h in order to examine the network architecture to minimize the delay.