• Title/Summary/Keyword: Input Variable Importance

Search Result 47, Processing Time 0.022 seconds

Estimation of Fractional Urban Tree Canopy Cover through Machine Learning Using Optical Satellite Images (기계학습을 이용한 광학 위성 영상 기반의 도시 내 수목 피복률 추정)

  • Sejeong Bae ;Bokyung Son ;Taejun Sung ;Yeonsu Lee ;Jungho Im ;Yoojin Kang
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.1009-1029
    • /
    • 2023
  • Urban trees play a vital role in urban ecosystems,significantly reducing impervious surfaces and impacting carbon cycling within the city. Although previous research has demonstrated the efficacy of employing artificial intelligence in conjunction with airborne light detection and ranging (LiDAR) data to generate urban tree information, the availability and cost constraints associated with LiDAR data pose limitations. Consequently, this study employed freely accessible, high-resolution multispectral satellite imagery (i.e., Sentinel-2 data) to estimate fractional tree canopy cover (FTC) within the urban confines of Suwon, South Korea, employing machine learning techniques. This study leveraged a median composite image derived from a time series of Sentinel-2 images. In order to account for the diverse land cover found in urban areas, the model incorporated three types of input variables: average (mean) and standard deviation (std) values within a 30-meter grid from 10 m resolution of optical indices from Sentinel-2, and fractional coverage for distinct land cover classes within 30 m grids from the existing level 3 land cover map. Four schemes with different combinations of input variables were compared. Notably, when all three factors (i.e., mean, std, and fractional cover) were used to consider the variation of landcover in urban areas(Scheme 4, S4), the machine learning model exhibited improved performance compared to using only the mean of optical indices (Scheme 1). Of the various models proposed, the random forest (RF) model with S4 demonstrated the most remarkable performance, achieving R2 of 0.8196, and mean absolute error (MAE) of 0.0749, and a root mean squared error (RMSE) of 0.1022. The std variable exhibited the highest impact on model outputs within the heterogeneous land covers based on the variable importance analysis. This trained RF model with S4 was then applied to the entire Suwon region, consistently delivering robust results with an R2 of 0.8702, MAE of 0.0873, and RMSE of 0.1335. The FTC estimation method developed in this study is expected to offer advantages for application in various regions, providing fundamental data for a better understanding of carbon dynamics in urban ecosystems in the future.

Export Prediction Using Separated Learning Method and Recommendation of Potential Export Countries (분리학습 모델을 이용한 수출액 예측 및 수출 유망국가 추천)

  • Jang, Yeongjin;Won, Jongkwan;Lee, Chaerok
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.69-88
    • /
    • 2022
  • One of the characteristics of South Korea's economic structure is that it is highly dependent on exports. Thus, many businesses are closely related to the global economy and diplomatic situation. In addition, small and medium-sized enterprises(SMEs) specialized in exporting are struggling due to the spread of COVID-19. Therefore, this study aimed to develop a model to forecast exports for next year to support SMEs' export strategy and decision making. Also, this study proposed a strategy to recommend promising export countries of each item based on the forecasting model. We analyzed important variables used in previous studies such as country-specific, item-specific, and macro-economic variables and collected those variables to train our prediction model. Next, through the exploratory data analysis(EDA) it was found that exports, which is a target variable, have a highly skewed distribution. To deal with this issue and improve predictive performance, we suggest a separated learning method. In a separated learning method, the whole dataset is divided into homogeneous subgroups and a prediction algorithm is applied to each group. Thus, characteristics of each group can be more precisely trained using different input variables and algorithms. In this study, we divided the dataset into five subgroups based on the exports to decrease skewness of the target variable. After the separation, we found that each group has different characteristics in countries and goods. For example, In Group 1, most of the exporting countries are developing countries and the majority of exporting goods are low value products such as glass and prints. On the other hand, major exporting countries of South Korea such as China, USA, and Vietnam are included in Group 4 and Group 5 and most exporting goods in these groups are high value products. Then we used LightGBM(LGBM) and Exponential Moving Average(EMA) for prediction. Considering the characteristics of each group, models were built using LGBM for Group 1 to 4 and EMA for Group 5. To evaluate the performance of the model, we compare different model structures and algorithms. As a result, it was found that the separated learning model had best performance compared to other models. After the model was built, we also provided variable importance of each group using SHAP-value to add explainability of our model. Based on the prediction model, we proposed a second-stage recommendation strategy for potential export countries. In the first phase, BCG matrix was used to find Star and Question Mark markets that are expected to grow rapidly. In the second phase, we calculated scores for each country and recommendations were made according to ranking. Using this recommendation framework, potential export countries were selected and information about those countries for each item was presented. There are several implications of this study. First of all, most of the preceding studies have conducted research on the specific situation or country. However, this study use various variables and develops a machine learning model for a wide range of countries and items. Second, as to our knowledge, it is the first attempt to adopt a separated learning method for exports prediction. By separating the dataset into 5 homogeneous subgroups, we could enhance the predictive performance of the model. Also, more detailed explanation of models by group is provided using SHAP values. Lastly, this study has several practical implications. There are some platforms which serve trade information including KOTRA, but most of them are based on past data. Therefore, it is not easy for companies to predict future trends. By utilizing the model and recommendation strategy in this research, trade related services in each platform can be improved so that companies including SMEs can fully utilize the service when making strategies and decisions for exports.

A Spatial Analysis about Arrival Delay and Dispatch Distribution of the 119 Rescue-Aid Service utilizing GIS - Gyeongsangbuk-Do Case Study - (GIS를 활용한 119 구조구급서비스의 도착지체 및 출동배치에 대한 공간분석 - 경상북도 사례 연구 -)

  • Oh, Chang-Seok;Lee, Seungwon;Lee, Inmook;Kho, Seung-Young
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.32 no.1D
    • /
    • pp.13-22
    • /
    • 2012
  • The 119 emergency rescue-aid service operated by Korean government is a very valuable in a society and its importance is growing in Korea as an aging society. Especially, the emergency vehicle's arrival time to accidents place is an important variable which affects initial emergency measure for patients and it depends on the road network attributes, such as emergency service station's location, accessibility to accidents place and so on. This study aims to analysis the emergency vehicles' arrival delay and the dispatch station in the viewpoint of efficiency utilizing the real rescue-aid activity data. We analyzed the dispatch distribution of the emergency rescue-aid service at first. And we analyzed high accident rate locations not involved in the fixed radius of rescue-aid service stations and display GIS map showing regions have been delayed. The input data of the road network speed is based on the KTDB (Korea Transportation Database) and historical rescue-aid data is from Gyeongsangbuk-do's fire service headquarters.

The Development of 63nm Diode Laser System for Photodynamic Therapy of Cancer (광역학적 암치료를 위한 635nm 다이오드 레이저 시스템 개발)

  • 임현수
    • Journal of Biomedical Engineering Research
    • /
    • v.24 no.4
    • /
    • pp.319-328
    • /
    • 2003
  • The purpose of this paper is to develop a medical laser system using the semiconductor diode laser in order to photodynamic cancel therapy as a light source. The ideal light source for photodynamic therapy would be a homogeneous nondiverging light with variable spot size and specific wavelength with stability. After due consideration in this point, in this paper, we used a diode laser resonator of 635nm wavelength. The development laser system have a statistical laser out beam with accuracy control using the constant current control of method and clinic-friendly with compact. In order to protect the diode resonator from the over-current, the rush-current and electrical fault, we specially designed. The most importance therapeutic factor are the radiation mode for cancer therapy. So we developed the radiation mode of CW(Continuous Wave), long pulse, short pulse, and burst pulse and can adjust the exposure time from several milli-second to several minute. The experimental result shows that laser beam power was increased linear from 10mW to 300mW according to the increasing input current and the increasing exposure time. The developed new compact diode laser system have a stability of output power and specific wavelength with easy control and transportable for many applications of PDT.

WQI Class Prediction of Sihwa Lake Using Machine Learning-Based Models (기계학습 기반 모델을 활용한 시화호의 수질평가지수 등급 예측)

  • KIM, SOO BIN;LEE, JAE SEONG;KIM, KYUNG TAE
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.27 no.2
    • /
    • pp.71-86
    • /
    • 2022
  • The water quality index (WQI) has been widely used to evaluate marine water quality. The WQI in Korea is categorized into five classes by marine environmental standards. But, the WQI calculation on huge datasets is a very complex and time-consuming process. In this regard, the current study proposed machine learning (ML) based models to predict WQI class by using water quality datasets. Sihwa Lake, one of specially-managed coastal zone, was selected as a modeling site. In this study, adaptive boosting (AdaBoost) and tree-based pipeline optimization (TPOT) algorithms were used to train models and each model performance was evaluated by metrics (accuracy, precision, F1, and Log loss) on classification. Before training, the feature importance and sensitivity analysis were conducted to find out the best input combination for each algorithm. The results proved that the bottom dissolved oxygen (DOBot) was the most important variable affecting model performance. Conversely, surface dissolved inorganic nitrogen (DINSur) and dissolved inorganic phosphorus (DIPSur) had weaker effects on the prediction of WQI class. In addition, the performance varied over features including stations, seasons, and WQI classes by comparing spatio-temporal and class sensitivities of each best model. In conclusion, the modeling results showed that the TPOT algorithm has better performance rather than the AdaBoost algorithm without considering feature selection. Moreover, the WQI class for unknown water quality datasets could be surely predicted using the TPOT model trained with satisfactory training datasets.

Temporal distritution analysis of design rainfall by significance test of regression coefficients (회귀계수의 유의성 검정방법에 따른 설계강우량 시간분포 분석)

  • Park, Jin Heea;Lee, Jae Joon
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.4
    • /
    • pp.257-266
    • /
    • 2022
  • Inundation damage is increasing every year due to localized heavy rain and an increase of rainfall exceeding the design frequency. Accordingly, the importance of hydraulic structures for flood control and defense is also increasing. The hydraulic structures are designed according to its purpose and performance, and the amount of flood is an important calculation factor. However, in Korea, design rainfall is used as input data for hydrological analysis for the design of hydraulic structures due to the lack of sufficient data and the lack of reliability of observation data. Accurate probability rainfall and its temporal distribution are important factors to estimate the design rainfall. In practice, the regression equation of temporal distribution for the design rainfall is calculated using the cumulative rainfall percentage of Huff's quartile method. In addition, the 6th order polynomial regression equation which shows high overall accuracy, is uniformly used. In this study, the optimized regression equation of temporal distribution is derived using the variable selection method according to the principle of parsimony in statistical modeling. The derived regression equation of temporal distribution is verified through the significance test. As a result of this study, it is most appropriate to derive the regression equation of temporal distribution using the stepwise selection method, which has the advantages of both forward selection and backward elimination.

Development of the forecasting model for import volume by item of major countries based on economic, industrial structural and cultural factors: Focusing on the cultural factors of Korea (경제적, 산업구조적, 문화적 요인을 기반으로 한 주요 국가의 한국 품목별 수입액 예측 모형 개발: 한국의, 한국에 대한 문화적 요인을 중심으로)

  • Jun, Seung-pyo;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.23-48
    • /
    • 2021
  • The Korean economy has achieved continuous economic growth for the past several decades thanks to the government's export strategy policy. This increase in exports is playing a leading role in driving Korea's economic growth by improving economic efficiency, creating jobs, and promoting technology development. Traditionally, the main factors affecting Korea's exports can be found from two perspectives: economic factors and industrial structural factors. First, economic factors are related to exchange rates and global economic fluctuations. The impact of the exchange rate on Korea's exports depends on the exchange rate level and exchange rate volatility. Global economic fluctuations affect global import demand, which is an absolute factor influencing Korea's exports. Second, industrial structural factors are unique characteristics that occur depending on industries or products, such as slow international division of labor, increased domestic substitution of certain imported goods by China, and changes in overseas production patterns of major export industries. Looking at the most recent studies related to global exchanges, several literatures show the importance of cultural aspects as well as economic and industrial structural factors. Therefore, this study attempted to develop a forecasting model by considering cultural factors along with economic and industrial structural factors in calculating the import volume of each country from Korea. In particular, this study approaches the influence of cultural factors on imports of Korean products from the perspective of PUSH-PULL framework. The PUSH dimension is a perspective that Korea develops and actively promotes its own brand and can be defined as the degree of interest in each country for Korean brands represented by K-POP, K-FOOD, and K-CULTURE. In addition, the PULL dimension is a perspective centered on the cultural and psychological characteristics of the people of each country. This can be defined as how much they are inclined to accept Korean Flow as each country's cultural code represented by the country's governance system, masculinity, risk avoidance, and short-term/long-term orientation. The unique feature of this study is that the proposed final prediction model can be selected based on Design Principles. The design principles we presented are as follows. 1) A model was developed to reflect interest in Korea and cultural characteristics through newly added data sources. 2) It was designed in a practical and convenient way so that the forecast value can be immediately recalled by inputting changes in economic factors, item code and country code. 3) In order to derive theoretically meaningful results, an algorithm was selected that can interpret the relationship between the input and the target variable. This study can suggest meaningful implications from the technical, economic and policy aspects, and is expected to make a meaningful contribution to the export support strategies of small and medium-sized enterprises by using the import forecasting model.