Search | Korea Science

Research on Optimization Strategies for Random Forest Algorithms in Federated Learning Environments (연합 학습 환경에서의 랜덤 포레스트 알고리즘 최적화 전략 연구)

InSeo Song;KangYoon Lee
- The Journal of Bigdata
- /
- v.9 no.1
- /
- pp.101-113
- /
- 2024
Federated learning has garnered attention as an efficient method for training machine learning models in a distributed environment while maintaining data privacy and security. This study proposes a novel FedRFBagging algorithm to optimize the performance of random forest models in such federated learning environments. By dynamically adjusting the trees of local random forest models based on client-specific data characteristics, the proposed approach reduces communication costs and achieves high prediction accuracy even in environments with numerous clients. This method adapts to various data conditions, significantly enhancing model stability and training speed. While random forest models consist of multiple decision trees, transmitting all trees to the server in a federated learning environment results in exponentially increasing communication overhead, making their use impractical. Additionally, differences in data distribution among clients can lead to quality imbalances in the trees. To address this, the FedRFBagging algorithm selects only the highest-performing trees from each client for transmission to the server, which then reselects trees based on impurity values to construct the optimal global model. This reduces communication overhead and maintains high prediction performance across diverse data distributions. Although the global model reflects data from various clients, the data characteristics of each client may differ. To compensate for this, clients further train additional trees on the global model to perform local optimizations tailored to their data. This improves the overall model's prediction accuracy and adapts to changing data distributions. Our study demonstrates that the FedRFBagging algorithm effectively addresses the communication cost and performance issues associated with random forest models in federated learning environments, suggesting its applicability in such settings.
https://doi.org/10.36498/kbigdt.2024.9.1.101 인용 PDF

Performance Evaluation of Vision Transformer-based Pneumonia Detection Model using Chest X-ray Images (흉부 X-선 영상을 이용한 Vision transformer 기반 폐렴 진단 모델의 성능 평가)

Junyong Chang;Youngeun Choi;Seungwan Lee
- Journal of the Korean Society of Radiology
- /
- v.18 no.5
- /
- pp.541-549
- /
- 2024
The various structures of artificial neural networks, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have been extensively studied and served as the backbone of numerous models. Among these, a transformer architecture has demonstrated its potential for natural language processing and become a subject of in-depth research. Currently, the techniques can be adapted for image processing through the modifications of its internal structure, leading to the development of Vision transformer (ViT) models. The ViTs have shown high accuracy and performance with large data-sets. This study aims to develop a ViT-based model for detecting pneumonia using chest X-ray images and quantitatively evaluate its performance. The various architectures of the ViT-based model were constructed by varying the number of encoder blocks, and different patch sizes were applied for network training. Also, the performance of the ViT-based model was compared to the CNN-based models, such as VGGNet, GoogLeNet, and ResNet. The results showed that the traninig efficiency and accuracy of the ViT-based model depended on the number of encoder blocks and the patch size, and the F1 scores of the ViT-based model ranged from 0.875 to 0.919. The training effeciency of the ViT-based model with a large patch size was superior to the CNN-based models, and the pneumonia detection accuracy of the ViT-based model was higher than that of the VGGNet. In conclusion, the ViT-based model can be potentially used for pneumonia detection using chest X-ray images, and the clinical availability of the ViT-based model would be improved by this study.
https://doi.org/10.7742/jksr.2024.18.5.541 인용 PDF HTML

Prediction of Spring Flowering Timing in Forested Area in 2023 (산림지역에서의 2023년 봄철 꽃나무 개화시기 예측)

Jihee Seo;Sukyung Kim;Hyun Seok Kim;Junghwa Chun;Myoungsoo Won;Keunchang Jang
- Korean Journal of Agricultural and Forest Meteorology
- /
- v.25 no.4
- /
- pp.427-435
- /
- 2023
Changes in flowering time due to weather fluctuations impact plant growth and ecosystem dynamics. Accurate prediction of flowering timing is crucial for effective forest ecosystem management. This study uses a process-based model to predict flowering timing in 2023 for five major tree species in Korean forests. Models are developed based on nine years (2009-2017) of flowering data for Abeliophyllum distichum, Robinia pseudoacacia, Rhododendron schlippenbachii, Rhododendron yedoense f. poukhanense, and Sorbus commixta, distributed across 28 regions in the country, including mountains. Weather data from the Automatic Mountain Meteorology Observation System (AMOS) and the Korea Meteorological Administration (KMA) are utilized as inputs for the models. The Single Triangle Degree Days (STDD) and Growing Degree Days (GDD) models, known for their superior performance, are employed to predict flowering dates. Daily temperature readings at a 1 km spatial resolution are obtained by merging AMOS and KMA data. To improve prediction accuracy nationwide, random forest machine learning is used to generate region-specific correction coefficients. Applying these coefficients results in minimal prediction errors, particularly for Abeliophyllum distichum, Robinia pseudoacacia, and Rhododendron schlippenbachii, with root mean square errors (RMSEs) of 1.2, 0.6, and 1.2 days, respectively. Model performance is evaluated using ten random sampling tests per species, selecting the model with the highest R². The models with applied correction coefficients achieve R² values ranging from 0.07 to 0.7, except for Sorbus commixta, and exhibit a final explanatory power of 0.75-0.9. This study provides valuable insights into seasonal changes in plant phenology, aiding in identifying honey harvesting seasons affected by abnormal weather conditions, such as those of Robinia pseudoacacia. Detailed information on flowering timing for various plant species and regions enhances understanding of the climate-plant phenology relationship.
https://doi.org/10.5532/KJAFM.2023.25.4.427 인용 PDF

Estimation of Ground-level PM₁₀ and PM_2.5 Concentrations Using Boosting-based Machine Learning from Satellite and Numerical Weather Prediction Data (부스팅 기반 기계학습기법을 이용한 지상 미세먼지 농도 산출)

Park, Seohui;Kim, Miae;Im, Jungho
- Korean Journal of Remote Sensing
- /
- v.37 no.2
- /
- pp.321-335
- /
- 2021
Particulate matter (PM10 and PM2.5 with a diameter less than 10 and 2.5 ㎛, respectively) can be absorbed by the human body and adversely affect human health. Although most of the PM monitoring are based on ground-based observations, they are limited to point-based measurement sites, which leads to uncertainty in PM estimation for regions without observation sites. It is possible to overcome their spatial limitation by using satellite data. In this study, we developed machine learning-based retrieval algorithm for ground-level PM10 and PM2.5 concentrations using aerosol parameters from Geostationary Ocean Color Imager (GOCI) satellite and various meteorological parameters from a numerical weather prediction model during January to December of 2019. Gradient Boosted Regression Trees (GBRT) and Light Gradient Boosting Machine (LightGBM) were used to estimate PM concentrations. The model performances were examined for two types of feature sets-all input parameters (Feature set 1) and a subset of input parameters without meteorological and land-cover parameters (Feature set 2). Both models showed higher accuracy (about 10 % higher in R2) by using the Feature set 1 than the Feature set 2. The GBRT model using Feature set 1 was chosen as the final model for further analysis(PM10: R2 = 0.82, nRMSE = 34.9 %, PM2.5: R2 = 0.75, nRMSE = 35.6 %). The spatial distribution of the seasonal and annual-averaged PM concentrations was similar with in-situ observations, except for the northeastern part of China with bright surface reflectance. Their spatial distribution and seasonal changes were well matched with in-situ measurements.
https://doi.org/10.7780/kjrs.2021.37.2.11 인용 PDF KSCI HTML

A Methodology for Automatic Multi-Categorization of Single-Categorized Documents (단일 카테고리 문서의 다중 카테고리 자동확장 방법론)

Hong, Jin-Sung;Kim, Namgyu;Lee, Sangwon
- Journal of Intelligence and Information Systems
- /
- v.20 no.3
- /
- pp.77-92
- /
- 2014
Recently, numerous documents including unstructured data and text have been created due to the rapid increase in the usage of social media and the Internet. Each document is usually provided with a specific category for the convenience of the users. In the past, the categorization was performed manually. However, in the case of manual categorization, not only can the accuracy of the categorization be not guaranteed but the categorization also requires a large amount of time and huge costs. Many studies have been conducted towards the automatic creation of categories to solve the limitations of manual categorization. Unfortunately, most of these methods cannot be applied to categorizing complex documents with multiple topics because the methods work by assuming that one document can be categorized into one category only. In order to overcome this limitation, some studies have attempted to categorize each document into multiple categories. However, they are also limited in that their learning process involves training using a multi-categorized document set. These methods therefore cannot be applied to multi-categorization of most documents unless multi-categorized training sets are provided. To overcome the limitation of the requirement of a multi-categorized training set by traditional multi-categorization algorithms, we propose a new methodology that can extend a category of a single-categorized document to multiple categorizes by analyzing relationships among categories, topics, and documents. First, we attempt to find the relationship between documents and topics by using the result of topic analysis for single-categorized documents. Second, we construct a correspondence table between topics and categories by investigating the relationship between them. Finally, we calculate the matching scores for each document to multiple categories. The results imply that a document can be classified into a certain category if and only if the matching score is higher than the predefined threshold. For example, we can classify a certain document into three categories that have larger matching scores than the predefined threshold. The main contribution of our study is that our methodology can improve the applicability of traditional multi-category classifiers by generating multi-categorized documents from single-categorized documents. Additionally, we propose a module for verifying the accuracy of the proposed methodology. For performance evaluation, we performed intensive experiments with news articles. News articles are clearly categorized based on the theme, whereas the use of vulgar language and slang is smaller than other usual text document. We collected news articles from July 2012 to June 2013. The articles exhibit large variations in terms of the number of types of categories. This is because readers have different levels of interest in each category. Additionally, the result is also attributed to the differences in the frequency of the events in each category. In order to minimize the distortion of the result from the number of articles in different categories, we extracted 3,000 articles equally from each of the eight categories. Therefore, the total number of articles used in our experiments was 24,000. The eight categories were "IT Science," "Economy," "Society," "Life and Culture," "World," "Sports," "Entertainment," and "Politics." By using the news articles that we collected, we calculated the document/category correspondence scores by utilizing topic/category and document/topics correspondence scores. The document/category correspondence score can be said to indicate the degree of correspondence of each document to a certain category. As a result, we could present two additional categories for each of the 23,089 documents. Precision, recall, and F-score were revealed to be 0.605, 0.629, and 0.617 respectively when only the top 1 predicted category was evaluated, whereas they were revealed to be 0.838, 0.290, and 0.431 when the top 1 - 3 predicted categories were considered. It was very interesting to find a large variation between the scores of the eight categories on precision, recall, and F-score.
https://doi.org/10.13088/jiis.2014.20.3.077 인용 PDF KSCI

Design of a High-Resolution Integrating Sigma-Delta ADC for Battery Capacity Measurement (배터리 용량측정을 위한 고해상도 Integrating Sigma-Delta ADC 설계)

Park, Chul-Kyu;Jang, Ki-Chang;Woo, Sun-Sik;Choi, Joong-Ho
- Journal of IKEEE
- /
- v.16 no.1
- /
- pp.28-33
- /
- 2012
Recently, with mobile devices increasing, as a variety of multimedia functions are needed, battery life is decreased. Accordingly the methods for extending the battery life has been proposed. In order to implement these methods, we have to know exactly the status of the battery, so we need a high resolution analog to digital converter(ADC). In case of the existing integrating sigma-delta ADC, it have not convert reset-time conversion cycle to function of resolution. Because of this reason, all digital values corresponding to the all number of bits will not be able to be expressed. To compensated this drawback, this paper propose that all digital values corresponding to the number of bits can be expressed without having to convert reset-time additional conversion cycle to function of resolution by using a up-down counter. The proposed circuit achieves improved SNDR compared to conventional converters simulation result. Also, this was designed for low power suitable for battery management systems and fabricated in 0.35um process.
https://doi.org/10.7471/ikeee.2012.16.1.028 인용 PDF KSCI

Difference of Functional Outcome Measurements between Total Knee Arthroplasty and Knee Amputation (슬관절 절단과 슬관절성형술간의 가능 수행 측정)

Sung, Paul S.
- Physical Therapy Korea
- /
- v.4 no.2
- /
- pp.89-99
- /
- 1997
임상 결과의 측정에서 새로운 관점을 갖는 것은 중요하다. 의료 재활은 심리적 측정의 질들(표준화, 신뢰도, 타당도)에서 충분한 노력에 수행되어지지 않아 왔기 때문에 환자와 프로그램 사이에 일반화된 기능적인 평가 범위가 부족하다. 장애의 적절한 측정을 위한 요구는 기능적인 상태에서 변화들을 알리고 치료의 필요성을 평가하고 치료를 계획하고 결과를 예측하고 보상 방법을 측정하기 위한 환자의 치료와 임상 연구에서 모두 나타난다. 세계적으로 사용되어지고 있는 기능 평가 도구인 FIM으로부터 이 연구는 신체적 측정의 기대되어진 것에 유사한 비율로 기능적 평가 측정들을 구성한다. 노인 재활에서 기능적인 결과의 측정은 중요한 몇 가지 점이 있다. 첫째는 접근에 기초한 기능적인 결과는 치료 목표 설정에 필요하다. 둘째는 도구는 기능적인 향상을 예상하는데 유용해야 한다. 셋째는 기능 평가는 적절한 타당도와 신뢰도와 함께 고려되어져야만 한다. 넷째는 다른 기능적 도구들이 함께 평가되어져야 할 필요가 있다는 것이다. FIM의 목록의 어려운 접들은 손상을 입은 집단에서는 다소 다양하다. 가장 중요한 부분이기 때문에 하나의 운동범위는 요통과 화상을 입은 환자를 제외한 모든 손상을 입은 집단들에게 적용되어 질 수 있다. 기능의 운동과 인지적인 면은 구분되어지는 것이 중요하였고 분리되어져서 치료되어 졌다. 어려운 목록들은 손상을 입은 집단에서 다양하였고, 다양한 손상의 종류의 독특한 영향을 반영하였다. FIM은 기능적인 장애를 측정하기 위해 고안되어진 또 다른 도구이다. 그리고 다른 것들은 의료 재활을 위한 국제적 자료 체계를 만들기 위한 것이다. FIM의 목적은 의료 재활의 결과를 확인하고 장애의 정도의 측정을 포함한다. FIM은 7가지 수준에서 사회적 인지, 의사 소통, 이동, 움직임 (mobility), 소변 관리, 자조 활동을 평가한다. 범위는 총체적 도움의 비율로부터 완전하게 독립적인 것까지의 범위이고 도움, 감독, 도구의 사용의 범위를 고려한다. 27,009의 환자를 조사한 최근 검사 기록들은 FIM이 움직임(motor)과 인지 기능을 평가하는 것이라는 것을 보여준다(Hinemann, 1993). FIM의 저자들은 자료가 프로그램 평가의 시도에서 즉각적으로 적용 할 수 있기를 기대한다. FSI은 어떤 과제의 수행에서 어려움에 관계된 정보를 제공하는 것을 나타내고 과제를 수행하기 위한 환자를 위한 변경된 전략들을 발달시키기 위해 노력하는 임상가들에게 유용할 수 있다. 두 도구 모두는 전통적인 범위들보다 고관절 골절을 동반한 장애의 좀더 정확한 정보를 모으도록 할 수 있다. 고찰된 모든 연구의 결과들은 골절 후에 남아 있는 잔여 장애의 중요한 수준을 강조한다. 골절 전의 보행으로 회복된 사람은 매우 드물었다. 대부분은 기본적인 움직임 혹은 옷입기, 개인 위생에 관계된 활동들에서 의존적이었다. 많은 사람들은 사회에서 활동을 할 수 없었다. 장애의 적절한 측정의 요구는 환자 치료와 기능적인 상태에서 변화를 알고 치료의 요구도를 측정하고 치료를 계획하고 결과를 예상하고 보상 수단을 결정하는 임상적 연구에서 모두 나타난다. 물리치료 분야는 분야의 다른 영역에서 기능적인 결과를 충족시키고 발달시키는 것이 필요하다.
PDF

Research of Runoff Management in Urban Area using Genetic Algorithm (유전자알고리즘을 이용한 도시화 유역에서의 유출 관리 방안 연구)

Lee, Beum-Hee
- Journal of the Korean Geophysical Society
- /
- v.9 no.4
- /
- pp.321-331
- /
- 2006
Recently, runoff characteristics of urban area are changing because of the increase of impervious area by rapidly increasing of population and industrialization, urbanization. It needs to extract the accurate topologic and hydrologic parameters of watershed in order to manage water resource efficiently. Thus, this study developed more precise input data and more improved parameter estimating procedures using GIS(Geographic Information System) and GA(Genetic Algorithm). For these purposes, XP-SWMM (EXPert-Storm Water Management Model) was used to simulate the urban runoff. The model was applied to An-Yang stream basin that is a typical Korean urban stream basin with several tributaries. The rules for parameter estimation were composed and applied based on quantity parameters that are investigated through the sensitivity analysis. GA algorithm is composed of these rules and facts. The conditions of urban flows are simulated using the rainfall-runoff data of the study area. The data of area, slope, width of each subcatchment and length, slope of each stream reach were acquired from topographic maps, and imperviousness rate, land use types, infiltration capacities of each subcatchment from land use maps, soil maps using GIS. Also we gave the management scheme of urbanization runoff using XP-SWMM. The parameters are estimated by GA from sensitivity analysis which is performed to analyze the runoff parameters.
PDF

Delivery of Therapist's Intervention to the Education of Ayres Sensory Integration$^{(R)}$ (ASI$^{(R)}$) (Ayres Sensory Integration (ASI$^{(R)}$) 중재 교육에 따른 치료사의 치료 수행도 변화)

Shin, Ye-Na;Hong, Eunkyoung
- The Journal of Korean Academy of Sensory Integration
- /
- v.12 no.1
- /
- pp.13-23
- /
- 2014
Objective : This study was to perform the education of the ASI$^{(R)}$ intervention for six occupational therapists and to know the delivery of ASI$^{(R)}$ core principle through a self-assessment, a peer-assessment, an expert-assessment. Methods : The study performed from November 2013 to June 2014 for six occupational therapists without completion of the education of ASI$^{(R)}$ intervention. The participants were educated about the ASI$^{(R)}$ intervention during 8 weeks and took and assessed films before and after education. The assessment was the self-assessment, the peer-assessment, the expert-assessment and the data of assessment was analyzed by Mann-Whitney and ICC. Results : The result of process factors before and after education according to methods of assessment, the self-assessment was significant in 'self-regulation,' 'collaboration,' 'ensures success,' 'play,' 'alliance,' and 'total item'. The peer-assessment was significant in all item exception 'safety'. The expert-assessment was significant in all items exception 'sensory opportunities'. The results of self-assessment and expert-assessment before and after the education of ASI$^{(R)}$ intervention were significant in 'safety'. Conclusion : The results of this study provide to need the education of ASI$^{(R)}$ intervention for accuracy sensory integrative intervention. The occupational therapists need to check the style of intervention.
https://doi.org/10.18064/JKASI.2014.12.1.013 인용 PDF KSCI

Optimization of PRISM Parameters and Digital Elevation Model Resolution for Estimating the Spatial Distribution of Precipitation in South Korea (남한 강수량 분포 추정을 위한 PRISM 매개변수 및 수치표고모형 최적화)

Park, Jong-Chul;Jung, Il-Won;Chang, Hee-Jun;Kim, Man-Kyu
- Journal of the Korean Association of Geographic Information Studies
- /
- v.15 no.3
- /
- pp.36-51
- /
- 2012
The demand for a climatological dataset with a regular spaced grid is increasing in diverse fields such as ecological and hydrological modeling as well as regional climate impact studies. PRISM(Precipitation-Elevation Regressions on Independent Slopes Model) is a useful method to estimate high-altitude precipitation. However, it is not well discussed over the optimization of PRISM parameters and DEM(Digital Elevation Model) resolution in South Korea. This study developed the PRISM and then optimized parameters of the model and DEM resolution for producing a gridded annual average precipitation data of South Korea with 1km spatial resolution during the period 2000-2005. SCE-UA (Shuffled Complex Evolution-University of Arizona) method employed for the optimization. In addition, sensitivity analysis investigates the change in the model output with respect to the parameter and the DEM spatial resolution variations. The study result shows that maximum radius within which station search will be conducted is 67km. Minimum radius within which all stations are included is 31km. Minimum number of stations required for cell precipitation and elevation regression calculation is four. Optimizing DEM resolution is $1{\times}1km$. This study also shows that the PRISM output very sensitive to DEM spatial resolution variations. This study contributes to improving the accuracy of PRISM technique as it applies to South Korea.
https://doi.org/10.11108/kagis.2012.15.3.036 인용 PDF KSCI

Search Result 741, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)