Particulate matter(PM) among air pollutants with complex and widespread causes is classified according to particle size. Among them, PM2.5 is very small in size and can cause diseases in the human respiratory tract or cardiovascular system if inhaled by humans. In order to prepare for these risks, state-centered management and preventable monitoring and forecasting are important. This study tried to predict PM2.5 in Seoul, where high concentrations of fine dust occur frequently, using two ensemble models, random forest (RF) and extreme gradient boosting (XGB) using 15 local data assimilation and prediction system (LDAPS) weather-related factors, aerosol optical depth (AOD) and 4 chemical factors as independent variables. Performance evaluation and factor importance evaluation of the two models used for prediction were performed, and seasonal model analysis was also performed. As a result of prediction accuracy, RF showed high prediction accuracy of R2 = 0.85 and XGB R2 = 0.91, and it was confirmed that XGB was a more suitable model for PM2.5 prediction than RF. As a result of the seasonal model analysis, it can be said that the prediction performance was good compared to the observed values with high concentrations in spring. In this study, PM2.5 of Seoul was predicted using various factors, and an ensemble-based PM2.5 prediction model showing good performance was constructed.
Cho, Hyeon Jin;Na, Jeong Eun;Lee, Gun Ju;Lee, Hak Young
Korean Journal of Ecology and Environment
/
v.54
no.4
/
pp.291-302
/
2021
The phytoplankton community in the estuarine system is affected by changes of physicochemical factors easily. The present study analyzed phytoplankton community distribution and similarity, in addition to exploring factors influencing variations in phytoplankton community structure in three lakes located in the Yeongsan River estuary from March 2014 to November 2017. We carried out non-multidimensional scaling (NMDS) and random forest analysis (RF) for comparing the pattern of phytoplankton distribution and the relationship between phytoplankton distribution and environmental variables. Similarity Percentage (SIMPER) and Analysis of Similarity (ANOSIM) were performed to figure out the similarity of phytoplankton community at each site of three lakes. From NMDS, Phytoplankton community distribution differed between Yeongsan and Gumho lakes, and the factors influencing the distribution of phytoplankton communities across the three lakes were water temperature, dissolved oxygen, total nitrogen (T-N), nitrate-N (NO3-N), and conductivity. NO3-N was a key factor influencing phytoplankton community structure in the three lakes based on RF. A total of 24 species were identified as indicator species in the three lakes studied, with the highest species numbers observed in Yeongsan Lake (13) and the lowest observed in Yeongam Lake (2). According to SIMPER and ANOSIM results, the phytoplankton community in Yeongsan and Yeongam lakes were similar, and they differed from those in Gumho Lake. In addition, the phytoplankton community structure varied across the study sites in the three lakes, indicating that water channels across the lakes a minor influence phytoplankton community distribution.
Sangwon Yoon;Heegun Lee;Jong-Won Park;Minhwan Jeong;Dain Lee;Hyo Sun Jung;Julan Kim;Hye-Rim Yang;Seung Hwan Lee;Jeong-Ho Lee
Korean Journal of Fisheries and Aquatic Sciences
/
v.56
no.4
/
pp.411-418
/
2023
Genome wide association studies (GWAS) identify genetic loci associated with quantitative traits in genomic selection. Although several studies have compared performance of various algorithms, no study compares them in olive flounder Paralichthys olivaceus. This study compared the GWAS results of four mixed linear model (MLM) algorithms and one Fixed and random model Circulating Probability Unification (FarmCPU) algorithm in olive flounder. Considering gender and genetic association matrices as fixed and random effects, the MLM had stable performance without inflation for λGC (genomic inflation factor) of -log10P. The FarmCPU algorithm had some appropriate λGC of -log10P, and an upward tail was identified in quantile-quantile plots. Therefore, the models were suitable for detecting genetic variants associated with olive flounder growth traits. Moreover, significant genotypes appeared several times at chromosome 22, around which quantitative trait loci are expected to exist. Finally, in both models, some of the most genetic variants were found in genes related to growth traits, confirming their reliability. These results will be helpful when applied to the genomic selection of olive flounder growth traits in the future.
Journal of Korea Society of Industrial Information Systems
/
v.28
no.2
/
pp.1-9
/
2023
Research using AI and big data is also being actively conducted in the health and medical fields such as disease diagnosis and treatment. Most of the existing research data used cohort data from research institutes or some patient data. In this paper, the difference in the prediction rate of survival and the factors affecting survival between breast cancer patients in their 40~50s and other age groups was revealed using health insurance review claim data held by the HIRA. As a result, the accuracy of predicting patients' survival was 0.93 on average in their 40~50s, higher than 0.86 in their 60~80s. In terms of that factor, the number of treatments was high for those in their 40~50s, and age was high for those in their 60~80s. Performance comparison with previous studies, the average precision was 0.90, which was higher than 0.81 of the existing paper. As a result of performance comparison by applied algorithm, the overall average precision of Decision Tree, Random Forest, and Gradient Boosting was 0.90, and the recall was 1.0, and the precision of multi-layer perceptrons was 0.89, and the recall was 1.0. I hope that more research will be conducted using machine learning automation(Auto ML) tools for non-professionals to enhance the use of the value for health insurance review claim data held by the HIRA.
The Journal of Korea Institute of Information, Electronics, and Communication Technology
/
v.16
no.2
/
pp.87-96
/
2023
Recently, research on predicting the treatment results of diseases using deep learning technology is also active in the medical community. However, small patient data and specific deep learning algorithms were selected and utilized, and research was conducted to show meaningful results under specific conditions. In this study, in order to generalize the research results, patients were further expanded and subdivided to derive the results of a study predicting mortality after lung cancer diagnosis for men and women in their 80s, 90s, and 100s. Using AutoML, which provides large-scale medical information and various deep learning algorithms from the Health Insurance Review and Assessment Service, five algorithms such as Decision Tree, Random Forest, Gradient Boosting, XGBoost, and Logistic Registration were created to predict mortality rates for 84 months after lung cancer diagnosis. As a result of the study, men in their 80s and 90s had a higher mortality prediction rate than women, and women in their 100s had a higher mortality prediction rate than men. And the factor that has the greatest influence on the mortality rate was analyzed as the treatment period.
More than 95% of imports and exports in the World are being transported by vessels. In other words, marine transportation accounts for a large portion of share in the world trade. The purpose of this study is to analyze factors of seaborne trade volume according to items affecting on the world economy. This study conducted a linear regression analysis between seaborne trade volume and the world economy (world GDP) to estimate the correlation between them. Panel data analysis and random effects model analysis have been applied to examine the effect of seaborne trade volume. For this study, the seaborne trade volume is categorized into 10 items, and estimated how much global GDP will be affected when the trade volume changes. In addition, the granger causality test was conducted to verify the relationship between seaborne trade volume and the world GDP. As a result, seaborne trade volume and the world GDP were mutually influenced each other. However, seaborne trade volume affects the world economy more significantly. The items affecting world economic growth include petroleum products, crude oil, chemical products, and so on. The estimated value of the coefficients of petroleum products, crude oil and chemical products were 1.014, 1.013 and 1.010, respectively. The estimated value 1.014 of petroleum products means that the growth rate is 1.014 times higher than the current world GDP growth rate when the seaborne trade volume of petroleum products increased by one unit Lastly, this study examines the seaborne trade volume of 10 categories and then verifies whether the growth rate of world GDP will increase when the volume of seaborne trade increased. This study is expected to provide policy-makers with useful information about formulating policies related to international trade.
Exploratory data analysis is the process of observing and understanding data collected from various sources to identify their distributions and correlations through their structures and characterization. This process can be used to identify correlations among conditioning factors and select the most effective factors for analysis. This can help the assessment of landslide susceptibility, because landslides are usually triggered by multiple factors, and the impacts of these factors vary by region. This study compared two stages of exploratory data analysis to examine the impact of the data exploration procedure on the landslide prediction model's performance with respect to factor selection. Deep-learning-based landslide susceptibility analysis used either a combinations of selected factors or all 23 factors. During the data exploration phase, we used a Pearson correlation coefficient heat map and a histogram of random forest feature importance. We then assessed the accuracy of our deep-learning-based analysis of landslide susceptibility using a confusion matrix. Finally, a landslide susceptibility map was generated using the landslide susceptibility index derived from the proposed analysis. The analysis revealed that using all 23 factors resulted in low accuracy (55.90%), but using the 13 factors selected in one step of exploration improved the accuracy to 81.25%. This was further improved to 92.80% using only the nine conditioning factors selected during both steps of the data exploration. Therefore, exploratory data analysis selected the conditioning factors most suitable for landslide susceptibility analysis and thereby improving the performance of the analysis.
Single-channel seismic exploration has proven effective in delineating subsurface geological structures using small-scale survey systems. The seismic data acquired through zero- or near-offset methods directly capture subsurface features along the vertical axis, facilitating the construction of corresponding seismic sections. However, substantial noise in single-channel seismic data hampers precise interpretation because of the low signal-to-noise ratio. This study introduces a novel approach that integrate noise reduction and signal enhancement via matrix rank optimization to address this issue. Unlike conventional rank-reduction methods, which retain selected singular values to mitigate random noise, our method optimizes the entire singular value spectrum, thus effectively tackling both random and erratic noises commonly found in environments with low signal-to-noise ratio. Additionally, to enhance the horizontal continuity of seismic events and mitigate signal loss during noise reduction, we introduced an adaptive weighting factor computed from the eigenimage of the seismic section. To access the robustness of the proposed method, we conducted numerical experiments using single-channel Sparker seismic data from the Chukchi Plateau in the Arctic Ocean. The results demonstrated that the seismic sections had significantly improved signal-to-noise ratios and minimal signal loss. These advancements hold promise for enhancing single-channel and high-resolution seismic surveys and aiding in the identification of marine development and submarine geological hazards in domestic coastal areas.
Dong-Kyu Kim;Hee-Suk Lim;Kyoung Mi Eun;Yuju Seo;Joon Kon Kim;Young Seok Kim;Min-Kyung Kim;Siyeon Jin;Seung Cheol Han;Dae Woo Kim
Journal of Rhinology
/
v.59
no.2
/
pp.173-180
/
2021
Background: Neutrophils present as major inflammatory cells in refractory chronic rhinosinusitis with nasal polyps (CRSwNP), regardless of the endotype. However, their role in the pathophysiology of CRSwNP remains poorly understood. We investigated factors predicting the surgical outcomes of CRSwNP patients with focus on neutrophilic localization. Methods: We employed machine-learning methods such as the decision tree and random forest models to predict the surgical outcomes of CRSwNP. Immunofluorescence analysis was conducted to detect human neutrophil elastase (HNE), Bcl-2, and Ki-67 in NP tissues. We counted the immunofluorescence-positive cells and divided them into three groups based on the infiltrated area, namely, epithelial, subepithelial, and perivascular groups. Results: On machine learning, the decision tree algorithm demonstrated that the number of subepithelial HNE-positive cells, Lund-Mackay (LM) scores, and endotype (eosinophilic or non-eosinophilic) were the most important predictors of surgical outcomes in CRSwNP patients. Additionally, the random forest algorithm showed that, after ranking the mean decrease in the Gini index or the accuracy of each factor, the top three ranking factors associated with surgical outcomes were the LM score, age, and number of subepithelial HNE-positive cells. In terms of cellular proliferation, immunofluorescence analysis revealed that Ki-67/HNE-double positive and Bcl-2/HNE-double positive cells were significantly increased in the subepithelial area in refractory CRSwNP. Conclusion: Our machine-learning approach and immunofluorescence analysis demonstrated that subepithelial neutrophils in NP tissues had a high expression of Ki-67 and could serve as a cellular biomarker for predicting surgical outcomes in CRSwNP patients.
Predicting ground settlement during the improvement of soft ground and the construction of a structure is an crucial factor. Numerous studies have been conducted, and many prediction equations have been proposed to estimate settlement. Settlement can be calculated using the compression index of clay. In this study, data on water content, void ratio, liquid limit, plastic limit, and compression index from the Busan New Port area were collected to construct a dataset. Correlation analysis was conducted among the collected data. Machine learning algorithms, including Random Forest, Neural Network, Linear Regression, Ada Boost, and Gradient Boosting, were applied using the Orange mining program to propose compression index prediction models. The models' results were evaluated by comparing RMSE and MAPE values, which indicate error rates, and R2 values, which signify the models' significance. As a result, water content showed the highest correlation, while the plastic limit showed a somewhat lower correlation than other characteristics. Among the compared models, the AdaBoost model demonstrated the best performance. As a result of comparing each model, the AdaBoost model had the lowest error rate and a large coefficient of determination.
본 웹사이트에 게시된 이메일 주소가 전자우편 수집 프로그램이나
그 밖의 기술적 장치를 이용하여 무단으로 수집되는 것을 거부하며,
이를 위반시 정보통신망법에 의해 형사 처벌됨을 유념하시기 바랍니다.
[게시일 2004년 10월 1일]
이용약관
제 1 장 총칙
제 1 조 (목적)
이 이용약관은 KoreaScience 홈페이지(이하 “당 사이트”)에서 제공하는 인터넷 서비스(이하 '서비스')의 가입조건 및 이용에 관한 제반 사항과 기타 필요한 사항을 구체적으로 규정함을 목적으로 합니다.
제 2 조 (용어의 정의)
① "이용자"라 함은 당 사이트에 접속하여 이 약관에 따라 당 사이트가 제공하는 서비스를 받는 회원 및 비회원을
말합니다.
② "회원"이라 함은 서비스를 이용하기 위하여 당 사이트에 개인정보를 제공하여 아이디(ID)와 비밀번호를 부여
받은 자를 말합니다.
③ "회원 아이디(ID)"라 함은 회원의 식별 및 서비스 이용을 위하여 자신이 선정한 문자 및 숫자의 조합을
말합니다.
④ "비밀번호(패스워드)"라 함은 회원이 자신의 비밀보호를 위하여 선정한 문자 및 숫자의 조합을 말합니다.
제 3 조 (이용약관의 효력 및 변경)
① 이 약관은 당 사이트에 게시하거나 기타의 방법으로 회원에게 공지함으로써 효력이 발생합니다.
② 당 사이트는 이 약관을 개정할 경우에 적용일자 및 개정사유를 명시하여 현행 약관과 함께 당 사이트의
초기화면에 그 적용일자 7일 이전부터 적용일자 전일까지 공지합니다. 다만, 회원에게 불리하게 약관내용을
변경하는 경우에는 최소한 30일 이상의 사전 유예기간을 두고 공지합니다. 이 경우 당 사이트는 개정 전
내용과 개정 후 내용을 명확하게 비교하여 이용자가 알기 쉽도록 표시합니다.
제 4 조(약관 외 준칙)
① 이 약관은 당 사이트가 제공하는 서비스에 관한 이용안내와 함께 적용됩니다.
② 이 약관에 명시되지 아니한 사항은 관계법령의 규정이 적용됩니다.
제 2 장 이용계약의 체결
제 5 조 (이용계약의 성립 등)
① 이용계약은 이용고객이 당 사이트가 정한 약관에 「동의합니다」를 선택하고, 당 사이트가 정한
온라인신청양식을 작성하여 서비스 이용을 신청한 후, 당 사이트가 이를 승낙함으로써 성립합니다.
② 제1항의 승낙은 당 사이트가 제공하는 과학기술정보검색, 맞춤정보, 서지정보 등 다른 서비스의 이용승낙을
포함합니다.
제 6 조 (회원가입)
서비스를 이용하고자 하는 고객은 당 사이트에서 정한 회원가입양식에 개인정보를 기재하여 가입을 하여야 합니다.
제 7 조 (개인정보의 보호 및 사용)
당 사이트는 관계법령이 정하는 바에 따라 회원 등록정보를 포함한 회원의 개인정보를 보호하기 위해 노력합니다. 회원 개인정보의 보호 및 사용에 대해서는 관련법령 및 당 사이트의 개인정보 보호정책이 적용됩니다.
제 8 조 (이용 신청의 승낙과 제한)
① 당 사이트는 제6조의 규정에 의한 이용신청고객에 대하여 서비스 이용을 승낙합니다.
② 당 사이트는 아래사항에 해당하는 경우에 대해서 승낙하지 아니 합니다.
- 이용계약 신청서의 내용을 허위로 기재한 경우
- 기타 규정한 제반사항을 위반하며 신청하는 경우
제 9 조 (회원 ID 부여 및 변경 등)
① 당 사이트는 이용고객에 대하여 약관에 정하는 바에 따라 자신이 선정한 회원 ID를 부여합니다.
② 회원 ID는 원칙적으로 변경이 불가하며 부득이한 사유로 인하여 변경 하고자 하는 경우에는 해당 ID를
해지하고 재가입해야 합니다.
③ 기타 회원 개인정보 관리 및 변경 등에 관한 사항은 서비스별 안내에 정하는 바에 의합니다.
제 3 장 계약 당사자의 의무
제 10 조 (KISTI의 의무)
① 당 사이트는 이용고객이 희망한 서비스 제공 개시일에 특별한 사정이 없는 한 서비스를 이용할 수 있도록
하여야 합니다.
② 당 사이트는 개인정보 보호를 위해 보안시스템을 구축하며 개인정보 보호정책을 공시하고 준수합니다.
③ 당 사이트는 회원으로부터 제기되는 의견이나 불만이 정당하다고 객관적으로 인정될 경우에는 적절한 절차를
거쳐 즉시 처리하여야 합니다. 다만, 즉시 처리가 곤란한 경우는 회원에게 그 사유와 처리일정을 통보하여야
합니다.
제 11 조 (회원의 의무)
① 이용자는 회원가입 신청 또는 회원정보 변경 시 실명으로 모든 사항을 사실에 근거하여 작성하여야 하며,
허위 또는 타인의 정보를 등록할 경우 일체의 권리를 주장할 수 없습니다.
② 당 사이트가 관계법령 및 개인정보 보호정책에 의거하여 그 책임을 지는 경우를 제외하고 회원에게 부여된
ID의 비밀번호 관리소홀, 부정사용에 의하여 발생하는 모든 결과에 대한 책임은 회원에게 있습니다.
③ 회원은 당 사이트 및 제 3자의 지적 재산권을 침해해서는 안 됩니다.
제 4 장 서비스의 이용
제 12 조 (서비스 이용 시간)
① 서비스 이용은 당 사이트의 업무상 또는 기술상 특별한 지장이 없는 한 연중무휴, 1일 24시간 운영을
원칙으로 합니다. 단, 당 사이트는 시스템 정기점검, 증설 및 교체를 위해 당 사이트가 정한 날이나 시간에
서비스를 일시 중단할 수 있으며, 예정되어 있는 작업으로 인한 서비스 일시중단은 당 사이트 홈페이지를
통해 사전에 공지합니다.
② 당 사이트는 서비스를 특정범위로 분할하여 각 범위별로 이용가능시간을 별도로 지정할 수 있습니다. 다만
이 경우 그 내용을 공지합니다.
제 13 조 (홈페이지 저작권)
① NDSL에서 제공하는 모든 저작물의 저작권은 원저작자에게 있으며, KISTI는 복제/배포/전송권을 확보하고
있습니다.
② NDSL에서 제공하는 콘텐츠를 상업적 및 기타 영리목적으로 복제/배포/전송할 경우 사전에 KISTI의 허락을
받아야 합니다.
③ NDSL에서 제공하는 콘텐츠를 보도, 비평, 교육, 연구 등을 위하여 정당한 범위 안에서 공정한 관행에
합치되게 인용할 수 있습니다.
④ NDSL에서 제공하는 콘텐츠를 무단 복제, 전송, 배포 기타 저작권법에 위반되는 방법으로 이용할 경우
저작권법 제136조에 따라 5년 이하의 징역 또는 5천만 원 이하의 벌금에 처해질 수 있습니다.
제 14 조 (유료서비스)
① 당 사이트 및 협력기관이 정한 유료서비스(원문복사 등)는 별도로 정해진 바에 따르며, 변경사항은 시행 전에
당 사이트 홈페이지를 통하여 회원에게 공지합니다.
② 유료서비스를 이용하려는 회원은 정해진 요금체계에 따라 요금을 납부해야 합니다.
제 5 장 계약 해지 및 이용 제한
제 15 조 (계약 해지)
회원이 이용계약을 해지하고자 하는 때에는 [가입해지] 메뉴를 이용해 직접 해지해야 합니다.
제 16 조 (서비스 이용제한)
① 당 사이트는 회원이 서비스 이용내용에 있어서 본 약관 제 11조 내용을 위반하거나, 다음 각 호에 해당하는
경우 서비스 이용을 제한할 수 있습니다.
- 2년 이상 서비스를 이용한 적이 없는 경우
- 기타 정상적인 서비스 운영에 방해가 될 경우
② 상기 이용제한 규정에 따라 서비스를 이용하는 회원에게 서비스 이용에 대하여 별도 공지 없이 서비스 이용의
일시정지, 이용계약 해지 할 수 있습니다.
제 17 조 (전자우편주소 수집 금지)
회원은 전자우편주소 추출기 등을 이용하여 전자우편주소를 수집 또는 제3자에게 제공할 수 없습니다.
제 6 장 손해배상 및 기타사항
제 18 조 (손해배상)
당 사이트는 무료로 제공되는 서비스와 관련하여 회원에게 어떠한 손해가 발생하더라도 당 사이트가 고의 또는 과실로 인한 손해발생을 제외하고는 이에 대하여 책임을 부담하지 아니합니다.
제 19 조 (관할 법원)
서비스 이용으로 발생한 분쟁에 대해 소송이 제기되는 경우 민사 소송법상의 관할 법원에 제기합니다.
[부 칙]
1. (시행일) 이 약관은 2016년 9월 5일부터 적용되며, 종전 약관은 본 약관으로 대체되며, 개정된 약관의 적용일 이전 가입자도 개정된 약관의 적용을 받습니다.