Search | Korea Science

Domain-Specific Terminology Mapping Methodology Using Supervised Autoencoders (지도학습 오토인코더를 이용한 전문어의 범용어 공간 매핑 방법론)

Byung Ho Yoon;Junwoo Kim;Namgyu Kim
- Information Systems Review
- /
- v.25 no.1
- /
- pp.93-110
- /
- 2023
Recently, attempts have been made to convert unstructured text into vectors and to analyze vast amounts of natural language for various purposes. In particular, the demand for analyzing texts in specialized domains is rapidly increasing. Therefore, studies are being conducted to analyze specialized and general-purpose documents simultaneously. To analyze specific terms with general terms, it is necessary to align the embedding space of the specific terms with the embedding space of the general terms. So far, attempts have been made to align the embedding of specific terms into the embedding space of general terms through a transformation matrix or mapping function. However, the linear transformation based on the transformation matrix showed a limitation in that it only works well in a local range. To overcome this limitation, various types of nonlinear vector alignment methods have been recently proposed. We propose a vector alignment model that matches the embedding space of specific terms to the embedding space of general terms through end-to-end learning that simultaneously learns the autoencoder and regression model. As a result of experiments with R&D documents in the "Healthcare" field, we confirmed the proposed methodology showed superior performance in terms of accuracy compared to the traditional model.
https://doi.org/10.14329/isr.2023.25.1.093 인용 PDF

Optimal Design and Economic Evaluation of Energy Supply System from On/Off Shore Wind Farms (육/해상 풍력기반 에너지생산 공정 최적 설계 및 경제성 평가)

Kim, Minsoo;Kim, Jiyong
- Korean Chemical Engineering Research
- /
- v.53 no.2
- /
- pp.156-163
- /
- 2015
This paper presents a new framework for design and economic evaluation of wind energy-based electricity supply system. We propose a network optimization (mixed-integer linear programming) model to design the underlying energy supply system. In this model we include practical constraints such as land limitations of onshore wind farms and different costs of offshore wind farms to minimize the total annual cost. Based upon the model, we also analyze the sensitivity of the total annual cost on the change of key parameters such as available land for offshore wind farms, required area of a wind turbine and the unit price of wind turbines. We illustrate the applicability of the suggested model by applying to the problem of design of a wind turbines-based electricity supply problem in Jeju. As a result of this study, we identified the major cost-drivers and the regional cost distribution of the proposed system. We also comparatively analyzed the economic performance of on/off shore wind farms in wind energy-based electricity supply system of Jeju.
https://doi.org/10.9713/kcer.2015.53.2.156 인용 PDF KSCI

ADMM algorithms in statistics and machine learning (통계적 기계학습에서의 ADMM 알고리즘의 활용)

Choi, Hosik;Choi, Hyunjip;Park, Sangun
- Journal of the Korean Data and Information Science Society
- /
- v.28 no.6
- /
- pp.1229-1244
- /
- 2017
In recent years, as demand for data-based analytical methodologies increases in various fields, optimization methods have been developed to handle them. In particular, various constraints required for problems in statistics and machine learning can be solved by convex optimization. Alternating direction method of multipliers (ADMM) can effectively deal with linear constraints, and it can be effectively used as a parallel optimization algorithm. ADMM is an approximation algorithm that solves complex original problems by dividing and combining the partial problems that are easier to optimize than original problems. It is useful for optimizing non-smooth or composite objective functions. It is widely used in statistical and machine learning because it can systematically construct algorithms based on dual theory and proximal operator. In this paper, we will examine applications of ADMM algorithm in various fields related to statistics, and focus on two major points: (1) splitting strategy of objective function, and (2) role of the proximal operator in explaining the Lagrangian method and its dual problem. In this case, we introduce methodologies that utilize regularization. Simulation results are presented to demonstrate effectiveness of the lasso.
https://doi.org/10.7465/jkdi.2017.28.6.1229 인용 KSCI

Analysis of Causality of the Increase in the Port Congestion due to the COVID-19 Pandemic and BDI(Baltic Dry Index) (COVID-19 팬데믹으로 인한 체선율 증가와 부정기선 운임지수의 인과성 분석)

Lee, Choong-Ho;Park, Keun-Sik
- Journal of Korea Port Economic Association
- /
- v.37 no.4
- /
- pp.161-173
- /
- 2021
The shipping industry plummeted and was depressed due to the global economic crisis caused by the bankruptcy of Lehman Brothers in the US in 2008. In 2020, the shipping market also suffered from a collapse in the unstable global economic situation due to the COVID-19 pandemic, but unexpectedly, it changed to an upward trend from the end of 2020, and in 2021, it exceeded the market of the boom period of 2008. According to the Clarksons report published in May 2021, the decrease in cargo volume due to the COVID-19 pandemic in 2020 has returned to the pre-corona level by the end of 2020, and the tramper bulk carrier capacity of 103~104% of the Panamax has been in the ports due to congestion. Earnings across the bulker segments have risen to ten-year highs in recent months. In this study, as factors affecting BDI, the capacity and congestion ratio of Cape and Panamax ships on the supply side, iron ore and coal seaborne tonnge on the demand side and Granger causality test, IRF(Impulse Response Function) and FEVD(Forecast Error Variance Decomposition) were performed using VAR model to analyze the impact on BDI by congestion caused by strengthen quarantine at the port due to the COVID-19 pandemic and the loading and discharging operation delay due to the infection of the stevedore, etc and to predict the shipping market after the pandemic. As a result of the Granger causality test of variables and BDI using time series data from January 2016 to July 2021, causality was found in the Fleet and Congestion variables, and as a result of the Impulse Response Function, Congestion variable was found to have significant at both upper and lower limit of the confidence interval. As a result of the Forecast Error Variance Decomposition, Congestion variable showed an explanatory power upto 25% for the change in BDI. If the congestion in ports decreases after With Corona, it is expected that there is down-risk in the shipping market. The COVID-19 pandemic occurred not from economic factors but from an ecological factor by the pandemic is different from the past economic crisis. It is necessary to analyze from a different point of view than the past economic crisis. This study has meaningful to analyze the causality and explanatory power of Congestion factor by pandemic.
https://doi.org/10.38121/kpea.2021.12.37.4.161 인용 PDF KSCI

A study on solar radiation prediction using medium-range weather forecasts (중기예보를 이용한 태양광 일사량 예측 연구)

Sujin Park;Hyojeoung Kim;Sahm Kim
- The Korean Journal of Applied Statistics
- /
- v.36 no.1
- /
- pp.49-62
- /
- 2023
Solar energy, which is rapidly increasing in proportion, is being continuously developed and invested. As the installation of new and renewable energy policy green new deal and home solar panels increases, the supply of solar energy in Korea is gradually expanding, and research on accurate demand prediction of power generation is actively underway. In addition, the importance of solar radiation prediction was identified in that solar radiation prediction is acting as a factor that most influences power generation demand prediction. In addition, this study can confirm the biggest difference in that it attempted to predict solar radiation using medium-term forecast weather data not used in previous studies. In this paper, we combined the multi-linear regression model, KNN, random fores, and SVR model and the clustering technique, K-means, to predict solar radiation by hour, by calculating the probability density function for each cluster. Before using medium-term forecast data, mean absolute error (MAE) and root mean squared error (RMSE) were used as indicators to compare model prediction results. The data were converted into daily data according to the medium-term forecast data format from March 1, 2017 to February 28, 2022. As a result of comparing the predictive performance of the model, the method showed the best performance by predicting daily solar radiation with random forest, classifying dates with similar climate factors, and calculating the probability density function of solar radiation by cluster. In addition, when the prediction results were checked after fitting the model to the medium-term forecast data using this methodology, it was confirmed that the prediction error increased by date. This seems to be due to a prediction error in the mid-term forecast weather data. In future studies, among the weather factors that can be used in the mid-term forecast data, studies that add exogenous variables such as precipitation or apply time series clustering techniques should be conducted.
https://doi.org/10.5351/KJAS.2023.36.1.049 인용 PDF

A Study on Optimization of Perovskite Solar Cell Light Absorption Layer Thin Film Based on Machine Learning (머신러닝 기반 페로브스카이트 태양전지 광흡수층 박막 최적화를 위한 연구)

Ha, Jae-jun;Lee, Jun-hyuk;Oh, Ju-young;Lee, Dong-geun
- The Journal of the Korea Contents Association
- /
- v.22 no.7
- /
- pp.55-62
- /
- 2022
The perovskite solar cell is an active part of research in renewable energy fields such as solar energy, wind, hydroelectric power, marine energy, bioenergy, and hydrogen energy to replace fossil fuels such as oil, coal, and natural gas, which will gradually disappear as power demand increases due to the increase in use of the Internet of Things and Virtual environments due to the 4th industrial revolution. The perovskite solar cell is a solar cell device using an organic-inorganic hybrid material having a perovskite structure, and has advantages of replacing existing silicon solar cells with high efficiency, low cost solutions, and low temperature processes. In order to optimize the light absorption layer thin film predicted by the existing empirical method, reliability must be verified through device characteristics evaluation. However, since it costs a lot to evaluate the characteristics of the light-absorbing layer thin film device, the number of tests is limited. In order to solve this problem, the development and applicability of a clear and valid model using machine learning or artificial intelligence model as an auxiliary means for optimizing the light absorption layer thin film are considered infinite. In this study, to estimate the light absorption layer thin-film optimization of perovskite solar cells, the regression models of the support vector machine's linear kernel, R.B.F kernel, polynomial kernel, and sigmoid kernel were compared to verify the accuracy difference for each kernel function.
https://doi.org/10.5392/JKCA.2022.22.07.055 인용 PDF KSCI HTML

Optimization of Multi-reservoir Operation with a Hedging Rule: Case Study of the Han River Basin (Hedging Rule을 이용한 댐 연계 운영 최적화: 한강수계 사례연구)

Ryu, Gwan-Hyeong;Chung, Gun-Hui;Lee, Jung-Ho;Kim, Joong-Hoon
- Journal of Korea Water Resources Association
- /
- v.42 no.8
- /
- pp.643-657
- /
- 2009
The major reason to construct large dams is to store surplus water during rainy seasons and utilize it for water supply in dry seasons. Reservoir storage has to meet a pre-defined target to satisfy water demands and cope with a dry season when the availability of water resources are limited temporally as well as spatially. In this study, a Hedging rule that reduces total reservoir outflow as drought starts is applied to alleviate severe water shortages. Five stages for reducing outflow based on the current reservoir storage are proposed as the Hedging rule. The objective function is to minimize the total discrepancies between the target and actual reservoir storage, water supply and demand, and required minimum river discharge and actual river flow. Mixed Integer Linear Programming (MILP) is used to develop a multi-reservoir operation system with the Hedging rule. The developed system is applied for the Han River basin that includes four multi-purpose dams and one water supplying reservoir. One of the fours dams is primarily for power generation. Ten-day-based runoff from subbasins and water demand in 2003 and water supply plan to water users from the reservoirs are used from "Long Term Comprehensive Plan for Water Resources in Korea" and "Practical Handbook of Dam Operation in Korea", respectively. The model was optimized by GAMS/CPLEX which is LP/MIP solver using a branch-and-cut algorithm. As results, 99.99% of municipal demand, 99.91% of agricultural demand and 100.00% of minimum river discharge were satisfied and, at the same time, dam storage compared to the storage efficiency increased 10.04% which is a real operation data in 2003.
https://doi.org/10.3741/JKWRA.2009.42.8.643 인용 PDF KSCI

Multi-Variate Tabular Data Processing and Visualization Scheme for Machine Learning based Analysis: A Case Study using Titanic Dataset (기계 학습 기반 분석을 위한 다변량 정형 데이터 처리 및 시각화 방법: Titanic 데이터셋 적용 사례 연구)

Juhyoung Sung;Kiwon Kwon;Kyoungwon Park;Byoungchul Song
- Journal of Internet Computing and Services
- /
- v.25 no.4
- /
- pp.121-130
- /
- 2024
As internet and communication technology (ICT) is improved exponentially, types and amount of available data also increase. Even though data analysis including statistics is significant to utilize this large amount of data, there are inevitable limits to process various and complex data in general way. Meanwhile, there are many attempts to apply machine learning (ML) in various fields to solve the problems according to the enhancement in computational performance and increase in demands for autonomous systems. Especially, data processing for the model input and designing the model to solve the objective function are critical to achieve the model performance. Data processing methods according to the type and property have been presented through many studies and the performance of ML highly varies depending on the methods. Nevertheless, there are difficulties in deciding which data processing method for data analysis since the types and characteristics of data have become more diverse. Specifically, multi-variate data processing is essential for solving non-linear problem based on ML. In this paper, we present a multi-variate tabular data processing scheme for ML-aided data analysis by using Titanic dataset from Kaggle including various kinds of data. We present the methods like input variable filtering applying statistical analysis and normalization according to the data property. In addition, we analyze the data structure using visualization. Lastly, we design an ML model and train the model by applying the proposed multi-variate data process. After that, we analyze the passenger's survival prediction performance of the trained model. We expect that the proposed multi-variate data processing and visualization can be extended to various environments for ML based analysis.
https://doi.org/10.7472/jksii.2024.25.4.121 인용 PDF HTML

Search Result 38, Processing Time 0.02 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)