• Title/Summary/Keyword: Analysis and Prediction System

Search Result 2,686, Processing Time 0.035 seconds

The Study on the Confidence Building for Evaluation Methods of a Fracture System and Its Hydraulic Conductivity (단열체계 및 수리전도도의 해석신뢰도 향상을 위한 평가방법 연구)

  • Cho Sung-Il;Kim Chun-Soo;Bae Dae-Seok;Kim Kyung-Su;Song Moo-Young
    • The Journal of Engineering Geology
    • /
    • v.15 no.2 s.42
    • /
    • pp.213-227
    • /
    • 2005
  • This study aims to assess the problems with investigation method and to suggest the complementary solutions by comparing the predicted data from surface investigation with the outcome data from underground cavern. In the study area, one(NE-1) of 6 fracture zones predicted during the surface investigation was only confirmed in underground caverns. Therefore, it is necessary to improve the confidence level for prediction. In this study, the fracture classification criteria was quantitatively suggested on the basis of the BHTV images of NE-1 fracture zone. The major orientation of background fractures in rock mass was changed at the depth of the storage cavern, the length and intensity were decreased. These characteristics result in the deviation of predieted predicted fracture properties and generate the investigation bias depending on the bore hole directions and investigated scales. The evaluation of hydraulic connectivity in the surface investigation stage needs to be analyze by the groundwater pressures and hydrochemical properties from the monitoring bore hole(s) equipped with a double completion or multi-packer system during the test bore hole is pumping or injecting. The hydraulic conductivities in geometric mean measured in the underground caverns are 2-3 times lower than those from the surface and furthermore the horizontal hydraulic conductivity in geometric mean is six times lower than the vertical one. To improve confidence level of the hydraulic conductivity, the orientation of test hole should be considered during the analysis of the hydraulic conductivity and the methodology of hydro-testing and interpretation should be based on the characteristics of rock mass and investigation purposes.

A Study on the Effect of Network Centralities on Recommendation Performance (네트워크 중심성 척도가 추천 성능에 미치는 영향에 대한 연구)

  • Lee, Dongwon
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.23-46
    • /
    • 2021
  • Collaborative filtering, which is often used in personalization recommendations, is recognized as a very useful technique to find similar customers and recommend products to them based on their purchase history. However, the traditional collaborative filtering technique has raised the question of having difficulty calculating the similarity for new customers or products due to the method of calculating similaritiesbased on direct connections and common features among customers. For this reason, a hybrid technique was designed to use content-based filtering techniques together. On the one hand, efforts have been made to solve these problems by applying the structural characteristics of social networks. This applies a method of indirectly calculating similarities through their similar customers placed between them. This means creating a customer's network based on purchasing data and calculating the similarity between the two based on the features of the network that indirectly connects the two customers within this network. Such similarity can be used as a measure to predict whether the target customer accepts recommendations. The centrality metrics of networks can be utilized for the calculation of these similarities. Different centrality metrics have important implications in that they may have different effects on recommended performance. In this study, furthermore, the effect of these centrality metrics on the performance of recommendation may vary depending on recommender algorithms. In addition, recommendation techniques using network analysis can be expected to contribute to increasing recommendation performance even if they apply not only to new customers or products but also to entire customers or products. By considering a customer's purchase of an item as a link generated between the customer and the item on the network, the prediction of user acceptance of recommendation is solved as a prediction of whether a new link will be created between them. As the classification models fit the purpose of solving the binary problem of whether the link is engaged or not, decision tree, k-nearest neighbors (KNN), logistic regression, artificial neural network, and support vector machine (SVM) are selected in the research. The data for performance evaluation used order data collected from an online shopping mall over four years and two months. Among them, the previous three years and eight months constitute social networks composed of and the experiment was conducted by organizing the data collected into the social network. The next four months' records were used to train and evaluate recommender models. Experiments with the centrality metrics applied to each model show that the recommendation acceptance rates of the centrality metrics are different for each algorithm at a meaningful level. In this work, we analyzed only four commonly used centrality metrics: degree centrality, betweenness centrality, closeness centrality, and eigenvector centrality. Eigenvector centrality records the lowest performance in all models except support vector machines. Closeness centrality and betweenness centrality show similar performance across all models. Degree centrality ranking moderate across overall models while betweenness centrality always ranking higher than degree centrality. Finally, closeness centrality is characterized by distinct differences in performance according to the model. It ranks first in logistic regression, artificial neural network, and decision tree withnumerically high performance. However, it only records very low rankings in support vector machine and K-neighborhood with low-performance levels. As the experiment results reveal, in a classification model, network centrality metrics over a subnetwork that connects the two nodes can effectively predict the connectivity between two nodes in a social network. Furthermore, each metric has a different performance depending on the classification model type. This result implies that choosing appropriate metrics for each algorithm can lead to achieving higher recommendation performance. In general, betweenness centrality can guarantee a high level of performance in any model. It would be possible to consider the introduction of proximity centrality to obtain higher performance for certain models.

Managerial Implication of Trails in the Teabaeksan National Park Derived from the Analysis of Visitors Behaviors Using Automatic Visitor Counter Data (탐방객 자동 계수기 데이터를 활용한 태백산국립공원 탐방로 탐방 행태 분석 및 관리 방안 제언)

  • Sung, Chan Yong;Cho, Woo;Kim, Jong-Sub
    • Korean Journal of Environment and Ecology
    • /
    • v.34 no.5
    • /
    • pp.446-453
    • /
    • 2020
  • This study built a model to predict the daily number of visitors to 18 trails in the Taebaeksan National Park using the auto-counter system data to analyze the factors affecting the daily number of visitors to each trail and classified the trails by visitors' behaviors. Results of the multiple regression models with the daily number of visitors of the 18 trails indicated that the events, such as the National Foundation Day celebration of Snow Festival, affected the number of visitors of all of the 18 trails and were the most critical factor that determined the daily number of visitors to the Taebaeksan National Park. The long-holidays of three days or longer and other national holidays also affected the daily number of visitors to the trails. Precipitation had a negative impact on the number of visitors of trails where the intention of most visitors was for sightseeing or camping instead of hiking, whereas had no significant impacts on the number of visitors of trails where many visitors intended for hiking. It indicated that visitors who intended for hiking went ahead hiking even if the weather was poor. The effects of temperature had a positive effect on the number of visitors who intended for hiking but a negative effect on the number of visitor to the trails near Danggol Plaza where the Snow Festival was held in each winter, suggesting that the impact of the Snow Festival was the deterministic factor for trail management. Results of K-mean clustering showed that the 18 trails of the Taekbaeksan National Park could be classified into three types: those affected by the Snow Festival (type 1), those that have sightseeing points and so were visited mostly by non-hikers (type 2), and those visited mostly by hikers (type 3). Since visitor behaviors and illegal actions differ according to the trail type, this study's results can be used to prepare a trail management plan based on the trail characteristics.

Estimation of Wheat Growth using a Microwave Scatterometer (마이크로파 산란계를 이용한 밀 생육 추정)

  • Kim, Yihyun;Hong, Sukyoung;Lee, Kyungdo;Jang, Soyeong
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.46 no.1
    • /
    • pp.23-31
    • /
    • 2013
  • Microwave remote sensing can help monitor the land surface water cycle and crop growth. This type of remote sensing has great potential over conventional remote sensing using the visible and infrared regions due to its all-weather day-and-night imaging capabilities. In this paper, a ground-based multi-frequency (L-, C-, and X-band) polarimetric scatterometer system capable of making observations every 10 min was developed. This system was used to monitor the wheat over an entire growth cycle. The polarimetric scatterometer components were installed inside an air-conditioned shelter to maintain constant temperature and humidity during the data acquisition period. Backscattering coefficients for the crop growing season were compared with biophysical measurements. Backscattering coefficients for all frequencies and polarizations increased until dat of year 137 and then decreased along with fresh weight, dry weight, plant height, and vegetation water content (VWC). The range of backscatter for X-band was lower than for L- and C-band. We examined the relationship between the backscattering coefficients of each band (frequency/polarization) and the various wheat growth parameters. The correlation between the different vegetation parameters and backscatter decreased with increasing frequency. L-band HH-polarization (L-HH) is best suited for the monitoring of fresh weight (r=0.98), dry weight (r=0.96), VWC (r=0.98), and plant height (r=0.96). The correlation coefficients were highest for L-band observations and lowest for X-band. Also, HH-polarization had the highest correlations among the polarization channels (HH, VV and HV). Based on the correlation analysis between backscattering coefficients in each band and wheat growth parameters, we developed prediction equations using the L-HH based on the observed relationships between L-HH and fresh weight, dry weight, VWC and plant height. The results of these analyses will be useful in determining the optimum microwave frequency and polarizations necessary for estimating vegetation parameters in the wheat.

Development of the forecasting model for import volume by item of major countries based on economic, industrial structural and cultural factors: Focusing on the cultural factors of Korea (경제적, 산업구조적, 문화적 요인을 기반으로 한 주요 국가의 한국 품목별 수입액 예측 모형 개발: 한국의, 한국에 대한 문화적 요인을 중심으로)

  • Jun, Seung-pyo;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.23-48
    • /
    • 2021
  • The Korean economy has achieved continuous economic growth for the past several decades thanks to the government's export strategy policy. This increase in exports is playing a leading role in driving Korea's economic growth by improving economic efficiency, creating jobs, and promoting technology development. Traditionally, the main factors affecting Korea's exports can be found from two perspectives: economic factors and industrial structural factors. First, economic factors are related to exchange rates and global economic fluctuations. The impact of the exchange rate on Korea's exports depends on the exchange rate level and exchange rate volatility. Global economic fluctuations affect global import demand, which is an absolute factor influencing Korea's exports. Second, industrial structural factors are unique characteristics that occur depending on industries or products, such as slow international division of labor, increased domestic substitution of certain imported goods by China, and changes in overseas production patterns of major export industries. Looking at the most recent studies related to global exchanges, several literatures show the importance of cultural aspects as well as economic and industrial structural factors. Therefore, this study attempted to develop a forecasting model by considering cultural factors along with economic and industrial structural factors in calculating the import volume of each country from Korea. In particular, this study approaches the influence of cultural factors on imports of Korean products from the perspective of PUSH-PULL framework. The PUSH dimension is a perspective that Korea develops and actively promotes its own brand and can be defined as the degree of interest in each country for Korean brands represented by K-POP, K-FOOD, and K-CULTURE. In addition, the PULL dimension is a perspective centered on the cultural and psychological characteristics of the people of each country. This can be defined as how much they are inclined to accept Korean Flow as each country's cultural code represented by the country's governance system, masculinity, risk avoidance, and short-term/long-term orientation. The unique feature of this study is that the proposed final prediction model can be selected based on Design Principles. The design principles we presented are as follows. 1) A model was developed to reflect interest in Korea and cultural characteristics through newly added data sources. 2) It was designed in a practical and convenient way so that the forecast value can be immediately recalled by inputting changes in economic factors, item code and country code. 3) In order to derive theoretically meaningful results, an algorithm was selected that can interpret the relationship between the input and the target variable. This study can suggest meaningful implications from the technical, economic and policy aspects, and is expected to make a meaningful contribution to the export support strategies of small and medium-sized enterprises by using the import forecasting model.

Determination of Grades and Design Strengths of Machine Graded Lumber in Korea (국내 기계등급구조재의 등급구분체계 및 기준설계값 결정방법 연구)

  • Hong, Jung-Pyo;Lee, Jun-Jae;Park, Moon-Jae;Yeo, Hwanmyeong;Pang, Sung-Jun;Kim, Chul-Ki;Oh, Jung-Kwon
    • Journal of the Korean Wood Science and Technology
    • /
    • v.43 no.4
    • /
    • pp.446-455
    • /
    • 2015
  • Based on comparative studies on standards and grading procedures of machine graded lumber in Korea and other countries, this study proposed a procedure of determining the grade classification and design strengths of domestic machine graded lumber. Differences between machine stress rated lumber and E-rated laminations were detailed in order to clarify the need for the procedure improvement. To this improvement the use of average MOE requirement for grading was introduced instead of the fixed minimum MOE requirement which is currently used in the Korean standards. It was found that the fixed minimum MOE requirement method was easier for an inspector to grade but, less efficient as a strength predictor than the average MOE requirement method. The advantage of average MOE requirement method is statistically MOR-MOE regression-based MOR prediction and highly efficient in quality control though it requires a computer-aided operation system in an initial setup. A major weakness of the current Korean grading system was found that different strength characteristics depending on wood species were not reflected on the grade classification and the tabulated allowable design stress. The proposed procedures were developed taking advantages of respective merits of both methods and based on MOR-MOE regression analysis. Through this procedure, the grades of machine stress rated lumber should be revised to become interchangeable with E-rated lamination, which would be beneficial to the cost competitiveness of domestic machine graded lumber and glued laminated timber industry.

Prediction of patent lifespan and analysis of influencing factors using machine learning (기계학습을 활용한 특허수명 예측 및 영향요인 분석)

  • Kim, Yongwoo;Kim, Min Gu;Kim, Young-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.147-170
    • /
    • 2022
  • Although the number of patent which is one of the core outputs of technological innovation continues to increase, the number of low-value patents also hugely increased. Therefore, efficient evaluation of patents has become important. Estimation of patent lifespan which represents private value of a patent, has been studied for a long time, but in most cases it relied on a linear model. Even if machine learning methods were used, interpretation or explanation of the relationship between explanatory variables and patent lifespan was insufficient. In this study, patent lifespan (number of renewals) is predicted based on the idea that patent lifespan represents the value of the patent. For the research, 4,033,414 patents applied between 1996 and 2017 and finally granted were collected from USPTO (US Patent and Trademark Office). To predict the patent lifespan, we use variables that can reflect the characteristics of the patent, the patent owner's characteristics, and the inventor's characteristics. We build four different models (Ridge Regression, Random Forest, Feed Forward Neural Network, Gradient Boosting Models) and perform hyperparameter tuning through 5-fold Cross Validation. Then, the performance of the generated models are evaluated, and the relative importance of predictors is also presented. In addition, based on the Gradient Boosting Model which have excellent performance, Accumulated Local Effects Plot is presented to visualize the relationship between predictors and patent lifespan. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the evaluation reason of individual patents, and discuss applicability to the patent evaluation system. This study has academic significance in that it cumulatively contributes to the existing patent life estimation research and supplements the limitations of existing patent life estimation studies based on linearity. It is academically meaningful that this study contributes cumulatively to the existing studies which estimate patent lifespan, and that it supplements the limitations of linear models. Also, it is practically meaningful to suggest a method for deriving the evaluation basis for individual patent value and examine the applicability to patent evaluation systems.

Utilization of Smart Farms in Open-field Agriculture Based on Digital Twin (디지털 트윈 기반 노지스마트팜 활용방안)

  • Kim, Sukgu
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2023.04a
    • /
    • pp.7-7
    • /
    • 2023
  • Currently, the main technologies of various fourth industries are big data, the Internet of Things, artificial intelligence, blockchain, mixed reality (MR), and drones. In particular, "digital twin," which has recently become a global technological trend, is a concept of a virtual model that is expressed equally in physical objects and computers. By creating and simulating a Digital twin of software-virtualized assets instead of real physical assets, accurate information about the characteristics of real farming (current state, agricultural productivity, agricultural work scenarios, etc.) can be obtained. This study aims to streamline agricultural work through automatic water management, remote growth forecasting, drone control, and pest forecasting through the operation of an integrated control system by constructing digital twin data on the main production area of the nojinot industry and designing and building a smart farm complex. In addition, it aims to distribute digital environmental control agriculture in Korea that can reduce labor and improve crop productivity by minimizing environmental load through the use of appropriate amounts of fertilizers and pesticides through big data analysis. These open-field agricultural technologies can reduce labor through digital farming and cultivation management, optimize water use and prevent soil pollution in preparation for climate change, and quantitative growth management of open-field crops by securing digital data for the national cultivation environment. It is also a way to directly implement carbon-neutral RED++ activities by improving agricultural productivity. The analysis and prediction of growth status through the acquisition of the acquired high-precision and high-definition image-based crop growth data are very effective in digital farming work management. The Southern Crop Department of the National Institute of Food Science conducted research and development on various types of open-field agricultural smart farms such as underground point and underground drainage. In particular, from this year, commercialization is underway in earnest through the establishment of smart farm facilities and technology distribution for agricultural technology complexes across the country. In this study, we would like to describe the case of establishing the agricultural field that combines digital twin technology and open-field agricultural smart farm technology and future utilization plans.

  • PDF

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Analysis of domestic water usage patterns in Chungcheong using historical data of domestic water usage and climate variables (생활용수 실적자료와 기후 변수를 활용한 충청권역 생활용수 이용량 패턴 분석)

  • Kim, Min Ji;Park, Sung Min;Lee, Kyungju;So, Byung-Jin;Kim, Tae-Woong
    • Journal of Korea Water Resources Association
    • /
    • v.57 no.1
    • /
    • pp.1-8
    • /
    • 2024
  • Persistent droughts due to climate change will intensify water shortage problems in Korea. According to the 1st National Water Management Plan, the shortage of domestic and industrial waters is projected to be 0.07 billion m3/year under a 50-year drought event. A long-term prediction of water demand is essential for effectively responding to water shortage problems. Unlike industrial water, which has a relatively constant monthly usage, domestic water is analyzed on monthly basis due to apparent monthly usage patterns. We analyzed monthly water usage patterns using water usage data from 2017 to 2021 in Chungcheong, South Korea. The monthly water usage rate was calculated by dividing monthly water usage by annual water usage. We also calculated the water distribution rate considering correlations between water usage rate and climate variables. The division method that divided the monthly water usage rate by monthly average temperature resulted in the smallest absolute error. Using the division method with average temperature, we calculated the water distribution rates for the Chungcheong region. Then we predicted future water usage rates in the Chungcheong region by multiplying the average temperature of the SSP5-8.5 scenario and the water distribution rate. As a result, the average of the maximum water usage rate increased from 1.16 to 1.29 and the average of the minimum water usage rate decreased from 0.86 to 0.84, and the first quartile decreased from 0.95 to 0.93 and the third quartile increased from 1.04 to 1.06. Therefore, it is expected that the variability in monthly water usage rates will increase in the future.