• Title/Summary/Keyword: Trained Model

Search Result 1,531, Processing Time 0.03 seconds

Comparison of CNN and GAN-based Deep Learning Models for Ground Roll Suppression (그라운드-롤 제거를 위한 CNN과 GAN 기반 딥러닝 모델 비교 분석)

  • Sangin Cho;Sukjoon Pyun
    • Geophysics and Geophysical Exploration
    • /
    • v.26 no.2
    • /
    • pp.37-51
    • /
    • 2023
  • The ground roll is the most common coherent noise in land seismic data and has an amplitude much larger than the reflection event we usually want to obtain. Therefore, ground roll suppression is a crucial step in seismic data processing. Several techniques, such as f-k filtering and curvelet transform, have been developed to suppress the ground roll. However, the existing methods still require improvements in suppression performance and efficiency. Various studies on the suppression of ground roll in seismic data have recently been conducted using deep learning methods developed for image processing. In this paper, we introduce three models (DnCNN (De-noiseCNN), pix2pix, and CycleGAN), based on convolutional neural network (CNN) or conditional generative adversarial network (cGAN), for ground roll suppression and explain them in detail through numerical examples. Common shot gathers from the same field were divided into training and test datasets to compare the algorithms. We trained the models using the training data and evaluated their performances using the test data. When training these models with field data, ground roll removed data are required; therefore, the ground roll is suppressed by f-k filtering and used as the ground-truth data. To evaluate the performance of the deep learning models and compare the training results, we utilized quantitative indicators such as the correlation coefficient and structural similarity index measure (SSIM) based on the similarity to the ground-truth data. The DnCNN model exhibited the best performance, and we confirmed that other models could also be applied to suppress the ground roll.

Mean Teacher Learning Structure Optimization for Semantic Segmentation of Crack Detection (균열 탐지의 의미론적 분할을 위한 Mean Teacher 학습 구조 최적화 )

  • Seungbo Shim
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.27 no.5
    • /
    • pp.113-119
    • /
    • 2023
  • Most infrastructure structures were completed during periods of economic growth. The number of infrastructure structures reaching their lifespan is increasing, and the proportion of old structures is gradually increasing. The functions and performance of these structures at the time of design may deteriorate and may even lead to safety accidents. To prevent this repercussion, accurate inspection and appropriate repair are requisite. To this end, demand is increasing for computer vision and deep learning technology to accurately detect even minute cracks. However, deep learning algorithms require a large number of training data. In particular, label images indicating the location of cracks in the image are required. To secure a large number of those label images, a lot of labor and time are consumed. To reduce these costs as well as increase detection accuracy, this study proposed a learning structure based on mean teacher method. This learning structure was trained on a dataset of 900 labeled image dataset and 3000 unlabeled image dataset. The crack detection network model was evaluated on over 300 labeled image dataset, and the detection accuracy recorded a mean intersection over union of 89.23% and an F1 score of 89.12%. Through this experiment, it was confirmed that detection performance was improved compared to supervised learning. It is expected that this proposed method will be used in the future to reduce the cost required to secure label images.

Estimation of the Spring and Summer Net Community Production in the Ulleung Basin using Machine Learning Methods (기계학습법을 이용한 동해 울릉분지의 봄과 여름 순군집생산 추정)

  • DOSHIK HAHM;INHEE LEE;MINKI CHOO
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.29 no.1
    • /
    • pp.1-13
    • /
    • 2024
  • The southwestern part of the East Sea is known to have a high primary productivity compared to those in the northern and eastern parts, which is attributed to nutrients supplies either by Tsushima Warm Current or by coastal upwelling. However, research on the biological pump in this area is limited. We developed machine learning models to estimate net community production (NCP), a measure of biological pump, with high spatial and time scales of 4 km and 8 days, respectively. The models were fed with the input parameters of sea surface temperature, chlorophyll-a, mixed layer depths, and photosynthetically active radiation and trained with observed NCP derived from high resolution measurements of surface O2/Ar. The root mean square error between the predicted values by the best performing machine model and the observed NCP was 6 mmol O2 m-2 d-1, corresponding to 15% of the average of observed NCP. The NCP in the central part of the Ulleung Basin was highest in March at 49 mmol O2 m-2 d-1 and lowest in June and July at 18 mmol O2 m-2 d-1. These seasonal variations were similar to the vertical nitrate flux based on the 3He gas exchange rate and to the particulate organic carbon flux estimated by the 234Th disequilibrium method. To expand this method, which produces NCP estimate for spring and summer, to autumn and winter, it is necessary to devise a way to correct bias in NCP by the entrainment of subsurface waters during the seasons.

Predicting Potential Habitat for Hanabusaya Asiatica in the North and South Korean Border Region Using MaxEnt (MaxEnt 모형 분석을 통한 남북한 접경지역의 금강초롱꽃 자생가능지 예측)

  • Sung, Chan Yong;Shin, Hyun-Tak;Choi, Song-Hyun;Song, Hong-Seon
    • Korean Journal of Environment and Ecology
    • /
    • v.32 no.5
    • /
    • pp.469-477
    • /
    • 2018
  • Hanabusaya asiatica is an endemic species whose distribution is limited in the mid-eastern part of the Korean peninsula. Due to its narrow range and small population, it is necessary to protect its habitats by identifying it as Key Biodiversity Areas (KBAs) adopted by the International Union for Conservation of Nature (IUCN). In this paper, we estimated potential natural habitats for H. asiatica using maximum entropy model (MaxEnt) and identified candidate sites for KBA based on the model results. MaxEnt is a machine learning algorithm that can predict habitats for species of interest unbiasedly with presence-only data. This property is particularly useful for the study area where data collection via a field survey is unavailable. We trained MaxEnt using 38 locations of H. asiatica and 11 environmental variables that measured climate, topography, and vegetation status of the study area which encompassed all locations of the border region between South and North Korea. Results showed that the potential habitats where the occurrence probabilities of H. asiatica exceeded 0.5 were $778km^2$, and the KBA candidate area identified by taking into account existing protected areas was $1,321km^2$. Of 11 environmental variables, elevation, annual average precipitation, average precipitation in growing seasons, and the average temperature in the coldest month had impacts on habitat selection, indicating that H. asiatica prefers cool regions at a relatively high elevation. These results can be used not only for identifying KBAs but also for the reference to a protection plan for H. asiatica in preparation of Korean reunification and climate change.

Deep Learning Approaches for Accurate Weed Area Assessment in Maize Fields (딥러닝 기반 옥수수 포장의 잡초 면적 평가)

  • Hyeok-jin Bak;Dongwon Kwon;Wan-Gyu Sang;Ho-young Ban;Sungyul Chang;Jae-Kyeong Baek;Yun-Ho Lee;Woo-jin Im;Myung-chul Seo;Jung-Il Cho
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.1
    • /
    • pp.17-27
    • /
    • 2023
  • Weeds are one of the factors that reduce crop yield through nutrient and photosynthetic competition. Quantification of weed density are an important part of making accurate decisions for precision weeding. In this study, we tried to quantify the density of weeds in images of maize fields taken by unmanned aerial vehicle (UAV). UAV image data collection took place in maize fields from May 17 to June 4, 2021, when maize was in its early growth stage. UAV images were labeled with pixels from maize and those without and the cropped to be used as the input data of the semantic segmentation network for the maize detection model. We trained a model to separate maize from background using the deep learning segmentation networks DeepLabV3+, U-Net, Linknet, and FPN. All four models showed pixel accuracy of 0.97, and the mIOU score was 0.76 and 0.74 in DeepLabV3+ and U-Net, higher than 0.69 for Linknet and FPN. Weed density was calculated as the difference between the green area classified as ExGR (Excess green-Excess red) and the maize area predicted by the model. Each image evaluated for weed density was recombined to quantify and visualize the distribution and density of weeds in a wide range of maize fields. We propose a method to quantify weed density for accurate weeding by effectively separating weeds, maize, and background from UAV images of maize fields.

Research on hybrid music recommendation system using metadata of music tracks and playlists (음악과 플레이리스트의 메타데이터를 활용한 하이브리드 음악 추천 시스템에 관한 연구)

  • Hyun Tae Lee;Gyoo Gun Lim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.145-165
    • /
    • 2023
  • Recommendation system plays a significant role on relieving difficulties of selecting information among rapidly increasing amount of information caused by the development of the Internet and on efficiently displaying information that fits individual personal interest. In particular, without the help of recommendation system, E-commerce and OTT companies cannot overcome the long-tail phenomenon, a phenomenon in which only popular products are consumed, as the number of products and contents are rapidly increasing. Therefore, the research on recommendation systems is being actively conducted to overcome the phenomenon and to provide information or contents that are aligned with users' individual interests, in order to induce customers to consume various products or contents. Usually, collaborative filtering which utilizes users' historical behavioral data shows better performance than contents-based filtering which utilizes users' preferred contents. However, collaborative filtering can suffer from cold-start problem which occurs when there is lack of users' historical behavioral data. In this paper, hybrid music recommendation system, which can solve cold-start problem, is proposed based on the playlist data of Melon music streaming service that is given by Kakao Arena for music playlist continuation competition. The goal of this research is to use music tracks, that are included in the playlists, and metadata of music tracks and playlists in order to predict other music tracks when the half or whole of the tracks are masked. Therefore, two different recommendation procedures were conducted depending on the two different situations. When music tracks are included in the playlist, LightFM is used in order to utilize the music track list of the playlists and metadata of each music tracks. Then, the result of Item2Vec model, which uses vector embeddings of music tracks, tags and titles for recommendation, is combined with the result of LightFM model to create final recommendation list. When there are no music tracks available in the playlists but only playlists' tags and titles are available, recommendation was made by finding similar playlists based on playlists vectors which was made by the aggregation of FastText pre-trained embedding vectors of tags and titles of each playlists. As a result, not only cold-start problem can be resolved, but also achieved better performance than ALS, BPR and Item2Vec by using the metadata of both music tracks and playlists. In addition, it was found that the LightFM model, which uses only artist information as an item feature, shows the best performance compared to other LightFM models which use other item features of music tracks.

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.

The Classification System and Information Service for Establishing a National Collaborative R&D Strategy in Infectious Diseases: Focusing on the Classification Model for Overseas Coronavirus R&D Projects (국가 감염병 공동R&D전략 수립을 위한 분류체계 및 정보서비스에 대한 연구: 해외 코로나바이러스 R&D과제의 분류모델을 중심으로)

  • Lee, Doyeon;Lee, Jae-Seong;Jun, Seung-pyo;Kim, Keun-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.127-147
    • /
    • 2020
  • The world is suffering from numerous human and economic losses due to the novel coronavirus infection (COVID-19). The Korean government established a strategy to overcome the national infectious disease crisis through research and development. It is difficult to find distinctive features and changes in a specific R&D field when using the existing technical classification or science and technology standard classification. Recently, a few studies have been conducted to establish a classification system to provide information about the investment research areas of infectious diseases in Korea through a comparative analysis of Korea government-funded research projects. However, these studies did not provide the necessary information for establishing cooperative research strategies among countries in the infectious diseases, which is required as an execution plan to achieve the goals of national health security and fostering new growth industries. Therefore, it is inevitable to study information services based on the classification system and classification model for establishing a national collaborative R&D strategy. Seven classification - Diagnosis_biomarker, Drug_discovery, Epidemiology, Evaluation_validation, Mechanism_signaling pathway, Prediction, and Vaccine_therapeutic antibody - systems were derived through reviewing infectious diseases-related national-funded research projects of South Korea. A classification system model was trained by combining Scopus data with a bidirectional RNN model. The classification performance of the final model secured robustness with an accuracy of over 90%. In order to conduct the empirical study, an infectious disease classification system was applied to the coronavirus-related research and development projects of major countries such as the STAR Metrics (National Institutes of Health) and NSF (National Science Foundation) of the United States(US), the CORDIS (Community Research & Development Information Service)of the European Union(EU), and the KAKEN (Database of Grants-in-Aid for Scientific Research) of Japan. It can be seen that the research and development trends of infectious diseases (coronavirus) in major countries are mostly concentrated in the prediction that deals with predicting success for clinical trials at the new drug development stage or predicting toxicity that causes side effects. The intriguing result is that for all of these nations, the portion of national investment in the vaccine_therapeutic antibody, which is recognized as an area of research and development aimed at the development of vaccines and treatments, was also very small (5.1%). It indirectly explained the reason of the poor development of vaccines and treatments. Based on the result of examining the investment status of coronavirus-related research projects through comparative analysis by country, it was found that the US and Japan are relatively evenly investing in all infectious diseases-related research areas, while Europe has relatively large investments in specific research areas such as diagnosis_biomarker. Moreover, the information on major coronavirus-related research organizations in major countries was provided by the classification system, thereby allowing establishing an international collaborative R&D projects.

A Study on the Establishment of Comparison System between the Statement of Military Reports and Related Laws (군(軍) 보고서 등장 문장과 관련 법령 간 비교 시스템 구축 방안 연구)

  • Jung, Jiin;Kim, Mintae;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.109-125
    • /
    • 2020
  • The Ministry of National Defense is pushing for the Defense Acquisition Program to build strong defense capabilities, and it spends more than 10 trillion won annually on defense improvement. As the Defense Acquisition Program is directly related to the security of the nation as well as the lives and property of the people, it must be carried out very transparently and efficiently by experts. However, the excessive diversification of laws and regulations related to the Defense Acquisition Program has made it challenging for many working-level officials to carry out the Defense Acquisition Program smoothly. It is even known that many people realize that there are related regulations that they were unaware of until they push ahead with their work. In addition, the statutory statements related to the Defense Acquisition Program have the tendency to cause serious issues even if only a single expression is wrong within the sentence. Despite this, efforts to establish a sentence comparison system to correct this issue in real time have been minimal. Therefore, this paper tries to propose a "Comparison System between the Statement of Military Reports and Related Laws" implementation plan that uses the Siamese Network-based artificial neural network, a model in the field of natural language processing (NLP), to observe the similarity between sentences that are likely to appear in the Defense Acquisition Program related documents and those from related statutory provisions to determine and classify the risk of illegality and to make users aware of the consequences. Various artificial neural network models (Bi-LSTM, Self-Attention, D_Bi-LSTM) were studied using 3,442 pairs of "Original Sentence"(described in actual statutes) and "Edited Sentence"(edited sentences derived from "Original Sentence"). Among many Defense Acquisition Program related statutes, DEFENSE ACQUISITION PROGRAM ACT, ENFORCEMENT RULE OF THE DEFENSE ACQUISITION PROGRAM ACT, and ENFORCEMENT DECREE OF THE DEFENSE ACQUISITION PROGRAM ACT were selected. Furthermore, "Original Sentence" has the 83 provisions that actually appear in the Act. "Original Sentence" has the main 83 clauses most accessible to working-level officials in their work. "Edited Sentence" is comprised of 30 to 50 similar sentences that are likely to appear modified in the county report for each clause("Original Sentence"). During the creation of the edited sentences, the original sentences were modified using 12 certain rules, and these sentences were produced in proportion to the number of such rules, as it was the case for the original sentences. After conducting 1 : 1 sentence similarity performance evaluation experiments, it was possible to classify each "Edited Sentence" as legal or illegal with considerable accuracy. In addition, the "Edited Sentence" dataset used to train the neural network models contains a variety of actual statutory statements("Original Sentence"), which are characterized by the 12 rules. On the other hand, the models are not able to effectively classify other sentences, which appear in actual military reports, when only the "Original Sentence" and "Edited Sentence" dataset have been fed to them. The dataset is not ample enough for the model to recognize other incoming new sentences. Hence, the performance of the model was reassessed by writing an additional 120 new sentences that have better resemblance to those in the actual military report and still have association with the original sentences. Thereafter, we were able to check that the models' performances surpassed a certain level even when they were trained merely with "Original Sentence" and "Edited Sentence" data. If sufficient model learning is achieved through the improvement and expansion of the full set of learning data with the addition of the actual report appearance sentences, the models will be able to better classify other sentences coming from military reports as legal or illegal. Based on the experimental results, this study confirms the possibility and value of building "Real-Time Automated Comparison System Between Military Documents and Related Laws". The research conducted in this experiment can verify which specific clause, of several that appear in related law clause is most similar to the sentence that appears in the Defense Acquisition Program-related military reports. This helps determine whether the contents in the military report sentences are at the risk of illegality when they are compared with those in the law clauses.

Investigation of Factors on the Sensory Characteristics of Milk Bread with Tumeric Powder (Curcuma longa L.) Using Fractional Factorial Design Method (부분배치법을 활용한 울금 분말 첨가 우유식빵의 관능적 영향 인자 탐색)

  • Jung, Kyong Im;Park, Jae Ha;Kim, Mi Jeong
    • Journal of the Korean Society of Food Science and Nutrition
    • /
    • v.43 no.4
    • /
    • pp.592-603
    • /
    • 2014
  • We developed various recipes of turmeric powder (Curcuma longa L.) added to milk bread and assessed the individual effects of seven ingredients [milk ($X_1$), turmeric powder ($X_2$), bread improver ($X_3$), fresh yeast ($X_4$), butter ($X_5$), sugar ($X_6$), and salt ($X_7$)] as well as the 2-way interaction effects of the ingredients on the sensory characteristics of breads using fractional factorial design method. The center and end points of each component were determined via literature review and multiple test baking. Seven trained sensory test panels evaluated the outside appearance (OA), inside appearance (IA), and flavor & texture (FT) of 38 breads using 46 items of sensory evaluation. Findings are as follows: for the OA, $X_1$ (P<0.05) and $X_4$ (P<0.0001) exhibited significant individual effects, whereas $X_1*X_7$, $X_2*X_5$, $X_3*X_6$, and $X_4*X_6$ indicated significant interaction effects (P<0.05). For the IA, $X_1$ (P<0.0001), $X_4$ (P<0.0001), $X_6$ (P<0.05), $X_2*X_4$ (P<0.05), and $X_3*X_6$ (P<0.01) showed individual and interaction effects, respectively. For the FT, $X_1$ and $X_2$ showed the most significant individual effect (P<0.0001), followed by $X_4$, $X_5$ and $X_6$ (P<0.05) in descending order. $X_4*X_7$ indicated the only significant interaction effect. We computed the magnitudes of the 2-way interaction effects of the ingredients with a distinct emphasis. Model equations predicting the levels of the ingredient effects on the breads were also provided via regression analyses. In summation, $X_4$ appeared to be the most significant component affecting the sensory characteristics based on its individual and 2-way interaction effects. Further, $X_6$, $X_1$, $X_2$, and $X_5$ indicated both individual and interaction effects. $X_3$ and X7 showed only interaction effects. The center point effect appeared to be unequivocal for whole sensory characteristics. Findings of the present study may provide insights into the selection of ingredients to derive an optimal model for turmeric powder-added bread using the response surface method hereafter.