Search | Korea Science

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

Kim, JaeHun;Lee, Myungjin
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.43-61
- /
- 2019
Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.
https://doi.org/10.13088/jiis.2019.25.1.043 인용 PDF KSCI HTML

Damage of Whole Crop Maize in Abnormal Climate Using Machine Learning (이상기상 시 사일리지용 옥수수의 기계학습을 이용한 피해량 산출)

Kim, Ji Yung;Choi, Jae Seong;Jo, Hyun Wook;Kim, Moon Ju;Kim, Byong Wan;Sung, Kyung Il
- Journal of The Korean Society of Grassland and Forage Science
- /
- v.42 no.2
- /
- pp.127-136
- /
- 2022
This study was conducted to estimate the damage of Whole Crop Maize (WCM) according to abnormal climate using machine learning and present the damage through mapping. The collected WCM data was 3,232. The climate data was collected from the Korea Meteorological Administration's meteorological data open portal. Deep Crossing is used for the machine learning model. The damage was calculated using climate data from the Automated Synoptic Observing System (95 sites) by machine learning. The damage was calculated by difference between the Dry matter yield (DMY)_normal and DMY_abnormal. The normal climate was set as the 40-year of climate data according to the year of WCM data (1978~2017). The level of abnormal climate was set as a multiple of the standard deviation applying the World Meteorological Organization(WMO) standard. The DMY_normal was ranged from 13,845~19,347 kg/ha. The damage of WCM was differed according to region and level of abnormal climate and ranged from -305 to 310, -54 to 89, and -610 to 813 kg/ha bnormal temperature, precipitation, and wind speed, respectively. The maximum damage was 310 kg/ha when the abnormal temperature was +2 level (+1.42 ℃), 89 kg/ha when the abnormal precipitation was -2 level (-0.12 mm) and 813 kg/ha when the abnormal wind speed was -2 level (-1.60 m/s). The damage calculated through the WMO method was presented as an mapping using QGIS. When calculating the damage of WCM due to abnormal climate, there was some blank area because there was no data. In order to calculate the damage of blank area, it would be possible to use the automatic weather system (AWS), which provides data from more sites than the automated synoptic observing system (ASOS).
https://doi.org/10.5333/KGFS.2022.42.2.127 인용 PDF KSCI

Data-Driven Approach to Identify Research Topics for Science and Technology Diplomacy (과학외교를 위한 데이터기반의 연구주제선정 방법)

Yeo, Woon-Dong;Kim, Seonho;Lee, BangRae;Noh, Kyung-Ran
- The Journal of the Korea Contents Association
- /
- v.20 no.11
- /
- pp.216-227
- /
- 2020
In science and technology diplomacy, major countries actively utilize their capabilities in science and technology for public diplomacy, especially for promoting diplomatic relations with politically sensitive regions and countries. Recently, with an increase in the influence of science and technology on national development, interest in science and technology diplomacy has increased. So far, science and technology diplomacy has relied on experts to find research topics that are of common interest to both the countries. However, this method has various problems such as the bias arising from the subjective judgment of experts, the attribution of the halo effect to famous researchers, and the use of different criteria for different experts. This paper presents an objective data-based approach to identify and recommend research topics to support science and technology diplomacy without relying on the expert-based approach. The proposed approach is based on big data analysis that uses deep-learning techniques and bibliometric methods. The Scopus database is used to find proper topics for collaborative research between two countries. This approach has been used to support science and technology diplomacy between Korea and Hungary and has raised expectations of policy makers. This paper finally discusses aspects that should be focused on to improve the system in the future.
https://doi.org/10.5392/JKCA.2020.20.11.216 인용 PDF KSCI HTML

Timely Sensor Fault Detection Scheme based on Deep Learning (딥 러닝 기반 실시간 센서 고장 검출 기법)

Yang, Jae-Wan;Lee, Young-Doo;Koo, In-Soo
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.20 no.1
- /
- pp.163-169
- /
- 2020
Recently, research on automation and unmanned operation of machines in the industrial field has been conducted with the advent of AI, Big data, and the IoT, which are the core technologies of the Fourth Industrial Revolution. The machines for these automation processes are controlled based on the data collected from the sensors attached to them, and further, the processes are managed. Conventionally, the abnormalities of sensors are periodically checked and managed. However, due to various environmental factors and situations in the industrial field, there are cases where the inspection due to the failure is not missed or failures are not detected to prevent damage due to sensor failure. In addition, even if a failure occurs, it is not immediately detected, which worsens the process loss. Therefore, in order to prevent damage caused by such a sudden sensor failure, it is necessary to identify the failure of the sensor in an embedded system in real-time and to diagnose the failure and determine the type for a quick response. In this paper, a deep neural network-based fault diagnosis system is designed and implemented using Raspberry Pi to classify typical sensor fault types such as erratic fault, hard-over fault, spike fault, and stuck fault. In order to diagnose sensor failure, the network is constructed using Google's proposed Inverted residual block structure of MobilieNetV2. The proposed scheme reduces memory usage and improves the performance of the conventional CNN technique to classify sensor faults.
https://doi.org/10.7236/JIIBC.2020.20.1.163 인용 PDF KSCI HTML

A Study on the Development Trend of Artificial Intelligence Using Text Mining Technique: Focused on Open Source Software Projects on Github (텍스트 마이닝 기법을 활용한 인공지능 기술개발 동향 분석 연구: 깃허브 상의 오픈 소스 소프트웨어 프로젝트를 대상으로)

Chong, JiSeon;Kim, Dongsung;Lee, Hong Joo;Kim, Jong Woo
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.1-19
- /
- 2019
Artificial intelligence (AI) is one of the main driving forces leading the Fourth Industrial Revolution. The technologies associated with AI have already shown superior abilities that are equal to or better than people in many fields including image and speech recognition. Particularly, many efforts have been actively given to identify the current technology trends and analyze development directions of it, because AI technologies can be utilized in a wide range of fields including medical, financial, manufacturing, service, and education fields. Major platforms that can develop complex AI algorithms for learning, reasoning, and recognition have been open to the public as open source projects. As a result, technologies and services that utilize them have increased rapidly. It has been confirmed as one of the major reasons for the fast development of AI technologies. Additionally, the spread of the technology is greatly in debt to open source software, developed by major global companies, supporting natural language recognition, speech recognition, and image recognition. Therefore, this study aimed to identify the practical trend of AI technology development by analyzing OSS projects associated with AI, which have been developed by the online collaboration of many parties. This study searched and collected a list of major projects related to AI, which were generated from 2000 to July 2018 on Github. This study confirmed the development trends of major technologies in detail by applying text mining technique targeting topic information, which indicates the characteristics of the collected projects and technical fields. The results of the analysis showed that the number of software development projects by year was less than 100 projects per year until 2013. However, it increased to 229 projects in 2014 and 597 projects in 2015. Particularly, the number of open source projects related to AI increased rapidly in 2016 (2,559 OSS projects). It was confirmed that the number of projects initiated in 2017 was 14,213, which is almost four-folds of the number of total projects generated from 2009 to 2016 (3,555 projects). The number of projects initiated from Jan to Jul 2018 was 8,737. The development trend of AI-related technologies was evaluated by dividing the study period into three phases. The appearance frequency of topics indicate the technology trends of AI-related OSS projects. The results showed that the natural language processing technology has continued to be at the top in all years. It implied that OSS had been developed continuously. Until 2015, Python, C ++, and Java, programming languages, were listed as the top ten frequently appeared topics. However, after 2016, programming languages other than Python disappeared from the top ten topics. Instead of them, platforms supporting the development of AI algorithms, such as TensorFlow and Keras, are showing high appearance frequency. Additionally, reinforcement learning algorithms and convolutional neural networks, which have been used in various fields, were frequently appeared topics. The results of topic network analysis showed that the most important topics of degree centrality were similar to those of appearance frequency. The main difference was that visualization and medical imaging topics were found at the top of the list, although they were not in the top of the list from 2009 to 2012. The results indicated that OSS was developed in the medical field in order to utilize the AI technology. Moreover, although the computer vision was in the top 10 of the appearance frequency list from 2013 to 2015, they were not in the top 10 of the degree centrality. The topics at the top of the degree centrality list were similar to those at the top of the appearance frequency list. It was found that the ranks of the composite neural network and reinforcement learning were changed slightly. The trend of technology development was examined using the appearance frequency of topics and degree centrality. The results showed that machine learning revealed the highest frequency and the highest degree centrality in all years. Moreover, it is noteworthy that, although the deep learning topic showed a low frequency and a low degree centrality between 2009 and 2012, their ranks abruptly increased between 2013 and 2015. It was confirmed that in recent years both technologies had high appearance frequency and degree centrality. TensorFlow first appeared during the phase of 2013-2015, and the appearance frequency and degree centrality of it soared between 2016 and 2018 to be at the top of the lists after deep learning, python. Computer vision and reinforcement learning did not show an abrupt increase or decrease, and they had relatively low appearance frequency and degree centrality compared with the above-mentioned topics. Based on these analysis results, it is possible to identify the fields in which AI technologies are actively developed. The results of this study can be used as a baseline dataset for more empirical analysis on future technology trends that can be converged.
https://doi.org/10.13088/jiis.2019.25.1.001 인용 PDF KSCI HTML

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
- Journal of Intelligence and Information Systems
- /
- v.27 no.3
- /
- pp.139-156
- /
- 2021
The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.
https://doi.org/10.13088/jiis.2021.27.3.139 인용 PDF KSCI

Image based Fire Detection using Convolutional Neural Network (CNN을 활용한 영상 기반의 화재 감지)

Kim, Young-Jin;Kim, Eun-Gyung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.20 no.9
- /
- pp.1649-1656
- /
- 2016
Performance of the existing sensor-based fire detection system is limited according to factors in the environment surrounding the sensor. A number of image-based fire detection systems were introduced in order to solve these problem. But such a system can generate a false alarm for objects similar in appearance to fire due to algorithm that directly defines the characteristics of a flame. Also fir detection systems using movement between video flames cannot operate correctly as intended in an environment in which the network is unstable. In this paper, we propose an image-based fire detection method using CNN (Convolutional Neural Network). In this method, firstly we extract fire candidate region using color information from video frame input and then detect fire using trained CNN. Also, we show that the performance is significantly improved compared to the detection rate and missing rate found in previous studies.
https://doi.org/10.6109/jkiice.2016.20.9.1649 인용 PDF KSCI

Automatic Wood Species Identification of Korean Softwood Based on Convolutional Neural Networks

Kwon, Ohkyung;Lee, Hyung Gu;Lee, Mi-Rim;Jang, Sujin;Yang, Sang-Yun;Park, Se-Yeong;Choi, In-Gyu;Yeo, Hwanmyeong
- Journal of the Korean Wood Science and Technology
- /
- v.45 no.6
- /
- pp.797-808
- /
- 2017
Automatic wood species identification systems have enabled fast and accurate identification of wood species outside of specialized laboratories with well-trained experts on wood species identification. Conventional automatic wood species identification systems consist of two major parts: a feature extractor and a classifier. Feature extractors require hand-engineering to obtain optimal features to quantify the content of an image. A Convolutional Neural Network (CNN), which is one of the Deep Learning methods, trained for wood species can extract intrinsic feature representations and classify them correctly. It usually outperforms classifiers built on top of extracted features with a hand-tuning process. We developed an automatic wood species identification system utilizing CNN models such as LeNet, MiniVGGNet, and their variants. A smartphone camera was used for obtaining macroscopic images of rough sawn surfaces from cross sections of woods. Five Korean softwood species (cedar, cypress, Korean pine, Korean red pine, and larch) were under classification by the CNN models. The highest and most stable CNN model was LeNet3 that is two additional layers added to the original LeNet architecture. The accuracy of species identification by LeNet3 architecture for the five Korean softwood species was 99.3%. The result showed the automatic wood species identification system is sufficiently fast and accurate as well as small to be deployed to a mobile device such as a smartphone.
https://doi.org/10.5658/WOOD.2017.45.6.797 인용 PDF KSCI

Road Crack Detection based on Object Detection Algorithm using Unmanned Aerial Vehicle Image (드론영상을 이용한 물체탐지알고리즘 기반 도로균열탐지)

Kim, Jeong Min;Hyeon, Se Gwon;Chae, Jung Hwan;Do, Myung Sik
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.18 no.6
- /
- pp.155-163
- /
- 2019
This paper proposes a new methodology to recognize cracks on asphalt road surfaces using the image data obtained with drones. The target section was Yuseong-daero, the main highway of Daejeon. Furthermore, two object detection algorithms, such as Tiny-YOLO-V2 and Faster-RCNN, were used to recognize cracks on road surfaces, classify the crack types, and compare the experimental results. As a result, mean average precision of Faster-RCNN and Tiny-YOLO-V2 was 71% and 33%, respectively. The Faster-RCNN algorithm, 2Stage Detection, showed better performance in identifying and separating road surface cracks than the Yolo algorithm, 1Stage Detection. In the future, it will be possible to prepare a plan for building an infrastructure asset-management system using drones and AI crack detection systems. An efficient and economical road-maintenance decision-support system will be established and an operating environment will be produced.
https://doi.org/10.12815/kits.2019.18.6.155 인용 PDF KSCI

An Artificial Intelligence Method for the Prediction of Near- and Off-Shore Fish Catch Using Satellite and Numerical Model Data

Yoon, You-Jeong;Cho, Subin;Kim, Seoyeon;Kim, Nari;Lee, Soo-Jin;Ahn, Jihye;Lee, Eunjeong;Joh, Seongeok;Lee, Yang-Won
- Korean Journal of Remote Sensing
- /
- v.36 no.1
- /
- pp.41-53
- /
- 2020
The production of near- and off-shore fisheries in South Korea is decreasing due to rapid changes in the fishing environment, particularly including higher sea temperature in recent years. To improve the competitiveness of the fisheries, it is necessary to provide fish catch information that changes spatiotemporally according to the sea state. In this study, artificial intelligence models that predict the CPUE (catch per unit effort) of mackerel, anchovies, and squid (Todarodes pacificus), which are three major fish species in the near- and off-shore areas of South Korea, on a 15-km grid and daily basis were developed. The models were trained and validated using the sea surface temperature, rainfall, relative humidity, pressure,sea surface wind velocity, significant wave height, and salinity as input data, and the fish catch statistics of Suhyup (National Federation of Fisheries Cooperatives) as observed data. The 10-fold blind test results showed that the developed artificial intelligence models exhibited accuracy with a corresponding correlation coefficient of 0.86. It is expected that the fish catch models can be actually operated with high accuracy under various sea conditions if high-quality large-volume data are available.
https://doi.org/10.7780/kjrs.2020.36.1.4 인용 PDF KSCI HTML

Search Result 1,745, Processing Time 0.046 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)