Search | Korea Science

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

Kim, JaeHun;Lee, Myungjin
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.43-61
- /
- 2019
Development of technologies in artificial intelligence has been rapidly increasing with the Fourth Industrial Revolution, and researches related to AI have been actively conducted in a variety of fields such as autonomous vehicles, natural language processing, and robotics. These researches have been focused on solving cognitive problems such as learning and problem solving related to human intelligence from the 1950s. The field of artificial intelligence has achieved more technological advance than ever, due to recent interest in technology and research on various algorithms. The knowledge-based system is a sub-domain of artificial intelligence, and it aims to enable artificial intelligence agents to make decisions by using machine-readable and processible knowledge constructed from complex and informal human knowledge and rules in various fields. A knowledge base is used to optimize information collection, organization, and retrieval, and recently it is used with statistical artificial intelligence such as machine learning. Recently, the purpose of the knowledge base is to express, publish, and share knowledge on the web by describing and connecting web resources such as pages and data. These knowledge bases are used for intelligent processing in various fields of artificial intelligence such as question answering system of the smart speaker. However, building a useful knowledge base is a time-consuming task and still requires a lot of effort of the experts. In recent years, many kinds of research and technologies of knowledge based artificial intelligence use DBpedia that is one of the biggest knowledge base aiming to extract structured content from the various information of Wikipedia. DBpedia contains various information extracted from Wikipedia such as a title, categories, and links, but the most useful knowledge is from infobox of Wikipedia that presents a summary of some unifying aspect created by users. These knowledge are created by the mapping rule between infobox structures and DBpedia ontology schema defined in DBpedia Extraction Framework. In this way, DBpedia can expect high reliability in terms of accuracy of knowledge by using the method of generating knowledge from semi-structured infobox data created by users. However, since only about 50% of all wiki pages contain infobox in Korean Wikipedia, DBpedia has limitations in term of knowledge scalability. This paper proposes a method to extract knowledge from text documents according to the ontology schema using machine learning. In order to demonstrate the appropriateness of this method, we explain a knowledge extraction model according to the DBpedia ontology schema by learning Wikipedia infoboxes. Our knowledge extraction model consists of three steps, document classification as ontology classes, proper sentence classification to extract triples, and value selection and transformation into RDF triple structure. The structure of Wikipedia infobox are defined as infobox templates that provide standardized information across related articles, and DBpedia ontology schema can be mapped these infobox templates. Based on these mapping relations, we classify the input document according to infobox categories which means ontology classes. After determining the classification of the input document, we classify the appropriate sentence according to attributes belonging to the classification. Finally, we extract knowledge from sentences that are classified as appropriate, and we convert knowledge into a form of triples. In order to train models, we generated training data set from Wikipedia dump using a method to add BIO tags to sentences, so we trained about 200 classes and about 2,500 relations for extracting knowledge. Furthermore, we evaluated comparative experiments of CRF and Bi-LSTM-CRF for the knowledge extraction process. Through this proposed process, it is possible to utilize structured knowledge by extracting knowledge according to the ontology schema from text documents. In addition, this methodology can significantly reduce the effort of the experts to construct instances according to the ontology schema.
https://doi.org/10.13088/jiis.2019.25.1.043 인용 PDF KSCI HTML

Development of a complex failure prediction system using Hierarchical Attention Network (Hierarchical Attention Network를 이용한 복합 장애 발생 예측 시스템 개발)

Park, Youngchan;An, Sangjun;Kim, Mintae;Kim, Wooju
- Journal of Intelligence and Information Systems
- /
- v.26 no.4
- /
- pp.127-148
- /
- 2020
The data center is a physical environment facility for accommodating computer systems and related components, and is an essential foundation technology for next-generation core industries such as big data, smart factories, wearables, and smart homes. In particular, with the growth of cloud computing, the proportional expansion of the data center infrastructure is inevitable. Monitoring the health of these data center facilities is a way to maintain and manage the system and prevent failure. If a failure occurs in some elements of the facility, it may affect not only the relevant equipment but also other connected equipment, and may cause enormous damage. In particular, IT facilities are irregular due to interdependence and it is difficult to know the cause. In the previous study predicting failure in data center, failure was predicted by looking at a single server as a single state without assuming that the devices were mixed. Therefore, in this study, data center failures were classified into failures occurring inside the server (Outage A) and failures occurring outside the server (Outage B), and focused on analyzing complex failures occurring within the server. Server external failures include power, cooling, user errors, etc. Since such failures can be prevented in the early stages of data center facility construction, various solutions are being developed. On the other hand, the cause of the failure occurring in the server is difficult to determine, and adequate prevention has not yet been achieved. In particular, this is the reason why server failures do not occur singularly, cause other server failures, or receive something that causes failures from other servers. In other words, while the existing studies assumed that it was a single server that did not affect the servers and analyzed the failure, in this study, the failure occurred on the assumption that it had an effect between servers. In order to define the complex failure situation in the data center, failure history data for each equipment existing in the data center was used. There are four major failures considered in this study: Network Node Down, Server Down, Windows Activation Services Down, and Database Management System Service Down. The failures that occur for each device are sorted in chronological order, and when a failure occurs in a specific equipment, if a failure occurs in a specific equipment within 5 minutes from the time of occurrence, it is defined that the failure occurs simultaneously. After configuring the sequence for the devices that have failed at the same time, 5 devices that frequently occur simultaneously within the configured sequence were selected, and the case where the selected devices failed at the same time was confirmed through visualization. Since the server resource information collected for failure analysis is in units of time series and has flow, we used Long Short-term Memory (LSTM), a deep learning algorithm that can predict the next state through the previous state. In addition, unlike a single server, the Hierarchical Attention Network deep learning model structure was used in consideration of the fact that the level of multiple failures for each server is different. This algorithm is a method of increasing the prediction accuracy by giving weight to the server as the impact on the failure increases. The study began with defining the type of failure and selecting the analysis target. In the first experiment, the same collected data was assumed as a single server state and a multiple server state, and compared and analyzed. The second experiment improved the prediction accuracy in the case of a complex server by optimizing each server threshold. In the first experiment, which assumed each of a single server and multiple servers, in the case of a single server, it was predicted that three of the five servers did not have a failure even though the actual failure occurred. However, assuming multiple servers, all five servers were predicted to have failed. As a result of the experiment, the hypothesis that there is an effect between servers is proven. As a result of this study, it was confirmed that the prediction performance was superior when the multiple servers were assumed than when the single server was assumed. In particular, applying the Hierarchical Attention Network algorithm, assuming that the effects of each server will be different, played a role in improving the analysis effect. In addition, by applying a different threshold for each server, the prediction accuracy could be improved. This study showed that failures that are difficult to determine the cause can be predicted through historical data, and a model that can predict failures occurring in servers in data centers is presented. It is expected that the occurrence of disability can be prevented in advance using the results of this study.
https://doi.org/10.13088/jiis.2020.26.4.127 인용 PDF KSCI

Packaging Technology for the Optical Fiber Bragg Grating Multiplexed Sensors (광섬유 브래그 격자 다중화 센서 패키징 기술에 관한 연구)

Lee, Sang Mae
- Journal of the Microelectronics and Packaging Society
- /
- v.24 no.4
- /
- pp.23-29
- /
- 2017
The packaged optical fiber Bragg grating sensors which were networked by multiplexing the Bragg grating sensors with WDM technology were investigated in application for the structural health monitoring of the marine trestle structure transporting the ship. The optical fiber Bragg grating sensor was packaged in a cylindrical shape made of aluminum tubes. Furthermore, after the packaged optical fiber sensor was inserted in polymeric tube, the epoxy was filled inside the tube so that the sensor has resistance and durability against sea water. The packaged optical fiber sensor component was investigated under 0.2 MPa of hydraulic pressure and was found to be robust. The number and location of Bragg gratings attached at the trestle were determined where the trestle was subject to high displacement obtained by the finite element simulation. Strain of the part in the trestle being subjected to the maximum load was analyzed to be ${\sim}1000{\mu}{\varepsilon}$ and thus shift in Bragg wavelength of the sensor caused by the maximum load of the trestle was found to be ~1,200 pm. According to results of the finite element analysis, the Bragg wavelength spacings of the sensors were determined to have 3~5 nm without overlapping of grating wavelengths between sensors when the trestle was under loads and thus 50 of the grating sensors with each module consisting of 5 sensors could be networked within 150 nm optical window at 1550 nm wavelength of the Bragg wavelength interrogator. Shifts in Bragg wavelength of the 5 packaged optical fiber sensors attached at the mock trestle unit were well interrogated by the grating interrogator which used the optical fiber loop mirror, and the maximum strain rate was measured to be about $235.650{\mu}{\varepsilon}$. The modelling result of the sensor packaging and networking was in good agreements with experimental result each other.
https://doi.org/10.6117/kmeps.2017.24.4.023 인용 PDF KSCI

A Study on Music Summarization (음악요약 생성에 관한 연구)

Kim Sung-Tak;Kim Sang-Ho;Kim Hoi-Rin;Choi Ji-Hoon;Lee Han-Kyu;Hong Jin-Woo
- Journal of Broadcast Engineering
- /
- v.11 no.1 s.30
- /
- pp.3-14
- /
- 2006
Music summarization means a technique which automatically generates the most importantand representative a part or parts ill music content. The techniques of music summarization have been studied with two categories according to summary characteristics. The first one is that the repeated part is provided as music summary and the second provides the combined segments which consist of segments with different characteristics as music summary in music content In this paper, we propose and evaluate two kinds of music summarization techniques. The algorithm using multi-level vector quantization which provides a repeated part as music summary gives fixed-length music summary is evaluated by overlapping ration between hand-made repeated parts and automatically generated summary. As results, the overlapping ratios of conventional methods are 42.2% and 47.4%, but that of proposed method with fixed-length summary is 67.1%. Optimal length music summary is evaluated by the portion of overlapping between summary and repeated part which is different length according to music content and the result shows that automatically-generated summary expresses more effective part than fixed-length summary with optimal length. The cluster-based algorithm using 2-D similarity matrix and k-means algorithm provides the combined segments as music summary. In order to evaluate this algorithm, we use MOS test consisting of two questions(How many similar segments are in summarized music? How many segments are included in same structure?) and the results show good performance.
PDF KSCI

The Method for Real-time Complex Event Detection of Unstructured Big data (비정형 빅데이터의 실시간 복합 이벤트 탐지를 위한 기법)

Lee, Jun Heui;Baek, Sung Ha;Lee, Soon Jo;Bae, Hae Young
- Spatial Information Research
- /
- v.20 no.5
- /
- pp.99-109
- /
- 2012
Recently, due to the growth of social media and spread of smart-phone, the amount of data has considerably increased by full use of SNS (Social Network Service). According to it, the Big Data concept is come up and many researchers are seeking solutions to make the best use of big data. To maximize the creative value of the big data held by many companies, it is required to combine them with existing data. The physical and theoretical storage structures of data sources are so different that a system which can integrate and manage them is needed. In order to process big data, MapReduce is developed as a system which has advantages over processing data fast by distributed processing. However, it is difficult to construct and store a system for all key words. Due to the process of storage and search, it is to some extent difficult to do real-time processing. And it makes extra expenses to process complex event without structure of processing different data. In order to solve this problem, the existing Complex Event Processing System is supposed to be used. When it comes to complex event processing system, it gets data from different sources and combines them with each other to make it possible to do complex event processing that is useful for real-time processing specially in stream data. Nevertheless, unstructured data based on text of SNS and internet articles is managed as text type and there is a need to compare strings every time the query processing should be done. And it results in poor performance. Therefore, we try to make it possible to manage unstructured data and do query process fast in complex event processing system. And we extend the data complex function for giving theoretical schema of string. It is completed by changing the string key word into integer type with filtering which uses keyword set. In addition, by using the Complex Event Processing System and processing stream data at real-time of in-memory, we try to reduce the time of reading the query processing after it is stored in the disk.
https://doi.org/10.12672/ksis.2012.20.5.099 인용 PDF KSCI

Development Process for User Needs-based Chatbot: Focusing on Design Thinking Methodology (사용자 니즈 기반의 챗봇 개발 프로세스: 디자인 사고방법론을 중심으로)

Kim, Museong;Seo, Bong-Goon;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.25 no.3
- /
- pp.221-238
- /
- 2019
Recently, companies and public institutions have been actively introducing chatbot services in the field of customer counseling and response. The introduction of the chatbot service not only brings labor cost savings to companies and organizations, but also enables rapid communication with customers. Advances in data analytics and artificial intelligence are driving the growth of these chatbot services. The current chatbot can understand users' questions and offer the most appropriate answers to questions through machine learning and deep learning. The advancement of chatbot core technologies such as NLP, NLU, and NLG has made it possible to understand words, understand paragraphs, understand meanings, and understand emotions. For this reason, the value of chatbots continues to rise. However, technology-oriented chatbots can be inconsistent with what users want inherently, so chatbots need to be addressed in the area of the user experience, not just in the area of technology. The Fourth Industrial Revolution represents the importance of the User Experience as well as the advancement of artificial intelligence, big data, cloud, and IoT technologies. The development of IT technology and the importance of user experience have provided people with a variety of environments and changed lifestyles. This means that experiences in interactions with people, services(products) and the environment become very important. Therefore, it is time to develop a user needs-based services(products) that can provide new experiences and values to people. This study proposes a chatbot development process based on user needs by applying the design thinking approach, a representative methodology in the field of user experience, to chatbot development. The process proposed in this study consists of four steps. The first step is 'setting up knowledge domain' to set up the chatbot's expertise. Accumulating the information corresponding to the configured domain and deriving the insight is the second step, 'Knowledge accumulation and Insight identification'. The third step is 'Opportunity Development and Prototyping'. It is going to start full-scale development at this stage. Finally, the 'User Feedback' step is to receive feedback from users on the developed prototype. This creates a "user needs-based service (product)" that meets the process's objectives. Beginning with the fact gathering through user observation, Perform the process of abstraction to derive insights and explore opportunities. Next, it is expected to develop a chatbot that meets the user's needs through the process of materializing to structure the desired information and providing the function that fits the user's mental model. In this study, we present the actual construction examples for the domestic cosmetics market to confirm the effectiveness of the proposed process. The reason why it chose the domestic cosmetics market as its case is because it shows strong characteristics of users' experiences, so it can quickly understand responses from users. This study has a theoretical implication in that it proposed a new chatbot development process by incorporating the design thinking methodology into the chatbot development process. This research is different from the existing chatbot development research in that it focuses on user experience, not technology. It also has practical implications in that companies or institutions propose realistic methods that can be applied immediately. In particular, the process proposed in this study can be accessed and utilized by anyone, since 'user needs-based chatbots' can be developed even if they are not experts. This study suggests that further studies are needed because only one field of study was conducted. In addition to the cosmetics market, additional research should be conducted in various fields in which the user experience appears, such as the smart phone and the automotive market. Through this, it will be able to be reborn as a general process necessary for 'development of chatbots centered on user experience, not technology centered'.
https://doi.org/10.13088/jiis.2019.25.3.221 인용 PDF KSCI

A Study on the Policy Direction of Space Composition of the Future School in Old High School - Focused on The Judgment of Space Relocation for the Application of the High School Credit System - (노후고등학교의 미래학교 공간구성 정책방향에 관한 연구 - 고교학점제 적용을 위한 공간 재배치 판단을 중심으로 -)

Lee, Jae-Lim
- The Journal of Sustainable Design and Educational Environment Research
- /
- v.21 no.3
- /
- pp.1-13
- /
- 2022
This study is a case study to identify the spatial composition and structural problems of existing schools for spatial innovation as a future school that can operate a credit system for old high schools and establish a mid-to-long-term arrangement plan as a credit system operating school capable of various teaching and learning in the future. The study results are as follows: First, most of the problems of the old high schools entailed that there was very poor connectivity between buildings as most of them were arranged in a single, standard design-type unit building and distributed in multiple buildings. In addition, the floor plan of each building is suggested to be a structure in which student exchange and rest functions cannot be achieved during the break period due to the spatial composition of the classroom and hallway concepts. Second, in the direction of the high school space configuration for future school space innovation, the arrangement plan should be established by reflecting the collective arrangement in consideration of the shortening of the movement route and the expansion of subject areas due to the movement of students on the premise of the subject classroom system. Moreover, it is desirable to provide a square-type space for rest and exchange in the central area where communication and exchange are possible according to the moving class. Third, as the evaluation criteria for relocating old high schools, a space program is prepared based on the number of classes in the future, and legal analysis of school land use and land use efficiency analysis considering regional characteristics are conducted. Based on such analysis data, mid-to-long-term land use plans and space arrangement plans for the entire school space such as the school facility complex are established.
https://doi.org/10.7743/kege.2022.21.3.001 인용 PDF KSCI

Search Result 1,837, Processing Time 0.03 seconds

Knowledge Extraction Methodology and Framework from Wikipedia Articles for Construction of Knowledge-Base (지식베이스 구축을 위한 한국어 위키피디아의 학습 기반 지식추출 방법론 및 플랫폼 연구)

Development of a complex failure prediction system using Hierarchical Attention Network (Hierarchical Attention Network를 이용한 복합 장애 발생 예측 시스템 개발)

Packaging Technology for the Optical Fiber Bragg Grating Multiplexed Sensors (광섬유 브래그 격자 다중화 센서 패키징 기술에 관한 연구)

A Study on Music Summarization (음악요약 생성에 관한 연구)

The Method for Real-time Complex Event Detection of Unstructured Big data (비정형 빅데이터의 실시간 복합 이벤트 탐지를 위한 기법)

Development Process for User Needs-based Chatbot: Focusing on Design Thinking Methodology (사용자 니즈 기반의 챗봇 개발 프로세스: 디자인 사고방법론을 중심으로)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)