Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)
-
- Journal of Intelligence and Information Systems
- /
- v.21 no.2
- /
- pp.49-67
- /
- 2015
Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.
Jeollabuk-do is bounded by the sea, and Mahan Baekje culture have been established around a wide plain. Also, in southeastern, it was closed by Gaya kingdom where iron culture was prosperous at that time, a variety of the handles of swords with round pommel is excavated at present. The handles of swords with round pommel is the best amount of excavated objects among the swords with round pommel and producted object for the time. It supposes them to become the foundation of making the decorated swords with round pommel. But, the handles of swords with round pommel that don't have a pattern in handle is indifferent to study because the production method is simple in spite of that the value of archaeological data is sufficient. Therefore, in this study, it examined changed production techniques with the change of times concerning the handles of swords with round pommel of Mahan Baekje Gaya period(before 6C) excavated from Jeollabukdo through using X-rays in order to clarify a variety of production techniques of the handles of swords with round pommel correctly in accordance with a period of production and excavated place. As a result, identified production techniques using X-rays of the handles of swords with round pommel excavated around remains of Mahan Baekje Gaya period shows that production progress improved in order of all-in-one shape, hammer welding shape of the handle of round pommel, and two in body formation in accordance with age. Especially, in two in body shape, it products the handle of round pommel separately, after that it welds the handle of swords and then links the sword blade like a riveting or bottleneck and so on. Despite of improved hammer welding technique, the reason why it didn't utilize is it regards as inlay or gilt will be damaged. And it is judged by using riveting or bottleneck. Also, it appears to techniques of metal craft such as decoration of the handle, decoration of point of sword, inlay, and silver-plating in the period of appearing two in body shape. As clarifying correctly, it provides fundamental database of scientific research about a study of production techniques of handle of swords with round pommel.
The large amount of data that emerges from the initial connection environment of the Fourth Industrial Revolution is a major factor that distinguishes the Fourth Industrial Revolution from the existing production environment. This environment has two-sided features that allow it to produce data while using it. And the data produced so produces another value. Due to the massive scale of data, future information systems need to process more data in terms of quantities than existing information systems. In addition, in terms of quality, only a large amount of data, Ability is required. In a small-scale information system, it is possible for a person to accurately understand the system and obtain the necessary information, but in a variety of complex systems where it is difficult to understand the system accurately, it becomes increasingly difficult to acquire the desired information. In other words, more accurate processing of large amounts of data has become a basic condition for future information systems. This problem related to the efficient performance of the information system can be solved by building a semantic web which enables various information processing by expressing the collected data as an ontology that can be understood by not only people but also computers. For example, as in most other organizations, IT has been introduced in the military, and most of the work has been done through information systems. Currently, most of the work is done through information systems. As existing systems contain increasingly large amounts of data, efforts are needed to make the system easier to use through its data utilization. An ontology-based system has a large data semantic network through connection with other systems, and has a wide range of databases that can be utilized, and has the advantage of searching more precisely and quickly through relationships between predefined concepts. In this paper, we propose a defense ontology as a method for effective data management and decision support. In order to judge the applicability and effectiveness of the actual system, we reconstructed the existing air force munitions situation management system as an ontology based system. It is a system constructed to strengthen management and control of logistics situation of commanders and practitioners by providing real - time information on maintenance and distribution situation as it becomes difficult to use complicated logistics information system with large amount of data. Although it is a method to take pre-specified necessary information from the existing logistics system and display it as a web page, it is also difficult to confirm this system except for a few specified items in advance, and it is also time-consuming to extend the additional function if necessary And it is a system composed of category type without search function. Therefore, it has a disadvantage that it can be easily utilized only when the system is well known as in the existing system. The ontology-based logistics situation management system is designed to provide the intuitive visualization of the complex information of the existing logistics information system through the ontology. In order to construct the logistics situation management system through the ontology, And the useful functions such as performance - based logistics support contract management and component dictionary are further identified and included in the ontology. In order to confirm whether the constructed ontology can be used for decision support, it is necessary to implement a meaningful analysis function such as calculation of the utilization rate of the aircraft, inquiry about performance-based military contract. Especially, in contrast to building ontology database in ontology study in the past, in this study, time series data which change value according to time such as the state of aircraft by date are constructed by ontology, and through the constructed ontology, It is confirmed that it is possible to calculate the utilization rate based on various criteria as well as the computable utilization rate. In addition, the data related to performance-based logistics contracts introduced as a new maintenance method of aircraft and other munitions can be inquired into various contents, and it is easy to calculate performance indexes used in performance-based logistics contract through reasoning and functions. Of course, we propose a new performance index that complements the limitations of the currently applied performance indicators, and calculate it through the ontology, confirming the possibility of using the constructed ontology. Finally, it is possible to calculate the failure rate or reliability of each component, including MTBF data of the selected fault-tolerant item based on the actual part consumption performance. The reliability of the mission and the reliability of the system are calculated. In order to confirm the usability of the constructed ontology-based logistics situation management system, the proposed system through the Technology Acceptance Model (TAM), which is a representative model for measuring the acceptability of the technology, is more useful and convenient than the existing system.
The prevalence rate for chronic diseases such as obesity, diabetes, hypertension etc. caused by the increment of national income and the change of food life according to the globalization in Korea have been increased. Especially excess sodium intake may contribute to the development of hypertension, increasing cardiovascular disease risk. The objective of this study was to investigate sodium intake of nursery school meals in Gyeonggi-Do, and to construct database for lesser sodium intake policy. Survey consisted of 601 sample intakes of sodium in summer and in winter. A food weighed record method was used for measuring food intakes. Average intakes of ten children per nursery school were measured. The sodium contents of meals were analyzed by ICP-OES (inductively coupled plasma-optical emission spectrometer) after acid digestion by microwave. The sodium contents on food groups showed that sources (693 mg/100 g), grilled foods (689 mg/100 g) and kimchies (643 mg/100 g) had respectively higher sodium contents and the average sodium intake per meal was
The objectives of this study were to examine the production processes and methods of "Forest Type Map Actualization Production (Database (DB) Construction Work Manual)" (Work Manual) identify issues associated with the production processes and methods, and suggest solutions for them by applying evaluation items to a 1:5k digital forest type map. The evaluation items applied to a forest type map were divided into zoning and attributes, and the issues associated with the production processes and methods of Work Manual were derived through analyzing the characteristics of the stand structure and fragmentation by administrative districts. Korea is divided into five divisions, where one is set as the area changed naturally and the other four areas set as the area changed artificially. The area changed naturally has been updated every five years, and those changed artificially have been updated annually. The fragmentation of South Korea was analyzed in order to examine the consistency of the DB established for each region. The results showed that, in South Korea, the number of patches increased and the mean patch size decreased. As a result, the degree of fragmentation and the complexity of shapes increased. The degree of fragmentation and the complexity of shapes decreased in four regions out of 17 regions (metropolitan cities and provinces). The results indicated that there were spatial variations. The "Forest Classification" defines the minimum area of a zoning as 0.1ha. This study examined the criteria for the minimum area of a zoning by estimating the divided object (polygon unit) in a forest type map. The results of this study revealed that approximately 26% of objects were smaller than the minimum area of a zoning. The results implied that it would be necessary to establish the definition and the regeneration interval of "Areas Changed Artificially and Areas Changed Naturally", and improve the standard for the minimum area of a zoning. Among the attributes of Work Manual, "Species Change" item classifies terrain features into 52 types, and 43 types of them belong to stocking land. This study examined distribution ratios by extracting species information from the forest type map. It was found that each of 23 species, approximately 53% of species, occupied less than 0.1% of Forested land. The top three species were pine and other species. Although undergrowth on unstocked forest land are classified in the terrain feature system, their definition and classification criteria are not established in the "Forest Classification" item. Therefore, it will be needed to reestablish the terrain feature system and set the definitions of undergrowth.
A methodology to predict the carbon performance of newly created urban greening plans is required as policies based on quantifying carbon performance are rapidly being introduced in the face of the climate crisis caused by global warming. This study developed a tree carbon calculator that can be used for carbon reduction designs in landscaping and attempted to verify its effectiveness in landscape design. For practical operability, MS Excel was selected as a format, and carbon absorption and storage by tree type and size were extracted from 93 representative species to reflect plant design characteristics. The database, including tree unit prices, was established to reflect cost limitations. A plantation experimental design to verify the performance of the tree carbon calculator was conducted by simulating the design of parks in the central region for four landscape design, and the causal relationship was analyzed by conducting semi-structured interviews before and after. As a result, carbon absorption and carbon storage in the design using the tree carbon calculator were about 17-82% and about 14-85% higher, respectively, compared to not using it. It was confirmed that the reason for the increase in carbon performance efficiency was that additional planting was actively carried out within a given budget, along with the replacement of excellent carbon performance species. Pre-interviews revealed that designers distrusted data and the burdens caused by new programs before using the arboreal carbon calculator but tended to change positively because of its usefulness and ease of use. In order to implement carbon reduction design in the landscaping field, it is necessary to develop it into a carbon calculator for trees and landscaping performance. This study is expected to present a useful direction for ntroducing carbon reduction designs based on quantitative data in landscape design.
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70