KNU Korean Sentiment Lexicon: Bi-LSTM-based Method for Building a Korean Sentiment Lexicon (Bi-LSTM 기반의 한국어 감성사전 구축 방안)
-
- Journal of Intelligence and Information Systems
- /
- v.24 no.4
- /
- pp.219-240
- /
- 2018
Sentiment analysis, which is one of the text mining techniques, is a method for extracting subjective content embedded in text documents. Recently, the sentiment analysis methods have been widely used in many fields. As good examples, data-driven surveys are based on analyzing the subjectivity of text data posted by users and market researches are conducted by analyzing users' review posts to quantify users' reputation on a target product. The basic method of sentiment analysis is to use sentiment dictionary (or lexicon), a list of sentiment vocabularies with positive, neutral, or negative semantics. In general, the meaning of many sentiment words is likely to be different across domains. For example, a sentiment word, 'sad' indicates negative meaning in many fields but a movie. In order to perform accurate sentiment analysis, we need to build the sentiment dictionary for a given domain. However, such a method of building the sentiment lexicon is time-consuming and various sentiment vocabularies are not included without the use of general-purpose sentiment lexicon. In order to address this problem, several studies have been carried out to construct the sentiment lexicon suitable for a specific domain based on 'OPEN HANGUL' and 'SentiWordNet', which are general-purpose sentiment lexicons. However, OPEN HANGUL is no longer being serviced and SentiWordNet does not work well because of language difference in the process of converting Korean word into English word. There are restrictions on the use of such general-purpose sentiment lexicons as seed data for building the sentiment lexicon for a specific domain. In this article, we construct 'KNU Korean Sentiment Lexicon (KNU-KSL)', a new general-purpose Korean sentiment dictionary that is more advanced than existing general-purpose lexicons. The proposed dictionary, which is a list of domain-independent sentiment words such as 'thank you', 'worthy', and 'impressed', is built to quickly construct the sentiment dictionary for a target domain. Especially, it constructs sentiment vocabularies by analyzing the glosses contained in Standard Korean Language Dictionary (SKLD) by the following procedures: First, we propose a sentiment classification model based on Bidirectional Long Short-Term Memory (Bi-LSTM). Second, the proposed deep learning model automatically classifies each of glosses to either positive or negative meaning. Third, positive words and phrases are extracted from the glosses classified as positive meaning, while negative words and phrases are extracted from the glosses classified as negative meaning. Our experimental results show that the average accuracy of the proposed sentiment classification model is up to 89.45%. In addition, the sentiment dictionary is more extended using various external sources including SentiWordNet, SenticNet, Emotional Verbs, and Sentiment Lexicon 0603. Furthermore, we add sentiment information about frequently used coined words and emoticons that are used mainly on the Web. The KNU-KSL contains a total of 14,843 sentiment vocabularies, each of which is one of 1-grams, 2-grams, phrases, and sentence patterns. Unlike existing sentiment dictionaries, it is composed of words that are not affected by particular domains. The recent trend on sentiment analysis is to use deep learning technique without sentiment dictionaries. The importance of developing sentiment dictionaries is declined gradually. However, one of recent studies shows that the words in the sentiment dictionary can be used as features of deep learning models, resulting in the sentiment analysis performed with higher accuracy (Teng, Z., 2016). This result indicates that the sentiment dictionary is used not only for sentiment analysis but also as features of deep learning models for improving accuracy. The proposed dictionary can be used as a basic data for constructing the sentiment lexicon of a particular domain and as features of deep learning models. It is also useful to automatically and quickly build large training sets for deep learning models.
As interest on intelligent search engines increases, various studies have been conducted to extract and utilize the features related to products intelligencely. In particular, when users search for goods in e-commerce search engines, the 'color' of a product is an important feature that describes the product. Therefore, it is necessary to deal with the synonyms of color terms in order to produce accurate results to user's color-related queries. Previous studies have suggested dictionary-based approach to process synonyms for color features. However, the dictionary-based approach has a limitation that it cannot handle unregistered color-related terms in user queries. In order to overcome the limitation of the conventional methods, this research proposes a model which extracts RGB values from an internet search engine in real time, and outputs similar color names based on designated color information. At first, a color term dictionary was constructed which includes color names and R, G, B values of each color from Korean color standard digital palette program and the Wikipedia color list for the basic color search. The dictionary has been made more robust by adding 138 color names converted from English color names to foreign words in Korean, and with corresponding RGB values. Therefore, the fininal color dictionary includes a total of 671 color names and corresponding RGB values. The method proposed in this research starts by searching for a specific color which a user searched for. Then, the presence of the searched color in the built-in color dictionary is checked. If there exists the color in the dictionary, the RGB values of the color in the dictioanry are used as reference values of the retrieved color. If the searched color does not exist in the dictionary, the top-5 Google image search results of the searched color are crawled and average RGB values are extracted in certain middle area of each image. To extract the RGB values in images, a variety of different ways was attempted since there are limits to simply obtain the average of the RGB values of the center area of images. As a result, clustering RGB values in image's certain area and making average value of the cluster with the highest density as the reference values showed the best performance. Based on the reference RGB values of the searched color, the RGB values of all the colors in the color dictionary constructed aforetime are compared. Then a color list is created with colors within the range of
Recently concerns have been raised due to the unbalanced supply of crops: the price of crops has been unstable and at one point the price went up so high that the word Agflation(agriculture+ inflation) was coined. Korea, in particular, is a small-sized country and needs to secure the stable supply of crops by investing in the produce importation at a national level. Investment in foreign produce importation is becoming more important as a measure for sufficient supply of crops, limited supply of domestic crops, weakened farming conditions worldwide, as well as recent changes in the use of crops due to the development of bio-fuels, influence of carbon emission on crops, the price increase in crops, and influx of foreign hot money. However, there are many problems with investing in foreign produce importation: lack of support from the government; lack of farming information and technology; difficulty in securing the capital; no immediate pay-off from the investment and insufficient management. Although foreign produce is originally more price-competitive than domestic produce, it loses its competiveness in the process of importation (due to high tariffs) and poor distribution system, which makes it difficult to sell in Korea. Therefore, investment in foreign produce importation is being questioned for feasibility; to make it possible, foreign produce must maintain the price-competitiveness. Especially, harvest of agricultural products depends on natural and geographical conditions of each country and those products have indigenous properties, so distribution system according to import and export of agricultural products should be treated more carefully than that of other industries. Distribution costs are differentiated into each item and include cost of sorting and wrapping, cost of wrapping materials, cost of domestic transport, cost of international transport and cost of clearing customs for import and export. So transporting and storing agricultural products generates considerable costs compared with other products. Also, due to upgrade of dietary life, needs for stability, taste and visible quality toward food including agricultural products are being raised and wrong way of storage causes decomposition of food and loss of freshness, making the storage more difficult than that in room temperature, so storage and transport in distribution of agricultural products needs specialty. In addition, because lack of specialty in distribution and circulation such as storage and wrapping does not solve limit factors in distance, the distribution and circulation has been limited to a form of import and export within short-distant region. Therefore, need for distribution out-sourcing which can satisfy specialty in managing distribution and circulation and it is needed to establish more effective distribution system. However, existing distribution system of agricultural products is exposed to various problems including problems in distribution channel, making distribution and strategy for distribution and those problems are as follows. First, in case of investment in overseas agricultural industry, stable supply of the products is difficult because areas of production are dispersed widely and influenced by outer factors due to including overseas distribution channels. Also, at the aspect of quality, standardization of products is difficult, distribution system is quite complicated and unreasonable due to long distribution channels according to international trade and financial and institutional support is not enough. Especially, there are quite a lot of ineffective factors including multi level distribution process, dramatic gap between production cost and customer's cost, lack of physical distribution facilities and difficulties in storage and transport due to lack of wrapping containers. Besides, because import and export of agricultural products has been manages under the company's own distribution according to transaction contract between manufacturers and exporting company, efficiency is low due to excessive investment in fixed costs and lack of specialty in dealing with agricultural products causes fall of value of products, showing the limit to lose price-competitiveness. Especially, because lack of specialty in distribution and circulation such as storage and wrapping does not solve limit factors in distance, the distribution and circulation has been limited to a form of import and export within short-distant region. Therefore, need for distribution out-sourcing which can satisfy specialty in managing distribution and circulation and it is needed to establish more effective distribution system. Second, among tangible and intangible services which promote the efficiency of the whole distribution, a function building distribution environment which includes distribution information, system for standard and inspection, distribution finance, system for diversification of risks, education and training, distribution administration and tax system is wanted. In general, such a function building distribution environment is difficult to be changed and supplement innovatively because its effect compared with investment does not appear immediately despite of its necessity. Especially, in case of distribution of agricultural products, as a function of collecting and distributing is performed individually through various channels, the importance of distribution information and standardization is getting more focus due to the problem of repetition of work and lack of specialty. Also, efficient management of distribution is quite difficult due to lack of professionals in distribution, so support to professional education is needed. Third, though effort to keep self-sufficiency ratio of staple food, rice is regarded as important at the government level, level of dependency on overseas of others crops is high. Therefore, plan for stable securing food resources aside from staple food is also necessary. Especially, governmental organizations of agricultural products distribution in Korea are production-centered and have unreasonable structure whose function at the aspect of distribution and consumption is quite insufficient. And development of new distribution channels which can deal with changes in distribution environment and they do not achieve actual results of strategy for distribution due to non-positive strategy for price distribution. That is, it implies the possibility that base for supply will become vulnerable because it does not mediate appropriate interests on total distribution channels such as manufacturers, wholesale dealers and vendors by emphasizing consumer protection excessively in the distribution of agricultural products. Therefore, this study examined fundamental concept and actual situation for our investment to overseas agriculture, drew necessities, considerations, problems, etc. of overseas agricultural investment and suggested improvements at the level of distribution for price competitiveness of agricultural products cultivated in overseas under five aspects; government's indirect support, distribution's modernization and distribution information function's strengthening, government's political support for distribution facility, transportation route, load and unloading works' improvement, price competitiveness' securing, professional manpower's cultivation by education and training, etc. Here are some suggestions for foreign produce importation. First, the government should conduct a survey on the current distribution channels and analyze the situation to establish a measure for long-term development plans. By providing each agricultural area with a guideline for planning appropriate production of crops, the government can help farmers be ready for importation, and prevent them from producing same crops all at the same time. Government can sign an MOU with the foreign government and promote the importation so that the development of agricultural resources can be stable and steady. Second, the government can establish a strategy for an effective distribution system by providing farmers and agriculture-related workers with the distribution information such as price, production, demand, market structure and location, feature of each crop, and etc. In order for such distribution system to become feasible, the government needs to reconstruct the current distribution system, designate a public organization for providing distribution information and set the criteria for level of produce quality, trade units, and package units. Third, the government should provide financial support and a policy to seek an efficient distribution channel for foreign produce to be delivered fresh: the government should expand distribution facilities (for selecting, packaging, storing, and processing) and transportation vehicles while modernizing old facilities. There should be another policy to improve the efficiency of unloading, and to lower the cost of distribution. Fourth, it is necessary to enact a new law covering exceptional cases for importing produce in order to maintain the price competitiveness; currently the high tariffs is keeping the imported produce from being distributed domestically. However, the new adjustment should be made carefully within the WTO regulations since it can create a problem from giving preferential tariffs. The government can also simplify the distribution channels in order to reduce the cost in the distribution process. Fifth, the government should educate distributors to raise the efficiency and to modernize the distribution system. It is necessary to develop human resources by educating people regarding the foreign agricultural environment, the produce quality, management skills, and by introducing some successful cases in advanced countries.
Nowadays, social network is a huge communication platform for providing people to connect with one another and to bring users together to share common interests, experiences, and their daily activities. Users spend hours per day in maintaining personal information and interacting with other people via posting, commenting, messaging, games, social events, and applications. Due to the growth of user's distributed information in social network, there is a great potential to utilize the social data to enhance the quality of recommender system. There are some researches focusing on social network analysis that investigate how social network can be used in recommendation domain. Among these researches, we are interested in taking advantages of the interaction between a user and others in social network that can be determined and known as social relationship. Furthermore, mostly user's decisions before purchasing some products depend on suggestion of people who have either the same preferences or closer relationship. For this reason, we believe that user's relationship in social network can provide an effective way to increase the quality in prediction user's interests of recommender system. Therefore, social relationship between users encountered from social network is a common factor to improve the way of predicting user's preferences in the conventional approach. Recommender system is dramatically increasing in popularity and currently being used by many e-commerce sites such as Amazon.com, Last.fm, eBay.com, etc. Collaborative filtering (CF) method is one of the essential and powerful techniques in recommender system for suggesting the appropriate items to user by learning user's preferences. CF method focuses on user data and generates automatic prediction about user's interests by gathering information from users who share similar background and preferences. Specifically, the intension of CF method is to find users who have similar preferences and to suggest target user items that were mostly preferred by those nearest neighbor users. There are two basic units that need to be considered by CF method, the user and the item. Each user needs to provide his rating value on items i.e. movies, products, books, etc to indicate their interests on those items. In addition, CF uses the user-rating matrix to find a group of users who have similar rating with target user. Then, it predicts unknown rating value for items that target user has not rated. Currently, CF has been successfully implemented in both information filtering and e-commerce applications. However, it remains some important challenges such as cold start, data sparsity, and scalability reflected on quality and accuracy of prediction. In order to overcome these challenges, many researchers have proposed various kinds of CF method such as hybrid CF, trust-based CF, social network-based CF, etc. In the purpose of improving the recommendation performance and prediction accuracy of standard CF, in this paper we propose a method which integrates traditional CF technique with social relationship between users discovered from user's behavior in social network i.e. Facebook. We identify user's relationship from behavior of user such as posts and comments interacted with friends in Facebook. We believe that social relationship implicitly inferred from user's behavior can be likely applied to compensate the limitation of conventional approach. Therefore, we extract posts and comments of each user by using Facebook Graph API and calculate feature score among each term to obtain feature vector for computing similarity of user. Then, we combine the result with similarity value computed using traditional CF technique. Finally, our system provides a list of recommended items according to neighbor users who have the biggest total similarity value to the target user. In order to verify and evaluate our proposed method we have performed an experiment on data collected from our Movies Rating System. Prediction accuracy evaluation is conducted to demonstrate how much our algorithm gives the correctness of recommendation to user in terms of MAE. Then, the evaluation of performance is made to show the effectiveness of our method in terms of precision, recall, and F1-measure. Evaluation on coverage is also included in our experiment to see the ability of generating recommendation. The experimental results show that our proposed method outperform and more accurate in suggesting items to users with better performance. The effectiveness of user's behavior in social network particularly shows the significant improvement by up to 6% on recommendation accuracy. Moreover, experiment of recommendation performance shows that incorporating social relationship observed from user's behavior into CF is beneficial and useful to generate recommendation with 7% improvement of performance compared with benchmark methods. Finally, we confirm that interaction between users in social network is able to enhance the accuracy and give better recommendation in conventional approach.
The thermal analysis by mathematical model simulation makes it possible to reasonably predict heating and/or cooling requirements of certain greenhouses located under various geographical and climatic environment. It is another advantages of model simulation technique to be able to make it possible to select appropriate heating system, to set up energy utilization strategy, to schedule seasonal crop pattern, as well as to determine new greenhouse ranges. In this study, the control pattern for greenhouse microclimate is categorized as cooling and heating. Dynamic model was adopted to simulate heating requirements and/or energy conservation effectiveness such as energy saving by night-time thermal curtain, estimation of Heating Degree-Hours(HDH), long time prediction of greenhouse thermal behavior, etc. On the other hand, the cooling effects of ventilation, shading, and pad ||||&|||| fan system were partly analyzed by static model. By the experimental work with small size model greenhouse of 1.2m
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70
The wall shear stress in the vicinity of end-to end anastomoses under steady flow conditions was measured using a flush-mounted hot-film anemometer(FMHFA) probe. The experimental measurements were in good agreement with numerical results except in flow with low Reynolds numbers. The wall shear stress increased proximal to the anastomosis in flow from the Penrose tubing (simulating an artery) to the PTFE: graft. In flow from the PTFE graft to the Penrose tubing, low wall shear stress was observed distal to the anastomosis. Abnormal distributions of wall shear stress in the vicinity of the anastomosis, resulting from the compliance mismatch between the graft and the host artery, might be an important factor of ANFH formation and the graft failure. The present study suggests a correlation between regions of the low wall shear stress and the development of anastomotic neointimal fibrous hyperplasia(ANPH) in end-to-end anastomoses. 30523 T00401030523 ^x Air pressure decay(APD) rate and ultrafiltration rate(UFR) tests were performed on new and saline rinsed dialyzers as well as those roused in patients several times. C-DAK 4000 (Cordis Dow) and CF IS-11 (Baxter Travenol) reused dialyzers obtained from the dialysis clinic were used in the present study. The new dialyzers exhibited a relatively flat APD, whereas saline rinsed and reused dialyzers showed considerable amount of decay. C-DAH dialyzers had a larger APD(11.70