Target-Aspect-Sentiment Joint Detection with CNN Auxiliary Loss for Aspect-Based Sentiment Analysis (CNN 보조 손실을 이용한 차원 기반 감성 분석)
-
- Journal of Intelligence and Information Systems
- /
- v.27 no.4
- /
- pp.1-22
- /
- 2021
Aspect Based Sentiment Analysis (ABSA), which analyzes sentiment based on aspects that appear in the text, is drawing attention because it can be used in various business industries. ABSA is a study that analyzes sentiment by aspects for multiple aspects that a text has. It is being studied in various forms depending on the purpose, such as analyzing all targets or just aspects and sentiments. Here, the aspect refers to the property of a target, and the target refers to the text that causes the sentiment. For example, for restaurant reviews, you could set the aspect into food taste, food price, quality of service, mood of the restaurant, etc. Also, if there is a review that says, "The pasta was delicious, but the salad was not," the words "steak" and "salad," which are directly mentioned in the sentence, become the "target." So far, in ABSA, most studies have analyzed sentiment only based on aspects or targets. However, even with the same aspects or targets, sentiment analysis may be inaccurate. Instances would be when aspects or sentiment are divided or when sentiment exists without a target. For example, sentences like, "Pizza and the salad were good, but the steak was disappointing." Although the aspect of this sentence is limited to "food," conflicting sentiments coexist. In addition, in the case of sentences such as "Shrimp was delicious, but the price was extravagant," although the target here is "shrimp," there are opposite sentiments coexisting that are dependent on the aspect. Finally, in sentences like "The food arrived too late and is cold now." there is no target (NULL), but it transmits a negative sentiment toward the aspect "service." Like this, failure to consider both aspects and targets - when sentiment or aspect is divided or when sentiment exists without a target - creates a dual dependency problem. To address this problem, this research analyzes sentiment by considering both aspects and targets (Target-Aspect-Sentiment Detection, hereby TASD). This study detected the limitations of existing research in the field of TASD: local contexts are not fully captured, and the number of epochs and batch size dramatically lowers the F1-score. The current model excels in spotting overall context and relations between each word. However, it struggles with phrases in the local context and is relatively slow when learning. Therefore, this study tries to improve the model's performance. To achieve the objective of this research, we additionally used auxiliary loss in aspect-sentiment classification by constructing CNN(Convolutional Neural Network) layers parallel to existing models. If existing models have analyzed aspect-sentiment through BERT encoding, Pooler, and Linear layers, this research added CNN layer-adaptive average pooling to existing models, and learning was progressed by adding additional loss values for aspect-sentiment to existing loss. In other words, when learning, the auxiliary loss, computed through CNN layers, allowed the local context to be captured more fitted. After learning, the model is designed to do aspect-sentiment analysis through the existing method. To evaluate the performance of this model, two datasets, SemEval-2015 task 12 and SemEval-2016 task 5, were used and the f1-score increased compared to the existing models. When the batch was 8 and epoch was 5, the difference was largest between the F1-score of existing models and this study with 29 and 45, respectively. Even when batch and epoch were adjusted, the F1-scores were higher than the existing models. It can be said that even when the batch and epoch numbers were small, they can be learned effectively compared to the existing models. Therefore, it can be useful in situations where resources are limited. Through this study, aspect-based sentiments can be more accurately analyzed. Through various uses in business, such as development or establishing marketing strategies, both consumers and sellers will be able to make efficient decisions. In addition, it is believed that the model can be fully learned and utilized by small businesses, those that do not have much data, given that they use a pre-training model and recorded a relatively high F1-score even with limited resources.
Thecodiplosis japonesis is sweeping the Pinus densiflora forests from south-west to north-east direction, destroying almost all the aged large trees as well as even the young ones. The front line of infestation is moving slowly but ceaselessly norhwards as a long bottle front. Estimation is that more than 40 percent of the area of P. densiflora forest has been damaged already, however some individuals could escapes from the damage and contribute to restore the site to the previous vegetation composition. When the stands were attacked by this insect, the drastic openings of the upper story of tree canopy formed by exclusively P. densiflora are usually resulted and some environmental factors such as light, temperature, litter accumulation, soil moisture and offers were naturally modified. With these changes after insect invasion, as the time passes, phytosociologic changes of the vegetation are gradually proceeding. If we select the forest according to four categories concerning the history of the insect outbreak, namely, non-attacked (healthy forest), recently damaged (the outbreak occured about 1-2 years ago), severely damaged (occured 5-6 years ago), damage prolonged (occured 10 years ago) and restored (occured about 20 years ago), any directional changes of vegetation composition could be traced these in line with four progressive stages. To elucidate these changes, three survey districts; (1) "Gongju" where the damage was severe and it was outbroken in 1977, (2) "Buyeo" where damage prolonged and (3) "Gochang" as restored, were set, (See Tab. 1). All these were located in the south temperate forest zone which was delimited mainly due to the temporature factor and generally accepted without any opposition at present. In view of temperature, the amount and distribution of precipitation and various soil factor, the overall homogeneity of environmental conditions between survey districts might be accepted. However this did not mean that small changes of edaphic and topographic conditions and microclimates can induce any alteration of vegetation patterns. Again four survey plots were set in each district and inter plot distance was 3 to 4 km. And again four subplots were set within a survey plot. The size of a subplot was