• Title/Summary/Keyword: word problem

Search Result 522, Processing Time 0.027 seconds

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Target-Aspect-Sentiment Joint Detection with CNN Auxiliary Loss for Aspect-Based Sentiment Analysis (CNN 보조 손실을 이용한 차원 기반 감성 분석)

  • Jeon, Min Jin;Hwang, Ji Won;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.1-22
    • /
    • 2021
  • Aspect Based Sentiment Analysis (ABSA), which analyzes sentiment based on aspects that appear in the text, is drawing attention because it can be used in various business industries. ABSA is a study that analyzes sentiment by aspects for multiple aspects that a text has. It is being studied in various forms depending on the purpose, such as analyzing all targets or just aspects and sentiments. Here, the aspect refers to the property of a target, and the target refers to the text that causes the sentiment. For example, for restaurant reviews, you could set the aspect into food taste, food price, quality of service, mood of the restaurant, etc. Also, if there is a review that says, "The pasta was delicious, but the salad was not," the words "steak" and "salad," which are directly mentioned in the sentence, become the "target." So far, in ABSA, most studies have analyzed sentiment only based on aspects or targets. However, even with the same aspects or targets, sentiment analysis may be inaccurate. Instances would be when aspects or sentiment are divided or when sentiment exists without a target. For example, sentences like, "Pizza and the salad were good, but the steak was disappointing." Although the aspect of this sentence is limited to "food," conflicting sentiments coexist. In addition, in the case of sentences such as "Shrimp was delicious, but the price was extravagant," although the target here is "shrimp," there are opposite sentiments coexisting that are dependent on the aspect. Finally, in sentences like "The food arrived too late and is cold now." there is no target (NULL), but it transmits a negative sentiment toward the aspect "service." Like this, failure to consider both aspects and targets - when sentiment or aspect is divided or when sentiment exists without a target - creates a dual dependency problem. To address this problem, this research analyzes sentiment by considering both aspects and targets (Target-Aspect-Sentiment Detection, hereby TASD). This study detected the limitations of existing research in the field of TASD: local contexts are not fully captured, and the number of epochs and batch size dramatically lowers the F1-score. The current model excels in spotting overall context and relations between each word. However, it struggles with phrases in the local context and is relatively slow when learning. Therefore, this study tries to improve the model's performance. To achieve the objective of this research, we additionally used auxiliary loss in aspect-sentiment classification by constructing CNN(Convolutional Neural Network) layers parallel to existing models. If existing models have analyzed aspect-sentiment through BERT encoding, Pooler, and Linear layers, this research added CNN layer-adaptive average pooling to existing models, and learning was progressed by adding additional loss values for aspect-sentiment to existing loss. In other words, when learning, the auxiliary loss, computed through CNN layers, allowed the local context to be captured more fitted. After learning, the model is designed to do aspect-sentiment analysis through the existing method. To evaluate the performance of this model, two datasets, SemEval-2015 task 12 and SemEval-2016 task 5, were used and the f1-score increased compared to the existing models. When the batch was 8 and epoch was 5, the difference was largest between the F1-score of existing models and this study with 29 and 45, respectively. Even when batch and epoch were adjusted, the F1-scores were higher than the existing models. It can be said that even when the batch and epoch numbers were small, they can be learned effectively compared to the existing models. Therefore, it can be useful in situations where resources are limited. Through this study, aspect-based sentiments can be more accurately analyzed. Through various uses in business, such as development or establishing marketing strategies, both consumers and sellers will be able to make efficient decisions. In addition, it is believed that the model can be fully learned and utilized by small businesses, those that do not have much data, given that they use a pre-training model and recorded a relatively high F1-score even with limited resources.

Stress and Oral Health Care in Nonhealth-Related Majors (비보건계열 대학생의 스트레스와 구강건강관리)

  • Woo, Seung-Hee;Ju, On Ju
    • Journal of dental hygiene science
    • /
    • v.15 no.5
    • /
    • pp.527-535
    • /
    • 2015
  • The findings of the study illustrated that the college students felt more stress when they had to receive treatment for the sake of oral health, and that they experienced less stress when they took good care of their oral health for preventive purposes. A self-administered survey was conducted on 235 junior college students whose majors were unrelated to health in the region of Jeollanam-do from March 4 to 30, 2015. A total of 27.2% of the respondents had received dental caries treatment, and 48.1% had received periodontal treatment. When the stress of the college students about personality, appearance, families and interpersonal relationship was measured, they were most stressed out about their personality ($3.40{\pm}0.73$). Specifically, they scored highest in the item "It's such a hassle to do something" ($3.73{\pm}1.20$), and scored lowest in the item "I was concerned about someone else's problem" ($2.22{\pm}1.15$). The female students experienced more stress about their appearance, personality, families and interpersonal relationship than the male students. The male students felt more stress about their studies than the female students. The college students who had dental caries and periodontal diseases suffered stress that was above the average level, and the stress level of the group that had scaling experience and/or had received toothbrushing education, namely taking care of oral health for preventive purposes, was more below average than the other group that didn't. In a word, it's urgently required to take measures to prevent college students from having oral diseases, as the students who suffered from oral diseases and received treatment were more stressed out. The implementation and revitalization of systemized educational programs are required to help college students stay away from oral diseases to promote their oral health.

Comparison of Reliability and Validity of Three Korean Versions of the 20-Item Toronto Alexithymia Scale (TAS-20의 한국판 3종간의 신뢰도 및 타당도 비교)

  • Chung, Un-Sun;Rim, Hyo-Deog;Lee, Yang-Hyun;Kim, Sang-Heon
    • Korean Journal of Psychosomatic Medicine
    • /
    • v.11 no.1
    • /
    • pp.77-88
    • /
    • 2003
  • Objectives: The purpose of this study was to compare reliability and validity of three Korean versions of the 20-item Toronto Alexithymia scale and to confirm the most reliable and validated Korean translation of the 20-item Toronto Alexithymia Scale for both clinical and research purpose in Korea. The first one was a Korean version of the 20-Item Toronto Alexithymia Scale developed by Lee YH et al in 1996 which was designated as TAS-20K(1996) in this study. This scale had a problem with one item due to the cultural difference regarding the word 'analyzing' between western culture and Korean culture. The second one was the revised version of TAS-20K(1996) on that point by Lee YH et al in 1996 without validation which was designated as TAS-20K(2003) in this study. The third one was a 23-item Korean version developed by Sin HG and Won HT in 1997, which was somewhat different from the 20-item Toronto Alexithymia Scale(TAS-20) in the number of total item, the content of some items and the scoring method. This scale was designated as S-TAS here. Methods: 408 medical students were tested with one scale composed of all the different items randomly arranged from the three versions. We evaluated goodness-of-fit and Cronbach $\alpha$ coefficients of three scales for reliability. We used confirmatory factor analysis to compare validity. Results: TAS-20K(2003) showed that it had better internal consistency than TAS-20K(1996), which implied that the cultural difference should be considered in the Korean translation. Both TAS-20K(2003) and S-TAS replicated three-factor structures and had adequacy of fit, good internal consistency and acceptable validity. However, S-TAS had one item with poor item-factor correlation and didn't show high correlation between item 2 and factor 1 as before in 1997. Conclusion: Although S-TAS had added 3 items and changed the content of two items, it didn't show better reliability and validity than TAS-20K(2003). Therefore it is proposed to use TAS-20K (2003) as the Korean version of the 20-item Toronto Alexithymia Scale(TAS-20K) for international communication of results of Alexithymia research. It has good internal consistency and validity and maintains original items, the same construct and scoring method as the 20-item Toronto Alexithymia Scale.

  • PDF

Studies on Electrostatic Propensity of Fabrics (직물대전성에 관한 연구)

  • 최병희;배도규
    • Journal of Sericultural and Entomological Science
    • /
    • v.27 no.2
    • /
    • pp.54-63
    • /
    • 1985
  • This studies has been carried out how to effect on electrostafic propensity of synthetic fabrics by coating with 0.5% acrylic polymer solution which was previously developed by the author to improve anticrease nature of silk. The work conditions are: (A) Applied synthetic polymer was acrylic polymer 525, developed by the author. (B) Electrostatic voltage for various fabrics were carried out by Korea standard abrasion partner with Korea standard (KS K 0905) cotton, nylon, polyester and the self sample fabric. (C) Applied fabrics for the investigations were carried out by using abrasion partner with Korea standard (KS K 0905) cotton, nylon, polyester and the self sample fabric. (D) Electrostatic propensity investigations were carried out by use of sample as silk, nylon, polyester and acrylic fabrics, seperating before finish or after finish. (E) Washing after the finish or the original fabric was carried out by Korea standard method, KS K 0465. Through the investigations, he happened to find many interesting matters and the obtained results are as followings. 1. Electrostatic voltage for the finished fabrics increased more than their original silk, nylon, acrylic fabrics except polyester fabric. (See Table 5) 2. Electrostatic voltage for the finished polyester against K.S. polyester decreased remarkably than the original fabric test. 3. In spite of no problem on electrostatic propensity of silk, it showed high electrostatic voltage between the same nature fabric abrasion, because silk is very weak against abrasion and because the test method had been developed to be useful for only synthetic fabrics. 4. Electrostatic voltage increased more in case of abrasion between different nature of fabrics than the same nature of fabrics. 5. Electrostatic voltage of each fabric increased by repeat of wash. 6. Many investigation data were followed with Contact Electrification Series Principle, another word, the farther each other located fabric on the series abrasion was, the higher electrostatic voltage. (See Fig. 6) 7. Such investigation gives warning of use on the mix fiber spinning service as far as concern with electrification. 8. It may also call attention for such increase of electrification in case any finishing of silk textile.

  • PDF

Improving Bidirectional LSTM-CRF model Of Sequence Tagging by using Ontology knowledge based feature (온톨로지 지식 기반 특성치를 활용한 Bidirectional LSTM-CRF 모델의 시퀀스 태깅 성능 향상에 관한 연구)

  • Jin, Seunghee;Jang, Heewon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.253-266
    • /
    • 2018
  • This paper proposes a methodology applying sequence tagging methodology to improve the performance of NER(Named Entity Recognition) used in QA system. In order to retrieve the correct answers stored in the database, it is necessary to switch the user's query into a language of the database such as SQL(Structured Query Language). Then, the computer can recognize the language of the user. This is the process of identifying the class or data name contained in the database. The method of retrieving the words contained in the query in the existing database and recognizing the object does not identify the homophone and the word phrases because it does not consider the context of the user's query. If there are multiple search results, all of them are returned as a result, so there can be many interpretations on the query and the time complexity for the calculation becomes large. To overcome these, this study aims to solve this problem by reflecting the contextual meaning of the query using Bidirectional LSTM-CRF. Also we tried to solve the disadvantages of the neural network model which can't identify the untrained words by using ontology knowledge based feature. Experiments were conducted on the ontology knowledge base of music domain and the performance was evaluated. In order to accurately evaluate the performance of the L-Bidirectional LSTM-CRF proposed in this study, we experimented with converting the words included in the learned query into untrained words in order to test whether the words were included in the database but correctly identified the untrained words. As a result, it was possible to recognize objects considering the context and can recognize the untrained words without re-training the L-Bidirectional LSTM-CRF mode, and it is confirmed that the performance of the object recognition as a whole is improved.

Implementation of Reporting Tool Supporting OLAP and Data Mining Analysis Using XMLA (XMLA를 사용한 OLAP과 데이타 마이닝 분석이 가능한 리포팅 툴의 구현)

  • Choe, Jee-Woong;Kim, Myung-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.3
    • /
    • pp.154-166
    • /
    • 2009
  • Database query and reporting tools, OLAP tools and data mining tools are typical front-end tools in Business Intelligence environment which is able to support gathering, consolidating and analyzing data produced from business operation activities and provide access to the result to enterprise's users. Traditional reporting tools have an advantage of creating sophisticated dynamic reports including SQL query result sets, which look like documents produced by word processors, and publishing the reports to the Web environment, but data source for the tools is limited to RDBMS. On the other hand, OLAP tools and data mining tools have an advantage of providing powerful information analysis functions on each own way, but built-in visualization components for analysis results are limited to tables or some charts. Thus, this paper presents a system that integrates three typical front-end tools to complement one another for BI environment. Traditional reporting tools only have a query editor for generating SQL statements to bring data from RDBMS. However, the reporting tool presented by this paper can extract data also from OLAP and data mining servers, because editors for OLAP and data mining query requests are added into this tool. Traditional systems produce all documents in the server side. This structure enables reporting tools to avoid repetitive process to generate documents, when many clients intend to access the same dynamic document. But, because this system targets that a few users generate documents for data analysis, this tool generates documents at the client side. Therefore, the tool has a processing mechanism to deal with a number of data despite the limited memory capacity of the report viewer in the client side. Also, this reporting tool has data structure for integrating data from three kinds of data sources into one document. Finally, most of traditional front-end tools for BI are dependent on data source architecture from specific vendor. To overcome the problem, this system uses XMLA that is a protocol based on web service to access to data sources for OLAP and data mining services from various vendors.

A Study on Food Service Franchise Location Factors and Quality of Service Factors, The Impact on Customer Satisfaction (외식 프랜차이즈 입지요건과 서비스 품질 요인이 고객만족에 미치는 영향)

  • Kim, Jo In Seog;Cho, Kyu Youn;An, Sang
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.11 no.5
    • /
    • pp.77-90
    • /
    • 2016
  • This study is to examine the importance of site selection and service quality in franchise business as food service franchise became one of the fastest-growing service industries today. The chief finding of this study is as follows: First, a survey in locational and service quality factors affecting food service franchise shows that responders are more concerned with hygiene and visibility of the store than proximity and transportation advantages which reflects low statistical significance, thus the distance did not seem to be a big problem for the responders in the context that they mostly visit nearby food franchise. Second, the examination of the influence by the service quality factors and customer satisfaction shows significant positive relation with customer response, speed and accuracy, and accuracy factors which reveals that the responders prefer prompt response and swift judgment toward the customer's needs and expectations, professional knowledge services to the credibility factors in which little correlation with the customer satisfaction were found. Third, the examination of the influence by the service quality factors, locational factors, and re-visit reveals that customer response and specialty showed statistically significant correlation with intention of WOM (Word of Mouth) and revisit, which suggests that swift judgment and response toward the customer's needs and expectations, professional knowledge services is of great importance to both customer satisfaction and revisit. The study on the aspects of locational and service quality factors affecting franchise industry's customer satisfaction was conducted as above, an investigation in both factors' influence on the customer satisfaction was made, and based on the results of the analysis, this research seeks an optimal operation strategy of a franchise business. Food service franchise are relatively very competent to business adminstration and reaction capability to consumption changes due to the already established market, and there are stores springing up everywhere inspired by the founders who are too confident of their success in the franchise business. However, it is necessary for the franchise beginners to figure out a zone oriented, regular customer oriented business strategy than just complying with the head office manual. Owing to an increasing trend of opening medium to large sized stores and investments in the wake of converting to multiple business type Korean food franchise, there is growing need to set up new concept of store development and operational management strategy in order to overcome the excessive competition and limited sales volume of the old-fashioned small sized, small capital franchise stores. Furthermore, as most business category of food service franchise serve very similar menus, from a product differentiation point of view, it is required to map out flexible sales concept including the adoption of competitive and low-price strategy. In conclusion, as is shown in the analytical research, the customers' optimal choice fluctuate over their preferences like customer convenience and circumstances rather than insisting on specific brand, thus it will be necessary for the franchise stores to draw up aggressive strategy and planning in running food service franchise to maximize their profits.

  • PDF

Original expression of the creative chidren's picture-book (창작그림동화의 독창성 연구)

  • 안경환
    • Archives of design research
    • /
    • v.11 no.1
    • /
    • pp.185-197
    • /
    • 1998
  • The domestic publishing market has heen ranked at No.7 in the word publishing market(stastics material in Cultu re and Gymnastics m inistrv )Es pe cia Ill'. publishing quantity of children'book is about to reacb No.3. Such a publishing condition i." showing that Korean publishing world has limit,llion of kind and genre despite of its quant.iative improvement On the ot.her hand. t',reign juvenile publi."hing has multi-publishing form, which is a simultaneous publishing with dolls, audio stuff, game programs and CD-ROM t.itles. Even the animation is considered as of the publication at the planning s tsge. However, when we take a look at domestic condition we come to know that Korean juvenile publishing has been occupied mostly by the studying book. Also, the cautious book selection by the well educated parents in l990's has brought up the change of juvenile publishing world. Such a presen t condition bears of juvenlie publi.,;hing world. Such a present condition bears problem, which is the checking 190 translat.ions among the published picture- books of the last ye ar children's book Nevertheless, there was a sucessful domestic planned creative picture book last year. That is "Puppy s shit", which was sold out 15 000 copies and be st se ller of children's book. Whe n we take a look at the commercial success of "Puppy s shit", it is possible that domestic work holds a position in the publishing market. "Puppy s shit" is the story about valuable nature with Korean styled illustration, which tells the prefemece of Korean book in do mestic pu blis hin f.i market. With the motto "Finding prospect of the Korean creative children's book", this paper was went throu gh. By searchinf.i for creative com ponent.s of picture-book planning such as theme, story, illustration, and edit design through the foreign picture-book "What 1 want. to know from the little mole is who made it on top of his head"-and domestic creative picture/book 'Puppy's shit", this study tried to tell a couple of things like followings publication of Korean creative picture book in t.he world. professional and more artistic inner fabric and originality(the relatio nship be tween stort and illu,tration), improvement of illustration through new formative language with well expressed con ten t, planning improvem ent of Korean creative pictive picture book including literary, artistic and educative component and finally examples of planning, artict and educative component and finally example, of planning the good book with a story and illu,;tration which can in the long run improve the value of life for the children.h can in the long run improve the value of life for the children.

  • PDF

The Method for Real-time Complex Event Detection of Unstructured Big data (비정형 빅데이터의 실시간 복합 이벤트 탐지를 위한 기법)

  • Lee, Jun Heui;Baek, Sung Ha;Lee, Soon Jo;Bae, Hae Young
    • Spatial Information Research
    • /
    • v.20 no.5
    • /
    • pp.99-109
    • /
    • 2012
  • Recently, due to the growth of social media and spread of smart-phone, the amount of data has considerably increased by full use of SNS (Social Network Service). According to it, the Big Data concept is come up and many researchers are seeking solutions to make the best use of big data. To maximize the creative value of the big data held by many companies, it is required to combine them with existing data. The physical and theoretical storage structures of data sources are so different that a system which can integrate and manage them is needed. In order to process big data, MapReduce is developed as a system which has advantages over processing data fast by distributed processing. However, it is difficult to construct and store a system for all key words. Due to the process of storage and search, it is to some extent difficult to do real-time processing. And it makes extra expenses to process complex event without structure of processing different data. In order to solve this problem, the existing Complex Event Processing System is supposed to be used. When it comes to complex event processing system, it gets data from different sources and combines them with each other to make it possible to do complex event processing that is useful for real-time processing specially in stream data. Nevertheless, unstructured data based on text of SNS and internet articles is managed as text type and there is a need to compare strings every time the query processing should be done. And it results in poor performance. Therefore, we try to make it possible to manage unstructured data and do query process fast in complex event processing system. And we extend the data complex function for giving theoretical schema of string. It is completed by changing the string key word into integer type with filtering which uses keyword set. In addition, by using the Complex Event Processing System and processing stream data at real-time of in-memory, we try to reduce the time of reading the query processing after it is stored in the disk.