• Title/Summary/Keyword: robust

Search Result 10,495, Processing Time 0.032 seconds

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

Color-related Query Processing for Intelligent E-Commerce Search (지능형 검색엔진을 위한 색상 질의 처리 방안)

  • Hong, Jung A;Koo, Kyo Jung;Cha, Ji Won;Seo, Ah Jeong;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.109-125
    • /
    • 2019
  • As interest on intelligent search engines increases, various studies have been conducted to extract and utilize the features related to products intelligencely. In particular, when users search for goods in e-commerce search engines, the 'color' of a product is an important feature that describes the product. Therefore, it is necessary to deal with the synonyms of color terms in order to produce accurate results to user's color-related queries. Previous studies have suggested dictionary-based approach to process synonyms for color features. However, the dictionary-based approach has a limitation that it cannot handle unregistered color-related terms in user queries. In order to overcome the limitation of the conventional methods, this research proposes a model which extracts RGB values from an internet search engine in real time, and outputs similar color names based on designated color information. At first, a color term dictionary was constructed which includes color names and R, G, B values of each color from Korean color standard digital palette program and the Wikipedia color list for the basic color search. The dictionary has been made more robust by adding 138 color names converted from English color names to foreign words in Korean, and with corresponding RGB values. Therefore, the fininal color dictionary includes a total of 671 color names and corresponding RGB values. The method proposed in this research starts by searching for a specific color which a user searched for. Then, the presence of the searched color in the built-in color dictionary is checked. If there exists the color in the dictionary, the RGB values of the color in the dictioanry are used as reference values of the retrieved color. If the searched color does not exist in the dictionary, the top-5 Google image search results of the searched color are crawled and average RGB values are extracted in certain middle area of each image. To extract the RGB values in images, a variety of different ways was attempted since there are limits to simply obtain the average of the RGB values of the center area of images. As a result, clustering RGB values in image's certain area and making average value of the cluster with the highest density as the reference values showed the best performance. Based on the reference RGB values of the searched color, the RGB values of all the colors in the color dictionary constructed aforetime are compared. Then a color list is created with colors within the range of ${\pm}50$ for each R value, G value, and B value. Finally, using the Euclidean distance between the above results and the reference RGB values of the searched color, the color with the highest similarity from up to five colors becomes the final outcome. In order to evaluate the usefulness of the proposed method, we performed an experiment. In the experiment, 300 color names and corresponding color RGB values by the questionnaires were obtained. They are used to compare the RGB values obtained from four different methods including the proposed method. The average euclidean distance of CIE-Lab using our method was about 13.85, which showed a relatively low distance compared to 3088 for the case using synonym dictionary only and 30.38 for the case using the dictionary with Korean synonym website WordNet. The case which didn't use clustering method of the proposed method showed 13.88 of average euclidean distance, which implies the DBSCAN clustering of the proposed method can reduce the Euclidean distance. This research suggests a new color synonym processing method based on RGB values that combines the dictionary method with the real time synonym processing method for new color names. This method enables to get rid of the limit of the dictionary-based approach which is a conventional synonym processing method. This research can contribute to improve the intelligence of e-commerce search systems especially on the color searching feature.

The Effect of Price Discount Rate According to Brand Loyalty on Consumer's Acquisition Value and Transaction Value (브랜드애호도에 따른 가격할인율의 차이가 소비자의 획득가치와 거래가치에 미치는 영향)

  • Kim, Young-Ei;Kim, Jae-Yeong;Shin, Chang-Nag
    • Journal of Global Scholars of Marketing Science
    • /
    • v.17 no.4
    • /
    • pp.247-269
    • /
    • 2007
  • In recent years, one of the major reasons for the fierce competition amongst firms is that they strive to increase their own market shares and customer acquisition rate in the same market with similar and apparently undifferentiated products in terms of quality and perceived benefit. Because of this change in recent marketing environment, the differentiated after-sales service and diversified promotion strategies have become more important to gain competitive advantage. Price promotion is the favorite strategy that most retailers use to achieve short-term sales increase, induce consumer's brand switch, in troduce new product into market, and so forth. However, if marketers apply or copy an identical price promotion strategy without considering the characteristic differences in product and consumer preference, it will cause serious problems because discounted price itself could make people skeptical about product quality, and the changes of perceived value might appear differently depending on other factors such as consumer involvement or brand attitude. Previous studies showed that price promotion would certainly increase sales, and the discounted price compared to regular price would enhance the consumer's perceived values. On the other hand, discounted price itself could make people depreciate or skeptical about product quality, and reduce the consumers' positivity bias because consumers might be unsure whether the current price promotion is the retailer's best price offer. Moreover, we cannot say that discounted price absolutely enhances the consumer's perceived values regardless of product category and purchase situations. That is, the factors that affect consumers' value perceptions and buying behavior are so diverse in reality that the results of studies on the same dependent variable come out differently depending on what variable was used or how experiment conditions were designed. Majority of previous researches on the effect of price-comparison advertising have used consumers' buying behavior as dependent variable. In order to figure out consumers' buying behavior theoretically, analysis of value perceptions which influence buying intentions is needed. In addition, they did not combined the independent variables such as brand loyalty and price discount rate together. For this reason, this paper tried to examine the moderating effect of brand loyalty on relationship between the different levels of discounting rate and buyers' value perception. And we provided with theoretical and managerial implications that marketers need to consider such variables as product attributes, brand loyalty, and consumer involvement at the same time, and then establish a differentiated pricing strategy case by case in order to enhance consumer's perceived values properl. Three research concepts were used in our study and each concept based on past researches was defined. The perceived acquisition value in this study was defined as the perceived net gains associated with the products or services acquired. That is, the perceived acquisition value of the product will be positively influenced by the benefits buyers believe they are getting by acquiring and using the product, and negatively influenced by the money given up to acquire the product. And the perceived transaction value was defined as the perception of psychological satisfaction or pleasure obtained from taking advantage of the financial terms of the price deal. Lastly, the brand loyalty was defined as favorable attitude towards a purchased product. Thus, a consumer loyal to a brand has an emotional attachment to the brand or firm. Repeat purchasers continue to buy the same brand even though they do not have an emotional attachment to it. We assumed that if the degree of brand loyalty is high, the perceived acquisition value and the perceived transaction value will increase when higher discount rate is provided. But we found that there are no significant differences in values between two different discount rates as a result of empirical analysis. It means that price reduction did not affect consumer's brand choice significantly because the perceived sacrifice decreased only a little, and customers are satisfied with product's benefits when brand loyalty is high. From the result, we confirmed that consumers with high degree of brand loyalty to a specific product are less sensitive to price change. Thus, using price promotion strategy to merely expect sale increase is not recommendable. Instead of discounting price, marketers need to strengthen consumers' brand loyalty and maintain the skimming strategy. On the contrary, when the degree of brand loyalty is low, the perceived acquisition value and the perceived transaction value decreased significantly when higher discount rate is provided. Generally brands that are considered inferior might be able to draw attention away from the quality of the product by making consumers focus more on the sacrifice component of price. But considering the fact that consumers with low degree of brand loyalty are known to be unsatisfied with product's benefits and have relatively negative brand attitude, bigger price reduction offered in experiment condition of this paper made consumers depreciate product's quality and benefit more and more, and consumer's psychological perceived sacrifice increased while perceived values decreased accordingly. We infer that, in the case of inferior brand, a drastic price-cut or frequent price promotion may increase consumers' uncertainty about overall components of product. Therefore, it appears that reinforcing the augmented product such as after-sale service, delivery and giving credit which is one of the levels consisting of product would be more effective in reality. This will be better rather than competing with product that holds high brand loyalty by reducing sale price. Although this study tried to examine the moderating effect of brand loyalty on relationship between the different levels of discounting rate and buyers' value perception, there are several limitations. This study was conducted in controlled conditions where the high involvement product and two different levels of discount rate were applied. Given the presence of low involvement product, when both pieces of information are available, it is likely that the results we have reported here may have been different. Thus, this research results explain only the specific situation. Second, the sample selected in this study was university students in their twenties, so we cannot say that the results are firmly effective to all generations. Future research that manipulates the level of discount along with the consumer involvement might lead to a more robust understanding of the effects various discount rate. And, we used a cellular phone as a product stimulus, so it would be very interesting to analyze the result when the product stimulus is an intangible product such as service. It could be also valuable to analyze whether the change of perceived value affects consumers' final buying behavior positively or negatively.

  • PDF

Kim Eung-hwan's Official Excursion for Drawing Scenic Spots in 1788 and his Album of Complete Views of Seas and Mountains (1788년 김응환의 봉명사경과 《해악전도첩(海嶽全圖帖)》)

  • Oh, Dayun
    • MISULJARYO - National Museum of Korea Art Journal
    • /
    • v.96
    • /
    • pp.54-88
    • /
    • 2019
  • The Album of Complete Views of Seas and Mountains comprises sixty real scenery landscape paintings depicting Geumgangsan Mountain, the Haegeumgang River, and the eight scenic views of Gwandong regions, as well as fifty-one pieces of writing. It is a rare example in terms of its size and painting style. The paintings in this album, which are densely packed with natural features, follow the painting style of the Southern School yet employ crude and unconventional elements. In them, stones on the mountains are depicted both geometrically and three-dimensionally. Since 1973, parts of this album have been published in some exhibition catalogues. The entire album was opened to the public at the special exhibition "Through the Eyes of Joseon Painters: Real Scenery Landscapes of Korea" held at the National Museum of Korea in 2019. The Album of Complete Views of Seas and Mountains was attributed to Kim Eung-hwan (1742-1789) due to the signature on the final leaf of the album and the seal reading "Bokheon(painter's penname)" on the currently missing album leaf of Chilbodae Peaks. However, there is a strong possibility that this signature and seal may have been added later. This paper intends to reexamine the creator of this album based on a variety of related factors. In order to understand the production background of Album of Complete Views of Seas and Mountains, I investigated the eighteenth-century tradition of drawing scenic spots while travelling in which scenery of was depicted during private travels or official excursions. Jeong Seon(1676-1759), Sim Sa-jeong(1707-1769), Kim Yun-gyeom(1711-1775), Choe Buk(1712-after 1786), and Kang Se-hwang(1713-1791) all went on a journey to Geumgangsan Mountain, the most famous travel destination in the late Joseon period, and created paintings of the mountain, including Album of Pungak Mountain in the Sinmyo Year(1711) by Jeong Seon. These painters presented their versions of the traditional scenic spots of Inner Geumgangsan and newly depicted vistas they discovered for themselves. To commemorate their private visits, they produced paintings for their fellow travelers or sponsors in an album format that could include several scenes. While the production of paintings of private travels to Geumgangsan Mountain increased, King Jeongjo(r. 1776-1800) ordered Kim Eung-hwan and Kim Hong-do, court painters at the Dohwaseo(Royal Bureau of Painting), to paint scenic spots in the nine counties of the Yeongdong region and around Geumgangsan Mountain. King Jeongjo selected these two as the painters for the official excursion taking into account their relationship, their administrative experience as regional officials, and their distinct painting styles. Starting in the reign of King Yeongjo(r. 1724-1776), Kim Eung-hwan and Kim Hong-do served as court painters at the Dohwaseo, maintained a close relationship as a senior and a junior and as colleagues, and served as chalbang(chief in large of post stations) in the Yeongnam region. While Kim Hong-do was proficient at applying soft and delicate brushstrokes, Kim Eung-hwan was skilled at depicting the beauty of robust and luxuriant landscapes. Both painters produced about 100 scenes of original drawings over fifty days of the official excursion. Based on these original drawings, they created around seventy album leaves or handscrolls. Their paintings enriched the tradition of depicting scenic spots, particularly Outer Inner Geumgang and the eight scenic views of Gwandong around Geumgangsan Mountain during private journeys in the eighteenth century. Moreover, they newly discovered places of scenic beauty in the Outer Geungang and Yeongdong regions, establishing them as new painting themes. The Album of Complete Views of Seas and Mountains consists of four volumes. The volumes I, II include twenty-nine paintings of Inner Geumgangsan; the volume III, seventeen scenes of Outer Geumgangsan; and the volume IV, fourteen images of Maritime Geumgangsan and the eight scenic views of Gwandong. These paintings produced on silk show crowded compositions, geometrical depictions of the stones and the mountains, and distinct presentation of the rocky peaks of Geumgangsan Mountain using white and grayish-blue pigments. This album reflects the Joseon painting style of the mid- and late eighteenth century, integrating influences from Jeong Seon, Kang Se-hwang, Sim Sa-jeong, Jeong Chung-yeop(1725-after 1800), and Kim Hong-do. In particular, some paintings in the album show similarities to Kim Hong-do's Album of Famous Mountains in Korea in terms of its compositions and painterly motifs. However, "Yeongrangho Lake," "Haesanjeong Pavilion," and "Wolsongjeong Pavilion" in Kim Eung-hwan's album differ from in the version by Kim Hong-do. Thus, Kim Eung-hwan was influenced by Kim Hong-do, but produced his own distinctive album. The Album of Complete Views of Seas and Mountains includes scenery of "Jaundam Pool," "Baegundae Peak," "Viewing Birobong Peak at Anmunjeom groove," and "Baekjeongbong Peak," all of which are not depicted in other albums. In his version, Kim Eung-hwan portrayed the characteristics of the natural features in each scenic spot in a detailed and refreshing manner. Moreover, he illustrated stones on the mountains using geometric shapes and added a sense of three-dimensionality using lines and planes. Based on the painting traditions of the Southern School, he established his own characteristics. He also turned natural features into triangular or rectangular chunks. All sixty paintings in this album appear rough and unconventional, but maintain their internal consistency. Each of the fifty-one writings included in the Album of Complete Views of Seas and Mountains is followed by a painting of a scenic spot. It explains the depicted landscape, thus helping viewers to understand and appreciate the painting. Intimately linked to each painting, the related text notes information on traveling from one scenic spot to the next, the origins of the place names, geographic features, and other related information. Such encyclopedic documentation began in the early nineteenth century and was common in painting albums of Geumgangsan Mountain in the mid- nineteenth century. The text following the painting of Baekhwaam Hermitage in the Album of Complete Views of Seas and Mountains documents the reconstruction of the Baekhwaam Hermitage in 1845, which provides crucial evidence for dating the text. Therefore, the owner of the Album of Complete Views of Seas and Mountains might have written the texts or asked someone else to transcribe them in the mid- or late nineteenth century. In this paper, I have inferred the producer of the Album of Complete Views of Seas and Mountains to be Kim Eung-hwan based on the painting style and the tradition of drawing scenic spots during official trips. Moreover, its affinity with the Handscroll of Pungak Mountain created by Kim Ha-jong(1793-after 1878) after 1865 is another decisive factor in attributing the album to Kim Eung-hwan. In contrast to the Album of Famous Mountains in Korea by Kim Hong-do, the Album of Complete Views of Seas and Mountains exerted only a minor influence on other painters. The Handscroll of Pungak Mountain by Kim Ha-jong is the sole example that employs the subject matter from the Album of Complete Views of Seas and Mountains and follows its painting style. In the Handscroll of Pungak Mountain, Kim Ha-jong demonstrated a painting style completely different from that in the Album of Seas and Mountains that he produced fifty years prior in 1816 for Yi Gwang-mun, the magistrate of Chuncheon. He emphasized the idea of "scholar thoughts" by following the compositions, painterly elements, and depictions of figures in the painting manual style from Kim Eung-hwan's Album of Complete Views of Seas and Mountains. Kim Ha-jong, a member of the Gaeseong Kim clan and the eldest grandson of Kim Eung-hwan, is presumed to have appreciated the paintings depicted in the nature of Album of Complete Views of Seas and Mountains, which had been passed down within the family, and newly transformed them. Furthermore, the contents and narrative styles of Yi Yu-won's writings attached to the paintings in the Handscroll of Pungak Mountain are similar to those of the fifty-one writings in Kim Eunghwan's album. This suggests a possible influence of the inscriptions in Kim Eung-hwan's album or the original texts from which these inscriptions were quoted upon the writings in Kim Ha-jong's handscroll. However, a closer examination will be needed to determine the order of the transcription of the writings. The Album of Complete View of Seas and Mountains differs from Kim Hong-do's paintings of his official trips and other painting albums he influenced. This album is a siginificant artwork in that it broadens the understanding of the art world of Kim Eung-hwan and illustrates another layer of real scenery landscape paintings in the late eighteenth century.