• Title/Summary/Keyword: 공간 빅데이터

Search Result 307, Processing Time 0.031 seconds

Smart Store in Smart City: The Development of Smart Trade Area Analysis System Based on Consumer Sentiments (Smart Store in Smart City: 소비자 감성기반 상권분석 시스템 개발)

  • Yoo, In-Jin;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.25-52
    • /
    • 2018
  • This study performs social network analysis based on consumer sentiment related to a location in Seoul using data reflecting consumers' web search activities and emotional evaluations associated with commerce. The study focuses on large commercial districts in Seoul. In addition, to consider their various aspects, social network indexes were combined with the trading area's public data to verify factors affecting the area's sales. According to R square's change, We can see that the model has a little high R square value even though it includes only the district's public data represented by static data. However, the present study confirmed that the R square of the model combined with the network index derived from the social network analysis was even improved much more. A regression analysis of the trading area's public data showed that the five factors of 'number of market district,' 'residential area per person,' 'satisfaction of residential environment,' 'rate of change of trade,' and 'survival rate over 3 years' among twenty two variables. The study confirmed a significant influence on the sales of the trading area. According to the results, 'residential area per person' has the highest standardized beta value. Therefore, 'residential area per person' has the strongest influence on commercial sales. In addition, 'residential area per person,' 'number of market district,' and 'survival rate over 3 years' were found to have positive effects on the sales of all trading area. Thus, as the number of market districts in the trading area increases, residential area per person increases, and as the survival rate over 3 years of each store in the trading area increases, sales increase. On the other hand, 'satisfaction of residential environment' and 'rate of change of trade' were found to have a negative effect on sales. In the case of 'satisfaction of residential environment,' sales increase when the satisfaction level is low. Therefore, as consumer dissatisfaction with the residential environment increases, sales increase. The 'rate of change of trade' shows that sales increase with the decreasing acceleration of transaction frequency. According to the social network analysis, of the 25 regional trading areas in Seoul, Yangcheon-gu has the highest degree of connection. In other words, it has common sentiments with many other trading areas. On the other hand, Nowon-gu and Jungrang-gu have the lowest degree of connection. In other words, they have relatively distinct sentiments from other trading areas. The social network indexes used in the combination model are 'density of ego network,' 'degree centrality,' 'closeness centrality,' 'betweenness centrality,' and 'eigenvector centrality.' The combined model analysis confirmed that the degree centrality and eigenvector centrality of the social network index have a significant influence on sales and the highest influence in the model. 'Degree centrality' has a negative effect on the sales of the districts. This implies that sales decrease when holding various sentiments of other trading area, which conflicts with general social myths. However, this result can be interpreted to mean that if a trading area has low 'degree centrality,' it delivers unique and special sentiments to consumers. The findings of this study can also be interpreted to mean that sales can be increased if the trading area increases consumer recognition by forming a unique sentiment and city atmosphere that distinguish it from other trading areas. On the other hand, 'eigenvector centrality' has the greatest effect on sales in the combined model. In addition, the results confirmed a positive effect on sales. This finding shows that sales increase when a trading area is connected to others with stronger centrality than when it has common sentiments with others. This study can be used as an empirical basis for establishing and implementing a city and trading area strategy plan considering consumers' desired sentiments. In addition, we expect to provide entrepreneurs and potential entrepreneurs entering the trading area with sentiments possessed by those in the trading area and directions into the trading area considering the district-sentiment structure.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

Application of Hot Spot Analysis for Interpreting Soil Heavy-Metal Concentration Data in Abandoned Mines (폐금속 광산의 토양 중금속 오염 조사 자료 해석을 위한 핫스팟 분석의 적용)

  • LEE, Chae-Young;KIM, Sung-Min;CHOI, Yo-Soon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.22 no.2
    • /
    • pp.24-35
    • /
    • 2019
  • In this study, a hotspot analysis was conducted to suggest a new method for interpreting soil heavy-metal contamination data of abandoned metal mines according to statistical significance level. The spatial autocorrelation of the data was analyzed using the Getis-Ord $Gi{\ast}$ statistic in order to check whether soil heavy metal contamination data showing abnormal values appeared concentrated or dispersed in a specific space. As a result, the statistically significant data showing abnormal values in the mine area could be classified as follows: (1) the contamination degree and the hotspot value (z-score) were both high, (2) the contamination degree was high but the z-score was low, (3) the contamination degree was low but the z-score was high and (4) the contamination degree and the z-score were both low. The proposed method can be used to interpret the soil heavy metal contamination data according to the statistical significance level and to support a rational decision for soil contamination management in abandoned mines.

Implementation of the Unborrowed Book Recommendation System for Public Libraries: Based on Daegu D Library (공공도서관 미대출 도서 추천시스템 구현 : 대구 D도서관을 중심으로)

  • Jin, Min-Ha;Jeong, Seung-Yeon;Cho, Eun-Ji;Lee, Myoung-Hun;Kim, Keun-Wook
    • Journal of Digital Convergence
    • /
    • v.19 no.5
    • /
    • pp.175-186
    • /
    • 2021
  • The roles and functions of domestic public libraries are diversifying, but various problems have emerged due to internally biased book lending. In addition, due to the 4th Industrial Revolution, public libraries have introduced a book recommendation system focusing on popular books, but the variety of books that users can access is limited. Therefore, in this study, the public library unborrowed book recommendation system was implemented limiting its spatial scope to Duryu Library in Daegu City to enhance the satisfaction of public library users, by using the loan records data (213,093 cases), user information (35,561 people), etc. and utilizing methods like cluster analysis, topic modeling, content-based filtering recommendation algorithm, and conducted a survey on actual users' satisfaction to present the possibility and implications of the unborrowed book recommendation system. As a result of the analysis, the majority of users responded with high satisfaction, and was able to find the satisfaction was relatively high in the class classified by specific gender, age, occupation, and usual reading. Through the results of this study, it is expected that some problems such as biased book lending and reduced operational efficiency of public libraries can be improved, and limitations of the study was also presented.

Study on Development of LED Camping Light Design Based on IOT and Emotional Lighting Contents (IOT 및 감성조명 콘텐츠 기반의 LED 캠핑등 디자인 개발에 관한 연구)

  • Kim, Hee-Jun
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.12
    • /
    • pp.332-342
    • /
    • 2018
  • This study is aimed at suggesting information about technical choices for designing LED camping lights based on emotional lighting contents of integrated IOT and design areas which take a central role in creation and knowledge based industries and the procedure for materializing them. 'i-Light,' a portable LED camping light, is 'connected lighting' connecting men, space and emotion and a smart camping light based on IOT and emotional lighting contents. 'i-Light' has two functions. One is about lighting for adjusting color and color temperature naturally and the other is about safety for detecting harmful gases. 'i-Light' also has various emotional functions for experiencing interaction and taste of light. For the purpose, portable LED camping lights were designed, first of all, and then a highly color rendering/full-color lighting module, a smart sensor module and an IOT device platform were developed. In addition, efforts were made to establish detailed data about emotional lighting contents and to develop a Web application based on them. Finally, prototypes of portable LED camping lights were made to get a test bench and usability evaluation from related organizations. According to the results, all of 12 developed emotional lighting contents and three IOT safety sensors were suitable and prototypes were satisfactory. This paper will suggest a direction about actual technical choices for development of contents and products integrating artificial intelligence and big data and about the procedure for materializing them.

Derivation of Green Infrastructure Planning Factors for Reducing Particulate Matter - Using Text Mining - (미세먼지 저감을 위한 그린인프라 계획요소 도출 - 텍스트 마이닝을 활용하여 -)

  • Seok, Youngsun;Song, Kihwan;Han, Hyojoo;Lee, Junga
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.5
    • /
    • pp.79-96
    • /
    • 2021
  • Green infrastructure planning represents landscape planning measures to reduce particulate matter. This study aimed to derive factors that may be used in planning green infrastructure for particulate matter reduction using text mining techniques. A range of analyses were carried out by focusing on keywords such as 'particulate matter reduction plan' and 'green infrastructure planning elements'. The analyses included Term Frequency-Inverse Document Frequency (TF-IDF) analysis, centrality analysis, related word analysis, and topic modeling analysis. These analyses were carried out via text mining by collecting information on previous related research, policy reports, and laws. Initially, TF-IDF analysis results were used to classify major keywords relating to particulate matter and green infrastructure into three groups: (1) environmental issues (e.g., particulate matter, environment, carbon, and atmosphere), target spaces (e.g., urban, park, and local green space), and application methods (e.g., analysis, planning, evaluation, development, ecological aspect, policy management, technology, and resilience). Second, the centrality analysis results were found to be similar to those of TF-IDF; it was confirmed that the central connectors to the major keywords were 'Green New Deal' and 'Vacant land'. The results from the analysis of related words verified that planning green infrastructure for particulate matter reduction required planning forests and ventilation corridors. Additionally, moisture must be considered for microclimate control. It was also confirmed that utilizing vacant space, establishing mixed forests, introducing particulate matter reduction technology, and understanding the system may be important for the effective planning of green infrastructure. Topic analysis was used to classify the planning elements of green infrastructure based on ecological, technological, and social functions. The planning elements of ecological function were classified into morphological (e.g., urban forest, green space, wall greening) and functional aspects (e.g., climate control, carbon storage and absorption, provision of habitats, and biodiversity for wildlife). The planning elements of technical function were classified into various themes, including the disaster prevention functions of green infrastructure, buffer effects, stormwater management, water purification, and energy reduction. The planning elements of the social function were classified into themes such as community function, improving the health of users, and scenery improvement. These results suggest that green infrastructure planning for particulate matter reduction requires approaches related to key concepts, such as resilience and sustainability. In particular, there is a need to apply green infrastructure planning elements in order to reduce exposure to particulate matter.

The Impact of O4O Selection Attributes on Customer Satisfaction and Loyalty: Focusing on the Case of Fresh Hema in China (O4O 선택속성이 고객만족도 및 고객충성도에 미치는 영향: 중국 허마셴셩 사례를 중심으로)

  • Cui, Chengguo;Yang, Sung-Byung
    • Knowledge Management Research
    • /
    • v.21 no.3
    • /
    • pp.249-269
    • /
    • 2020
  • Recently, as the online market has matured, it is facing many problems to prevent the growth. The most common problem is the homogenization of online products, which fails to increase the number of customers any more. Moreover, although the portion of the online market has increased significantly, it now becomes essential to expand offline for further development. In response, many online firms have recently sought to expand their businesses and marketing channels by securing offline spaces that can complement the limitations of online platforms, on top of their existing advantages of online channels. Based on their competitive advantage in terms of analyzing large volumes of customer data utilizing information technologies (e.g., big data and artificial intelligence), they are reinforcing their offline influence as well through this online for offline (O4O) business model. On the other hand, most of the existing research has primarily focused on online to offline (O2O) business model, and there is still a lack of research on O4O business models, which have been actively attempted in various industrial fields in recent years. Since a few of O4O-related studies have been conducted only in an experience marketing setting following a case study method, it is critical to conduct an empirical study on O4O selection attributes and their impact on customer satisfaction and loyalty. Therefore, focusing on China's representative O4O business model, 'Fresh Hema,' this study attempts to identify some key selection attributes specialized for O4O services from the customers' viewpoint and examine the impact of these attributes on customer satisfaction and loyalty. The results of the structural equation modeling (SEM) with 300 O4O (Fresh Hema) experienced customers, reveal that, out of seven O4O selection attributes, four (mobile app quality, mobile payment, product quality, and store facilities) have an impact on customer satisfaction, which also leads to customer loyalty (reuse intention, recommendation intention, and brand attachment). This study would help managers in an O4O area well adapt to rapidly changing customer needs and provide them with some guidelines for enhancing both customer satisfaction and loyalty by allocating more resources to more significant selection attributes, rather than less significant ones.

The Analysis of Urban Park Catchment Areas - Perspectives from Quality Service of Hangang Park - (한강공원의 질적 서비스와 이용자 영향권의 상관관계 분석)

  • Lee, Seo Hyo;Kim, Harry;Lee, Jae Ho
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.49 no.6
    • /
    • pp.27-36
    • /
    • 2021
  • At a time when the equitable use of urban parks is gradually emerging as a social issue, this study was initiated to expand the influence of urban parks by improving the quality of park services, thereby resolving areas not covered by urban park services. This study targeted the Hangang Park in Seoul, where the qualitative service of parks shows the greatest difference. The influence relationship between the qualitative services of the park and the user's sphere of influence, which indicates the distribution of park users, was proposed to assess the influence of improvements in the quality of service. As a research method, the top three districts and the bottom three districts were selected through the Han River Park user satisfaction survey conducted from 2017 to 2019, and a qualitative service evaluation was carried out. It was derived using the data acquired in September. Afterward, by performing a spatial autocorrelation analysis on the user's sphere of influence, additional verification of the user's sphere of influence was performed numerically and visually. As a result of the study, the user influence in the top three districts, with high-quality service, was stronger and wider than that of the lower three districts. It was confirmed that the quality of service of the park affects the user influence. This shows that to realize park equity, it is necessary to improve the quality of services through continuous management and improvement of individual parks and the creation of new parks. This study has significance in that it recognizes the limitations of research on park services from a supplier's point of view and evaluates the qualitative services of parks from the perspective of actual park users. We propose an alternative to deal with the lower the park deprivation index.

Keyword Network Visualization for Text Summarization and Comparative Analysis (문서 요약 및 비교분석을 위한 주제어 네트워크 가시화)

  • Kim, Kyeong-rim;Lee, Da-yeong;Cho, Hwan-Gue
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.139-147
    • /
    • 2017
  • Most of the information prevailing in the Internet space consists of textual information. So one of the main topics regarding the huge document analyses that are required in the "big data" era is the development of an automated understanding system for textual data; accordingly, the automation of the keyword extraction for text summarization and abstraction is a typical research problem. But the simple listing of a few keywords is insufficient to reveal the complex semantic structures of the general texts. In this paper, a text-visualization method that constructs a graph by computing the related degrees from the selected keywords of the target text is developed; therefore, two construction models that provide the edge relation are proposed for the computing of the relation degree among keywords, as follows: influence-interval model and word- distance model. The finally visualized graph from the keyword-derived edge relation is more flexible and useful for the display of the meaning structure of the target text; furthermore, this abstract graph enables a fast and easy understanding of the target text. The authors' experiment showed that the proposed abstract-graph model is superior to the keyword list for the attainment of a semantic and comparitive understanding of text.