• 제목/요약/키워드: Web-based data model

Search Result 801, Processing Time 0.031 seconds

Smart Store in Smart City: The Development of Smart Trade Area Analysis System Based on Consumer Sentiments (Smart Store in Smart City: 소비자 감성기반 상권분석 시스템 개발)

  • Yoo, In-Jin;Seo, Bong-Goon;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.25-52
    • /
    • 2018
  • This study performs social network analysis based on consumer sentiment related to a location in Seoul using data reflecting consumers' web search activities and emotional evaluations associated with commerce. The study focuses on large commercial districts in Seoul. In addition, to consider their various aspects, social network indexes were combined with the trading area's public data to verify factors affecting the area's sales. According to R square's change, We can see that the model has a little high R square value even though it includes only the district's public data represented by static data. However, the present study confirmed that the R square of the model combined with the network index derived from the social network analysis was even improved much more. A regression analysis of the trading area's public data showed that the five factors of 'number of market district,' 'residential area per person,' 'satisfaction of residential environment,' 'rate of change of trade,' and 'survival rate over 3 years' among twenty two variables. The study confirmed a significant influence on the sales of the trading area. According to the results, 'residential area per person' has the highest standardized beta value. Therefore, 'residential area per person' has the strongest influence on commercial sales. In addition, 'residential area per person,' 'number of market district,' and 'survival rate over 3 years' were found to have positive effects on the sales of all trading area. Thus, as the number of market districts in the trading area increases, residential area per person increases, and as the survival rate over 3 years of each store in the trading area increases, sales increase. On the other hand, 'satisfaction of residential environment' and 'rate of change of trade' were found to have a negative effect on sales. In the case of 'satisfaction of residential environment,' sales increase when the satisfaction level is low. Therefore, as consumer dissatisfaction with the residential environment increases, sales increase. The 'rate of change of trade' shows that sales increase with the decreasing acceleration of transaction frequency. According to the social network analysis, of the 25 regional trading areas in Seoul, Yangcheon-gu has the highest degree of connection. In other words, it has common sentiments with many other trading areas. On the other hand, Nowon-gu and Jungrang-gu have the lowest degree of connection. In other words, they have relatively distinct sentiments from other trading areas. The social network indexes used in the combination model are 'density of ego network,' 'degree centrality,' 'closeness centrality,' 'betweenness centrality,' and 'eigenvector centrality.' The combined model analysis confirmed that the degree centrality and eigenvector centrality of the social network index have a significant influence on sales and the highest influence in the model. 'Degree centrality' has a negative effect on the sales of the districts. This implies that sales decrease when holding various sentiments of other trading area, which conflicts with general social myths. However, this result can be interpreted to mean that if a trading area has low 'degree centrality,' it delivers unique and special sentiments to consumers. The findings of this study can also be interpreted to mean that sales can be increased if the trading area increases consumer recognition by forming a unique sentiment and city atmosphere that distinguish it from other trading areas. On the other hand, 'eigenvector centrality' has the greatest effect on sales in the combined model. In addition, the results confirmed a positive effect on sales. This finding shows that sales increase when a trading area is connected to others with stronger centrality than when it has common sentiments with others. This study can be used as an empirical basis for establishing and implementing a city and trading area strategy plan considering consumers' desired sentiments. In addition, we expect to provide entrepreneurs and potential entrepreneurs entering the trading area with sentiments possessed by those in the trading area and directions into the trading area considering the district-sentiment structure.

A Study on the Model of Appraisal and Acquisition for Digital Documentary Heritage : Focused on 'Whole-of-Society Approach' in Canada (디지털기록유산 평가·수집 모형에 대한 연구 캐나다 'Whole-of-Society 접근법'을 중심으로)

  • Pak, Ji-Ae;Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.44
    • /
    • pp.51-99
    • /
    • 2015
  • The purpose of the archival appraisal has gradually changed from the selection of records to the documentation of the society. In particular, the qualitative and quantitative developments of the current digital technology and web have become the driving force that enables semantic acquisition, rather than physical one. Under these circumstances, the concept of 'documentary heritage' has been re-established internationally, led by UNESCO. Library and Archives Canada (LAC) reflects this trend. LAC has been trying to develop a new appraisal model and an acquisition model at the same time to revive the spirit of total archives, which is the 'Whole-of-society approach'. Features of this approach can be summarized in three main points. First, it is for documentary heritage and the acquisition refers to semantic acquisition, not the physical one. And because the object of management is documentary heritage, the cooperation between documentary heritage institutions has to be a prerequisite condition. Lastly, it cannot only documenting what already happened, it can documenting what is happening in the current society. 'Whole-of-society approach', as an appraisal method, is a way to identify social components based on social theories. The approach, as an acquisition method, is targeting digital recording, which includes 'digitized' heritage and 'born-digital' heritage. And it makes possible to the semantic acquisition of documentary heritage based on the data linking by mapping identified social components as metadata component and establishing them into linked open data. This study pointed out that it is hard to realize documentation of the society based on domestic appraisal system since the purpose is limited to selection. To overcome this limitation, we suggest a guideline applied with 'Whole-of-society approach'.

A Study on Spam Document Classification Method using Characteristics of Keyword Repetition (단어 반복 특징을 이용한 스팸 문서 분류 방법에 관한 연구)

  • Lee, Seong-Jin;Baik, Jong-Bum;Han, Chung-Seok;Lee, Soo-Won
    • The KIPS Transactions:PartB
    • /
    • v.18B no.5
    • /
    • pp.315-324
    • /
    • 2011
  • In Web environment, a flood of spam causes serious social problems such as personal information leak, monetary loss from fishing and distribution of harmful contents. Moreover, types and techniques of spam distribution which must be controlled are varying as days go by. The learning based spam classification method using Bag-of-Words model is the most widely used method until now. However, this method is vulnerable to anti-spam avoidance techniques, which recent spams commonly have, because it classifies spam documents utilizing only keyword occurrence information from classification model training process. In this paper, we propose a spam document detection method using a characteristic of repeating words occurring in spam documents as a solution of anti-spam avoidance techniques. Recently, most spam documents have a trend of repeating key phrases that are designed to spread, and this trend can be used as a measure in classifying spam documents. In this paper, we define six variables, which represent a characteristic of word repetition, and use those variables as a feature set for constructing a classification model. The effectiveness of proposed method is evaluated by an experiment with blog posts and E-mail data. The result of experiment shows that the proposed method outperforms other approaches.

A Study on Consumer Acceptance toward the Commercialized Smart Clothing (상용화된 스마트 의류제품에 대한 소비자의 수용도)

  • Chae, Jin-Mie;Cho, Hyun-Seung;Lee, Joo-Hyeon
    • Science of Emotion and Sensibility
    • /
    • v.12 no.2
    • /
    • pp.181-192
    • /
    • 2009
  • The purpose of this study was to explain the attitudes and acceptance toward the smart clothing adopted by the TAM(Technology Acceptance Model). A theoretical model on consumer acceptance of the commercialized smart clothing was constructed on the basis of previous studies and a review of literature and tested the path analysis. For the data analysis, statistical methods of confirmatory factor analysis, correlation, and covariance structural analysis using AMOS 5.0 package were employed. The results from a web-based survey of 557 online consumers indicate that the TAM by Davis(1989) adopted by perceived value according to market price was able to explain consumer acceptance of smart clothing substantially well. Using structural equation modeling, eight of the six hypotheses investigated in this study were supported. The results of the testing of the hypothesis were as fellows: First, perceived ease of use had a side effect on attitude through perceived usefulness. In addition, the attitude was found to affect consumer acceptance. Second, perceived usefulness was found to affect consumer attitude on perceived value, however, had not significant effect on consumer acceptance toward smart clothing. Third, perceived value was found to affect both consumer acceptance and attitude toward smart clothing. Finally, the attitude toward smart clothing had a direct effect on the consumer acceptance.

  • PDF

Blooming Time of Tilia amurensis Rupr. in Mountainous Area and Prediction of its Blooming Progress Using Growing Degree Day Model (산악 지역에서의 피나무(Tilia amurensis Rupr.) 개화시기와 성장온일도를 이용한 개화 진행 예측)

  • Kim, Min-Jung;Son, Minwong;Lee, Juhyeok;Jung, Chuleui
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.24 no.1
    • /
    • pp.1-12
    • /
    • 2022
  • Tilia amurensis is an important honey plant. As T. amurensis mainly distributes mountainous area with various elevations in Korea, accurate prediction of blooming time at the different elevation would benefit forest beekeepers. In this study, we measured time-dependent blooming progress of T. amurensis in Mt. Gariwang area ranging from 500-1500m. Additionally we collected blooming data from web and published literatures and estimated the variation of blooming time relative to the geographic locations. Flowers began to bloom from July 6 to July 22 with full blooming on July 14 in location where elevation is 638m in Mt. G ariwang area in 2021. Based on these databases, a growing degree day (G DD) model was developed for prediction of T. amurensis blooming progress using average daily temperatures. Using the starting date of G DD accumulation of January 1 and base temperature of 5 ℃, blooming period ranging from 10% to 90% of cumulative blooming rate was estimated as 860-1198 degree days (DD). This corresponded to the beginning to the end of July in Mt. Gariwaning area in 2021. This model could explain the phenological variations of T. amurensis flower blooming possibly affected by elevation within geographic area, latitude or year relative to the climate change, and aid forest beekeepers for better timing of nectar foraging by honey bees.

Analyzing Changes in Consumers' Interest Areas Related to Skin under the Pandemic: Focusing on Structural Topic Modeling (팬데믹에 따른 소비자의 피부 관련 관심 영역 변화 분석: 구조적 토픽모델링을 중심으로)

  • Nakyung Kim;Jiwon Park;HyungBin Moon
    • Knowledge Management Research
    • /
    • v.25 no.1
    • /
    • pp.173-192
    • /
    • 2024
  • This study aims to understand the changes in the beauty industry due to the pandemic from the consumer's perspective based on consumers' opinions about their skin online before and after the pandemic. Furthermore, this study tries to derive strategies for companies and governments to support sustainable growth and innovation in the beauty industry. To this end, posts on social media from 2017 to 2022 that contained the keyword 'skin concerns' are collected, and after data preprocessing, 96,908 posts are used for the structural topic model. To examine whether consumers' interest areas related to skin change according to the pandemic situation, the analysis period is divided into 7 periods, and the variables that distinguish each stage are used as meta-variables for the structural topic model. As a result, it is found that consumers' interests can be divided into 22 topics, which can be categorized into four main categories: beauty manufacturing, beauty services, skin concerns, and other. The results of this study are expected to be utilized in construction of product development and marketing strategies of related companies and the establishment of economic support policies by the government in response to changes in demand in the beauty industry due to the pandemic.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

Ontology Design for the Register of Officials(先生案) of the Joseon Period (조선시대 선생안 온톨로지 설계)

  • Kim, Sa-hyun
    • (The)Study of the Eastern Classic
    • /
    • no.69
    • /
    • pp.115-146
    • /
    • 2017
  • This paper is about the research on ontology design for a digital archive of seonsaengan(先生案) of the Joseon Period. Seonsaengan is the register of staff officials at each government office, along with their personal information and records of their transfer from one office to another, in addition to their DOBs, family clan, etc. A total of 176 types of registers are known to be kept at libraries and museums in the country. This paper intends to engage in the ontology design of 47 cases of such registers preserved at the Jangseogak Archives of the Academy of Korean Studies (AKS) with a focus on their content and structure including the names of the relevant government offices and posts assumed by the officials, etc. The work for the ontology design was done with a focus on the officials, the offices they belong to, and records about their transfers kept in the registers. The ontology design categorized relevant resources into classes according to the attributes common to the individuals. Each individual has defined a semantic postposition word that can explicitly express the relationship with other individuals. As for the classes, they were divided into eight categories, i.e. registers, figures, offices, official posts, state examination, records, and concepts. For design of relationships and attributes, terms and phrases such as Dublin Core, Europeana Data Mode, CIDOC-CRM, data model for database of those who passed the exam in the past, which are already designed and used, were referred to. Where terms and phrases designed in existing data models are used, the work used Namespace of the relevant data model. The writer defined the relationships where necessary. The designed ontology shows an exemplary implementation of the Myeongneung seonsaengan(明陵先生案). The work gave consideration to expected effects of information entered when a single registered is expanded to plural registers, along with ways to use it. The ontology design is not one made based on the review of all of the 176 registers. The model needs to be improved each time relevant information is obtained. The aim of such efforts is the systematic arrangement of information contained in the registers. It should be remembered that information arranged in this manner may be rearranged with the aid of databases or archives existing currently or to be built in the future. It is expected that the pieces of information entered through the ontology design will be used as data showing how government offices were operated and what their personnel system was like, along with politics, economy, society, and culture of the Joseon Period, in linkage with databases already established.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

Development of Mobile Application for Ship Officers' Job Stress Measurement and Management (해기사 직무스트레스 측정 및 관리 모바일 애플리케이션 개발)

  • Yang, Dong-Bok;Kim, Joo-Sung;Kim, Deug-Bong
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.2
    • /
    • pp.266-274
    • /
    • 2021
  • Ship officers are subject to excessive job stress, which has negative physical and psychological impacts and may adversely affect the smooth supply and demand of human resources. In this study, a mobile web application was developed as a tool for systematic job stress measurement and management of officers and verified through quality evaluation. Requirement analysis was performed by ship officers and staff in charge of human resources of shipping companies, and the results were reflected in the application configuration step. The application was designed according to the waterfall model, which is a traditional software development method, and functions were implemented using JSP and Spring Framework. Performance evaluation on the user interface, confirmed that proper input and output results were implemented, and the respondent results and the database were configured in the administrator interface. The results of evaluation questionnaires for quality evaluation of the interface based on ISO/IEC 9126-2 metric were significant 4.60 for the user interface and 4.65 for the administrator interface in a 5-point scale. In the future, it is necessary to conduct follow-up research on the development of data analysis system through utilization of the collected big-data sets.