• Title/Summary/Keyword: Internet models

Search Result 1,391, Processing Time 0.033 seconds

Automatic Target Recognition Study using Knowledge Graph and Deep Learning Models for Text and Image data (지식 그래프와 딥러닝 모델 기반 텍스트와 이미지 데이터를 활용한 자동 표적 인식 방법 연구)

  • Kim, Jongmo;Lee, Jeongbin;Jeon, Hocheol;Sohn, Mye
    • Journal of Internet Computing and Services
    • /
    • v.23 no.5
    • /
    • pp.145-154
    • /
    • 2022
  • Automatic Target Recognition (ATR) technology is emerging as a core technology of Future Combat Systems (FCS). Conventional ATR is performed based on IMINT (image information) collected from the SAR sensor, and various image-based deep learning models are used. However, with the development of IT and sensing technology, even though data/information related to ATR is expanding to HUMINT (human information) and SIGINT (signal information), ATR still contains image oriented IMINT data only is being used. In complex and diversified battlefield situations, it is difficult to guarantee high-level ATR accuracy and generalization performance with image data alone. Therefore, we propose a knowledge graph-based ATR method that can utilize image and text data simultaneously in this paper. The main idea of the knowledge graph and deep model-based ATR method is to convert the ATR image and text into graphs according to the characteristics of each data, align it to the knowledge graph, and connect the heterogeneous ATR data through the knowledge graph. In order to convert the ATR image into a graph, an object-tag graph consisting of object tags as nodes is generated from the image by using the pre-trained image object recognition model and the vocabulary of the knowledge graph. On the other hand, the ATR text uses the pre-trained language model, TF-IDF, co-occurrence word graph, and the vocabulary of knowledge graph to generate a word graph composed of nodes with key vocabulary for the ATR. The generated two types of graphs are connected to the knowledge graph using the entity alignment model for improvement of the ATR performance from images and texts. To prove the superiority of the proposed method, 227 documents from web documents and 61,714 RDF triples from dbpedia were collected, and comparison experiments were performed on precision, recall, and f1-score in a perspective of the entity alignment..

A Study on Establishment of Safety Training Center Based on Virtual Reality and Augmented Reality Technology for Military Safety and Suicide Accident Prevention (가상현실(VR/AR) 기술 기반으로 군 안전 및 자살사고 예방을 위한 안전체험훈련장 구축 방안에 관한 연구)

  • Choi, Sung-oh;Min, Yong-sik;Kim, Sung-Il;Ghoi, Jong-geun
    • Journal of Internet Computing and Services
    • /
    • v.21 no.2
    • /
    • pp.139-148
    • /
    • 2020
  • Due to change in circumstances in the 2000s such as severe birthrate decline and shortened military service period, the armed forces of the Republic of Korea is currently turning to technologies and equipments from manpower, developing it to become high-tech, high-speed, and complex, resulting in an environment in which a single mistake could cause a mass mortality crisis.It is also evident that, considering aspects such as safety training curriculums and achievements of advanced countries and private education, hands-on training is a must in preventing suicides and accidents in the military, and establishing safety training centers is crucial for systematic and effective hands-on training.Soldiers who are joining the army as of now have experienced the Internet ever since they were born and easily use both virtual and augmented reality, and the current level of science and technology has developed to the point where most of the public safety experience centers are able to be replaced by virtual and augmented reality (VR/AR). Therefore, considering the aspects such as installation space, construction costs, maintenance costs, user characteristics, and education effects, other than for those trainings where real models and objects are more effective such as first aid training, it is with a strong recommendation that establishing military safety training facilities with VR/AR (Virtual and augmented reality) is a must in the coming future. We have derived the need for hands-on training by considering the development of virtual and augmented reality (VR/AR), analysis of operation status of the public safety experience centers, characteristics of military units, installation and maintenance costs, and proposed plan to establish safety training centers where effective training performance can be achieved at a lower cost than the public safety experience center. In addition, we suggested the scale of the required safety training center and the composition of the experience rooms considering the number of trainees and the environment of each military units. Given this analysis it will contribute to the prevention of military safety and suicide by building a safety training center in the future.

Nutrition Education Performance of Elementary School Dietitians in North Gyeonggi Province (경기 북부 지역 초등학교 영양사의 영양 교육 실시 현황)

  • Min Kyung-Chan;Park Young-Sim;Park Hae-Won;Lee Myung-Ho;Shin Yong-Chill;Cho Kyu-Bong;Rhie Kyoung-Ik;Jeaung Koang-Ock;Shin Yim-Sook;Yoon Hee-Sun
    • The Korean Journal of Food And Nutrition
    • /
    • v.19 no.2
    • /
    • pp.183-192
    • /
    • 2006
  • The purpose of this study was to investigate the performance of elementary school dietitians in terms of nutrition education in the northern portion of Gyeonggi province. Self-administered questionnaires were given to 50 dietitians who have worked in elementary schools with self-operation food service, and 35(70%) dietitians returned the questionnaires. The results are summarized as follows: no students took part in nutrition education as a regular course, but all dietitians performed nutrition education in passive ways, such as 'using home correspondence'(39.0%), 'bulletin board/poster'(22.0%), 'using the internet'(13.4%) and 'indirectly through a classroom teacher'(12.2%). Most respondents performed nutrition education 'one time/month'(66.0%) or 'one time/week'(20.0%). The respondents thought that suitable teaching times for nutrition education were 'during a related subject'(35.5%), 'during lunch time'(22.6%) rather than 'during an independent subject'(16.1%). Most of the dietitians(94.3%) did not perform nutrition counseling because of 'a lack of opportunity'(72.7%) and 'workload'(27.3%). Additionally 88.6% of respondents did not have the time of for nutrition counseling for parents because 'am not a teacher'(56.7%) and 'workload'(30,0%). Information sources for nutrition education were mainly 'internet'(71.4%) and 're-educationa1 materials'(17.1%). They possessed instructional materials in the forms of 'printed materials'(35.1 %), 'exhibition/bulletin board'(31.2%), and 'electrical materials'(33.8%), 'but did not have 'solid materials' such as food models and dolls. Generally they had mostly 'leaflets'(82.9%), 'bulletins'(68.6%), 'internet'(57.1%), and 'CDs'(57.1%). Preferences for instructional materials used were 'printed materials'(46.2%), 'exhibition/bulletin board'(36.5%), and 'electrical materials'(17.3%) 'Leaflets'(80.0%) were mainly used; 'CD'(17.1 %) use was low compared to the proportion possessing CDs. The topics frequently chosen by the subjects for nutrition education were 'table manners'(82.9%), 'basic concepts of food and nutrition'(80.0%), and 'proper food habits'(80.0%), but the topics helpful for practical use, such as 'how much do I eat'(20.0%) and 'nutrition labeling'(37.1%), were not included frequently. The respondents thought that 'eating only what they like'(60.0 %), 'intake of processed foods'(17.8%), and 'obesity'(17.8%) were the most common nutritional problems among elementary school children. They also thought that establishing a regular course for nutrition education was an effective way to cut down on these nutritional problems. In conclusion, nutrition education programs that are combined with effective instructional materials and practical topics should be developed. Additionally, it is recommended that dietitians act as teachers who participate in regular courses as soon as possible.

A Study for Factors Influencing the Usage Increase and Decrease of Mobile Data Service: Based on The Two Factor Theory (모바일 데이터 서비스 사용량 증감에 영향을 미치는 요인들에 관한 연구: 이요인 이론(Two Factor Theory)을 바탕으로)

  • Lee, Sang-Hoon;Kim, Il-Kyung;Lee, Ho-Geun;Park, Hyun-Jee
    • Asia pacific journal of information systems
    • /
    • v.17 no.2
    • /
    • pp.97-122
    • /
    • 2007
  • Conventional networking and telecommunications infrastructure characterized by wires, fixed location, and inflexibility is giving way to mobile technologies. Numerous research reports point to the ultimate domination of wireless communication. With the increasing prevalence of advanced cell-phones, various mobile data services (hereafter MDS) are gaining popularity. Although cellular networks were originally introduced for voice communications, statistics indicate that data services are replacing the matured voice service as the growth engine for telecom service providers. For example, SK Telecom, the Korea's largest mobile service provider, reported that 25.6% of revenue and 28.5% of profit came from MDS in 2006 and the share is growing. Statistics also indicate that, in 2006, the average revenue per user (ARPU) for voice didn't change but MDS grew seven percents from the previous year, further highlighting its growth potential. MDS is defined "as an assortment of digital data services that can be accessed using a mobile device over a wide geographic area." A variety of MDS have been deployed, with a few reaching the status of killer applications. Many of them need to access the Internet through the cellular-phone infrastructure. In the past, when the cellular network didn't have acceptable bandwidth for data services, SMS (short messaging service) dominated MDS. Now, Internet-ready, next-generation cell-phones are driving rich digital data services into the fabric of everyday life, These include news on various topics, Internet search, mapping and location-based information, mobile banking and gaming, downloading (i.e., screen savers), multimedia streaming, and various communication services (i.e., email, short messaging, messenger, and chaffing). The huge economic stake MDS has on its stakeholders warrants focused research to understand associated dynamics behind its adoption. Lyytinen and Yoo(2002) pointed out the limitation of traditional adoption models in explaining the rapid diffusion of innovations such as P2P or mobile services. Also, despite the increasing popularity of MDS, unexpected drop in its usage is observed among some people. Intrigued by these observations, an exploratory study was conducted to examine decision factors of MDS usage. Data analysis revealed that the increase and decrease of MDS use was influenced by different forces. The findings of the exploratory study triggered our confirmatory research effort to validate the uni-directionality of studied factors in affecting MDS usage. This differs from extant studies of IS/IT adoption that are largely grounded on the assumption of bi-directionality of explanatory variables in determining the level of dependent variables (i.e., user satisfaction, service usage). The research goal is, therefore, to examine if increase and decrease in the usage of MDS are explained by two separate groups of variables pertaining to information quality and system quality. For this, we investigate following research questions: (1) Does the information quality of MDS increase service usage?; (2) Does the system quality of MDS decrease service usage?; and (3) Does user motivation for subscribing MDS moderate the effect information and system quality have on service usage? The research questions and subsequent analysis are grounded on the two factor theory pioneered by Hertzberg et al(1959). To answer the research questions, in the first, an exploratory study based on 378 survey responses was conducted to learn about important decision factors of MDS usage. It revealed discrepancy between the influencing forces of usage increase and those of usage decrease. Based on the findings from the exploratory study and the two-factor theory, we postulated information quality as the motivator and system quality as the de-motivator (or hygiene) of MDS. Then, a confirmative study was undertaken on their respective role in encouraging and discouraging the usage of mobile data service.

The Influence of Online Social Networking on Individual Virtual Competence and Task Performance in Organizations (온라인 네트워킹 활동이 가상협업 역량 및 업무성과에 미치는 영향)

  • Suh, A-Young;Shin, Kyung-Shik
    • Asia pacific journal of information systems
    • /
    • v.22 no.2
    • /
    • pp.39-69
    • /
    • 2012
  • With the advent of communication technologies including electronic collaborative tools and conferencing systems provided over the Internet, virtual collaboration is becoming increasingly common in organizations. Virtual collaboration refers to an environment in which the people working together are interdependent in their tasks, share responsibility for outcomes, are geographically dispersed, and rely on mediated rather than face-to face, communication to produce an outcome. Research suggests that new sets of individual skill, knowledge, and ability (SKAs) are required to perform effectively in today's virtualized workplace, which is labeled as individual virtual competence. It is also argued that use of online social networking sites may influence not only individuals' daily lives but also their capability to manage their work-related relationships in organizations, which in turn leads to better performance. The existing research regarding (1) the relationship between virtual competence and task performance and (2) the relationship between online networking and task performance has been conducted based on different theoretical perspectives so that little is known about how online social networking and virtual competence interplay to predict individuals' task performance. To fill this gap, this study raises the following research questions: (1) What is the individual virtual competence required for better adjustment to the virtual collaboration environment? (2) How does online networking via diverse social network service sites influence individuals' task performance in organizations? (3) How do the joint effects of individual virtual competence and online networking influence task performance? To address these research questions, we first draw on the prior literature and derive four dimensions of individual virtual competence that are related with an individual's self-concept, knowledge and ability. Computer self-efficacy is defined as the extent to which an individual beliefs in his or her ability to use computer technology broadly. Remotework self-efficacy is defined as the extent to which an individual beliefs in his or her ability to work and perform joint tasks with others in virtual settings. Virtual media skill is defined as the degree of confidence of individuals to function in their work role without face-to-face interactions. Virtual social skill is an individual's skill level in using technologies to communicate in virtual settings to their full potential. It should be noted that the concept of virtual social skill is different from the self-efficacy and captures an individual's cognition-based ability to build social relationships with others in virtual settings. Next, we discuss how online networking influences both individual virtual competence and task performance based on the social network theory and the social learning theory. We argue that online networking may enhance individuals' capability in expanding their social networks with low costs. We also argue that online networking may enable individuals to learn the necessary skills regarding how they use technological functions, communicate with others, and share information and make social relations using the technical functions provided by electronic media, consequently increasing individual virtual competence. To examine the relationships among online networking, virtual competence, and task performance, we developed research models (the mediation, interaction, and additive models, respectively) by integrating the social network theory and the social learning theory. Using data from 112 employees of a virtualized company, we tested the proposed research models. The results of analysis partly support the mediation model in that online social networking positively influences individuals' computer self-efficacy, virtual social skill, and virtual media skill, which are key predictors of individuals' task performance. Furthermore, the results of the analysis partly support the interaction model in that the level of remotework self-efficacy moderates the relationship between online social networking and task performance. The results paint a picture of people adjusting to virtual collaboration that constrains and enables their task performance. This study contributes to research and practice. First, we suggest a shift of research focus to the individual level when examining virtual phenomena and theorize that online social networking can enhance individual virtual competence in some aspects. Second, we replicate and advance the prior competence literature by linking each component of virtual competence and objective task performance. The results of this study provide useful insights into how human resource responsibilities assess employees' weakness and strength when they organize virtualized groups or projects. Furthermore, it provides managers with insights into the kinds of development or training programs that they can engage in with their employees to advance their ability to undertake virtual work.

  • PDF

An Intelligent Intrusion Detection Model Based on Support Vector Machines and the Classification Threshold Optimization for Considering the Asymmetric Error Cost (비대칭 오류비용을 고려한 분류기준값 최적화와 SVM에 기반한 지능형 침입탐지모형)

  • Lee, Hyeon-Uk;Ahn, Hyun-Chul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.157-173
    • /
    • 2011
  • As the Internet use explodes recently, the malicious attacks and hacking for a system connected to network occur frequently. This means the fatal damage can be caused by these intrusions in the government agency, public office, and company operating various systems. For such reasons, there are growing interests and demand about the intrusion detection systems (IDS)-the security systems for detecting, identifying and responding to unauthorized or abnormal activities appropriately. The intrusion detection models that have been applied in conventional IDS are generally designed by modeling the experts' implicit knowledge on the network intrusions or the hackers' abnormal behaviors. These kinds of intrusion detection models perform well under the normal situations. However, they show poor performance when they meet a new or unknown pattern of the network attacks. For this reason, several recent studies try to adopt various artificial intelligence techniques, which can proactively respond to the unknown threats. Especially, artificial neural networks (ANNs) have popularly been applied in the prior studies because of its superior prediction accuracy. However, ANNs have some intrinsic limitations such as the risk of overfitting, the requirement of the large sample size, and the lack of understanding the prediction process (i.e. black box theory). As a result, the most recent studies on IDS have started to adopt support vector machine (SVM), the classification technique that is more stable and powerful compared to ANNs. SVM is known as a relatively high predictive power and generalization capability. Under this background, this study proposes a novel intelligent intrusion detection model that uses SVM as the classification model in order to improve the predictive ability of IDS. Also, our model is designed to consider the asymmetric error cost by optimizing the classification threshold. Generally, there are two common forms of errors in intrusion detection. The first error type is the False-Positive Error (FPE). In the case of FPE, the wrong judgment on it may result in the unnecessary fixation. The second error type is the False-Negative Error (FNE) that mainly misjudges the malware of the program as normal. Compared to FPE, FNE is more fatal. Thus, when considering total cost of misclassification in IDS, it is more reasonable to assign heavier weights on FNE rather than FPE. Therefore, we designed our proposed intrusion detection model to optimize the classification threshold in order to minimize the total misclassification cost. In this case, conventional SVM cannot be applied because it is designed to generate discrete output (i.e. a class). To resolve this problem, we used the revised SVM technique proposed by Platt(2000), which is able to generate the probability estimate. To validate the practical applicability of our model, we applied it to the real-world dataset for network intrusion detection. The experimental dataset was collected from the IDS sensor of an official institution in Korea from January to June 2010. We collected 15,000 log data in total, and selected 1,000 samples from them by using random sampling method. In addition, the SVM model was compared with the logistic regression (LOGIT), decision trees (DT), and ANN to confirm the superiority of the proposed model. LOGIT and DT was experimented using PASW Statistics v18.0, and ANN was experimented using Neuroshell 4.0. For SVM, LIBSVM v2.90-a freeware for training SVM classifier-was used. Empirical results showed that our proposed model based on SVM outperformed all the other comparative models in detecting network intrusions from the accuracy perspective. They also showed that our model reduced the total misclassification cost compared to the ANN-based intrusion detection model. As a result, it is expected that the intrusion detection model proposed in this paper would not only enhance the performance of IDS, but also lead to better management of FNE.

The Adoption and Diffusion of Semantic Web Technology Innovation: Qualitative Research Approach (시맨틱 웹 기술혁신의 채택과 확산: 질적연구접근법)

  • Joo, Jae-Hun
    • Asia pacific journal of information systems
    • /
    • v.19 no.1
    • /
    • pp.33-62
    • /
    • 2009
  • Internet computing is a disruptive IT innovation. Semantic Web can be considered as an IT innovation because the Semantic Web technology possesses the potential to reduce information overload and enable semantic integration, using capabilities such as semantics and machine-processability. How should organizations adopt the Semantic Web? What factors affect the adoption and diffusion of Semantic Web innovation? Most studies on adoption and diffusion of innovation use empirical analysis as a quantitative research methodology in the post-implementation stage. There is criticism that the positivist requiring theoretical rigor can sacrifice relevance to practice. Rapid advances in technology require studies relevant to practice. In particular, it is realistically impossible to conduct quantitative approach for factors affecting adoption of the Semantic Web because the Semantic Web is in its infancy. However, in an early stage of introduction of the Semantic Web, it is necessary to give a model and some guidelines and for adoption and diffusion of the technology innovation to practitioners and researchers. Thus, the purpose of this study is to present a model of adoption and diffusion of the Semantic Web and to offer propositions as guidelines for successful adoption through a qualitative research method including multiple case studies and in-depth interviews. The researcher conducted interviews with 15 people based on face-to face and 2 interviews by telephone and e-mail to collect data to saturate the categories. Nine interviews including 2 telephone interviews were from nine user organizations adopting the technology innovation and the others were from three supply organizations. Semi-structured interviews were used to collect data. The interviews were recorded on digital voice recorder memory and subsequently transcribed verbatim. 196 pages of transcripts were obtained from about 12 hours interviews. Triangulation of evidence was achieved by examining each organization website and various documents, such as brochures and white papers. The researcher read the transcripts several times and underlined core words, phrases, or sentences. Then, data analysis used the procedure of open coding, in which the researcher forms initial categories of information about the phenomenon being studied by segmenting information. QSR NVivo version 8.0 was used to categorize sentences including similar concepts. 47 categories derived from interview data were grouped into 21 categories from which six factors were named. Five factors affecting adoption of the Semantic Web were identified. The first factor is demand pull including requirements for improving search and integration services of the existing systems and for creating new services. Second, environmental conduciveness, reference models, uncertainty, technology maturity, potential business value, government sponsorship programs, promising prospects for technology demand, complexity and trialability affect the adoption of the Semantic Web from the perspective of technology push. Third, absorptive capacity is an important role of the adoption. Fourth, suppler's competence includes communication with and training for users, and absorptive capacity of supply organization. Fifth, over-expectance which results in the gap between user's expectation level and perceived benefits has a negative impact on the adoption of the Semantic Web. Finally, the factor including critical mass of ontology, budget. visible effects is identified as a determinant affecting routinization and infusion. The researcher suggested a model of adoption and diffusion of the Semantic Web, representing relationships between six factors and adoption/diffusion as dependent variables. Six propositions are derived from the adoption/diffusion model to offer some guidelines to practitioners and a research model to further studies. Proposition 1 : Demand pull has an influence on the adoption of the Semantic Web. Proposition 1-1 : The stronger the degree of requirements for improving existing services, the more successfully the Semantic Web is adopted. Proposition 1-2 : The stronger the degree of requirements for new services, the more successfully the Semantic Web is adopted. Proposition 2 : Technology push has an influence on the adoption of the Semantic Web. Proposition 2-1 : From the perceptive of user organizations, the technology push forces such as environmental conduciveness, reference models, potential business value, and government sponsorship programs have a positive impact on the adoption of the Semantic Web while uncertainty and lower technology maturity have a negative impact on its adoption. Proposition 2-2 : From the perceptive of suppliers, the technology push forces such as environmental conduciveness, reference models, potential business value, government sponsorship programs, and promising prospects for technology demand have a positive impact on the adoption of the Semantic Web while uncertainty, lower technology maturity, complexity and lower trialability have a negative impact on its adoption. Proposition 3 : The absorptive capacities such as organizational formal support systems, officer's or manager's competency analyzing technology characteristics, their passion or willingness, and top management support are positively associated with successful adoption of the Semantic Web innovation from the perceptive of user organizations. Proposition 4 : Supplier's competence has a positive impact on the absorptive capacities of user organizations and technology push forces. Proposition 5 : The greater the gap of expectation between users and suppliers, the later the Semantic Web is adopted. Proposition 6 : The post-adoption activities such as budget allocation, reaching critical mass, and sharing ontology to offer sustainable services are positively associated with successful routinization and infusion of the Semantic Web innovation from the perceptive of user organizations.

Analysis of the Effects of Radio Traffic Information on Urban Worker's Travel Choice Behavior (교통방송이 제공하는 교통정보가 직장인의 통행행태에 미치는 영향 분석)

  • 윤대식
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.5
    • /
    • pp.33-43
    • /
    • 2002
  • Travel choice behavior is affected by real-time traffic information. Recently, in urban area, real-time traffic information is provided by several instruments such as transportation broadcasting, internet PC network and variable message sign, etc. Furthermore, it has been increasing for urban travelers to use real-time traffic information provided by several instruments. The purpose of this study is to analyze the effects of advanced traveler information on urban worker's travel choice behavior. Among several Advanced Traveler Information System(ATIS) employed in urban area. This study focuses on examining the effects of transportation broadcasting on urban worker's travel choice behavior. This study attempts to examine traveler's mode change behavior in the pre-trip stage and traveler's route change behavior in the on-route stage. For this study, the survey data collected from Daegu City in 2000 is used. For empirical analysis, several nested logit models are estimated, and among them, the best models are reported in this paper. Furthermore, based on the empirical models estimated for this research, important findings and their policy implications are discussed.

Design and Implementation of MongoDB-based Unstructured Log Processing System over Cloud Computing Environment (클라우드 환경에서 MongoDB 기반의 비정형 로그 처리 시스템 설계 및 구현)

  • Kim, Myoungjin;Han, Seungho;Cui, Yun;Lee, Hanku
    • Journal of Internet Computing and Services
    • /
    • v.14 no.6
    • /
    • pp.71-84
    • /
    • 2013
  • Log data, which record the multitude of information created when operating computer systems, are utilized in many processes, from carrying out computer system inspection and process optimization to providing customized user optimization. In this paper, we propose a MongoDB-based unstructured log processing system in a cloud environment for processing the massive amount of log data of banks. Most of the log data generated during banking operations come from handling a client's business. Therefore, in order to gather, store, categorize, and analyze the log data generated while processing the client's business, a separate log data processing system needs to be established. However, the realization of flexible storage expansion functions for processing a massive amount of unstructured log data and executing a considerable number of functions to categorize and analyze the stored unstructured log data is difficult in existing computer environments. Thus, in this study, we use cloud computing technology to realize a cloud-based log data processing system for processing unstructured log data that are difficult to process using the existing computing infrastructure's analysis tools and management system. The proposed system uses the IaaS (Infrastructure as a Service) cloud environment to provide a flexible expansion of computing resources and includes the ability to flexibly expand resources such as storage space and memory under conditions such as extended storage or rapid increase in log data. Moreover, to overcome the processing limits of the existing analysis tool when a real-time analysis of the aggregated unstructured log data is required, the proposed system includes a Hadoop-based analysis module for quick and reliable parallel-distributed processing of the massive amount of log data. Furthermore, because the HDFS (Hadoop Distributed File System) stores data by generating copies of the block units of the aggregated log data, the proposed system offers automatic restore functions for the system to continually operate after it recovers from a malfunction. Finally, by establishing a distributed database using the NoSQL-based Mongo DB, the proposed system provides methods of effectively processing unstructured log data. Relational databases such as the MySQL databases have complex schemas that are inappropriate for processing unstructured log data. Further, strict schemas like those of relational databases cannot expand nodes in the case wherein the stored data are distributed to various nodes when the amount of data rapidly increases. NoSQL does not provide the complex computations that relational databases may provide but can easily expand the database through node dispersion when the amount of data increases rapidly; it is a non-relational database with an appropriate structure for processing unstructured data. The data models of the NoSQL are usually classified as Key-Value, column-oriented, and document-oriented types. Of these, the representative document-oriented data model, MongoDB, which has a free schema structure, is used in the proposed system. MongoDB is introduced to the proposed system because it makes it easy to process unstructured log data through a flexible schema structure, facilitates flexible node expansion when the amount of data is rapidly increasing, and provides an Auto-Sharding function that automatically expands storage. The proposed system is composed of a log collector module, a log graph generator module, a MongoDB module, a Hadoop-based analysis module, and a MySQL module. When the log data generated over the entire client business process of each bank are sent to the cloud server, the log collector module collects and classifies data according to the type of log data and distributes it to the MongoDB module and the MySQL module. The log graph generator module generates the results of the log analysis of the MongoDB module, Hadoop-based analysis module, and the MySQL module per analysis time and type of the aggregated log data, and provides them to the user through a web interface. Log data that require a real-time log data analysis are stored in the MySQL module and provided real-time by the log graph generator module. The aggregated log data per unit time are stored in the MongoDB module and plotted in a graph according to the user's various analysis conditions. The aggregated log data in the MongoDB module are parallel-distributed and processed by the Hadoop-based analysis module. A comparative evaluation is carried out against a log data processing system that uses only MySQL for inserting log data and estimating query performance; this evaluation proves the proposed system's superiority. Moreover, an optimal chunk size is confirmed through the log data insert performance evaluation of MongoDB for various chunk sizes.

Multi-Dimensional Analysis Method of Product Reviews for Market Insight (마켓 인사이트를 위한 상품 리뷰의 다차원 분석 방안)

  • Park, Jeong Hyun;Lee, Seo Ho;Lim, Gyu Jin;Yeo, Un Yeong;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.57-78
    • /
    • 2020
  • With the development of the Internet, consumers have had an opportunity to check product information easily through E-Commerce. Product reviews used in the process of purchasing goods are based on user experience, allowing consumers to engage as producers of information as well as refer to information. This can be a way to increase the efficiency of purchasing decisions from the perspective of consumers, and from the seller's point of view, it can help develop products and strengthen their competitiveness. However, it takes a lot of time and effort to understand the overall assessment and assessment dimensions of the products that I think are important in reading the vast amount of product reviews offered by E-Commerce for the products consumers want to compare. This is because product reviews are unstructured information and it is difficult to read sentiment of reviews and assessment dimension immediately. For example, consumers who want to purchase a laptop would like to check the assessment of comparative products at each dimension, such as performance, weight, delivery, speed, and design. Therefore, in this paper, we would like to propose a method to automatically generate multi-dimensional product assessment scores in product reviews that we would like to compare. The methods presented in this study consist largely of two phases. One is the pre-preparation phase and the second is the individual product scoring phase. In the pre-preparation phase, a dimensioned classification model and a sentiment analysis model are created based on a review of the large category product group review. By combining word embedding and association analysis, the dimensioned classification model complements the limitation that word embedding methods for finding relevance between dimensions and words in existing studies see only the distance of words in sentences. Sentiment analysis models generate CNN models by organizing learning data tagged with positives and negatives on a phrase unit for accurate polarity detection. Through this, the individual product scoring phase applies the models pre-prepared for the phrase unit review. Multi-dimensional assessment scores can be obtained by aggregating them by assessment dimension according to the proportion of reviews organized like this, which are grouped among those that are judged to describe a specific dimension for each phrase. In the experiment of this paper, approximately 260,000 reviews of the large category product group are collected to form a dimensioned classification model and a sentiment analysis model. In addition, reviews of the laptops of S and L companies selling at E-Commerce are collected and used as experimental data, respectively. The dimensioned classification model classified individual product reviews broken down into phrases into six assessment dimensions and combined the existing word embedding method with an association analysis indicating frequency between words and dimensions. As a result of combining word embedding and association analysis, the accuracy of the model increased by 13.7%. The sentiment analysis models could be seen to closely analyze the assessment when they were taught in a phrase unit rather than in sentences. As a result, it was confirmed that the accuracy was 29.4% higher than the sentence-based model. Through this study, both sellers and consumers can expect efficient decision making in purchasing and product development, given that they can make multi-dimensional comparisons of products. In addition, text reviews, which are unstructured data, were transformed into objective values such as frequency and morpheme, and they were analysed together using word embedding and association analysis to improve the objectivity aspects of more precise multi-dimensional analysis and research. This will be an attractive analysis model in terms of not only enabling more effective service deployment during the evolving E-Commerce market and fierce competition, but also satisfying both customers.