• Title/Summary/Keyword: Unstructured data analysis

Search Result 428, Processing Time 0.023 seconds

Numerical Analysis of Flowfield around Multicopter for the Analysis of Air Data Sensor Installation (대기자료센서 장착위치 분석을 위한 멀티콥터 주변 유동장 수치해석)

  • Park, Young Min;Lee, Chang Ho;Lee, Yung Gyo
    • Journal of Aerospace System Engineering
    • /
    • v.11 no.5
    • /
    • pp.20-27
    • /
    • 2017
  • The present paper describes the flow analysis of the flows around the multicopter for the selection of optimal position of air data sensor. For the flow analysis, the commercial fluid dynamics solver, STAR-CCM+ was used with polygon mesh and k-w SST turbulence modeling options. For the simulation of each rotating 4 propellers, unstructured overset mesh method was used. Hovering, forward flight, ascending and descending flight conditions are selected for the analysis and airspeed and flow angle errors were investigated using the CFD results. Through the flow field analysis, sensor location above one propeller diameter distance from the propeller rotating plane showed airspeed error less than 1m/s within the typical flight conditions of multicopter except descending.

A study on the internal reputation factors affecting the job satisfaction: Focusing on big data analysis in the social media for corporation reputation (직무만족도에 영향을 미치는 내부평판 요인에 관한 연구: 기업정보 제공 소셜 미디어 빅데이터를 중심으로)

  • Seo, Woon-Chae;Kim, Hyoung-Joong
    • Journal of Digital Contents Society
    • /
    • v.17 no.4
    • /
    • pp.295-305
    • /
    • 2016
  • The purpose of this study is to analyze the internal reputation factors that affect the job satisfaction by big data analysis in the social media for corporate reputation and verify the difference between large corporations and small-medium corporations for each factor of internal reputation. The result showed 'Salaries and Benefits' is a major factor that affects the job satisfaction for all research corporations, 'Senior Management' is a major factor for large corporations, and 'Salaries and Benefits' is a major factor for small-medium corporations. As for the difference factors of large corporations and small-medium corporations are 'Job Satisfaction', 'Salaries and Benefits', and 'Work-life Balance'. Unstructured data analysis shows some interesting features to be studied further.

An Analytical Approach Using Topic Mining for Improving the Service Quality of Hotels (호텔 산업의 서비스 품질 향상을 위한 토픽 마이닝 기반 분석 방법)

  • Moon, Hyun Sil;Sung, David;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.21-41
    • /
    • 2019
  • Thanks to the rapid development of information technologies, the data available on Internet have grown rapidly. In this era of big data, many studies have attempted to offer insights and express the effects of data analysis. In the tourism and hospitality industry, many firms and studies in the era of big data have paid attention to online reviews on social media because of their large influence over customers. As tourism is an information-intensive industry, the effect of these information networks on social media platforms is more remarkable compared to any other types of media. However, there are some limitations to the improvements in service quality that can be made based on opinions on social media platforms. Users on social media platforms represent their opinions as text, images, and so on. Raw data sets from these reviews are unstructured. Moreover, these data sets are too big to extract new information and hidden knowledge by human competences. To use them for business intelligence and analytics applications, proper big data techniques like Natural Language Processing and data mining techniques are needed. This study suggests an analytical approach to directly yield insights from these reviews to improve the service quality of hotels. Our proposed approach consists of topic mining to extract topics contained in the reviews and the decision tree modeling to explain the relationship between topics and ratings. Topic mining refers to a method for finding a group of words from a collection of documents that represents a document. Among several topic mining methods, we adopted the Latent Dirichlet Allocation algorithm, which is considered as the most universal algorithm. However, LDA is not enough to find insights that can improve service quality because it cannot find the relationship between topics and ratings. To overcome this limitation, we also use the Classification and Regression Tree method, which is a kind of decision tree technique. Through the CART method, we can find what topics are related to positive or negative ratings of a hotel and visualize the results. Therefore, this study aims to investigate the representation of an analytical approach for the improvement of hotel service quality from unstructured review data sets. Through experiments for four hotels in Hong Kong, we can find the strengths and weaknesses of services for each hotel and suggest improvements to aid in customer satisfaction. Especially from positive reviews, we find what these hotels should maintain for service quality. For example, compared with the other hotels, a hotel has a good location and room condition which are extracted from positive reviews for it. In contrast, we also find what they should modify in their services from negative reviews. For example, a hotel should improve room condition related to soundproof. These results mean that our approach is useful in finding some insights for the service quality of hotels. That is, from the enormous size of review data, our approach can provide practical suggestions for hotel managers to improve their service quality. In the past, studies for improving service quality relied on surveys or interviews of customers. However, these methods are often costly and time consuming and the results may be biased by biased sampling or untrustworthy answers. The proposed approach directly obtains honest feedback from customers' online reviews and draws some insights through a type of big data analysis. So it will be a more useful tool to overcome the limitations of surveys or interviews. Moreover, our approach easily obtains the service quality information of other hotels or services in the tourism industry because it needs only open online reviews and ratings as input data. Furthermore, the performance of our approach will be better if other structured and unstructured data sources are added.

Sentiment Analyses of the Impacts of Online Experience Subjectivity on Customer Satisfaction (감성분석을 이용한 온라인 체험 내 비정형데이터의 주관도가 고객만족에 미치는 영향 분석)

  • Yeeun Seo;Sang-Yong Tom Lee
    • Information Systems Review
    • /
    • v.25 no.1
    • /
    • pp.233-255
    • /
    • 2023
  • The development of information technology(IT) has brought so-called "online experience" to satisfy our daily needs. The market for online experiences grew more during the COVID-19 pandemic. Therefore, this study attempted to analyze how the features of online experience services affect customer satisfaction by crawling structured and unstructured data from the online experience web site newly launched by Airbnb after COVID-19. As a result of the analysis, it was found that the structured data generated by service users on a C2C online sharing platform had a positive effect on the satisfaction of other users. In addition, unstructured text data such as experience introductions and host introductions generated by service providers turned out to have different subjectivity scores depending on the purpose of its text. It was confirmed that the subjective host introduction and the objective experience introduction affect customer satisfaction positively. The results of this study are to provide various implications to stakeholders of the online sharing economy platform and researchers interested in online experience knowledge management.

The Effect of Expert Reviews on Consumer Product Evaluations: A Text Mining Approach (전문가 제품 후기가 소비자 제품 평가에 미치는 영향: 텍스트마이닝 분석을 중심으로)

  • Kang, Taeyoung;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.63-82
    • /
    • 2016
  • Individuals gather information online to resolve problems in their daily lives and make various decisions about the purchase of products or services. With the revolutionary development of information technology, Web 2.0 has allowed more people to easily generate and use online reviews such that the volume of information is rapidly increasing, and the usefulness and significance of analyzing the unstructured data have also increased. This paper presents an analysis on the lexical features of expert product reviews to determine their influence on consumers' purchasing decisions. The focus was on how unstructured data can be organized and used in diverse contexts through text mining. In addition, diverse lexical features of expert reviews of contents provided by a third-party review site were extracted and defined. Expert reviews are defined as evaluations by people who have expert knowledge about specific products or services in newspapers or magazines; this type of review is also called a critic review. Consumers who purchased products before the widespread use of the Internet were able to access expert reviews through newspapers or magazines; thus, they were not able to access many of them. Recently, however, major media also now provide online services so that people can more easily and affordably access expert reviews compared to the past. The reason why diverse reviews from experts in several fields are important is that there is an information asymmetry where some information is not shared among consumers and sellers. The information asymmetry can be resolved with information provided by third parties with expertise to consumers. Then, consumers can read expert reviews and make purchasing decisions by considering the abundant information on products or services. Therefore, expert reviews play an important role in consumers' purchasing decisions and the performance of companies across diverse industries. If the influence of qualitative data such as reviews or assessment after the purchase of products can be separately identified from the quantitative data resources, such as the actual quality of products or price, it is possible to identify which aspects of product reviews hamper or promote product sales. Previous studies have focused on the characteristics of the experts themselves, such as the expertise and credibility of sources regarding expert reviews; however, these studies did not suggest the influence of the linguistic features of experts' product reviews on consumers' overall evaluation. However, this study focused on experts' recommendations and evaluations to reveal the lexical features of expert reviews and whether such features influence consumers' overall evaluations and purchasing decisions. Real expert product reviews were analyzed based on the suggested methodology, and five lexical features of expert reviews were ultimately determined. Specifically, the "review depth" (i.e., degree of detail of the expert's product analysis), and "lack of assurance" (i.e., degree of confidence that the expert has in the evaluation) have statistically significant effects on consumers' product evaluations. In contrast, the "positive polarity" (i.e., the degree of positivity of an expert's evaluations) has an insignificant effect, while the "negative polarity" (i.e., the degree of negativity of an expert's evaluations) has a significant negative effect on consumers' product evaluations. Finally, the "social orientation" (i.e., the degree of how many social expressions experts include in their reviews) does not have a significant effect on consumers' product evaluations. In summary, the lexical properties of the product reviews were defined according to each relevant factor. Then, the influence of each linguistic factor of expert reviews on the consumers' final evaluations was tested. In addition, a test was performed on whether each linguistic factor influencing consumers' product evaluations differs depending on the lexical features. The results of these analyses should provide guidelines on how individuals process massive volumes of unstructured data depending on lexical features in various contexts and how companies can use this mechanism from their perspective. This paper provides several theoretical and practical contributions, such as the proposal of a new methodology and its application to real data.

Spark-based Network Log Analysis Aystem for Detecting Network Attack Pattern Using Snort (Snort를 이용한 비정형 네트워크 공격패턴 탐지를 수행하는 Spark 기반 네트워크 로그 분석 시스템)

  • Baek, Na-Eun;Shin, Jae-Hwan;Chang, Jin-Su;Chang, Jae-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.4
    • /
    • pp.48-59
    • /
    • 2018
  • Recently, network technology has been used in various fields due to development of network technology. However, there has been an increase in the number of attacks targeting public institutions and companies by exploiting the evolving network technology. Meanwhile, the existing network intrusion detection system takes much time to process logs as the amount of network log increases. Therefore, in this paper, we propose a Spark-based network log analysis system that detects unstructured network attack pattern. by using Snort. The proposed system extracts and analyzes the elements required for network attack pattern detection from large amount of network log data. For the analysis, we propose a rule to detect network attack patterns for Port Scanning, Host Scanning, DDoS, and worm activity, and can detect real attack pattern well by applying it to real log data. Finally, we show from our performance evaluation that the proposed Spark-based log analysis system is more than two times better on log data processing performance than the Hadoop-based system.

Association Analysis for Detecting Abnormal in Graph Database Environment (그래프 데이터베이스 환경에서 이상징후 탐지를 위한 연관 관계 분석 기법)

  • Jeong, Woo-Cheol;Jun, Moon-Seog;Choi, Do-Hyeon
    • Journal of Convergence for Information Technology
    • /
    • v.10 no.8
    • /
    • pp.15-22
    • /
    • 2020
  • The 4th industrial revolution and the rapid change in the data environment revealed technical limitations in the existing relational database(RDB). As a new analysis method for unstructured data in all fields such as IDC/finance/insurance, interest in graph database(GDB) technology is increasing. The graph database is an efficient technique for expressing interlocked data and analyzing associations in a wide range of networks. This study extended the existing RDB to the GDB model and applied machine learning algorithms (pattern recognition, clustering, path distance, core extraction) to detect new abnormal signs. As a result of the performance analysis, it was confirmed that the performance of abnormal behavior(about 180 times or more) was greatly improved, and that it was possible to extract an abnormal symptom pattern after 5 steps that could not be analyzed by RDB.

Feature Analyze and Research of National Convergence R&D: With Focus on the Text Mining (국가 융합 R&D 특성 분석에 관한 연구: 텍스트분석을 중심으로)

  • Yoo, KiCheol;Lee, TaeHee;Choi, SangHyun;Lee, JungHwan
    • Journal of Information Technology Applications and Management
    • /
    • v.27 no.1
    • /
    • pp.59-73
    • /
    • 2020
  • There is a growing interest in convergence. National R & D is also providing various policies and institutional support to promote convergence research. Convergence research, however, does not clearly specify its characteristics at the academic and government levels. This research proceeds with the process of collecting, refining, analyzing, modeling, verifying and visualizing national R & D data through the National Science and Technology Information Service (NTIS). The method is to derive the convergence research characteristics and to derive through text mining, focusing on the unstructured information of national R & D project data. The study confirmed that there was a difference in perception between the definition of converged research and the research site. In order to improve this, the research suggested that convergence among research subjects, collaboration among research topics reflecting various backgrounds and characteristics of researchers, and analysis of characteristics of convergence research using information were suggested in the process of establishing convergence policy.

A Novel Classification Model for Efficient Patent Information Research (효율적인 특허정보 조사를 위한 분류 모형)

  • Kim, Youngho;Park, Sangsung;Jang, Dongsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.4
    • /
    • pp.103-110
    • /
    • 2019
  • A patent contains detailed information of the developed technology and is published to the public. Thus, patents can be used to overcome the limitations of traditional technology trend research and prediction techniques. Recently, due to the advantages of patented analytical methodology, IP R&D is carried out worldwide. The patent is big data and has a huge amount, various domains, and structured and unstructured data characteristics. For this reason, there are many difficulties in collecting and researching patent information. Patent research generally writes the Search formula to collect patent documents from DB. The collected patent documents contain some noise patents that are irrelevant to the purpose of analysis, so they are removed. However, eliminating noise patents is a manual task of reading and classifying technology, which is time consuming and expensive. In this study, we propose a model that automatically classifies The Noise patent for efficient patent information research. The proposed method performs Patent Embedding using Word2Vec and generates Noise seed label. In addition, noise patent classification is performed using the Random forest. The experimental data is published and registered with the USPTO among the patents related to Ocean Surveillance & Tracking Network technology. As a result of experimenting with the proposed model, it showed 73% accuracy with the label actually given by experts.

Adaptation Process to Group Home Living by Older Adults (노인요양공동생활가정 입소노인의 적응과정)

  • Yoon, Hee Sook;Sok, Sohyune
    • Journal of Korean Academy of Nursing
    • /
    • v.46 no.6
    • /
    • pp.858-870
    • /
    • 2016
  • Purpose: The purpose of the study was to explore and describe the adaptation process of older people to group homes. Methods: Participants were twenty older adults aged 65 or older who were living in group homes. Data were collected from January to April, 2015. In-depth unstructured interviews were conducted with individual participants. Data were analyzed using Strauss and Corbin's grounded theory method. Results: From open coding, 100 concepts, 38 sub-categories, and 14 categories were identified. Analysis showed that the central phenomenon of the adaptation process of older people to group homes was 'gradually giving up'. Causal conditions were 'good-for-nothing body', contextual conditions were 'pushed', 'beleaguered'. Intervening conditions were 'reliable pillar: children', 'having affection (情) more than having it from family: facility workers', 'comfort - like feeling at home', 'relieved: system'. Action/interaction strategies were 'facing the unfamiliar reality', 'building relationships with other people', 'accepting reality'. Consequences were 'a good place, more than expected', 'hope for the remaining days', 'waiting for a peaceful death'. Conclusion: The results of this study provide an in-depth understanding of the experience of the adaptation process of older people to group homes. The findings from this study can be used as basic data to establish policies to increase the number of small scale facilities which can help older adults adapt easily to the facilities.