• Title/Summary/Keyword: unstructured data

Search Result 718, Processing Time 0.023 seconds

Digital Forensic Investigation of HBase (HBase에 대한 디지털 포렌식 조사 기법 연구)

  • Park, Aran;Jeong, Doowon;Lee, Sang Jin
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.2
    • /
    • pp.95-104
    • /
    • 2017
  • As the technology in smart device is growing and Social Network Services(SNS) are becoming more common, the data which is difficult to be processed by existing RDBMS are increasing. As a result of this, NoSQL databases are getting popular as an alternative for processing massive and unstructured data generated in real time. The demand for the technique of digital investigation of NoSQL databases is increasing as the businesses introducing NoSQL database in their system are increasing, although the technique of digital investigation of databases has been researched centered on RDMBS. New techniques of digital forensic investigation are needed as NoSQL Database has no schema to normalize and the storage method differs depending on the type of database and operation environment. Research on document-based database of NoSQL has been done but it is not applicable as itself to other types of NoSQL Database. Therefore, the way of operation and data model, grasp of operation environment, collection and analysis of artifacts and recovery technique of deleted data in HBase which is a NoSQL column-based database are presented in this paper. Also the proposed technique of digital forensic investigation to HBase is verified by an experimental scenario.

Experimental Study of Estimating the Optimized Parameters in OI (서남해안 관측자료를 활용한 OI 자료동화의 최적 매개변수 산정 연구)

  • Gu, Bon-Ho;Woo, Seung-Buhm;Kim, Sangil
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.31 no.6
    • /
    • pp.458-467
    • /
    • 2019
  • The purpose of this study is the suggestion of optimized parameters in OI (Optimal Interpolation) by experimental study. The observation of applying optimal interpolation is ADCP (Acoustic Doppler Current Profiler) data at the southwestern sea of Korea. FVCOM (Finite Volume Coastal Ocean Model) is used for the barotropic model. OI is to the estimation of the gain matrix by a minimum value between the background error covariance and the observation error covariance using the least square method. The scaling factor and correlation radius are very important parameters for OI. It is used to calculate the weight between observation data and model data in the model domain. The optimized parameters from the experiments were found by the Taylor diagram. Constantly each observation point requires optimizing each parameter for the best assimilation. Also, a high accuracy of numerical model means background error covariance is low and then it can decrease all of the parameters in OI. In conclusion, it is expected to have prepared the foundation for research for the selection of ocean observation points and the construction of ocean prediction systems in the future.

A Comparative Analysis of the Changes in Perception of the Fourth Industrial Revolution: Focusing on Analyzing Social Media Data (4차 산업혁명에 대한 인식 변화 비교 분석: 소셜 미디어 데이터 분석을 중심으로)

  • You, Jae Eun;Choi, Jong Woo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.11
    • /
    • pp.367-376
    • /
    • 2020
  • The fourth industrial revolution will greatly contribute to the entry of objects into an intelligent society through technologies such as big data and an artificial intelligence. Through the revolution, we were able to understand human behavior and awareness, and through the use of an artificial intelligence, we established ourselves as a key tool in various fields such as medicine and science. However, the fourth industrial revolution has a negative side with a positive future. In this study, an analysis was conducted using text mining techniques based on unstructured big data collected through social media. We wanted to look at keywords related to the fourth industrial revolution by year (2016, 2017 and 2018) and understand the meaning of each keyword. In addition, we understood how the keywords related to the Fourth Industrial Revolution changed with the change of the year and wanted to use R to conduct a Keyword Analysis to identify the recognition flow closely related to the Fourth Industrial Revolution through the keyword flow associated with the Fourth Industrial Revolution. Finally, people's perceptions of the fourth industrial revolution were identified by looking at the positive and negative feelings related to the fourth industrial revolution by year. The analysis showed that negative opinions were declining year after year, with more positive outlook and future.

An Artificial Neural Network Based Phrase Network Construction Method for Structuring Facility Error Types (설비 오류 유형 구조화를 위한 인공신경망 기반 구절 네트워크 구축 방법)

  • Roh, Younghoon;Choi, Eunyoung;Choi, Yerim
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.21-29
    • /
    • 2018
  • In the era of the 4-th industrial revolution, the concept of smart factory is emerging. There are efforts to predict the occurrences of facility errors which have negative effects on the utilization and productivity by using data analysis. Data composed of the situation of a facility error and the type of the error, called the facility error log, is required for the prediction. However, in many manufacturing companies, the types of facility error are not precisely defined and categorized. The worker who operates the facilities writes the type of facility error in the form with unstructured text based on his or her empirical judgement. That makes it impossible to analyze data. Therefore, this paper proposes a framework for constructing a phrase network to support the identification and classification of facility error types by using facility error logs written by operators. Specifically, phrase indicating the types are extracted from text data by using dictionary which classifies terms by their usage. Then, a phrase network is constructed by calculating the similarity between the extracted phrase. The performance of the proposed method was evaluated by using real-world facility error logs. It is expected that the proposed method will contribute to the accurate identification of error types and to the prediction of facility errors.

Analysis of Major COVID-19 Issues Using Unstructured Big Data (비정형 빅데이터를 이용한 COVID-19 주요 이슈 분석)

  • Kim, Jinsol;Shin, Donghoon;Kim, Heewoong
    • Knowledge Management Research
    • /
    • v.22 no.2
    • /
    • pp.145-165
    • /
    • 2021
  • As of late December 2019, the spread of COVID-19 pandemic began which put the entire world in panic. In order to overcome the crisis and minimize any subsequent damage, the government as well as its affiliated institutions must maximize effects of pre-existing policy support and introduce a holistic response plan that can reflect this changing situation- which is why it is crucial to analyze social topics and people's interests. This study investigates people's major thoughts, attitudes and topics surrounding COVID-19 pandemic through the use of social media and big data. In order to collect public opinion, this study segmented time period according to government countermeasures. All data were collected through NAVER blog from 31 December 2019 to 12 December 2020. This research applied TF-IDF keyword extraction and LDA topic modeling as text-mining techniques. As a result, eight major issues related to COVID-19 have been derived, and based on these keywords, this research presented policy strategies. The significance of this study is that it provides a baseline data for Korean government authorities in providing appropriate countermeasures that can satisfy needs of people in the midst of COVID-19 pandemic.

Methodology for Classifying Hierarchical Data Using Autoencoder-based Deeply Supervised Network (오토인코더 기반 심층 지도 네트워크를 활용한 계층형 데이터 분류 방법론)

  • Kim, Younha;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.185-207
    • /
    • 2022
  • Recently, with the development of deep learning technology, researches to apply a deep learning algorithm to analyze unstructured data such as text and images are being actively conducted. Text classification has been studied for a long time in academia and industry, and various attempts are being performed to utilize data characteristics to improve classification performance. In particular, a hierarchical relationship of labels has been utilized for hierarchical classification. However, the top-down approach mainly used for hierarchical classification has a limitation that misclassification at a higher level blocks the opportunity for correct classification at a lower level. Therefore, in this study, we propose a methodology for classifying hierarchical data using the autoencoder-based deeply supervised network that high-level classification does not block the low-level classification while considering the hierarchical relationship of labels. The proposed methodology adds a main classifier that predicts a low-level label to the autoencoder's latent variable and an auxiliary classifier that predicts a high-level label to the hidden layer of the autoencoder. As a result of experiments on 22,512 academic papers to evaluate the performance of the proposed methodology, it was confirmed that the proposed model showed superior classification accuracy and F1-score compared to the traditional supervised autoencoder and DNN model.

Art transaction using big data Artist analysis system implementation (미술품 거래 빅데이터를 이용한 작가 분석 시스템 구현)

  • SeungKyung Lee;JongTae Lim
    • Journal of Service Research and Studies
    • /
    • v.11 no.2
    • /
    • pp.79-93
    • /
    • 2021
  • The size of the domestic art market has increased 21.9% over the past five years as of 2018 to KRW 448.2 billion and the number of transactions has also increased 31.6% to 39,367 points maintaining growth for the fifth consecutive year. Art distribution platforms are diversifying from galleries and auction-style offline to online auctions. The art market consists of three areas: production (creation), distribution (trade), and consumption (buying) of works and as the perception of artistic value as well as economic value spreads interest is also increasing as a means of investment. Consumers who purchase works and think of them as a means of investment technology have an increased need for objective information about their works, but there is a limit to collecting and analyzing objective and reliable statistics because information provision in the art market distribution area is closed and unbalanced. This paper identifies objective and reliable art distribution status and status through big data collection and structured and unstructured data analysis on art market distribution areas. Through this, we want to implement a system that can objectively provide analysis of authors in the current market. This study collected author information from art distribution sites and calculated the frequency of associated words by writer by collecting and analyzing the author's articles from Maeil Business, a daily newspaper. It aims to provide consumers with objective and reliable information.

A Study on an Automatic Classification Model for Facet-Based Multidimensional Analysis of Civil Complaints (패싯 기반 민원 다차원 분석을 위한 자동 분류 모델)

  • Na Rang Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.29 no.1
    • /
    • pp.135-144
    • /
    • 2024
  • In this study, we propose an automatic classification model for quantitative multidimensional analysis based on facet theory to understand public opinions and demands on major issues through big data analysis. Civil complaints, as a form of public feedback, are generated by various individuals on multiple topics repeatedly and continuously in real-time, which can be challenging for officials to read and analyze efficiently. Specifically, our research introduces a new classification framework that utilizes facet theory and political analysis models to analyze the characteristics of citizen complaints and apply them to the policy-making process. Furthermore, to reduce administrative tasks related to complaint analysis and processing and to facilitate citizen policy participation, we employ deep learning to automatically extract and classify attributes based on the facet analysis framework. The results of this study are expected to provide important insights into understanding and analyzing the characteristics of big data related to citizen complaints, which can pave the way for future research in various fields beyond the public sector, such as education, industry, and healthcare, for quantifying unstructured data and utilizing multidimensional analysis. In practical terms, improving the processing system for large-scale electronic complaints and automation through deep learning can enhance the efficiency and responsiveness of complaint handling, and this approach can also be applied to text data processing in other fields.

Viscous Flow Analysis around a Wind Turbine Blade with End Plate and Rake (풍력터빈 날개의 끝판과 레이크 효과에 대한 점성유동장 해석)

  • Kim, Ju-In;Kim, Wu-Joan
    • Journal of the Korean Society for Marine Environment & Energy
    • /
    • v.14 no.4
    • /
    • pp.273-279
    • /
    • 2011
  • Turbulent flow analysis around a wind turbine blade was performed to evaluate the power performance of offshore wind turbine. Fluent package was utilized to solve the Reynolds-averaged Navier-Stokes equations in non-inertial rotating coordinates. The realizable k-$\varepsilon$ model was used for turbulence closure and the grid system combining structured and unstructured grids was generated. In the first, lift and drag forces of 2-D foil section were calculated and compared with existing experimental data for the validation. Then torque and thrust of the wind turbine blade having NACA 4-series sections were calculated with fixed pitch angle and rpm. Tip speed ratio was varied by changing wind speed. In the next, three kinds of end plate were attached at the tip of blade in order to increase the power of the wind turbine. Among them the end plate attached at the suction side of the blade was found to be most effective. Furthermore, performance analysis with tilt angle and rake was also performed.

A Study on New Graduate Nurses' Clinical Experience of Adaptation (신규간호사의 임상적응 경험에 관한 연구)

  • Yun, Suk-Hui
    • Journal of Korean Academy of Nursing Administration
    • /
    • v.8 no.1
    • /
    • pp.55-72
    • /
    • 2002
  • Purpose : This study aimed at understanding the lived experience of new graduate nurse in hospital setting.Method : This study was based on a phenomenological approach. For this study, 8 new graduate nurses participated. The unstructured in-depth interviews were carried out from June to October, 1998. The data was analyzed Van Kaam's method.Results : Clinical new graduate nurses experienced clinical adaptation and endeavor, professional conflict of nursing, disability of nursing performance, work stress and management, maladaptation of human relationships, diverse emotions of interpersonal relationships, change of personal identity, difference between theory and reality of nursing.Clinical adaptation and endeavor included to get familiar with hospital life, satisfaction and worth for nursing, to get familiar with scoldings, calm down by oneself, efforts for self-development. Professional conflicts of nursing included future uncertainties in the hospital, dissatisfactions in work, doubts and regrets in nursing.Disability of nursing performance included lack of knowledges and skills, desolate working, stresses from mistake in working. Work stress and management included psychological signs and symptoms, physical signs and symptoms, management of stress. Maladaptation of human relationships included unsuitabilities in nursing unit climate, difficulties in human relationship. Diverse emotions of interpersonal relationships included regrets and absurdities to clients, thanks and regrets to colleagues, difficulties with doctors, sense of rivalry with colleague nurses. Change of personal identity included to become narrowing life circle, change of personality. The differences between theoretical knowledge and practice included needs of systematic pre-education for clinical nursing practice, differences between nursing educations and clinical practices.Conclusion : Therefore, the human resource management reflecting new graduate nurses' experiences should be developed.

  • PDF