• Title/Summary/Keyword: Structured Big Data

Search Result 125, Processing Time 0.022 seconds

Big Data and U-City Services (빅데이터와 U-City 서비스)

  • Lee, Hyun-Ku;Oh, Jay In
    • The Journal of Bigdata
    • /
    • v.2 no.1
    • /
    • pp.71-75
    • /
    • 2017
  • The topic of big data has gained attention from the industry and the academics, because of the revitalization of social network services. The purpose of this study is to analyze the application cases of big data according to the categories of U-City services. The result from this study is that inside and unstructured information is more applied than outside and structured information in order to generate big data.

  • PDF

Research of Knowledge Management and Reusability in Streaming Big Data with Privacy Policy through Actionable Analytics (스트리밍 빅데이터의 프라이버시 보호 동반 실용적 분석을 통한 지식 활용과 재사용 연구)

  • Paik, Juryon;Lee, Youngsook
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.12 no.3
    • /
    • pp.1-9
    • /
    • 2016
  • The current meaning of "Big Data" refers to all the techniques for value eduction and actionable analytics as well management tools. Particularly, with the advances of wireless sensor networks, they yield diverse patterns of digital records. The records are mostly semi-structured and unstructured data which are usually beyond of capabilities of the management tools. Such data are rapidly growing due to their complex data structures. The complex type effectively supports data exchangeability and heterogeneity and that is the main reason their volumes are getting bigger in the sensor networks. However, there are many errors and problems in applications because the managing solutions for the complex data model are rarely presented in current big data environments. To solve such problems and show our differentiation, we aim to provide the solution of actionable analytics and semantic reusability in the sensor web based streaming big data with new data structure, and to empower the competitiveness.

How to improve oil consumption forecast using google trends from online big data?: the structured regularization methods for large vector autoregressive model

  • Choi, Ji-Eun;Shin, Dong Wan
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.41-51
    • /
    • 2022
  • We forecast the US oil consumption level taking advantage of google trends. The google trends are the search volumes of the specific search terms that people search on google. We focus on whether proper selection of google trend terms leads to an improvement in forecast performance for oil consumption. As the forecast models, we consider the least absolute shrinkage and selection operator (LASSO) regression and the structured regularization method for large vector autoregressive (VAR-L) model of Nicholson et al. (2017), which select automatically the google trend terms and the lags of the predictors. An out-of-sample forecast comparison reveals that reducing the high dimensional google trend data set to a low-dimensional data set by the LASSO and the VAR-L models produces better forecast performance for oil consumption compared to the frequently-used forecast models such as the autoregressive model, the autoregressive distributed lag model and the vector error correction model.

Analysis of the Influence Factors of Data Loading Performance Using Apache Sqoop (아파치 스쿱을 사용한 하둡의 데이터 적재 성능 영향 요인 분석)

  • Chen, Liu;Ko, Junghyun;Yeo, Jeongmo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.2
    • /
    • pp.77-82
    • /
    • 2015
  • Big Data technology has been attracted much attention in aspect of fast data processing. Research of practicing Big Data technology is also ongoing to process large-scale structured data much faster in Relatioinal Database(RDB). Although there are lots of studies about measuring analyzing performance, studies about structured data loading performance, prior step of analyzing, is very rare. Thus, in this study, structured data in RDB is tested the performance that loads distributed processing platform Hadoop using Apache sqoop. Also in order to analyze the influence factors of data loading, it is tested repeatedly with different options of data loading and compared with data loading performance among RDB based servers. Although data loading performance of Apache Sqoop in test environment was low, but in large-scale Hadoop cluster environment we can expect much better performance because of getting more hardware resources. It is expected to be based on study improving data loading performance and whole steps of performance analyzing structured data in Hadoop Platform.

A Study on Satisfaction Survey Based on Regression Analysis to Improve Curriculum for Big Data Education (빅데이터 양성 교육 교과과정 개선을 위한 회귀분석 기반의 만족도 조사에 관한 연구)

  • Choi, Hyun
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.22 no.6
    • /
    • pp.749-756
    • /
    • 2019
  • Big data is structured and unstructured data that is so difficult to collect, store, and so on due to the huge amount of data. Many institutions, including universities, are building student convergence systems to foster talents for data science and AI convergence, but there is an absolute lack of research on what kind of education is needed and what kind of education is required for students. Therefore, in this paper, after conducting the correlation analysis based on the questionnaire on basic surveys and courses to improve the curriculum by grasping the satisfaction and demands of the participants in the "2019 Big Data Youth Talent Training Course" held at K University, Regression analysis was performed. As a result of the study, the higher the satisfaction level, the satisfaction with class or job connection, and the self-development, the more positive the evaluation of program efficiency.

Applying Service Quality to Big Data Quality (빅데이터 품질 확장을 위한 서비스 품질 연구)

  • Park, Jooseok;Kim, Seunghyun;Ryu, Hocheol;Lee, Zoonky;Lee, Jangho;Lee, Junyong
    • The Journal of Bigdata
    • /
    • v.2 no.2
    • /
    • pp.87-93
    • /
    • 2017
  • The research on data quality has been performed for a long time. However, the research focused on structured data. With the recent digital revolution or the fourth industrial revolution, quality control of big data is becoming more important. In this paper, we analyze and classify big data quality types through previous research. The types of big data quality can be classified into value, data structure, process, value chain, and maturity model. Based on these comparative studies, this paper proposes a new standard, service quality of big data.

  • PDF

A Review on the Management of Water Resources Information based on Big Data and Cloud Computing (빅 데이터와 클라우드 컴퓨팅 기반의 수자원 정보 관리 방안에 관한 검토)

  • Kim, Yonsoo;Kang, Narae;Jung, Jaewon;Kim, Hung Soo
    • Journal of Wetlands Research
    • /
    • v.18 no.1
    • /
    • pp.100-112
    • /
    • 2016
  • In recent, the direction of water resources policy is changing from the typical plan for water use and flood control to the sustainable water resources management to improve the quality of life. This change makes the information related to water resources such as data collection, management, and supply is becoming an important concern for decision making of water resources policy. We had analyzed the structured data according to the purpose of providing information on water resources. However, the recent trend is big data and cloud computing which can create new values by linking unstructured data with structured data. Therefore, the trend for the management of water resources information is also changing. According to the paradigm change of information management, this study tried to suggest an application of big data and cloud computing in water resources field for efficient management and use of water. We examined the current state and direction of policy related to water resources information in Korea and an other country. Then we connected volume, velocity and variety which are the three basic components of big data with veracity and value which are additionally mentioned recently. And we discussed the rapid and flexible countermeasures about changes of consumer and increasing big data related to water resources via cloud computing. In the future, the management of water resources information should go to the direction which can enhance the value(Value) of water resources information by big data and cloud computing based on the amount of data(Volume), the speed of data processing(Velocity), the number of types of data(Variety). Also it should enhance the value(Value) of water resources information by the fusion of water and other areas and by the production of accurate information(Veracity) required for water management and prevention of disaster and for protection of life and property.

Design and Implementation of a Web Crawler System for Collection of Structured and Unstructured Data (정형 및 비정형 데이터 수집을 위한 웹 크롤러 시스템 설계 및 구현)

  • Bae, Seong Won;Lee, Hyun Dong;Cho, DaeSoo
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.2
    • /
    • pp.199-209
    • /
    • 2018
  • Recently, services provided to consumers are increasingly being combined with big data such as low-priced shopping, customized advertisement, and product recommendation. With the increasing importance of big data, the web crawler that collects data from the web has also become important. However, there are two problems with existing web crawlers. First, if the URL is hidden from the link, it can not be accessed by the URL. The second is the inefficiency of fetching more data than the user wants. Therefore, in this paper, through the Casper.js which can control the DOM in the headless brwoser, DOM event is generated by accessing the URL to the hidden link. We also propose an intelligent web crawler system that allows users to make steps to fine-tune both Structured and unstructured data to bring only the data they want. Finally, we show the superiority of the proposed crawler system through the performance evaluation results of the existing web crawler and the proposed web crawler.

Current Status of Big Data Utilization (빅데이터의 국내.외 활용 고찰 및 시사점)

  • Lee, Seong-Hoon;Lee, Dong-Woo
    • Journal of Digital Convergence
    • /
    • v.11 no.2
    • /
    • pp.229-233
    • /
    • 2013
  • The technologies related with information communication regions are progressing continuously. These technologies in today are converged with different industries in rapidly. Because of these properties, A number of data are made in our life. Through many devices such as smart phone, camera, game machine, tablet pc, various data types are produced and the traffic is increased. We called it Big Data. There are many efforts to create new worth creation through Big Data utilization. Therefore, we described current trends and future of Big Data in this paper.

Development of Data Visualization Tools for Land-Based Fish Farm Big Data Analysis System (육상 양식장 빅데이터 분석 시스템 개발을 위한 데이터 시각화 도구 개발)

  • Seoung-Bin Ye;Jeong-Seon Park;Hyi-Thaek Ceong;Soon-Hee Han
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.4
    • /
    • pp.763-770
    • /
    • 2024
  • Currently, land-based fish farms utilizing seawater have introduced and are utilizing various equipment such as real-time water quality monitoring systems, facility automation systems, and automated dissolved oxygen supply devices. Furthermore, data collected from various equipment in these fish farms produce structured and unstructured big data related to water quality environment, facility operations, and workplace visual information. The big data generated in the operational environment of fish farms aims to improve operational and production efficiency through the development and application of various methods. This study aims to develop a system for effectively analyzing and visualizing big data produced from land-based fish farms. It proposes a data visualization process suitable for use in a fish farm big data analysis system, develops big data visualization tools, and compares the results. Additionally, it presents intuitive visualization models for exploring and comparing big data with time-series characteristics.