• Title/Summary/Keyword: Big data collection

Search Result 340, Processing Time 0.031 seconds

Conparison of Data Collection Methods for Big Data Analysis (빅데이터 분석을 위한 자료 수집 방안 비교)

  • Kim, Sung-kook;Oh, Chang-heon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.422-424
    • /
    • 2018
  • Recently there has been growing interest in big data analysis and methods for collecting data have been developed diversely but researchers are still not easy to collect and use these large scale data. In this paper, researchers try to compare and analyze the method of collecting big data by using several methods and present it. I hope that you can provide the results of your research if you select and use methods that match your research objectives.

  • PDF

A Study on Big Data Processing Technology Based on Open Source for Expansion of LIMS (실험실정보관리시스템의 확장을 위한 오픈 소스 기반의 빅데이터 처리 기술에 관한 연구)

  • Kim, Soon-Gohn
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.2
    • /
    • pp.161-167
    • /
    • 2021
  • Laboratory Information Management System(LIMS) is a centralized database for storing, processing, retrieving, and analyzing laboratory data, and refers to a computer system or system specially designed for laboratories performing inspection, analysis, and testing tasks. In particular, LIMS is equipped with a function to support the operation of the laboratory, and it requires workflow management or data tracking support. In this paper, we collect data on websites and various channels using crawling technology, one of the automated big data collection technologies for the operation of the laboratory. Among the collected test methods and contents, useful test methods and contents useful that the tester can utilize are recommended. In addition, we implement a complementary LIMS platform capable of verifying the collection channel by managing the feedback.

Designing Cost Effective Open Source System for Bigdata Analysis (빅데이터 분석을 위한 비용효과적 오픈 소스 시스템 설계)

  • Lee, Jong-Hwa;Lee, Hyun-Kyu
    • Knowledge Management Research
    • /
    • v.19 no.1
    • /
    • pp.119-132
    • /
    • 2018
  • Many advanced products and services are emerging in the market thanks to data-based technologies such as Internet (IoT), Big Data, and AI. The construction of a system for data processing under the IoT network environment is not simple in configuration, and has a lot of restrictions due to a high cost for constructing a high performance server environment. Therefore, in this paper, we will design a development environment for large data analysis computing platform using open source with low cost and practicality. Therefore, this study intends to implement a big data processing system using Raspberry Pi, an ultra-small PC environment, and open source API. This big data processing system includes building a portable server system, building a web server for web mining, developing Python IDE classes for crawling, and developing R Libraries for NLP and visualization. Through this research, we will develop a web environment that can control real-time data collection and analysis of web media in a mobile environment and present it as a curriculum for non-IT specialists.

Big Data, Business Analytics, and IoT: The Opportunities and Challenges for Business (빅데이터, 비즈니스 애널리틱스, IoT: 경영의 새로운 도전과 기회)

  • Jang, Young Jae
    • The Journal of Information Systems
    • /
    • v.24 no.4
    • /
    • pp.139-152
    • /
    • 2015
  • With the advancement of the Internet/IT technologies and the increased computation power, massive data can be collected, stored, and processed these days. The availability of large databases has brought forth a new era in which companies are hard pressed to find innovative ways to utilize immense amounts of data at their disposal. Indeed, data has opened a new age of business operations and management. There are already many cases of innovative businesses reaping success thanks to scientific decisions based on data analysis and mathematical algorithms. Big Data is a new paradigm in itself. In this article, Big Data is viewed as a new perspective rather than a new technology. This value centric definition of Big Data provides a new insight and opportunities. Moreover, the Business Analytics, which is the framework of creating tangible results in management, is introduced. Then the Internet of Things (IoT), another innovative concept of data collection and networking, is presented and how this new concept can be interpreted with Big Data in terms of the value centric perspective. The challenges and opportunities with these new concepts are also discussed.

Efficient K-Anonymization Implementation with Apache Spark

  • Kim, Tae-Su;Kim, Jong Wook
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.17-24
    • /
    • 2018
  • Today, we are living in the era of data and information. With the advent of Internet of Things (IoT), the popularity of social networking sites, and the development of mobile devices, a large amount of data is being produced in diverse areas. The collection of such data generated in various area is called big data. As the importance of big data grows, there has been a growing need to share big data containing information regarding an individual entity. As big data contains sensitive information about individuals, directly releasing it for public use may violate existing privacy requirements. Thus, privacy-preserving data publishing (PPDP) has been actively studied to share big data containing personal information for public use, while preserving the privacy of the individual. K-anonymity, which is the most popular method in the area of PPDP, transforms each record in a table such that at least k records have the same values for the given quasi-identifier attributes, and thus each record is indistinguishable from other records in the same class. As the size of big data continuously getting larger, there is a growing demand for the method which can efficiently anonymize vast amount of dta. Thus, in this paper, we develop an efficient k-anonymity method by using Spark distributed framework. Experimental results show that, through the developed method, significant gains in processing time can be achieved.

A Case Study on Product Production Process Optimization using Big Data Analysis: Focusing on the Quality Management of LCD Production (빅데이터 분석 적용을 통한 공정 최적화 사례연구: LCD 공정 품질분석을 중심으로)

  • Park, Jong Tae;Lee, Sang Kon
    • Journal of Information Technology Services
    • /
    • v.21 no.2
    • /
    • pp.97-107
    • /
    • 2022
  • Recently, interest in smart factories is increasing. Investments to improve intelligence/automation are also being made continuously in manufacturing plants. Facility automation based on sensor data collection is now essential. In addition, we are operating our factories based on data generated in all areas of production, including production management, facility operation, and quality management, and an integrated standard information system. When producing LCD polarizer products, it is most important to link trace information between data generated by individual production processes. All systems involved in production must ensure that there is no data loss and data integrity is ensured. The large-capacity data collected from individual systems is composed of key values linked to each other. A real-time quality analysis processing system based on connected integrated system data is required. In this study, large-capacity data collection, storage, integration and loss prevention methods were presented for optimization of LCD polarizer production. The identification Risk model of inspection products can be added, and the applicable product model is designed to be continuously expanded. A quality inspection and analysis system that maximizes the yield rate was designed by using the final inspection image of the product using big data technology. In the case of products that are predefined as analysable products, it is designed to be verified with the big data knn analysis model, and individual analysis results are continuously applied to the actual production site to operate in a virtuous cycle structure. Production Optimization was performed by applying it to the currently produced LCD polarizer production line.

A Study on the Analysis Method of ICT Policy Triggering Mechanism Using Social Big Data (소셜 빅데이터 특성을 활용한 ICT 정책 격발 메커니즘 분석방법 제안)

  • Choi, Hong Gyu
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.8
    • /
    • pp.1192-1201
    • /
    • 2021
  • This study focused on how to analyze the ICT policy formation process using social big data. Specifically, in this study, a method for quantifying variables that influenced policy formation using the concept of a policy triggering mechanism and elements necessary to present the analysis results were proposed. For the analysis of the ICT policy triggering mechanism, variables such as 'Scope', 'Duration', 'Interactivity', 'Diversity', 'Attention', 'Preference', 'Transmutability' were proposed. In addition, 'interpretation of results according to data level', 'presentation of differences between collection and analysis time points', and 'setting of garbage level' were suggested as elements necessary to present the analysis results.

Fishery R&D Big Data Platform and Metadata Management Strategy (수산과학 빅데이터 플랫폼 구축과 메타 데이터 관리방안)

  • Kim, Jae-Sung;Choi, Youngjin;Han, Myeong-Soo;Hwang, Jae-Dong;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.93-103
    • /
    • 2019
  • In this paper, we introduce a big data platform and a metadata management technique for fishery science R & D information. The big data platform collects and integrates various types of fisheries science R & D information and suggests how to build it in the form of a data lake. In addition to existing data collected and accumulated in the field of fisheries science, we also propose to build a big data platform that supports diverse analysis by collecting unstructured big data such as satellite image data, research reports, and research data. Next, by collecting and managing metadata during data extraction, preprocessing and storage, systematic management of fisheries science big data is possible. By establishing metadata in a standard form along with the construction of a big data platform, it is meaningful to suggest a systematic and continuous big data management method throughout the data lifecycle such as data collection, storage, utilization and distribution.

  • PDF

Proposal of AI-based Digital Forensic Evidence Collecting System

  • Jang, Eun-Jin;Shin, Seung-Jung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.3
    • /
    • pp.124-129
    • /
    • 2021
  • As the 4th industrial era is in full swing, the public's interest in related technologies such as artificial intelligence, big data, and block chain is increasing. As artificial intelligence technology is used in various industrial fields, the need for research methods incorporating artificial intelligence technology in related fields is also increasing. Evidence collection among digital forensic investigation techniques is a very important procedure in the investigation process that needs to prove a specific person's suspicions. However, there may be cases in which evidence is damaged due to intentional damage to evidence or other physical reasons, and there is a limit to the collection of evidence in this situation. Therefore, this paper we intends to propose an artificial intelligence-based evidence collection system that analyzes numerous image files reported by citizens in real time to visually check the location, user information, and shooting time of the image files. When this system is applied, it is expected that the evidence expected data collected in real time can be actually used as evidence, and it is also expected that the risk area analysis will be possible through big data analysis.

A Study on Analysis of Problems in Data Collection for Smart Farm Construction (스마트팜 구축을 위한 데이터수집의 문제점 분석 연구)

  • Kim Song Gang;Nam Ki Po
    • Convergence Security Journal
    • /
    • v.22 no.5
    • /
    • pp.69-80
    • /
    • 2022
  • Now that climate change and food resource security are becoming issues around the world, smart farms are emerging as an alternative to solve them. In addition, changes in the production environment in the primary industry are a major concern for people engaged in all primary industries (agriculture, livestock, fishery), and the resulting food shortage problem is an important problem that we all need to solve. In order to solve this problem, in the primary industry, efforts are made to solve the food shortage problem through productivity improvement by introducing smart farms using the 4th industrial revolution such as ICT and BT and IoT big data and artificial intelligence technologies. This is done through the public and private sectors.This paper intends to consider the minimum requirements for the smart farm data collection system for the development and utilization of smart farms, the establishment of a sustainable agricultural management system, the sequential system construction method, and the purposeful, efficient and usable data collection system. In particular, we analyze and improve the problems of the data collection system for building a Korean smart farm standard model, which is facing limitations, based on in-depth investigations in the field of livestock and livestock (pig farming) and analysis of various cases, to establish an efficient and usable big data collection system. The goal is to propose a method for collecting big data.