• Title/Summary/Keyword: Hadoop System

Search Result 239, Processing Time 0.023 seconds

Algorithm Design to Judge Fake News based on Bigdata and Artificial Intelligence

  • Kang, Jangmook;Lee, Sangwon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.2
    • /
    • pp.50-58
    • /
    • 2019
  • The clear and specific objective of this study is to design a false news discriminator algorithm for news articles transmitted on a text-based basis and an architecture that builds it into a system (H/W configuration with Hadoop-based in-memory technology, Deep Learning S/W design for bigdata and SNS linkage). Based on learning data on actual news, the government will submit advanced "fake news" test data as a result and complete theoretical research based on it. The need for research proposed by this study is social cost paid by rumors (including malicious comments) and rumors (written false news) due to the flood of fake news, false reports, rumors and stabbings, among other social challenges. In addition, fake news can distort normal communication channels, undermine human mutual trust, and reduce social capital at the same time. The final purpose of the study is to upgrade the study to a topic that is difficult to distinguish between false and exaggerated, fake and hypocrisy, sincere and false, fraud and error, truth and false.

User Authentication Scheme based on Secret Sharing for Distributed File System in Hadoop (하둡의 분산 파일 시스템 구조를 고려한 비밀분산 기반의 사용자 인증 기법)

  • Kim, Su-Hyun;Lee, Im-Yeong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2013.11a
    • /
    • pp.740-743
    • /
    • 2013
  • 클라우드 컴퓨팅 환경에서는 사용자의 데이터를 수많은 분산서버를 이용하여 데이터를 암호화하여 저장한다. 구글, 야후 등 글로벌 인터넷 서비스 업체들은 인터넷 서비스 플랫폼의 중요성을 인식하고 자체 연구 개발을 수행, 저가 상용 노드를 기반으로 한 대규모 클러스터 기반의 클라우드 컴퓨팅 플랫폼 기술을 개발 활용하고 있다. 이와 같이 분산 컴퓨팅 환경에서 다양한 데이터 서비스가 가능해지면서 대용량 데이터의 분산관리가 주요 이슈로 떠오르고 있다. 한편, 대용량 데이터의 다양한 이용 형태로부터 악의적인 공격자나 내부 사용자에 의한 보안 취약성 및 프라이버시 침해가 발생할 수 있다. 특히, 하둡에서 데이터 블록의 권한 제어를 위해 사용하는 블록 접근 토큰에도 다양한 보안 취약점이 발생한다. 이러한 보안 취약점을 보완하기 위해 본 논문에서는 비밀분산 기반의 블록 접근 토큰 관리 기법을 제안한다.

Improvement of Reliability for Hadoop Distributed File System using Snapshot and Access Control (스냅샷과 접근제한 기법을 이용한 하둡 분산 파일 시스템의 신뢰성 향상)

  • Shin, Dong Hoon;Youn, Hee Yong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.137-138
    • /
    • 2009
  • 다양한 스토리지와 파일 시스템이 시스템의 신뢰도를 증가시키기 위해 스냅샷을 이용하고 있다.[1] 또한, 최근에는 정보 보호의 중요성에 관심이 많아지면서 많은 시스템이 자료 보안에 신경을 쓰고 있다. 하지만, 대표적인 분산 컴퓨터 시스템 중 하나인 하둡은 관련 기능을 제공하지 않는데, 이는 나중에 문제가 될 만한 여지가 농후하다. 본 논문에서는 현재 하둡 시스템의 신뢰도에 영향을 끼치는 결점에 대하여 언급하고, 그에 대한 보완의 일부로 스냅샷과 접근 제어 기능을 제안한다.

An Efficient Log Data Processing Architecture for Internet Cloud Environments

  • Kim, Julie;Bahn, Hyokyung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.8 no.1
    • /
    • pp.33-41
    • /
    • 2016
  • Big data management is becoming an increasingly important issue in both industry and academia of information science community today. One of the important categories of big data generated from software systems is log data. Log data is generally used for better services in various service providers and can also be used to improve system reliability. In this paper, we propose a novel big data management architecture specialized for log data. The proposed architecture provides a scalable log management system that consists of client and server side modules for efficient handling of log data. To support large and simultaneous log data from multiple clients, we adopt the Hadoop infrastructure in the server-side file system for storing and managing log data efficiently. We implement the proposed architecture to support various client environments and validate the efficiency through measurement studies. The results show that the proposed architecture performs better than the existing logging architecture by 42.8% on average. All components of the proposed architecture are implemented based on open source software and the developed prototypes are now publicly available.

Adaptable I/O System based I/O Reduction for Improving the Performance of HDFS

  • Park, Jung Kyu;Kim, Jaeho;Koo, Sungmin;Baek, Seungjae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.16 no.6
    • /
    • pp.880-888
    • /
    • 2016
  • In this paper, we propose a new HDFS-AIO framework to enhance HDFS with Adaptive I/O System (ADIOS), which supports many different I/O methods and enables applications to select optimal I/O routines for a particular platform without source-code modification and re-compilation. First, we customize ADIOS into a chunk-based storage system so its API semantics can fit the requirement of HDFS easily; then, we utilize Java Native Interface (JNI) to bridge HDFS and the tailored ADIOS. We use different I/O patterns to compare HDFS-AIO and the original HDFS, and the experimental results show the design feasibility and benefits. We also examine the performance of HDFS-AIO using various I/O techniques. There have been many studies that use ADIOS, however our research is expected to help in expanding the function of HDFS.

Hadoop Based Wavelet Histogram for Big Data in Cloud

  • Kim, Jeong-Joon
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.668-676
    • /
    • 2017
  • Recently, the importance of big data has been emphasized with the development of smartphone, web/SNS. As a result, MapReduce, which can efficiently process big data, is receiving worldwide attention because of its excellent scalability and stability. Since big data has a large amount, fast creation speed, and various properties, it is more efficient to process big data summary information than big data itself. Wavelet histogram, which is a typical data summary information generation technique, can generate optimal data summary information that does not cause loss of information of original data. Therefore, a system applying a wavelet histogram generation technique based on MapReduce has been actively studied. However, existing research has a disadvantage in that the generation speed is slow because the wavelet histogram is generated through one or more MapReduce Jobs. And there is a high possibility that the error of the data restored by the wavelet histogram becomes large. However, since the wavelet histogram generation system based on the MapReduce developed in this paper generates the wavelet histogram through one MapReduce Job, the generation speed can be greatly increased. In addition, since the wavelet histogram is generated by adjusting the error boundary specified by the user, the error of the restored data can be adjusted from the wavelet histogram. Finally, we verified the efficiency of the wavelet histogram generation system developed in this paper through performance evaluation.

A Study on the Customized Food Menu Recommendation System Based on ICT and Big Data (ICT 및 빅데이터기반 맞춤형 음식메뉴 추천시스템 연구)

  • Ryoo, Hee-Soo;Lee, Man-ting
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.2
    • /
    • pp.339-346
    • /
    • 2021
  • In this paper, we implemented an interface that provides a better food ordering mechanism and enables real-time selection of recipe ingredient ratios for customized food orders from global customers. Providing appropriate food to global customers by arranging a selection of menu on the order system screen that shows the basic ratio of each recipe ingredient and provides a customized recipe ingredient composition ratio by configuring a recipe graph without a system for simply selecting and ordering food menus. By enabling interaction, it allows users to provide customized services through the ratio adjustment of various recipe ingredients in the food menu ordering device

An Extraction Method of Sentiment Infromation from Unstructed Big Data on SNS (SNS상의 비정형 빅데이터로부터 감성정보 추출 기법)

  • Back, Bong-Hyun;Ha, Ilkyu;Ahn, ByoungChul
    • Journal of Korea Multimedia Society
    • /
    • v.17 no.6
    • /
    • pp.671-680
    • /
    • 2014
  • Recently, with the remarkable increase of social network services, it is necessary to extract interesting information from lots of data about various individual opinions and preferences on SNS(Social Network Service). The sentiment information can be applied to various fields of society such as politics, public opinions, economics, personal services and entertainments. To extract sentiment information, it is necessary to use processing techniques that store a large amount of SNS data, extract meaningful data from them, and search the sentiment information. This paper proposes an efficient method to extract sentiment information from various unstructured big data on social networks using HDFS(Hadoop Distributed File System) platform and MapReduce functions. In experiments, the proposed method collects and stacks data steadily as the number of data is increased. When the proposed functions are applied to sentiment analysis, the system keeps load balancing and the analysis results are very close to the results of manual work.

Development of CEP-based Real Time Analysis System Using Hospital ERP System (병원 ERP시스템을 적용한 CEP 기반 실시간 분석시스템 개발)

  • Kim, Mi-Jin;Yu, Yun-Sik;Seo, Young-Woo;Jang, Jong-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.290-293
    • /
    • 2015
  • 개개인의 데이터가 비즈니스적으로 중요하지 않을 수 있지만, 대량으로 모으면 그 안에 숨겨진 새로운 정보를 발견할 가능성이 있는 데이터의 집합체로 빅데이터 분석 활용 사례는 점차 늘어나는 추세이다. 빅데이터 분석 기술 중 전통적인 데이터 분석방법인 하둡(Hadoop)은 예전부터 현재에 이르기까지 정형 비정형 빅데이터 분석에 널리 사용되고 있는 기술이다. 하지만 하둡은 배치성 처리 시스템으로 데이터가 많아질수록 응답 지연이 발생할 가능성이 높아, 현재 기업 경영환경과 시장환경에 대한 엄청난 양의 고속 이벤트 데이터에 대한 실시간 분석이 어려운 상황이다. 본 논문에서는 급변하는 비즈니스 환경에 대한 대안으로 오픈소스 CEP(Complex Event Processing)기반 기술을 사용하여 초당 수백에서 수십만건 이상의 이벤트 스트림을 실시간으로 지연 없이 분석가능하게 하는 실시간 분석 시스템을 개발하여 병원 ERP시스템에 적용하였다.

  • PDF

Effective Countermeasure to APT Attacks using Big Data (빅데이터를 이용한 APT 공격 시도에 대한 효과적인 대응 방안)

  • Mun, Hyung-Jin;Choi, Seung-Hyeon;Hwang, Yooncheol
    • Journal of Convergence Society for SMB
    • /
    • v.6 no.1
    • /
    • pp.17-23
    • /
    • 2016
  • Recently, Internet services via various devices including smartphone have become available. Because of the development of ICT, numerous hacking incidents have occurred and most of those attacks turned out to be APT attacks. APT attack means an attack method by which a hacker continues to collect information to achieve his goal, and analyzes the weakness of the target and infects it with malicious code, and being hidden, leaks the data in time. In this paper, we examine the information collection method the APT attackers use to invade the target system in a short time using big data, and we suggest and evaluate the countermeasure to protect against the attack method using big data.