• Title/Summary/Keyword: 빅노드

Search Result 101, Processing Time 0.022 seconds

Video Big Data Processing Scheme for Spatio-Temporal Analysis of Moving Objects (움직이는 물체의 시공간 분석을 위한 동영상 빅 데이터 처리 방안)

  • Jung, Seungwon;Kim, Yongsung;Jung, Sangwon;Kim, Yoonki;Hwang, Eenjun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.833-836
    • /
    • 2017
  • 최근 블랙박스 및 CCTV 같은 영상 촬영 장치가 보편화되면서, 방대한 양의 영상 데이터가 실시간으로 생성되고 있다. 만약 이 대용량 데이터 안의 차량 정보를 추출할 수 있다면 범죄 차량 추적, 교통 혼잡도 측정 등의 활용이 가능할 것이다. 이를 구현하기 위해서는 수많은 자동차에서 실시간으로 생성되는 영상 데이터를 처리할 수 있는 시스템이 필수적이나, 이러한 시스템을 찾기 힘든 것이 현실이다. 이를 위해 이 논문에서는 아파치 카프카, Hbase를 이용한 영상 빅데이터 처리 시스템을 제안한다. 아파치 카프카는 시스템 내에서 영상 손실이 없는 전송과 영상 처리 노드의 스케줄링을 수행하며, Hbase는 처리된 데이터를 테이블로 저장하고 사용자가 보낸 쿼리를 처리한다. 더불어, Hbase에 인덱스를 구성하여 빠른 쿼리 처리가 가능하도록 만든다. 실험 결과, 제안된 시스템은 인덱스가 없을 때보다 뛰어난 쿼리 처리 성능을 보이는 것을 확인할 수 있었다.

Implementation of Crime Prediction Algorithm based on Crime Influential Factors (범죄발생 요인 분석 기반 범죄예측 알고리즘 구현)

  • Park, Ji Ho;Cha, Gyeong Hyeon;Kim, Kyung Ho;Lee, Dong Chang;Son, Ki Jun;Kim, Jin Young
    • Journal of Satellite, Information and Communications
    • /
    • v.10 no.2
    • /
    • pp.40-45
    • /
    • 2015
  • In this paper, we proposed and implemented a crime prediction algorithm based upon crime influential factors. To collect the crime-related big data, we used a data which had been collected and was published in the supreme prosecutors' office. The algorithm analyzed various crime patterns in Seoul from 2011 to 2013 using the spatial statistics analysis. Also, for the crime prediction algorithm, we adopted a Bayesian network. The Bayesian network consist of various spatial, populational and social characteristics. In addition, for the more precise prediction, we also considered date, time, and weather factors. As the result of the proposed algorithm, we could figure out the different crime patterns in Seoul, and confirmed the prediction accuracy of the proposed algorithm.

Transfer Impedence of Trip Chain with a Railway Mode Embedded - Using Seoul Metroplitan Transportation Card Data - (철도수단이 내재된 통행사슬의 환승저항 추정방안 - 수도권 교통카드자료를 활용하여 -)

  • Lee, Mee young;Sohn, Jhieon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.36 no.6
    • /
    • pp.1083-1091
    • /
    • 2016
  • This research uses public transportation card data to analyze the inter-regional transfer times, transfer frequencies, and transfer resistance that passengers experience during transit amongst the metropolitan public transportation modes. Currently, mode transfers between bus and rail are recorded up to five times during one transit movement by Trip Chain, facilitating greater comprehension of intermodal movements. However, lack of information on what arises during these transfers poses a problem in that it leads to an underestimation of transfer resistances on the Trip Chain. As such, a path choice model that reflects passenger movements during transit activities is created, which attains explanatory power on transfer resistance through its inclusion of transfer times and frequencies. The methodology adopted in this research is to first conceptualize the idea of metropolitan public transportation transfer, and in the case that mode transfers include the city-rail, to newly conceptualize the idea of transfer resistance using transportation card data. Also, the city-rail path choice model within the Trip Chain is constructed, with transfer time and frequency used to reevaluate transfer resistance. Further, in order to align bus and city-rail station administrative level small-zone coordinates to state and regional level mid-zone coordinates, the big node methdod is utilized. Finally, case studies on trip chains using at least one transfer onto the city-rail is used to determine the validity of the results obtained.

Big data distributed processing system using RHadoop (RHadoop을 이용한 빅데이터 분산처리 시스템)

  • Shin, Ji Eun;Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.5
    • /
    • pp.1155-1166
    • /
    • 2015
  • It is almost impossible to store or analyze big data increasing exponentially with traditional technologies, so Hadoop is a new technology to make that possible. In recent R is using as an engine for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with various data sizes of actual data and simulated data. Experimental results showed our RHadoop system was faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and biglm packages available on bigmemory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Performance Evaluation of Medical Big Data Analysis based on RHadoop (RHadoop 기반 보건의료 빅데이터 분석의 성능 평가)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.1
    • /
    • pp.207-212
    • /
    • 2018
  • As a data analysis tool which is becoming popular in the Big Data era, R is rapidly expanding its user range by providing powerful statistical analysis and data visualization functions. Major advantage of R is its functional scalability based on open source, but its scale scalability is limited, resulting in performance degrades in large data processing. RHadoop, one of the extension packages to complement it, can improve data analysis performance as it supports Hadoop platform-based distributed processing of programs written in R. In this paper, we evaluate the validity of RHadoop by evaluating the performance improvement of RHadoop in real medical big data analysis. Performance evaluation of the analysis of the medical history information, which is provided by National Health Insurance Service, using R and RHadoop shows that RHadoop cluster composed of 8 data nodes can improve performance up to 8 times compared with R.

Structuring of unstructured big data and visual interpretation (부산지역 교통관련 기사를 이용한 비정형 빅데이터의 정형화와 시각적 해석)

  • Lee, Kyeongjun;Noh, Yunhwan;Yoon, Sanggyeong;Cho, Youngseuk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.6
    • /
    • pp.1431-1438
    • /
    • 2014
  • We analyzed the articles from "Kukje Shinmun" and "Busan Ilbo", which are two local newpapers of Busan Metropolitan City. The articles cover from January 1, 2013 to December 31, 2013. Meaningful pattern inherent in 2889 articles of which the title includes "Busan" and "Traffic" and related data was analyzed. Textmining method, which is a part of datamining, was used for the social network analysis (SNA). HDFS and MapReduce (from Hadoop ecosystem), which is open-source framework based on JAVA, were used with Linux environment (Uubntu-12.04LTS) for the construction of unstructured data and the storage, process and the analysis of big data. We implemented new algorithm that shows better visualization compared with the default one from R package, by providing the color and thickness based on the weight from each node and line connecting the nodes.

Big Data Processing and Monitoring System based on Vehicle Data (차량 데이터 기반 빅데이터 처리 및 모니터링 시스템)

  • Shin, Dong-Yun;Kim, Ju-Ho;Lee, Seung-Hae;Shin, Dong-Jin;Oh, Jae-Kon;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.3
    • /
    • pp.105-114
    • /
    • 2019
  • As the Industrial Revolution progressed, Big Data technologies were used to develop a system that instantly identified the consequences of older vehicles using mobile devices. First, data from the vehicle was collected using the OBD2 sensor, and the data collected was stored in the raspberry pie, giving it the same situation that the raspberry pie was driving. In the event that vehicle data is generated, the data is collected in real time, stored in multiple nodes, and visualized and printed based on the processed, refined, processed and processed data. We can use Big Data in this process and quickly process vehicle data to identify it effectively through mobile devices.

Mobile-based Big Data Processing and Monitoring Technology in IoT Environment (IoT 환경에서 모바일 기반 빅데이터 처리 및 모니터링 기술)

  • Lee, Seung-Hae;Kim, Ju-Ho;Shin, Dong-Youn;Shin, Dong-Jin;Park, Jeong-Min;Kim, Jeong-Joon
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.6
    • /
    • pp.1-9
    • /
    • 2018
  • In the fourth industrial revolution, which has become an issue now, we have been able to receive instant analysis results faster than the existing slow speed through various Big Data technologies, and to conduct real-time monitoring on mobile and web. First, various irregular sensor Data is generated using IoT device, Raspberry Pi. Sensor Data is collected in real time, and the collected data is distributed and stored using several nodes. Then, the stored Sensor Data is processed and refined. Visualize and output the analysis result after analysis. By using these methods, we can train the human resources required for Big Data and mobile related fields using IoT, and process data efficiently and quickly. We also provide information that can confirm the reliability of research results through real time monitoring.

Study of Efficient Algorithm for Deduplication of Complex Structure (복잡한 구조의 데이터 중복제거를 위한 효율적인 알고리즘 연구)

  • Lee, Hyeopgeon;Kim, Young-Woon;Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.29-36
    • /
    • 2021
  • The amount of data generated has been growing exponentially, and the complexity of data has been increasing owing to the advancement of information technology (IT). Big data analysts and engineers have therefore been actively conducting research to minimize the analysis targets for faster processing and analysis of big data. Hadoop, which is widely used as a big data platform, provides various processing and analysis functions, including minimization of analysis targets through Hive, which is a subproject of Hadoop. However, Hive uses a vast amount of memory for data deduplication because it is implemented without considering the complexity of data. Therefore, an efficient algorithm has been proposed for data deduplication of complex structures. The performance evaluation results demonstrated that the proposed algorithm reduces the memory usage and data deduplication time by approximately 79% and 0.677%, respectively, compared to Hive. In the future, performance evaluation based on a large number of data nodes is required for a realistic verification of the proposed algorithm.

Landmark Selection Using CNN-Based Heat Map for Facial Age Prediction (안면 연령 예측을 위한 CNN기반의 히트 맵을 이용한 랜드마크 선정)

  • Hong, Seok-Mi;Yoo, Hyun
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.7
    • /
    • pp.1-6
    • /
    • 2021
  • The purpose of this study is to improve the performance of the artificial neural network system for facial image analysis through the image landmark selection technique. For landmark selection, a CNN-based multi-layer ResNet model for classification of facial image age is required. From the configured ResNet model, a heat map that detects the change of the output node according to the change of the input node is extracted. By combining a plurality of extracted heat maps, facial landmarks related to age classification prediction are created. The importance of each pixel location can be analyzed through facial landmarks. In addition, by removing the pixels with low weights, a significant amount of input data can be reduced.