• Title/Summary/Keyword: Big data Analytics

Search Result 284, Processing Time 0.03 seconds

Scalable Big Data Pipeline for Video Stream Analytics Over Commodity Hardware

  • Ayub, Umer;Ahsan, Syed M.;Qureshi, Shavez M.
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.4
    • /
    • pp.1146-1165
    • /
    • 2022
  • A huge amount of data in the form of videos and images is being produced owning to advancements in sensor technology. Use of low performance commodity hardware coupled with resource heavy image processing and analyzing approaches to infer and extract actionable insights from this data poses a bottleneck for timely decision making. Current approach of GPU assisted and cloud-based architecture video analysis techniques give significant performance gain, but its usage is constrained by financial considerations and extremely complex architecture level details. In this paper we propose a data pipeline system that uses open-source tools such as Apache Spark, Kafka and OpenCV running over commodity hardware for video stream processing and image processing in a distributed environment. Experimental results show that our proposed approach eliminates the need of GPU based hardware and cloud computing infrastructure to achieve efficient video steam processing for face detection with increased throughput, scalability and better performance.

Developing a National Data Metrics Framework for Learning Analytics in Korea

  • RHA, Ilju;LIM, Cheolil;CHO, Young Hoan;CHOI, Hyoseon;YUN, Haeseon;YOO, Mina;Jeong Eui-Suk
    • Educational Technology International
    • /
    • v.18 no.1
    • /
    • pp.1-25
    • /
    • 2017
  • Educational applications of big data analysis have been of interest in order to improve learning effectiveness and efficiency. As a basic challenge for educational applications, the purpose of this study is to develop a comprehensive data set scheme for learning analytics in the context of digital textbook usage within the K-12 school environments of Korea. On the basis of the literature review, the Start-up Mega Planning model of needs assessment methodology was used as this study sought to come up with negotiated solutions for different stakeholders for a national level of learning metrics framework. The Ministry of Education (MOE), Seoul Metropolitan Office of Education (SMOE), and Korean Education and Research Information Service (KERIS) were involved in the discussion of the learning metrics framework scope. Finally, we suggest a proposal for the national learning metrics framework to reflect such considerations as dynamic education context and feasibility of the metrics into the K-12 Korean schools. The possibilities and limitations of the suggested framework for learning metrics are discussed and future areas of study are suggested.

Analysis of public opinion in the 20th presidential election using YouTube data (유튜브 데이터를 활용한 20대 대선 여론분석)

  • Kang, Eunkyung;Yang, Seonuk;Kwon, Jiyoon;Yang, Sung-Byung
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.3
    • /
    • pp.161-183
    • /
    • 2022
  • Opinion polls have become a powerful means for election campaigns and one of the most important subjects in the media in that they predict the actual election results and influence people's voting behavior. However, the more active the polls, the more often they fail to properly reflect the voters' minds in measuring the effectiveness of election campaigns, such as repeatedly conducting polls on the likelihood of winning or support rather than verifying the pledges and policies of candidates. Even if the poor predictions of the election results of the polls have undermined the authority of the press, people cannot easily let go of their interest in polls because there is no clear alternative to answer the instinctive question of which candidate will ultimately win. In this regard, we attempt to retrospectively grasp public opinion on the 20th presidential election by applying the 'YouTube Analysis' function of Sometrend, which provides an environment for discovering insights through online big data. Through this study, it is confirmed that a result close to the actual public opinion (or opinion poll results) can be easily derived with simple YouTube data results, and a high-performance public opinion prediction model can be built.

Current Status of Educational Big Data Research (교육 빅데이터 관련 연구 동향)

  • Lee, Eun-young;Park, Do-oung;Choi, In-ong
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2014.07a
    • /
    • pp.175-176
    • /
    • 2014
  • 본고에서는 교육 빅데이터의 개념, 가치, 처리 기술 및 분석 방법 등을 탐색하였다. '온라인과 오프라인 교수 학습 활동의 투입, 과정, 산출을 통해 생산되는 국가, 지역, 학교, 교사, 학생 수준의 자료'로 정의할 수 있는 교육 빅데이터는 Hadoop으로 대표되는 분산 컴퓨팅 기술을 통해 효율적으로 처리할 수 있다. 대규모 교육 자료에서 의미있고 유용한 결과를 도출하기 위해 주로 사용되는 분석 방법에는 교육 데이터 마이닝, 학습 분석학과 시각 자료 분석학이 있다. 교육 데이터 마이닝은 학생과 교사, 학교의 다양한 수준에서 자료를 폭넓게 분석하는 측면이 강한 반면에 학습 분석학은 학생 수준에서의 자료 분석에 더 초점을 맞추는 경향이 있으며, 시각 자료 분석학은 자료에 대한 분석 자체보다는 분석 결과를 효과적으로 표현하는 방식에 초점이 주어져 있다.

  • PDF

A Study on Big Data Analytics Services and Standardization for Smart Manufacturing Innovation

  • Kim, Cheolrim;Kim, Seungcheon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.3
    • /
    • pp.91-100
    • /
    • 2022
  • Major developed countries are seriously considering smart factories to increase their manufacturing competitiveness. Smart factory is a customized factory that incorporates ICT in the entire process from product planning to design, distribution and sales. This can reduce production costs and respond flexibly to the consumer market. The smart factory converts physical signals into digital signals, connects machines, parts, factories, manufacturing processes, people, and supply chain partners in the factory to each other, and uses the collected data to enable the smart factory platform to operate intelligently. Enhancing personalized value is the key. Therefore, it can be said that the success or failure of a smart factory depends on whether big data is secured and utilized. Standardized communication and collaboration are required to smoothly acquire big data inside and outside the factory in the smart factory, and the use of big data can be maximized through big data analysis. This study examines big data analysis and standardization in smart factory. Manufacturing innovation by country, smart factory construction framework, smart factory implementation key elements, big data analysis and visualization, etc. will be reviewed first. Through this, we propose services such as big data infrastructure construction process, big data platform components, big data modeling, big data quality management components, big data standardization, and big data implementation consulting that can be suggested when building big data infrastructure in smart factories. It is expected that this proposal can be a guide for building big data infrastructure for companies that want to introduce a smart factory.

Cloud Computing Platforms for Big Data Adoption and Analytics

  • Hussain, Mohammad Jabed;Alsadie, Deafallah
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.2
    • /
    • pp.290-296
    • /
    • 2022
  • Big Data is a data analysis technology empowered by late advances in innovations and engineering. In any case, big data involves a colossal responsibility of equipment and handling assets, making reception expenses of big data innovation restrictive to little and medium estimated organizations. Cloud computing offers the guarantee of big data execution to little and medium measured organizations. Big Data preparing is performed through a programming worldview known as MapReduce. Normally, execution of the MapReduce worldview requires organized joined stockpiling and equal preparing. The computing needs of MapReduce writing computer programs are frequently past what little and medium measured business can submit. Cloud computing is on-request network admittance to computing assets, given by an external element. Normal arrangement models for cloud computing incorporate platform as a service (PaaS), software as a service (SaaS), framework as a service (IaaS), and equipment as a service (HaaS).

Implementation and Performance Aanalysis of Efficient Big Data Processing System Through Dynamic Configuration of Edge Server Computing and Storage Modules (BigCrawler: 엣지 서버 컴퓨팅·스토리지 모듈의 동적 구성을 통한 효율적인 빅데이터 처리 시스템 구현 및 성능 분석)

  • Kim, Yongyeon;Jeon, Jaeho;Kang, Sungjoo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.6
    • /
    • pp.259-266
    • /
    • 2021
  • Edge Computing enables real-time big data processing by performing computing close to the physical location of the user or data source. However, in an edge computing environment, various situations that affect big data processing performance may occur depending on temporary service requirements or changes of physical resources in the field. In this paper, we proposed a BigCrawler system that dynamically configures the computing module and storage module according to the big data collection status and computing resource usage status in the edge computing environment. And the feature of big data processing workload according to the arrangement of computing module and storage module were analyzed.

A Study on Phon Call Big Data Analytics (전화통화 빅데이터 분석에 관한 연구)

  • Kim, Jeongrae;Jeong, Chanki
    • Journal of Information Technology and Architecture
    • /
    • v.10 no.3
    • /
    • pp.387-397
    • /
    • 2013
  • This paper proposes an approach to big data analytics for phon call data. The analytical models for phon call data is composed of the PVPF (Parallel Variable-length Phrase Finding) algorithm for identifying verbal phrases of natural language and the word count algorithm for measuring the usage frequency of keywords. In the proposed model, we identify words using the PVPF algorithm, and measure the usage frequency of the identified words using word count algorithm in MapReduce. The results can be interpreted from various viewpoints. We design and implement the model based HDFS (Hadoop Distributed File System), verify the proposed approach through a case study of phon call data. So we extract useful results through analysis of keyword correlation and usage frequency.

Research Trends Analysis of Big Data: Focused on the Topic Modeling (빅데이터 연구동향 분석: 토픽 모델링을 중심으로)

  • Park, Jongsoon;Kim, Changsik
    • Journal of Korea Society of Digital Industry and Information Management
    • /
    • v.15 no.1
    • /
    • pp.1-7
    • /
    • 2019
  • The objective of this study is to examine the trends in big data. Research abstracts were extracted from 4,019 articles, published between 1995 and 2018, on Web of Science and were analyzed using topic modeling and time series analysis. The 20 single-term topics that appeared most frequently were as follows: model, technology, algorithm, problem, performance, network, framework, analytics, management, process, value, user, knowledge, dataset, resource, service, cloud, storage, business, and health. The 20 multi-term topics were as follows: sense technology architecture (T10), decision system (T18), classification algorithm (T03), data analytics (T17), system performance (T09), data science (T06), distribution method (T20), service dataset (T19), network communication (T05), customer & business (T16), cloud computing (T02), health care (T14), smart city (T11), patient & disease (T04), privacy & security (T08), research design (T01), social media (T12), student & education (T13), energy consumption (T07), supply chain management (T15). The time series data indicated that the 40 single-term topics and multi-term topics were hot topics. This study provides suggestions for future research.

The Big Data Analytics Regarding the Cadastral Resurvey News Articles

  • Joo, Yong-Jin;Kim, Duck-Ho
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.32 no.6
    • /
    • pp.651-659
    • /
    • 2014
  • With the popularization of big data environment, big data have been highlighted as a key information strategy to establish national spatial data infrastructure for a scientific land policy and the extension of the creative economy. Especially interesting from our point of view is the cadastral information is a core national information source that forms the basis of spatial information that leads to people's daily life including the production and consumption of information related to real estate. The purpose of our paper is to suggest the scheme of big data analytics with respect to the articles of cadastral resurvey project in order to approach cadastral information in terms of spatial data integration. As specific research method, the TM (Text Mining) package from R was used to read various formats of news reports as texts, and nouns were extracted by using the KoNLP package. That is, we searched the main keywords regarding cadastral resurvey, performing extraction of compound noun and data mining analysis. And visualization of the results was presented. In addition, new reports related to cadastral resurvey between 2012 and 2014 were searched in newspapers, and nouns were extracted from the searched data for the data mining analysis of cadastral information. Furthermore, the approval rating, reliability, and improvement of rules were presented through correlation analyses among the extracted compound nouns. As a result of the correlation analysis among the most frequently used ones of the extracted nouns, five groups of data consisting of 133 keywords were generated. The most frequently appeared words were "cadastral resurvey," "civil complaint," "dispute," "cadastral survey," "lawsuit," "settlement," "mediation," "discrepant land," and "parcel." In Conclusions, the cadastral resurvey performed in some local governments has been proceeding smoothly as positive results. On the other hands, disputes from owner of land have been provoking a stream of complaints from parcel surveying for the cadastral resurvey. Through such keyword analysis, various public opinion and the types of civil complaints related to the cadastral resurvey project can be identified to prevent them through pre-emptive responses for direct call centre on the cadastral surveying, Electronic civil service and customer counseling, and high quality services about cadastral information can be provided. This study, therefore, provides a stepping stones for developing an account of big data analytics which is able to comprehensively examine and visualize a variety of news report and opinions in cadastral resurvey project promotion. Henceforth, this will contribute to establish the foundation for a framework of the information utilization, enabling scientific decision making with speediness and correctness.