• Title/Summary/Keyword: Big data analytics

Search Result 287, Processing Time 0.024 seconds

Clustering of Smart Meter Big Data Based on KNIME Analytic Platform (KNIME 분석 플랫폼 기반 스마트 미터 빅 데이터 클러스터링)

  • Kim, Yong-Gil;Moon, Kyung-Il
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.2
    • /
    • pp.13-20
    • /
    • 2020
  • One of the major issues surrounding big data is the availability of massive time-based or telemetry data. Now, the appearance of low cost capture and storage devices has become possible to get very detailed time data to be used for further analysis. Thus, we can use these time data to get more knowledge about the underlying system or to predict future events with higher accuracy. In particular, it is very important to define custom tailored contract offers for many households and businesses having smart meter records and predict the future electricity usage to protect the electricity companies from power shortage or power surplus. It is required to identify a few groups with common electricity behavior to make it worth the creation of customized contract offers. This study suggests big data transformation as a side effect and clustering technique to understand the electricity usage pattern by using the open data related to smart meter and KNIME which is an open source platform for data analytics, providing a user-friendly graphical workbench for the entire analysis process. While the big data components are not open source, they are also available for a trial if required. After importing, cleaning and transforming the smart meter big data, it is possible to interpret each meter data in terms of electricity usage behavior through a dynamic time warping method.

Analysis of Public Perception and Policy Implications of Foreign Workers through Social Big Data analysis (소셜 빅데이터분석을 통한 외국인근로자에 관한 국민 인식 분석과 정책적 함의)

  • Ha, Jae-Been;Lee, Do-Eun
    • Journal of Digital Convergence
    • /
    • v.19 no.11
    • /
    • pp.1-10
    • /
    • 2021
  • This paper aimed to look at the awareness of foreign workers in social platforms by using text mining, one of the big data techniques and draw suggestions for foreign workers. To achieve this purpose, data collection was conducted with search keyword 'Foreign Worker' from Jan. 1, to Dec. 31, 2020, and frequency analysis, TF-IDF analysis, and degree centrality analysis and 100 parent keywords were drawn for comparison. Furthermore, Ucinet6.0 and Netdraw were used to analyze semantic networks, and through CONCOR analysis, data were clustered into the following eight groups: foreigner policy issue, regional community issue, business owner's perspective issue, employment issue, working environment issue, legal issue, immigration issue, and human rights issue. Based on such analyzed results, it identified national awareness of foreign workers and main issues and provided the basic data on policy proposals for foreign workers and related researches.

Design and Utilization of Connected Data Architecture-based AI Service of Mass Distributed Abyss Storage (대용량 분산 Abyss 스토리지의 CDA (Connected Data Architecture) 기반 AI 서비스의 설계 및 활용)

  • Cha, ByungRae;Park, Sun;Seo, JaeHyun;Kim, JongWon;Shin, Byeong-Chun
    • Smart Media Journal
    • /
    • v.10 no.1
    • /
    • pp.99-107
    • /
    • 2021
  • In addition to the 4th Industrial Revolution and Industry 4.0, the recent megatrends in the ICT field are Big-data, IoT, Cloud Computing, and Artificial Intelligence. Therefore, rapid digital transformation according to the convergence of various industrial areas and ICT fields is an ongoing trend that is due to the development of technology of AI services suitable for the era of the 4th industrial revolution and the development of subdivided technologies such as (Business Intelligence), IA (Intelligent Analytics, BI + AI), AIoT (Artificial Intelligence of Things), AIOPS (Artificial Intelligence for IT Operations), and RPA 2.0 (Robotic Process Automation + AI). This study aims to integrate and advance various machine learning services of infrastructure-side GPU, CDA (Connected Data Architecture) framework, and AI based on mass distributed Abyss storage in accordance with these technical situations. Also, we want to utilize AI business revenue model in various industries.

A Study on Policy Priorities for Implementing Big Data Analytics in the Social Security Sector : Adopting AHP Methodology (AHP분석을 활용한 사회보장부문 빅 데이터 활용가능 영역 탐색 연구)

  • Ham, Young-Jin;Ahn, Chang-Won;Kim, Ki-Ho;Park, Gyu-Beom;Kim, Kyoung-June;Lee, Dae-Young;Park, Sun-Mi
    • Journal of Digital Convergence
    • /
    • v.12 no.8
    • /
    • pp.49-60
    • /
    • 2014
  • The primary purpose of this paper is to find out what issues are important in the Social Security sector, and then, through AHP methodology, this study analyzes what kind of big data methodologies and projects can be implemented to solves these issues. To the aim, this paper first confirmed 8 big data projects from reviewing all issues in the Social Security sector such as administrative works and social policies. After the result of pairwise comparison, policy validity is most important factors rather then effectiveness and practicability. With regard to the priorities among sub-big data projects, the project about preventing improper recipients has come out the most important project in terms of validity, effectiveness and practicability. And the results showed that the project about outreaching and reducing a blind spot on the welfare sector is weighed as a significant project. The results of this paper, in particular 8 sub-big data projects, will be useful to anyone who is interested in using big data and its methodologies for the social welfare sector.

A study on the MD&A Disclosure Quality in real-time calculated and provided By Programming Technology

  • Shin, YeounOuk
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.3
    • /
    • pp.41-48
    • /
    • 2019
  • The Management Discussion and Analysis(MD&A) provides investors with an opportunity to gain insight into the company from a manager's perspective and enables short-term and long-term analysis of the business. And MD&A is an important channel through which companies and investors can communicate, providing a useful source of information for analyzing financialstatements. MD&A is measured by the quality of disclosure and there are many previous studies on the usefulness of disclosure information. Therefore, it is very important for the financial analyst who is the representative information user group in the capital market that MD&A Disclosure Quality is measured in real-time in combination with IT information technology and provided timely to financial analyst. In this study, we propose a method that real-time data is converted to digitalized data by combining MD&A disclosure with IT information technology and provided to financial analyst's information environment in real-time. The real-time information provided by MD&A can help the financial analysts' activities and reduce information asymmetry.

A Study on Adaptive Learning Model for Performance Improvement of Stream Analytics (실시간 데이터 분석의 성능개선을 위한 적응형 학습 모델 연구)

  • Ku, Jin-Hee
    • Journal of Convergence for Information Technology
    • /
    • v.8 no.1
    • /
    • pp.201-206
    • /
    • 2018
  • Recently, as technologies for realizing artificial intelligence have become more common, machine learning is widely used. Machine learning provides insight into collecting large amounts of data, batch processing, and taking final action, but the effects of the work are not immediately integrated into the learning process. In this paper proposed an adaptive learning model to improve the performance of real-time stream analysis as a big business issue. Adaptive learning generates the ensemble by adapting to the complexity of the data set, and the algorithm uses the data needed to determine the optimal data point to sample. In an experiment for six standard data sets, the adaptive learning model outperformed the simple machine learning model for classification at the learning time and accuracy. In particular, the support vector machine showed excellent performance at the end of all ensembles. Adaptive learning is expected to be applicable to a wide range of problems that need to be adaptively updated in the inference of changes in various parameters over time.

Performance Optimization Strategies for Fully Utilizing Apache Spark (아파치 스파크 활용 극대화를 위한 성능 최적화 기법)

  • Myung, Rohyoung;Yu, Heonchang;Choi, Sukyong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.7 no.1
    • /
    • pp.9-18
    • /
    • 2018
  • Enhancing performance of big data analytics in distributed environment has been issued because most of the big data related applications such as machine learning techniques and streaming services generally utilize distributed computing frameworks. Thus, optimizing performance of those applications at Spark has been actively researched. Since optimizing performance of the applications at distributed environment is challenging because it not only needs optimizing the applications themselves but also requires tuning of the distributed system configuration parameters. Although prior researches made a huge effort to improve execution performance, most of them only focused on one of three performance optimization aspect: application design, system tuning, hardware utilization. Thus, they couldn't handle an orchestration of those aspects. In this paper, we deeply analyze and model the application processing procedure of the Spark. Through the analyzed results, we propose performance optimization schemes for each step of the procedure: inner stage and outer stage. We also propose appropriate partitioning mechanism by analyzing relationship between partitioning parallelism and performance of the applications. We applied those three performance optimization schemes to WordCount, Pagerank, and Kmeans which are basic big data analytics and found nearly 50% performance improvement when all of those schemes are applied.

A Big Data Study on Viewers' Response and Success Factors in the D2C Era Focused on tvN's Web-real Variety 'SinSeoYuGi' and Naver TV Cast Programming

  • Oh, Sejong;Ahn, Sunghun;Byun, Jungmin
    • International Journal of Advanced Culture Technology
    • /
    • v.4 no.2
    • /
    • pp.7-18
    • /
    • 2016
  • The first D2C-era web-real variety show in Korea was broadcast via tvN of CJ E&M. The web-real variety program 'SinSeoYuGi' accumulated 54 million views, along with 50 million views at the Chinese portal site QQ. This study carries out an analysis using text mining that extracts portal site blogs, twitter page views and associative terms. In addition, this study derives viewers' response by extracting key words with opinion mining techniques that divide positive words, neutral words and negative words through customer sentiment analysis. It is found that the success factors of the web-real variety were reduced in appearance fees and production cost, harmony between actual cast members and scenario characters, mobile TV programing, and pre-roll advertising. It is expected that web-real variety broadcasting will increase in value as web contents in the future, and be established as a new genre with the job of 'technical marketer' growing as well.

The Venture Business Starts News and SNS Big Data Analytics (벤처창업 관련 뉴스 및 SNS 빅데이터 분석)

  • Ban, ChaeHoon;Lee, YeChan;Ahn, DaeJoong;Kwak, YoonHyeok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2017.05a
    • /
    • pp.99-102
    • /
    • 2017
  • 대규모의 데이터가 생산되고 저장되는 정보화 시대에서 현재와 과거의 데이터를 바탕으로 미래를 추측하고 방향성을 알아갈 수 있는 빅데이터의 중요성이 강조되고 있다. 정형화 되지 못한 대규모 데이터를 빅데이터 분석 도구인 R과 웹크롤링을 통해 분석하고 그 통계를 기초로 데이터의 정형화와 정보 분석을 하도록 한다. 본 논문에서는 R과 웹크롤링을 이용하여 최근 이슈가 되고 있는 벤처창업을 주 키워드로 하여 뉴스 및 SNS에서 나타나는 벤처창업 관련 빅데이터를 분석한다. 뉴스기사와 페이스북, 트위터에서 벤처창업 관련 데이터를 수집하고 수집된 데이터에서 키워드를 분류하여 효율적인 벤처창업의 방법과 종류, 방향성에 대해 예측한다. 과거의 벤처창업 실패요인을 분석하고 현재의 문제점을 찾아 데이터 분석을 통해 벤처창업의 흐름과 방향성을 제시하여 창업자들이 겪을 수 있는 어려움을 사전에 예측하고 파악함으로써 실질적인 벤처창업에 크게 이바지할 것으로 보여 진다.

  • PDF

An Analysis of the Positive and Negative Factors Affecting Job Satisfaction Using Topic Modeling

  • Changjae Lee;Byunghyun Lee;Ilyoung Choi;Jaekyeong Kim
    • Asia pacific journal of information systems
    • /
    • v.34 no.1
    • /
    • pp.321-350
    • /
    • 2024
  • When a competent employee leaves an organization, the technical skills and know-how possessed by that employee also disappear, which may lead to various problems, such as a decrease in organizational morale and technology leakage. To address such problems, it is important to increase employees' job satisfaction. Due to the advancement of both information and communication technology and social media, many former and current employees share information regarding companies in which they have worked or for which they currently work via job portal websites. In this study, a web crawl was used to collect reviews and job satisfaction ratings written by all and incumbent employees working in nine industries from Job Planet, a Korean job portal site. According to this analysis, regardless of the industry in question, organizational culture, welfare support, work system, growth capability and relationships had significant positive effects on job satisfaction, while time and attendance management, performance management, and organizational flexibility had significant negative effects on job satisfaction. With respect to the path difference between former and current employees, time and attendance management and organizational flexibility have greater negative effects on job satisfaction for current employees than for former employees. On the other hand, organizational culture, work system, and relationships had greater positive effects for current employees than for former employees.