• 제목/요약/키워드: Multimedia Big Data

검색결과 144건 처리시간 0.023초

Design of Big Data Preference Analysis System (빅데이터 선호도 분석 시스템 설계)

  • Son, Sung Il;Park, Chan Khon
    • Journal of Korea Multimedia Society
    • /
    • 제17권11호
    • /
    • pp.1286-1295
    • /
    • 2014
  • This paper suggests the way that it could improve the reliability about preference of user's feedback by adding weighting factor on sentiment analysis, and efficiently make a sentiment analysis of users' emotional perspective on the big data massively generated on twitter. To solve errors on earlier studies, this paper has improved recall and precision of sensibility determination by using sensibility dictionary subdivided sentiment polarity based on the level of sensibility and given impotance to sensibility determination by populating slang, new words, emoticons and idiomatic expressions not in the system dictionary. It has considered the context through conjunctive adverbs fixed in korean characteristics which are free to the word order. It also recognize sensibility words such as TF(Term Frequency), RT(Retweet), Follower which are weighting factors of preference and has increased reliability of preference analysis considering weight on 'a very emotional tweet', 'a recognised tweet from users' and 'a tweeter influencer'

A Method on Associated Document Recommendation with Word Correlation Weights (단어 연관성 가중치를 적용한 연관 문서 추천 방법)

  • Kim, Seonmi;Na, InSeop;Shin, Juhyun
    • Journal of Korea Multimedia Society
    • /
    • 제22권2호
    • /
    • pp.250-259
    • /
    • 2019
  • Big data processing technology and artificial intelligence (AI) are increasingly attracting attention. Natural language processing is an important research area of artificial intelligence. In this paper, we use Korean news articles to extract topic distributions in documents and word distribution vectors in topics through LDA-based Topic Modeling. Then, we use Word2vec to vector words, and generate a weight matrix to derive the relevance SCORE considering the semantic relationship between the words. We propose a way to recommend documents in order of high score.

A Novel Data Prediction Model using Data Weights and Neural Network based on R for Meaning Analysis between Data (데이터간 의미 분석을 위한 R기반의 데이터 가중치 및 신경망기반의 데이터 예측 모형에 관한 연구)

  • Jung, Se Hoon;Kim, Jong Chan;Sim, Chun Bo
    • Journal of Korea Multimedia Society
    • /
    • 제18권4호
    • /
    • pp.524-532
    • /
    • 2015
  • All data created in BigData times is included potentially meaning and correlation in data. A variety of data during a day in all society sectors has become created and stored. Research areas in analysis and grasp meaning between data is proceeding briskly. Especially, accuracy of meaning prediction and data imbalance problem between data for analysis is part in course of something important in data analysis field. In this paper, we proposed data prediction model based on data weights and neural network using R for meaning analysis between data. Proposed data prediction model is composed of classification model and analysis model. Classification model is working as weights application of normal distribution and optimum independent variable selection of multiple regression analysis. Analysis model role is increased prediction accuracy of output variable through neural network. Performance evaluation result, we were confirmed superiority of prediction model so that performance of result prediction through primitive data was measured 87.475% by proposed data prediction model.

Design and Implementation of a Web Crawler System for Collection of Structured and Unstructured Data (정형 및 비정형 데이터 수집을 위한 웹 크롤러 시스템 설계 및 구현)

  • Bae, Seong Won;Lee, Hyun Dong;Cho, DaeSoo
    • Journal of Korea Multimedia Society
    • /
    • 제21권2호
    • /
    • pp.199-209
    • /
    • 2018
  • Recently, services provided to consumers are increasingly being combined with big data such as low-priced shopping, customized advertisement, and product recommendation. With the increasing importance of big data, the web crawler that collects data from the web has also become important. However, there are two problems with existing web crawlers. First, if the URL is hidden from the link, it can not be accessed by the URL. The second is the inefficiency of fetching more data than the user wants. Therefore, in this paper, through the Casper.js which can control the DOM in the headless brwoser, DOM event is generated by accessing the URL to the hidden link. We also propose an intelligent web crawler system that allows users to make steps to fine-tune both Structured and unstructured data to bring only the data they want. Finally, we show the superiority of the proposed crawler system through the performance evaluation results of the existing web crawler and the proposed web crawler.

A Study on the Classification of Unstructured Data through Morpheme Analysis

  • Kim, SungJin;Choi, NakJin;Lee, JunDong
    • Journal of the Korea Society of Computer and Information
    • /
    • 제26권4호
    • /
    • pp.105-112
    • /
    • 2021
  • In the era of big data, interest in data is exploding. In particular, the development of the Internet and social media has led to the creation of new data, enabling the realization of the era of big data and artificial intelligence and opening a new chapter in convergence technology. Also, in the past, there are many demands for analysis of data that could not be handled by programs. In this paper, an analysis model was designed and verified for classification of unstructured data, which is often required in the era of big data. Data crawled DBPia's thesis summary, main words, and sub-keyword, and created a database using KoNLP's data dictionary, and tokenized words through morpheme analysis. In addition, nouns were extracted using KAIST's 9 part-of-speech classification system, TF-IDF values were generated, and an analysis dataset was created by combining training data and Y values. Finally, The adequacy of classification was measured by applying three analysis algorithms(random forest, SVM, decision tree) to the generated analysis dataset. The classification model technique proposed in this paper can be usefully used in various fields such as civil complaint classification analysis and text-related analysis in addition to thesis classification.

A Study on Possible Construction of Big Data Analysis System Applied to the Offline Market (오프라인 마켓에 적용 가능한 빅데이터 분석 시스템 구축 방안에 관한 연구)

  • Lee, Hoo-Young;Park, Koo-Rack;Kim, Dong-Hyun
    • Journal of Digital Convergence
    • /
    • 제14권9호
    • /
    • pp.317-323
    • /
    • 2016
  • Big Data is now seen as a major asset in the company's competitiveness, its influence in the future is expected to grow. Companies that recognize the importance are already actively engaged with Big Data in product development and marketing, which are increasingly applied across sectors of society, including politics, sports. However, lack of knowledge of the system implementation and high costs are still a big obstacles to the introduction of Big Data and systems. It is an objective in this study to build a Big Data system, which is based on open source Hadoop and Hive among Big Data systems, utilizing POS sales data of small and medium-sized offline markets. This approach of convergence is expected to improve existing sales systems that have been simply focusing on profit and loss analysis. It will also be able to use it as the basis for the decisions of the executive to enable prediction of the consumption patterns of customer preference and demand in advance.

A Prediction of Number of Patients and Risk of Disease in Each Region Based on Pharmaceutical Prescription Data (의약품 처방 데이터 기반의 지역별 예상 환자수 및 위험도 예측)

  • Chang, Jeong Hyeon;Kim, Young Jae;Choi, Jong Hyeok;Kim, Chang Su;Aziz, Nasridinov
    • Journal of Korea Multimedia Society
    • /
    • 제21권2호
    • /
    • pp.271-280
    • /
    • 2018
  • Recently, big data has been growing rapidly due to the development of IT technology. Especially in the medical field, big data is utilized to provide services such as patient-customized medical care, disease management and disease prediction. In Korea, 'National Health Alarm Service' is provided by National Health Insurance Corporation. However, the prediction model has a problem of short-term prediction within 3 days and unreliability of social data used in prediction model. In order to solve these problems, this paper proposes a disease prediction model using medicine prescription data generated from actual patients. This model predicts the total number of patients and the risk of disease in each region and uses the ARIMA model for long-term predictions.

An IoT Information Security Model for Securing Bigdata Information for IoT Users (IoT 사용자의 빅데이터 정보를 안전하게 보호하기 위한 IoT 정보 보안 모델)

  • Jeong, Yoon-Su;Yoon, Deok-Byeong;Shin, Seung-Soo
    • Journal of Convergence for Information Technology
    • /
    • 제9권11호
    • /
    • pp.8-14
    • /
    • 2019
  • Due to the development of computer technology, IoT technology is being used in various fields of industry, economy, medical service and education. However, multimedia information processed through IoT equipment is still one of the major issues in the application sector. In this paper, a big data protection model for users of IoT based IoT is proposed to ensure integrity of users' multimedia information processed through IoT equipment. The proposed model aims to prevent users' illegal exploitation of big data information collected through IoT equipment without users' consent. The proposed model uses signatures and authentication information for IoT users in a hybrid cryptographic method. The proposed model feature ensuring integrity and confidentiality of users' big data collected through IoT equipment. In addition, the user's big data is not abused without the user's consent because the user's signature information is encrypted using a steganography-based cryptography-based encryption technique.

Twitter Crawling System

  • Ganiev, Saydiolim;Nasridinov, Aziz;Byun, Jeong-Yong
    • Journal of Multimedia Information System
    • /
    • 제2권3호
    • /
    • pp.287-294
    • /
    • 2015
  • We are living in epoch of information when Internet touches all aspects of our lives. Therefore, it provides a plenty of services each of which benefits people in different ways. Electronic Mail (E-mail), File Transfer Protocol (FTP), Voice/Video Communication, Search Engines are bright examples of Internet services. Between them Social Network Services (SNS) continuously gain its popularity over the past years. Most popular SNSs like Facebook, Weibo and Twitter generate millions of data every minute. Twitter is one of SNS which allows its users post short instant messages. They, 100 million, posted 340 million tweets per day (2012)[1]. Often big amount of data contains lots of noisy data which can be defined as uninteresting and unclassifiable data. However, researchers can take advantage of such huge information in order to analyze and extract meaningful and interesting features. The way to collect SNS data as well as tweets is handled by crawlers. Twitter crawler has recently emerged as a great tool to crawl Twitter data as well as tweets. In this project, we develop Twitter Crawler system which enables us to extract Twitter data. We implemented our system in Java language along with MySQL. We use Twitter4J which is a java library for communicating with Twitter API. The application, first, connects to Twitter API, then retrieves tweets, and stores them into database. We also develop crawling strategies to efficiently extract tweets in terms of time and amount.

Initial Authentication Protocol of Hadoop Distribution System based on Elliptic Curve (타원곡선기반 하둡 분산 시스템의 초기 인증 프로토콜)

  • Jeong, Yoon-Su;Kim, Yong-Tae;Park, Gil-Cheol
    • Journal of Digital Convergence
    • /
    • 제12권10호
    • /
    • pp.253-258
    • /
    • 2014
  • Recently, the development of cloud computing technology is developed as soon as smartphones is increases, and increased that users want to receive big data service. Hadoop framework of the big data service is provided to hadoop file system and hadoop mapreduce supported by data-intensive distributed applications. But, smpartphone service using hadoop system is a very vulnerable state to data authentication. In this paper, we propose a initial authentication protocol of hadoop system assisted by smartphone service. Proposed protocol is combine symmetric key cryptography techniques with ECC algorithm in order to support the secure multiple data processing systems. In particular, the proposed protocol to access the system by the user Hadoop when processing data, the initial authentication key and the symmetric key instead of the elliptic curve by using the public key-based security is improved.