Browse > Article
http://dx.doi.org/10.9708/jksci.2015.20.8.077

Efficient Multimedia Data File Management and Retrieval Strategy on Big Data Processing System  

Lee, Jae-Kyung (Dept. of Computer Engineering, Hongik University)
Shin, Su-Mi (Dept. of Computer Engineering, Hongik University)
Kim, Kyung-Chang (Dept. of Computer Engineering, Hongik University)
Abstract
The storage and retrieval of multimedia data is becoming increasingly important in many application areas including record management, video(CCTV) management and Internet of Things (IoT). In these applications, the files containing multimedia that need to be stored and managed is tremendous and constantly scaling. In this paper, we propose a technique to retrieve a very large number of files, in multimedia format, using the Hadoop Framework. Our strategy is based on the management of metadata that describes the characteristic of files that are stored in Hadoop Distributed File System (HDFS). The metadata schema is represented in Hbase and looked up using SQL On Hadoop (Hive, Tajo). Both the Hbase, Hive and Tajo are part of the Hadoop Ecosystem. Preliminary experiment on multimedia data files stored in HDFS shows the viability of the proposed strategy.
Keywords
Hadoop Framework; Metadata; Multimedia Data; CCTV Data; Record Management; Retrieval; Hadoop Ecosystem;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 Hadoop. https://hadoop.apache.org/
2 Shvachko, Konstantin, et al. "The hadoop distributed file system." Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on. IEEE, 2010.
3 J. Dean, S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," In Proc. of the 6th Symposium on Operating Systems Design and Implementation, San Francisco CA, Dec. 2004.
4 Lucene. https://lucene.apache.org/
5 Cloudera. https://www.cloudera.com/
6 Hakimzadeh, Kamal, Hooman Peiro Sajjad, and Jim Dowling. "Scaling HDFS with a Strongly Consistent Relational Model for Metadata." Distributed Applications and Interoperable Systems. Springer Berlin Heidelberg, 2014.
7 Smith, Ken, et al. "Big Metadata: The Need for Principled Metadata Management in Big Data Ecosystems." Proceedings of Workshop on Data analytics in the Cloud. ACM, 2014.
8 MySQL. https://www.mysql.com/
9 Schreiber, G., Raimond, Y. "RDF 1.1 primer." W3C Note, 2014.
10 HBase. http://hbase.apache.org/
11 Craig Franke, Samuel Morin, Artem Chebotko, John Abraham, and Pearl Brazier. "Efficient Processing of Semantic Web Queries in HBase and MySQL Cluster." IEEE Computer Society, Vol. 15, No. 03, pp. 36-43, May-June. 2013.
12 Hive. https://hive.apache.org/
13 Tajo. http://tajo.apache.org/
14 JunSang Kim, ChangHyeon Kim, WonJoo Lee, ChangHo Jeon "A Block Relocation Algorithm for Reducing Network Consumption in Hadoop Cluster", The Korea Society of Computer and Information, Vol. 19, No. 11, pp. 9-15, Nov. 2014.
15 TaeHoon Keum, WonJoo Lee, ChangHo Jeon, "A Performance Analysis Based on Hadoop Application's Characteristics in Cloud Computing", The Korea Society of Computer and Information, Vol. 15, No. 5, pp.4 9-56, May. 2010.