• Title/Summary/Keyword: Data journal

Search Result 188,082, Processing Time 0.119 seconds

A Data Quality Management Maturity Model

  • Ryu, Kyung-Seok;Park, Joo-Seok;Park, Jae-Hong
    • ETRI Journal
    • /
    • v.28 no.2
    • /
    • pp.191-204
    • /
    • 2006
  • Many previous studies of data quality have focused on the realization and evaluation of both data value quality and data service quality. These studies revealed that poor data value quality and poor data service quality were caused by poor data structure. In this study we focus on metadata management, namely, data structure quality and introduce the data quality management maturity model as a preferred maturity model. We empirically show that data quality improves as data management matures.

  • PDF

A Study on Korean Data Naming Schemes (한글 데이터 명칭의 문법적 구조에 관한 연구)

  • 이춘열;김흥수
    • The Journal of Information Technology and Database
    • /
    • v.6 no.2
    • /
    • pp.101-114
    • /
    • 1999
  • Data Naming has been a long-lived issue in data management. A data name is a basic vehicle to convey meanings of data. Thus, names are organized in such a way that anyone can understand their meanings with ease; however, Korean data names have been organized based on English-like naming scheme. This paper proposes a Korean data naming scheme. For this, we specify information that data names should include. Secondly, we propose rules to organize data names. For real world applications, a database system is proposed to manage data names. The database is expected to provide a starting point that on organization can develop its own data name repository.

  • PDF

Big Data Key Challenges

  • Alotaibi, Sultan
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.4
    • /
    • pp.340-350
    • /
    • 2022
  • The big data term refers to the great volume of data and complicated data structure with difficulties in collecting, storing, processing, and analyzing these data. Big data analytics refers to the operation of disclosing hidden patterns through big data. This information and data set cloud to be useful and provide advanced services. However, analyzing and processing this information could cause revealing and disclosing some sensitive and personal information when the information is contained in applications that are correlated to users such as location-based services, but concerns are diminished if the applications are correlated to general information such as scientific results. In this work, a survey has been done over security and privacy challenges and approaches in big data. The challenges included here are in each of the following areas: privacy, access control, encryption, and authentication in big data. Likewise, the approaches presented here are privacy-preserving approaches in big data, access control approaches in big data, encryption approaches in big data, and authentication approaches in big data.

Generic Multidimensional Model of Complex Data: Design and Implementation

  • Khrouf, Kais;Turki, Hela
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.12spc
    • /
    • pp.643-647
    • /
    • 2021
  • The use of data analysis on large volumes of data constitutes a challenge for deducting knowledge and new information. Data can be heterogeneous and complex: Semi-structured data (Example: XML), Data from social networks (Example: Tweets) and Factual data (Example: Spreading of Covid-19). In this paper, we propose a generic multidimensional model in order to analyze complex data, according to several dimensions.

Obtaining bootstrap data for the joint distribution of bivariate survival times

  • Kwon, Se-Hyug
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.933-939
    • /
    • 2009
  • The bivariate data in clinical research fields often has two types of failure times, which are mark variable for the first failure time and the final failure time. This paper showed how to generate bootstrap data to get Bayesian estimation for the joint distribution of bivariate survival times. The observed data was generated by Frank's family and the fake date is simulated with the Gamma prior of survival time. The bootstrap data was obtained by combining the mimic data with the observed data and the simulated fake data from the observed data.

  • PDF

A Comparative Study of Big Data, Open Data, and My Data (빅데이터, 오픈데이터, 마이데이터의 비교 연구)

  • Park, Jooseok
    • The Journal of Bigdata
    • /
    • v.3 no.1
    • /
    • pp.41-46
    • /
    • 2018
  • With the advent of the fourth industrial revolution, data becomes very important resource. Now is called as 'Data Revolution Age.' It is said that Data Revolution Age started with Big Data, then accelerated with Open Data, finally completed with My Data. In this paper, we compared Big Data, Open Data, and suggested roles and effects of My Data as a digital resource.

Modeling and Implementation of Public Open Data in NoSQL Database

  • Min, Meekyung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.10 no.3
    • /
    • pp.51-58
    • /
    • 2018
  • In order to utilize various data provided by Korea public open data portal, data should be systematically managed using a database. Since the range of open data is enormous, and the amount of data continues to increase, it is preferable to use a database capable of processing big data in order to analyze and utilize the data. This paper proposes data modeling and implementation method suitable for public data. The target data is subway related data provided by the public open data portal. Schema of the public data related to Seoul metro stations are analyzed and problems of the schema are presented. To solve these problems, this paper proposes a method to normalize and structure the subway data and model it in NoSQL database. In addition, the implementation result is shown by using MongDB which is a document-based database capable of processing big data.

From Multimedia Data Mining to Multimedia Big Data Mining

  • Constantin, Gradinaru Bogdanel;Mirela, Danubianu;Luminita, Barila Adina
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.11
    • /
    • pp.381-389
    • /
    • 2022
  • With the collection of huge volumes of text, image, audio, video or combinations of these, in a word multimedia data, the need to explore them in order to discover possible new, unexpected and possibly valuable information for decision making was born. Starting from the already existing data mining, but not as its extension, multimedia mining appeared as a distinct field with increased complexity and many characteristic aspects. Later, the concept of big data was extended to multimedia, resulting in multimedia big data, which in turn attracted the multimedia big data mining process. This paper aims to survey multimedia data mining, starting from the general concept and following the transition from multimedia data mining to multimedia big data mining, through an up-to-date synthesis of works in the field, which is a novelty, from our best of knowledge.

Environmental Survey Data Analysis by Data Fusion Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1201-1208
    • /
    • 2006
  • Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences. Data fusion is also called data combination or data matching. Data fusion is divided in five branch types which are exact matching, judgemental matching, probability matching, statistical matching, and data linking. Currently, Gyeongnam province is executing the social survey every year with the provincials. But, they have the limit of the analysis as execute the different survey to 3 year cycles. In this paper, we study to data fusion of environmental survey data using sas macro. We can use data fusion outputs in environmental preservation and environmental improvement.

  • PDF

Data Reduction Method in Massive Data Sets

  • Namo, Gecynth Torre;Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.7 no.1
    • /
    • pp.35-40
    • /
    • 2009
  • Many researchers strive to research on ways on how to improve the performance of RFID system and many papers were written to solve one of the major drawbacks of potent technology related with data management. As RFID system captures billions of data, problems arising from dirty data and large volume of data causes uproar in the RFID community those researchers are finding ways on how to address this issue. Especially, effective data management is important to manage large volume of data. Data reduction techniques in attempts to address the issues on data are also presented in this paper. This paper introduces readers to a new data reduction algorithm that might be an alternative to reduce data in RFID Systems. A process on how to extract data from the reduced database is also presented. Performance study is conducted to analyze the new data reduction algorithm. Our performance analysis shows the utility and feasibility of our categorization reduction algorithms.