• Title/Summary/Keyword: Data journal

Search Result 191,047, Processing Time 0.116 seconds

A Model Comparison for Spatiotemporal Data in Ubiquitous Environments: A Case Study

  • Noh, Seo-Young;Gadia, Shashi K.
    • Journal of Information Processing Systems
    • /
    • v.7 no.4
    • /
    • pp.635-652
    • /
    • 2011
  • In ubiquitous environments, many applications need to process data with time and space dimensions. Because of this, there is growing attention not only on gathering spatiotemporal data in ubiquitous environments, but also on processing such data in databases. In order to obtain the full benefits from spatiotemporal data, we need a data model that naturally expresses the properties of spatiotemporal data. In this paper, we introduce three spatiotemporal data models extended from temporal data models. The main goal of this paper is to determine which data model is less complex in the spatiotemporal context. To this end, we compare their query languages in the complexity aspect because the complexity of a query language is tightly coupled with its underlying data model. Throughout our investigations, we show that it is important to intertwine space and time dimensions and keep one-to-one correspondence between an object in the real world and a tuple in a database in order to naturally express queries in ubiquitous applications.

Survey on Deep Learning Methods for Irregular 3D Data Using Geometric Information (불규칙 3차원 데이터를 위한 기하학정보를 이용한 딥러닝 기반 기법 분석)

  • Cho, Sung In;Park, Haeju
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.5
    • /
    • pp.215-223
    • /
    • 2021
  • 3D data can be categorized into two parts : Euclidean data and non-Euclidean data. In general, 3D data exists in the form of non-Euclidean data. Due to irregularities in non-Euclidean data such as mesh and point cloud, early 3D deep learning studies transformed these data into regular forms of Euclidean data to utilize them. This approach, however, cannot use memory efficiently and causes loses of essential information on objects. Thus, various approaches that can directly apply deep learning architecture to non-Euclidean 3D data have emerged. In this survey, we introduce various deep learning methods for mesh and point cloud data. After analyzing the operating principles of these methods designed for irregular data, we compare the performance of existing methods for shape classification and segmentation tasks.

Big Data Platform Case Analysis and Deployment Strategies to Revitalize the Data Economy (데이터 경제 활성화를 위한 빅데이터 플랫폼 사례 분석 및 구축 전략)

  • Kim, Baehyun
    • Convergence Security Journal
    • /
    • v.21 no.1
    • /
    • pp.73-78
    • /
    • 2021
  • Big data is a key driver of the fourth industrial revolution, represented by ultra-connected, ultra-intelligence, and ultra-convergence, and it is important to create innovation and share, link, and utilize data to discover business models. However, it is difficult to secure and utilize high-quality and abundant data when big data platforms are built in a regular manner without considering shared-linked. Therefore, this paper presents the development direction of big data platform infrastructure by comparing and analyzing various cases of big data platforms to enable data production, construction, linkage, and distribution.

Generating and Validating Synthetic Training Data for Predicting Bankruptcy of Individual Businesses

  • Hong, Dong-Suk;Baik, Cheol
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.4
    • /
    • pp.228-233
    • /
    • 2021
  • In this study, we analyze the credit information (loan, delinquency information, etc.) of individual business owners to generate voluminous training data to establish a bankruptcy prediction model through a partial synthetic training technique. Furthermore, we evaluate the prediction performance of the newly generated data compared to the actual data. When using conditional tabular generative adversarial networks (CTGAN)-based training data generated by the experimental results (a logistic regression task), the recall is improved by 1.75 times compared to that obtained using the actual data. The probability that both the actual and generated data are sampled over an identical distribution is verified to be much higher than 80%. Providing artificial intelligence training data through data synthesis in the fields of credit rating and default risk prediction of individual businesses, which have not been relatively active in research, promotes further in-depth research efforts focused on utilizing such methods.

A Research on the Energy Data Analysis using Machine Learning (머신러닝 기법을 활용한 에너지 데이터 분석에 관한 연구)

  • Kim, Dongjoo;Kwon, Seongchul;Moon, Jonghui;Sim, Gido;Bae, Moonsung
    • KEPCO Journal on Electric Power and Energy
    • /
    • v.7 no.2
    • /
    • pp.301-307
    • /
    • 2021
  • After the spread of the data collection devices such as smart meters, energy data is increasingly collected in a variety of ways, and its importance continues to grow. However, due to technical or practical limitations, errors such as missing or outliers in the data occur during data collection process. Especially in the case of customer-related data, billing problems may occur, so energy companies are conducting various research to process such data. In addition, efforts are being made to create added value from data, which makes it difficult to provide such services unless reliability of data is guaranteed. In order to solve these challenges, this research analyzes prior research related to bad data processing specifically in the energy field, and propose new missing value processing methods to improve the reliability and field utilization of energy data.

Data Design Strategy for Data Governance Applied to Customer Relationship Management

  • Sangwon LEE;Joohyung KIM
    • International Journal of Advanced Culture Technology
    • /
    • v.11 no.3
    • /
    • pp.338-345
    • /
    • 2023
  • Nowadays, many companies are striving to turn customer value into business value. Customer Relationship Management is a management system that develops effective and efficient marketing strategies by classifying customers in detail based on their information, i.e. databases, and consists of various information technologies. To implement this management system, a customer integration database must be established, and customer characteristics (buying behavior, preferences, etc.) must be analyzed with the databases established and the behavior of each customer must be predicted. This study aims to systematically manage a large amount of customer data generated by companies that apply Customer Relationship Management, in order to develop data design and data governance strategies that should be considered to increase customer value and even company value. We mainly looked at the characteristics of customer relationship management and data governance, and then explored the link between the field of customer relationship management and data governance. In addition, we have developed a data strategy that companies need to perform data governance for customer relationship management.

Privacy-Constrained Relational Data Perturbation: An Empirical Evaluation

  • Deokyeon Jang;Minsoo Kim;Yon Dohn Chung
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.524-534
    • /
    • 2024
  • The release of relational data containing personal sensitive information poses a significant risk of privacy breaches. To preserve privacy while publishing such data, it is important to implement techniques that ensure protection of sensitive information. One popular technique used for this purpose is data perturbation, which is popularly used for privacy-preserving data release due to its simplicity and efficiency. However, the data perturbation has some limitations that prevent its practical application. As such, it is necessary to propose alternative solutions to overcome these limitations. In this study, we propose a novel approach to preserve privacy in the release of relational data containing personal sensitive information. This approach addresses an intuitive, syntactic privacy criterion for data perturbation and two perturbation methods for relational data release. Through experiments with synthetic and real data, we evaluate the performance of our methods.

An Extended Relational Data Model for Database Uncertainty Using Data Source Reliability (데이터 제공원의 신뢰도를 고려한 확장 관계형 데이터 모델)

  • 정철용;이석균;서용무
    • The Journal of Information Technology and Database
    • /
    • v.6 no.1
    • /
    • pp.15-25
    • /
    • 1999
  • We propose an extended relational data model which can represent the reliability of data. In this paper, the reliability of data is defined as the reliability of the source, from which the data originated. We represent the reliability of data at the level of attribute values, instead of tuples, then define the selection, product and join operators.

  • PDF

A Note on Quartile (4분위수에 대한 메모)

  • 박동준;황현미
    • Journal of Korean Society for Quality Management
    • /
    • v.26 no.3
    • /
    • pp.150-155
    • /
    • 1998
  • It is necessary to describe a data set after collection of data in elementary statistics course. Two major numerical summary of the data set may be measures of central location and dispersion. There are various unmerical summary methods in presenting how data are dispersed and each method has its own advantages and disadvantages. Quartiles are discussed among several methods to describe dispersion of data set. When data type is discrete, exact quartile values are sometimes ambiguous to find, whereas exact quartile values are obtained for contionuous data. Examples of both data types are given. Programs listed below may be used to provide quartiles in MINITAB and SAS.

  • PDF

An Efficient Schema Extracting Technique Using DTD in XML Documents (DTD를 이용한 XML문서의 효율적인 스키마 추출 기법)

  • Ahn, Sung-Eun;Choi, Hwang-Kyu
    • Journal of Industrial Technology
    • /
    • v.21 no.A
    • /
    • pp.141-146
    • /
    • 2001
  • XML is fast emerging as the dominant standard to represent and exchange data in the Web. As the amount of data available in the Web has increased dramatically in recent years, the data resides in different forms ranging from semi-structured data to highly structured data in relational database. As semi-structured data will be represented by XML, XML will increase the ability of semi-structured data. In this paper, we propose an idea for extracting schema in XML document using DTD.

  • PDF