• 제목/요약/키워드: Data journal

검색결과 188,479건 처리시간 0.106초

A Data Design for Increasing the Usability of Subway Public Data

  • Min, Meekyung
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권4호
    • /
    • pp.18-25
    • /
    • 2019
  • The public data portal provides various public data created by the government in the form of files and open APIs. In order to increase the usability of public open data, a variety of information should be provided to users and should be convenient to use for users. This requires the structured data design plan of the public data. In this paper, we propose a data design method to improve the usability of the Seoul subway public data. For the study, we first identify some properties of the current subway public data and then classify the data based on these properties. The properties used as classification criteria are stored properties, derived properties, static properties, and dynamic properties. We also analyze the limitations of current data for each property. Based on this analysis, we classify currently used subway public data into code entities, base entities, and history entities and present the improved design of entities according to this classification. In addition, we propose data retrieval functions to increase the utilization of the data. If the data is designed according to the proposed design of this paper, it will be possible to solve the problem of duplication and inconsistency of the data currently used and to implement more structural data. As a result, it can provide more functions for users, which is the basis for increasing usability of subway public data.

Secure Data Sharing in The Cloud Through Enhanced RSA

  • Islam abdalla mohamed;Loay F. Hussein;Anis Ben Aissa;Tarak kallel
    • International Journal of Computer Science & Network Security
    • /
    • 제23권2호
    • /
    • pp.89-95
    • /
    • 2023
  • Cloud computing today provides huge computational resources, storage capacity, and many kinds of data services. Data sharing in the cloud is the practice of exchanging files between various users via cloud technology. The main difficulty with file sharing in the public cloud is maintaining privacy and integrity through data encryption. To address this issue, this paper proposes an Enhanced RSA encryption schema (ERSA) for data sharing in the public cloud that protects privacy and strengthens data integrity. The data owners store their files in the cloud after encrypting the data using the ERSA which combines the RSA algorithm, XOR operation, and SHA-512. This approach can preserve the confidentiality and integrity of a file in any cloud system while data owners are authorized with their unique identities for data access. Furthermore, analysis and experimental results are presented to verify the efficiency and security of the proposed schema.

Enhanced Hybrid Privacy Preserving Data Mining Technique

  • Kundeti Naga Prasanthi;M V P Chandra Sekhara Rao;Ch Sudha Sree;P Seshu Babu
    • International Journal of Computer Science & Network Security
    • /
    • 제23권6호
    • /
    • pp.99-106
    • /
    • 2023
  • Now a days, large volumes of data is accumulating in every field due to increase in capacity of storage devices. These large volumes of data can be applied with data mining for finding useful patterns which can be used for business growth, improving services, improving health conditions etc. Data from different sources can be combined before applying data mining. The data thus gathered can be misused for identity theft, fake credit/debit card transactions, etc. To overcome this, data mining techniques which provide privacy are required. There are several privacy preserving data mining techniques available in literature like randomization, perturbation, anonymization etc. This paper proposes an Enhanced Hybrid Privacy Preserving Data Mining(EHPPDM) technique. The proposed technique provides more privacy of data than existing techniques while providing better classification accuracy. The experimental results show that classification accuracies have increased using EHPPDM technique.

Comparative Study of Evaluating the Trustworthiness of Data Based on Data Provenance

  • Gurjar, Kuldeep;Moon, Yang-Sae
    • Journal of Information Processing Systems
    • /
    • 제12권2호
    • /
    • pp.234-248
    • /
    • 2016
  • Due to the proliferation of data being exchanged and the increase of dependency on this data for critical decision-making, it has become imperative to ensure the trustworthiness of the data at the receiving end in order to obtain reliable results. Data provenance, the derivation history of data, is a useful tool for evaluating the trustworthiness of data. Various frameworks have been proposed to evaluate the trustworthiness of data based on data provenance. In this paper, we briefly review a history of these frameworks for evaluating the trustworthiness of data and present an overview of some prominent state-of-the-art evaluation frameworks. Moreover, we provide a comparative analysis of two key frameworks by evaluating various aspects in an executional environment. Our analysis points to various open research issues and provides an understanding of the functionalities of the frameworks that are used to evaluate the trustworthiness of data.

Data-Compression-Based Resource Management in Cloud Computing for Biology and Medicine

  • Zhu, Changming
    • Journal of Computing Science and Engineering
    • /
    • 제10권1호
    • /
    • pp.21-31
    • /
    • 2016
  • With the application and development of biomedical techniques such as next-generation sequencing, mass spectrometry, and medical imaging, the amount of biomedical data have been growing explosively. In terms of processing such data, we face the problems surrounding big data, highly intensive computation, and high dimensionality data. Fortunately, cloud computing represents significant advantages of resource allocation, data storage, computation, and sharing and offers a solution to solve big data problems of biomedical research. In order to improve the efficiency of resource management in cloud computing, this paper proposes a clustering method and adopts Radial Basis Function in order to compress comprehensive data sets found in biology and medicine in high quality, and stores these data with resource management in cloud computing. Experiments have validated that with such a data-compression-based resource management in cloud computing, one can store large data sets from biology and medicine in fewer capacities. Furthermore, with reverse operation of the Radial Basis Function, these compressed data can be reconstructed with high accuracy.

Similarity Measure Design on High Dimensional Data

  • Nipon, Theera-Umpon;Lee, Sanghyuk
    • 한국융합학회논문지
    • /
    • 제4권1호
    • /
    • pp.43-48
    • /
    • 2013
  • Designing of similarity on high dimensional data was done. Similarity measure between high dimensional data was considered by analysing neighbor information with respect to data sets. Obtained result could be applied to big data, because big data has multiple characteristics compared to simple data set. Definitely, analysis of high dimensional data could be the pre-study of big data. High dimensional data analysis was also compared with the conventional similarity. Traditional similarity measure on overlapped data was illustrated, and application to non-overlapped data was carried out. Its usefulness was proved by way of mathematical proof, and verified by calculation of similarity for artificial data example.

Network-based Microarray Data Analysis Tool

  • Park, Hee-Chang;Ryu, Ki-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권1호
    • /
    • pp.53-62
    • /
    • 2006
  • DNA microarray data analysis is a new technology to investigate the expression levels of thousands of genes simultaneously. Since DNA microarray data structures are various and complicative, the data are generally stored in databases for approaching to and controlling the data effectively. But we have some difficulties to analyze and control the data when the data are stored in the several database management systems or that the data are stored to the file format. The existing analysis tools for DNA microarray data have many difficult problems by complicated instructions, and dependency on data types and operating system. In this paper, we design and implement network-based analysis tool for obtaining to useful information from DNA microarray data. When we use this tool, we can analyze effectively DNA microarray data without special knowledge and education for data types and analytical methods.

  • PDF

데이터 모델 재사용을 위한 사례기반추론 프레임워크 (Case-Based Reasoning Framework for Data Model Reuse)

  • 이재식;한재홍
    • 지능정보연구
    • /
    • 제3권2호
    • /
    • pp.33-55
    • /
    • 1997
  • A data model is a diagram that describes the properties of different categories of data and the associations among them within a business or information system. In spite of its importance and usefulness, data modeling activity requires not only a lot of time and effort but also extensive experience and expertise. The data models for similar business areas are analogous to one another. Therefore, it is reasonable to reuse the already-developed data models if the target business area is similar to what we have already analyzed before. In this research, we develop a case-based reasoning system for data model reuse, which we shall call CB-DM Reuser (Case-Based Data Model Reuser). CB-DM Reuse consists of four subsystems : the graphic user interface to interact with end user, the data model management system to build new data model, the case base to store the past data models, and the knowledge base to store data modeling and data model reusing knowledge. We present the functionality of CB-DM Reuser and show how it works on real-life a, pp.ication.

  • PDF

Detection and Correction Method of Erroneous Data Using Quantile Pattern and LSTM

  • Hwang, Chulhyun;Kim, Hosung;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • 제16권4호
    • /
    • pp.242-247
    • /
    • 2018
  • The data of K-Water waterworks is collected from various sensors and used as basic data for the operation and analysis of various devices. In this way, the importance of the sensor data is very high, but it contains misleading data due to the characteristics of the sensor in the external environment. However, the cleansing method for the missing data is concentrated on the prediction of the missing data, so the research on the detection and prediction method of the missing data is poor. This is a study to detect wrong data by converting collected data into quintiles and patterning them. It is confirmed that the accuracy of detecting false data intentionally generated from real data is higher than that of the conventional method in all cases. Future research we will prove the proposed system's efficiency and accuracy in various environments.

가공송전 전선 자산데이터의 정제 자동화 알고리즘 개발 연구 (Automatic Algorithm for Cleaning Asset Data of Overhead Transmission Line)

  • Mun, Sung-Duk;Kim, Tae-Joon;Kim, Kang-Sik;Hwang, Jae-Sang
    • KEPCO Journal on Electric Power and Energy
    • /
    • 제7권1호
    • /
    • pp.73-77
    • /
    • 2021
  • As the big data analysis technologies has been developed worldwide, the importance of asset management for electric power facilities based data analysis is increasing. It is essential to secure quality of data that will determine the performance of the RISK evaluation algorithm for asset management. To improve reliability of asset management, asset data must be preprocessed. In particular, the process of cleaning dirty data is required, and it is also urgent to develop an algorithm to reduce time and improve accuracy for data treatment. In this paper, the result of the development of an automatic cleaning algorithm specialized in overhead transmission asset data is presented. A data cleaning algorithm was developed to enable data clean by analyzing quality and overall pattern of raw data.