• 제목/요약/키워드: Data Paper

검색결과 56,498건 처리시간 0.062초

Design and application of effective data extraction technique from Web databases (웹 기반 데이터베이스로부터의 유용한 데이터 추출 기법의 설계 및 응용)

  • Hwang, Doo-Sung
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • 제6권4호
    • /
    • pp.309-314
    • /
    • 2005
  • This paper analyzes techniques that extract objective information from distributed web databases for bioinformatics based on relationship among information. Moreover, we discuss the design and implementation of a method for knowledge enhancement in respect of protein information. Web data extractor can be constructed by using a manual, semi-automatic, or automatic way. Data extractor generally makes use of identifiers in order to search and extract targeting information from a specified web page. This paper presents a design and implementation for the protein databases of an organism by utilizing web data extraction techniques.

  • PDF

A Secure Cloud Computing System by Using Encryption and Access Control Model

  • Mahmood, Ghassan Sabeeh;Huang, Dong Jun;Jaleel, Baidaa Abdulrahman
    • Journal of Information Processing Systems
    • /
    • 제15권3호
    • /
    • pp.538-549
    • /
    • 2019
  • Cloud computing is the concept of providing information technology services on the Internet, such as software, hardware, networking, and storage. These services can be accessed anywhere at any time on a pay-per-use basis. However, storing data on servers is a challenging aspect of cloud computing. This paper utilizes cryptography and access control to ensure the confidentiality, integrity, and proper control of access to sensitive data. We propose a model that can protect data in cloud computing. Our model is designed by using an enhanced RSA encryption algorithm and a combination of role-based access control model with extensible access control markup language (XACML) to facilitate security and allow data access. This paper proposes a model that uses cryptography concepts to store data in cloud computing and allows data access through the access control model with minimum time and cost for encryption and decryption.

Restricted maximum likelihood estimation of a censored random effects panel regression model

  • Lee, Minah;Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • 제26권4호
    • /
    • pp.371-383
    • /
    • 2019
  • Panel data sets have been developed in various areas, and many recent studies have analyzed panel, or longitudinal data sets. Maximum likelihood (ML) may be the most common statistical method for analyzing panel data models; however, the inference based on the ML estimate will have an inflated Type I error because the ML method tends to give a downwardly biased estimate of variance components when the sample size is small. The under estimation could be severe when data is incomplete. This paper proposes the restricted maximum likelihood (REML) method for a random effects panel data model with a censored dependent variable. Note that the likelihood function of the model is complex in that it includes a multidimensional integral. Many authors proposed to use integral approximation methods for the computation of likelihood function; however, it is well known that integral approximation methods are inadequate for high dimensional integrals in practice. This paper introduces to use the moments of truncated multivariate normal random vector for the calculation of multidimensional integral. In addition, a proper asymptotic standard error of REML estimate is given.

A Study on Security Event Detection in ESM Using Big Data and Deep Learning

  • Lee, Hye-Min;Lee, Sang-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제13권3호
    • /
    • pp.42-49
    • /
    • 2021
  • As cyber attacks become more intelligent, there is difficulty in detecting advanced attacks in various fields such as industry, defense, and medical care. IPS (Intrusion Prevention System), etc., but the need for centralized integrated management of each security system is increasing. In this paper, we collect big data for intrusion detection and build an intrusion detection platform using deep learning and CNN (Convolutional Neural Networks). In this paper, we design an intelligent big data platform that collects data by observing and analyzing user visit logs and linking with big data. We want to collect big data for intrusion detection and build an intrusion detection platform based on CNN model. In this study, we evaluated the performance of the Intrusion Detection System (IDS) using the KDD99 dataset developed by DARPA in 1998, and the actual attack categories were tested with KDD99's DoS, U2R, and R2L using four probing methods.

Optimal filter design at the semiconductor gas sensor by using genetic algorithm (유전알고리즘을 이용한 반도체식 가스센서 최적 필터 설계)

  • Kong, Jung-Shik
    • Design & Manufacturing
    • /
    • 제16권1호
    • /
    • pp.15-20
    • /
    • 2022
  • This paper is about elimination the situation in which gas sensor data becomes inaccurate due to temperature control when a semiconductor gas sensor is driven. Recently, interest in semiconductor gas sensors is high because semiconductor sensors can be driven with small and low power. Although semiconductor-type gas sensors have various advantages, there is a problem that they must operate at high temperatures. First temperature control was configured to adjust the temperature value of the heater mounted on the gas sensor. At that time, in controlling the heater temperature, gas sensor data are fluctuated despite supplying same gas concentration according to the temperature controlled. To resolve this problem, gas and temperature are extracted as a data. And then, a relation function is constructed between gas and temperature data. At this time, it is included low pass filter to get the stable data. In this paper, we can find optimal gain and parameters between gas and temperature data by using genetic algorithm.

Developing the Korean National Archaeological Data Digital Archive: An Exploratory Study (국가 고고학 데이터 디지털 아카이브 개발을 위한 연구)

  • Rhee, Hea Lim
    • Journal of Korean Society of Archives and Records Management
    • /
    • 제18권2호
    • /
    • pp.1-28
    • /
    • 2018
  • Because archaeological artifacts are often destroyed during physical excavation, the data archaeologists gather in the field is rich with research potential. Few in Korea have paid attention to digital archives for archaeological data or argued for their development. This paper considers the significance and necessity of archaeological data and digital archives for its preservation and access. It also raises awareness of the need to develop a Korean national archaeological data digital archive. The paper first overviews the nature of the archaeological discipline, data, and digital archives. Then it investigates well-known, global cases involving digital archiving of archaeological data. Based on these foundations, the paper discusses principal and prior challenges to developing a Korean national archaeological data digital archive.

Collaborative Modeling of Medical Image Segmentation Based on Blockchain Network

  • Yang Luo;Jing Peng;Hong Su;Tao Wu;Xi Wu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권3호
    • /
    • pp.958-979
    • /
    • 2023
  • Due to laws, regulations, privacy, etc., between 70-90 percent of providers do not share medical data, forming a "data island". It is essential to collaborate across multiple institutions without sharing patient data. Most existing methods adopt distributed learning and centralized federal architecture to solve this problem, but there are problems of resource heterogeneity and data heterogeneity in the practical application process. This paper proposes a collaborative deep learning modelling method based on the blockchain network. The training process uses encryption parameters to replace the original remote source data transmission to protect privacy. Hyperledger Fabric blockchain is adopted to realize that the parties are not restricted by the third-party authoritative verification end. To a certain extent, the distrust and single point of failure caused by the centralized system are avoided. The aggregation algorithm uses the FedProx algorithm to solve the problem of device heterogeneity and data heterogeneity. The experiments show that the maximum improvement of segmentation accuracy in the collaborative training mode proposed in this paper is 11.179% compared to local training. In the sequential training mode, the average accuracy improvement is greater than 7%. In the parallel training mode, the average accuracy improvement is greater than 8%. The experimental results show that the model proposed in this paper can solve the current problem of centralized modelling of multicenter data. In particular, it provides ideas to solve privacy protection and break "data silos", and protects all data.

Discrimination model for cultivation origin of paper mulberry bast fiber and Hanji based on NIR and MIR spectral data combined with PLS-DA (닥나무 인피섬유와 한지의 원산지 판별모델 개발을 위한 NIR 및 MIR 스펙트럼 데이터의 PLS-DA 적용)

  • Jang, Kyung-Ju;Jung, So-Yoon;Go, In-Hee;Jeong, Seon-Hwa
    • Analytical Science and Technology
    • /
    • 제32권1호
    • /
    • pp.7-16
    • /
    • 2019
  • The objective of this study was the development of a discrimination model for the cultivational origin of paper mulberry bast fiber and Hanji using near infrared (NIR) and mid infrared (MIR) spectroscopy combined with partial least squares discriminant analysis (PLS-DA). Paper mulberry bast fiber was purchased in 10 different regions of Korea, and used to make Hanji. PLS-DA was performed using pre-treated FT-NIR and FT-MIR spectral data for paper mulberry bast fiber and Hanji. PLS-DA of paper mulberry bast fiber and Hanji samples, using FT-NIR spectral data, showed 100 % performance in cross validation and the confusion matrix (accuracy, sensitivity, and specificity). The discrimination models showed four regional groups which demonstrated clearer separation and much superior score plots in the NIR spectral data-based model than in the MIR spectral data-based model. Furthermore, the discrimination model based on the NIR spectral data of paper mulberry bast fiber had highly similar score morphology to that of the discrimination model based on the NIR spectral data of Hanji.

Scaling of Hadoop Cluster for Cost-Effective Processing of MapReduce Applications (비용 효율적 맵리듀스 처리를 위한 클러스터 규모 설정)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • 제15권1호
    • /
    • pp.107-114
    • /
    • 2020
  • This paper studies a method for estimating the scale of a Hadoop cluster to process big data as a cost-effective manner. In the case of medical institutions, demands for cloud-based big data analysis are increasing as medical records can be stored outside the hospital. This paper first analyze the Amazon EMR framework, which is one of the popular cloud-based big data framework. Then, this paper presents a efficiency model for scaling the Hadoop cluster to execute a Mapreduce application more cost-effectively. This paper also analyzes the factors that influence the execution of the Mapreduce application by performing several experiments under various conditions. The cost efficiency of the analysis of the big data can be increased by setting the scale of cluster with the most efficient processing time compared to the operational cost.

Analysis of paper map images for acquiring 3D terrain data (3차원 지형 자료 획득을 위한 지도 영상 분석)

  • LEE, JIN SEON
    • Journal of the Korea Computer Graphics Society
    • /
    • 제2권1호
    • /
    • pp.68-76
    • /
    • 1996
  • One of the major problems in GIS(Geographical Information Systems) involves acquiring 3-D terrain data. Because conventional methods such as land surveying or analysis of aerial photographs are costly, the method of using existing paper maps has been gaining considerable attention. This method demands three processing steps: 1) extraction of contours, 2) assignment of height values to the extracted contours, 3) reconstruction of 3-D terrain data. In this paper we systematically develop a procedure for acquiring 3-D terrain data from contour solutions. For the first two steps, we describe the necessary operations and roughly sketch solutions. For the last step, we propose an efficient raster-based algorithm and present the results of experiments with existing paper map images.

  • PDF