• 제목/요약/키워드: Data extraction

검색결과 3,329건 처리시간 0.032초

Considerations for generating meaningful HRA data: Lessons learned from HuREX data collection

  • Kim, Yochan
    • Nuclear Engineering and Technology
    • /
    • 제52권8호
    • /
    • pp.1697-1705
    • /
    • 2020
  • To enhance the credibility of human reliability analysis, various kinds of data have been recently collected and analyzed. Although it is obvious that the quality of data is critical, the practices or considerations for securing data quality have not been sufficiently discussed. In this work, based on the experience of the recent human reliability data extraction projects, which produced more than fifty thousand data-points, we derive a number of issues to be considered for generating meaningful data. As a result, thirteen considerations are presented here as pertaining to the four different data extraction activities: preparation, collection, analysis, and application. Although the lessons were acquired from a single kind of data collection framework, it is believed that these results will guide researchers to consider important issues in the process of extracting data.

Visualization for Digesting a High Volume of the Biomedical Literature

  • Lee, Chang-Su;Park, Jin-Ah;Park, Jong-C.
    • Bioinformatics and Biosystems
    • /
    • 제1권1호
    • /
    • pp.51-60
    • /
    • 2006
  • The paradigm in biology is currently changing from that of conducting hypothesis-driven individual experiments to that of utilizing the results of a massive data analysis with appropriate computational tools. We present LayMap, an implemented visualization system that helps the user to deal with a high volume of the biomedical literature such as MEDLINE, through the layered maps that are constructed on the results of an information extraction system. LayMap also utilizes filtering and granularity for an enhanced view of the results. Since a biomedical information extraction system gives rise to a focused and effective way of slicing up the data space, the combined use of LayMap with such an information extraction system can help the user to navigate the data space in a speedy and guided manner. As a case study, we have applied the system to datasets of journal abstracts on 'MAPK pathway' and 'bufalin' from MEDLINE. With the proposed visualization, we have successfully rediscovered pathway maps of a reasonable quality for ERK, p38 and JNK. Furthermore, with respect to bufalin, we were able to identify the potentially interesting relation between the Chinese medicine Chan su and apoptosis with a high level of detail.

  • PDF

Neural network rule extraction for credit scoring

  • Bart Baesens;Rudy Setiono;Lille, Valerina-De;Stijn Viaene
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2001년도 The Pacific Aisan Confrence On Intelligent Systems 2001
    • /
    • pp.128-132
    • /
    • 2001
  • In this paper, we evaluate and contrast four neural network rule extraction approaches for credit scoring. Experiments are carried our on three real life credit scoring data sets. Both the continuous and the discretised versions of all data sets are analysed The rule extraction algorithms, Neurolonear, Neurorule. Trepan and Nefclass, have different characteristics, with respect to their perception of the neural network and their way of representing the generated rules or knowledge. It is shown that Neurolinear, Neurorule and Trepan are able to extract very concise rule sets or trees with a high predictive accuracy when compared to classical decision tree(rule) induction algorithms like C4.5(rules). Especially Neurorule extracted easy to understand and powerful propositional if -then rules for all discretised data sets. Hence, the Neurorule algorithm may offer a viable alternative for rule generation and knowledge discovery in the domain of credit scoring.

  • PDF

CutPaste-Based Anomaly Detection Model using Multi Scale Feature Extraction in Time Series Streaming Data

  • Jeon, Byeong-Uk;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권8호
    • /
    • pp.2787-2800
    • /
    • 2022
  • The aging society increases emergency situations of the elderly living alone and a variety of social crimes. In order to prevent them, techniques to detect emergency situations through voice are actively researched. This study proposes CutPaste-based anomaly detection model using multi-scale feature extraction in time series streaming data. In the proposed method, an audio file is converted into a spectrogram. In this way, it is possible to use an algorithm for image data, such as CNN. After that, mutli-scale feature extraction is applied. Three images drawn from Adaptive Pooling layer that has different-sized kernels are merged. In consideration of various types of anomaly, including point anomaly, contextual anomaly, and collective anomaly, the limitations of a conventional anomaly model are improved. Finally, CutPaste-based anomaly detection is conducted. Since the model is trained through self-supervised learning, it is possible to detect a diversity of emergency situations as anomaly without labeling. Therefore, the proposed model overcomes the limitations of a conventional model that classifies only labelled emergency situations. Also, the proposed model is evaluated to have better performance than a conventional anomaly detection model.

Study on Plastics Detection Technique using Terra/ASTER Data

  • Syoji, Mizuhiko;Ohkawa, Kazumichi
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2003년도 Proceedings of ACRS 2003 ISRS
    • /
    • pp.1460-1463
    • /
    • 2003
  • In this study, plastic detection technique was developed, applying remote sensing technology as a method to extract plastic wastes, which is one of the big causes of concern contributing to environmental destruction. It is possible to extract areas where plastic (including polypropylene and polyethylene) wastes are prominent, using ASTER data by taking advantage of its absorptive characteristics of ASTER/SWIR bands. The algorithm is applicable to define large industrial wastes disposal sites and areas where plastic greenhouses are concentrated. However, the detection technique with ASTER/SWIR data has some research tasks to be tackled, which includes a partial secretion of reference spectral, depending on some conditions of plastic wastes and a detection error in a region mixed with vegetations and waters. Following results were obtained after making comparisons between several detection methods and plastic wastes in different conditions; (a)'spectral extraction method' was suitable for areas where plastic wastes exist separated from other objects, such as coastal areas where plastic wastes drifted ashore. (single plastic spectral was used as a reference for the 'spectral extraction method') (b)On the other hand, the 'spectral extraction method' was not suitable for sites where plastic wastes are mixed with vegetation and soil. After making comparison of the processing results of a mixed area, it was found that applying both 'separation method' using un-mixing and ‘spectral extraction method’ with NDVI masked is the most appropriate method to extract plastic wastes. Also, we have investigated the possibility of reducing the influence of vegetation and water, using ASTER/TIR, and successfully extracted some places with plastics. As a conclusion, we have summarized the relationship between detection techniques and conditions of plastic wastes and propose the practical application of remote sensing technology to the extraction of plastic wastes.

  • PDF

Interactive Morphological Analysis to Improve Accuracy of Keyword Extraction Based on Cohesion Scoring

  • Yu, Yang Woo;Kim, Hyeon Gyu
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권12호
    • /
    • pp.145-153
    • /
    • 2020
  • 최근 소셜 빅데이터를 대상으로 한 키워드 분석은 고객 관점의 의견이나 불만 사항을 추출하기 위한 목적으로 광범위하게 활용되고 있다. 이와 관련하여, 이전 연구에서는 키워드 분석의 정확도를 높이기 위해 응집도 점수를 활용한 방법을 제안하였으나, 리뷰의 수가 적을 경우 오류율이 증가하는 문제가 있었다. 본 논문에서는 응집도 점수 기반 알고리즘으로부터 추출된 키워드에 대해 간소화된 형태소 분석 단계를 후처리 형태로 적용함으로써 키워드 추출의 정확도를 개선하고자 하였다. 제안 방법은 입력 데이터가 주어질 때마다 필요한 형태소 분석 규칙을 점증적으로 추가할 수 있도록 지원함으로써, 사전의 크기를 최소화하고 분석의 효율을 높이고자 하였다. 또한 대화형 규칙 입력 시스템을 제공하여 분석 규칙 추가에 드는 노력을 최소화하고자 하였다. 제안 방법을 검증하기 위해 온라인에서 수집된 실제 리뷰를 대상으로 실험을 수행하였으며, 제안 방법을 적용할 경우 오류율이 기존 10%에서 1%로 개선되는 동시에, 5,000개의 리뷰 처리에 450ms가 소요되어 실시간 처리가 가능한 수준임을 확인하였다.

A Study of 3D Design Data Extraction for Thermal Forming Information

  • Kim, Jung;Park, Jung-Seo;Jo, Ye-Hyan;Shin, Jong-Gye;Kim, Won-Don;Ko, Kwang-Hee
    • Journal of Ship and Ocean Technology
    • /
    • 제12권3호
    • /
    • pp.1-13
    • /
    • 2008
  • In shipbuilding, diverse manufacturing techniques for automation have been developed and used in practice. Among them, however, the hull forming automation is the one that has not been of major concern compared with others such as welding and cutting. The basis of the development of this process is to find out how to extract thermal forming information. There exist various methods to obtain such information and the 3D design shape that needs to be formed should be extracted first for getting the necessary thermal forming information. Except well-established shipyards which operate 3D design systems, most of the shipyards only rely on 2.5D design systems and do not have an easy way to obtain 3D surface design data. So in this study, various shipbuilding design systems used by shipyards are investigated and a 3D design surface data extraction method is proposed from those design systems. Then an example is presented to show the extraction of real 3D surface data using the proposed method and computation of thermal forming information using the data.

A Two-stage Process for Increasing the Yield of Prebiotic-rich Extract from Pinus densiflora

  • Jung, Ji Young;Yang, Jae-Kyung
    • Journal of the Korean Wood Science and Technology
    • /
    • 제46권4호
    • /
    • pp.380-392
    • /
    • 2018
  • The importance of polysaccharides is increasing globally due to their role as a significant source of dietary prebiotics in the human diet. In the present study, in order to maximize the yield of crude polysaccharides from Pinus densiflora, response surface methodology (RSM) was used to optimize a two-stage extraction process consisting of steam explosion and water extraction. Three independent main variables, namely, the severity factor (Ro) for the steam explosion process, the water extraction temperature ($^{\circ}C$), and the ratio of water to raw material (v/w), were studied with respect to prebiotic sugar content. A Box-Behnken design was created on the basis of the results of these single-factor tests. The experimental data were fitted to a second-order polynomial equation for multiple regression analysis and examined using the appropriate statistical methods. The data showed that both the severity factor (Ro) and the ratio of water to material (v/w) had significant effects on the prebiotic sugar content. The optimal conditions for the two-stage process were as follows: a severity factor (Ro) of 3.86, a water extraction temperature of $89.66^{\circ}C$, and a ratio of water to material (v/w) of 39.20. Under these conditions, the prebiotic sugar content in the extract was 332.45 mg/g.

A TWO-STAGE SOURCE EXTRACTION ALGORITHM FOR TEMPORALLY CORRELATED SIGNALS BASED ON ICA-R

  • Zhang, Hongjuan;Shi, Zhenwei;Guo, Chonghui;Feng, Enmin
    • Journal of applied mathematics & informatics
    • /
    • 제26권5_6호
    • /
    • pp.1149-1159
    • /
    • 2008
  • Blind source extraction (BSE) is a special class of blind source separation (BSS) methods, which only extracts one or a subset of the sources at a time. Based on the time delay of the desired signal, a simple but important extraction algorithm (simplified " BC algorithm")was presented by Barros and Cichocki. However, the performance of this method is not satisfying in some cases for which it only carries out the constrained minimization of the mean squared error. To overcome these drawbacks, ICA with reference (ICA-R) based approach, which considers the higher-order statistics of sources, is added as the second stage for further source extraction. Specifically, BC algorithm is exploited to roughly extract the desired signal. Then the extracted signal in the first stage, as the reference signal of ICA-R method, is further used to extract the desired sources as cleanly as possible. Simulations on synthetic data and real-world data show its validity and usefulness.

  • PDF

EXTRACTION OF THE LEAN TISSUE BOUNDARY OF A BEEF CARCASS

  • Lee, C. H.;H. Hwang
    • 한국농업기계학회:학술대회논문집
    • /
    • 한국농업기계학회 2000년도 THE THIRD INTERNATIONAL CONFERENCE ON AGRICULTURAL MACHINERY ENGINEERING. V.III
    • /
    • pp.715-721
    • /
    • 2000
  • In this research, rule and neuro net based boundary extraction algorithm was developed. Extracting boundary of the interest, lean tissue, is essential for the quality evaluation of the beef based on color machine vision. Major quality features of the beef are size, marveling state of the lean tissue, color of the fat, and thickness of back fat. To evaluate the beef quality, extracting of loin parts from the sectional image of beef rib is crucial and the first step. Since its boundary is not clear and very difficult to trace, neural network model was developed to isolate loin parts from the entire image input. At the stage of training network, normalized color image data was used. Model reference of boundary was determined by binary feature extraction algorithm using R(red) channel. And 100 sub-images(selected from maximum extended boundary rectangle 11${\times}$11 masks) were used as training data set. Each mask has information on the curvature of boundary. The basic rule in boundary extraction is the adaptation of the known curvature of the boundary. The structured model reference and neural net based boundary extraction algorithm was developed and implemented to the beef image and results were analyzed.

  • PDF