• Title/Summary/Keyword: data extraction

Search Result 3,329, Processing Time 0.033 seconds

Considerations for generating meaningful HRA data: Lessons learned from HuREX data collection

  • Kim, Yochan
    • Nuclear Engineering and Technology
    • /
    • v.52 no.8
    • /
    • pp.1697-1705
    • /
    • 2020
  • To enhance the credibility of human reliability analysis, various kinds of data have been recently collected and analyzed. Although it is obvious that the quality of data is critical, the practices or considerations for securing data quality have not been sufficiently discussed. In this work, based on the experience of the recent human reliability data extraction projects, which produced more than fifty thousand data-points, we derive a number of issues to be considered for generating meaningful data. As a result, thirteen considerations are presented here as pertaining to the four different data extraction activities: preparation, collection, analysis, and application. Although the lessons were acquired from a single kind of data collection framework, it is believed that these results will guide researchers to consider important issues in the process of extracting data.

Visualization for Digesting a High Volume of the Biomedical Literature

  • Lee, Chang-Su;Park, Jin-Ah;Park, Jong-C.
    • Bioinformatics and Biosystems
    • /
    • v.1 no.1
    • /
    • pp.51-60
    • /
    • 2006
  • The paradigm in biology is currently changing from that of conducting hypothesis-driven individual experiments to that of utilizing the results of a massive data analysis with appropriate computational tools. We present LayMap, an implemented visualization system that helps the user to deal with a high volume of the biomedical literature such as MEDLINE, through the layered maps that are constructed on the results of an information extraction system. LayMap also utilizes filtering and granularity for an enhanced view of the results. Since a biomedical information extraction system gives rise to a focused and effective way of slicing up the data space, the combined use of LayMap with such an information extraction system can help the user to navigate the data space in a speedy and guided manner. As a case study, we have applied the system to datasets of journal abstracts on 'MAPK pathway' and 'bufalin' from MEDLINE. With the proposed visualization, we have successfully rediscovered pathway maps of a reasonable quality for ERK, p38 and JNK. Furthermore, with respect to bufalin, we were able to identify the potentially interesting relation between the Chinese medicine Chan su and apoptosis with a high level of detail.

  • PDF

Neural network rule extraction for credit scoring

  • Bart Baesens;Rudy Setiono;Lille, Valerina-De;Stijn Viaene
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.128-132
    • /
    • 2001
  • In this paper, we evaluate and contrast four neural network rule extraction approaches for credit scoring. Experiments are carried our on three real life credit scoring data sets. Both the continuous and the discretised versions of all data sets are analysed The rule extraction algorithms, Neurolonear, Neurorule. Trepan and Nefclass, have different characteristics, with respect to their perception of the neural network and their way of representing the generated rules or knowledge. It is shown that Neurolinear, Neurorule and Trepan are able to extract very concise rule sets or trees with a high predictive accuracy when compared to classical decision tree(rule) induction algorithms like C4.5(rules). Especially Neurorule extracted easy to understand and powerful propositional if -then rules for all discretised data sets. Hence, the Neurorule algorithm may offer a viable alternative for rule generation and knowledge discovery in the domain of credit scoring.

  • PDF

CutPaste-Based Anomaly Detection Model using Multi Scale Feature Extraction in Time Series Streaming Data

  • Jeon, Byeong-Uk;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2787-2800
    • /
    • 2022
  • The aging society increases emergency situations of the elderly living alone and a variety of social crimes. In order to prevent them, techniques to detect emergency situations through voice are actively researched. This study proposes CutPaste-based anomaly detection model using multi-scale feature extraction in time series streaming data. In the proposed method, an audio file is converted into a spectrogram. In this way, it is possible to use an algorithm for image data, such as CNN. After that, mutli-scale feature extraction is applied. Three images drawn from Adaptive Pooling layer that has different-sized kernels are merged. In consideration of various types of anomaly, including point anomaly, contextual anomaly, and collective anomaly, the limitations of a conventional anomaly model are improved. Finally, CutPaste-based anomaly detection is conducted. Since the model is trained through self-supervised learning, it is possible to detect a diversity of emergency situations as anomaly without labeling. Therefore, the proposed model overcomes the limitations of a conventional model that classifies only labelled emergency situations. Also, the proposed model is evaluated to have better performance than a conventional anomaly detection model.

Study on Plastics Detection Technique using Terra/ASTER Data

  • Syoji, Mizuhiko;Ohkawa, Kazumichi
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1460-1463
    • /
    • 2003
  • In this study, plastic detection technique was developed, applying remote sensing technology as a method to extract plastic wastes, which is one of the big causes of concern contributing to environmental destruction. It is possible to extract areas where plastic (including polypropylene and polyethylene) wastes are prominent, using ASTER data by taking advantage of its absorptive characteristics of ASTER/SWIR bands. The algorithm is applicable to define large industrial wastes disposal sites and areas where plastic greenhouses are concentrated. However, the detection technique with ASTER/SWIR data has some research tasks to be tackled, which includes a partial secretion of reference spectral, depending on some conditions of plastic wastes and a detection error in a region mixed with vegetations and waters. Following results were obtained after making comparisons between several detection methods and plastic wastes in different conditions; (a)'spectral extraction method' was suitable for areas where plastic wastes exist separated from other objects, such as coastal areas where plastic wastes drifted ashore. (single plastic spectral was used as a reference for the 'spectral extraction method') (b)On the other hand, the 'spectral extraction method' was not suitable for sites where plastic wastes are mixed with vegetation and soil. After making comparison of the processing results of a mixed area, it was found that applying both 'separation method' using un-mixing and ‘spectral extraction method’ with NDVI masked is the most appropriate method to extract plastic wastes. Also, we have investigated the possibility of reducing the influence of vegetation and water, using ASTER/TIR, and successfully extracted some places with plastics. As a conclusion, we have summarized the relationship between detection techniques and conditions of plastic wastes and propose the practical application of remote sensing technology to the extraction of plastic wastes.

  • PDF

Interactive Morphological Analysis to Improve Accuracy of Keyword Extraction Based on Cohesion Scoring

  • Yu, Yang Woo;Kim, Hyeon Gyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.145-153
    • /
    • 2020
  • Recently, keyword extraction from social big data has been widely used for the purpose of extracting opinions or complaints from the user's perspective. Regarding this, our previous work suggested a method to improve accuracy of keyword extraction based on the notion of cohesion scoring, but its accuracy can be degraded when the number of input reviews is relatively small. This paper presents a method to resolve this issue by applying simplified morphological analysis as a postprocessing step to extracted keywords generated from the algorithm discussed in the previous work. The proposed method enables to add analysis rules necessary to process input data incrementally whenever new data arrives, which leads to reduction of a dictionary size and improvement of analysis efficiency. In addition, an interactive rule adder is provided to minimize efforts to add new rules. To verify performance of the proposed method, experiments were conducted based on real social reviews collected from online, where the results showed that error ratio was reduced from 10% to 1% by applying our method and it took 450 milliseconds to process 5,000 reviews, which means that keyword extraction can be performed in a timely manner in the proposed method.

A Study of 3D Design Data Extraction for Thermal Forming Information

  • Kim, Jung;Park, Jung-Seo;Jo, Ye-Hyan;Shin, Jong-Gye;Kim, Won-Don;Ko, Kwang-Hee
    • Journal of Ship and Ocean Technology
    • /
    • v.12 no.3
    • /
    • pp.1-13
    • /
    • 2008
  • In shipbuilding, diverse manufacturing techniques for automation have been developed and used in practice. Among them, however, the hull forming automation is the one that has not been of major concern compared with others such as welding and cutting. The basis of the development of this process is to find out how to extract thermal forming information. There exist various methods to obtain such information and the 3D design shape that needs to be formed should be extracted first for getting the necessary thermal forming information. Except well-established shipyards which operate 3D design systems, most of the shipyards only rely on 2.5D design systems and do not have an easy way to obtain 3D surface design data. So in this study, various shipbuilding design systems used by shipyards are investigated and a 3D design surface data extraction method is proposed from those design systems. Then an example is presented to show the extraction of real 3D surface data using the proposed method and computation of thermal forming information using the data.

A Two-stage Process for Increasing the Yield of Prebiotic-rich Extract from Pinus densiflora

  • Jung, Ji Young;Yang, Jae-Kyung
    • Journal of the Korean Wood Science and Technology
    • /
    • v.46 no.4
    • /
    • pp.380-392
    • /
    • 2018
  • The importance of polysaccharides is increasing globally due to their role as a significant source of dietary prebiotics in the human diet. In the present study, in order to maximize the yield of crude polysaccharides from Pinus densiflora, response surface methodology (RSM) was used to optimize a two-stage extraction process consisting of steam explosion and water extraction. Three independent main variables, namely, the severity factor (Ro) for the steam explosion process, the water extraction temperature ($^{\circ}C$), and the ratio of water to raw material (v/w), were studied with respect to prebiotic sugar content. A Box-Behnken design was created on the basis of the results of these single-factor tests. The experimental data were fitted to a second-order polynomial equation for multiple regression analysis and examined using the appropriate statistical methods. The data showed that both the severity factor (Ro) and the ratio of water to material (v/w) had significant effects on the prebiotic sugar content. The optimal conditions for the two-stage process were as follows: a severity factor (Ro) of 3.86, a water extraction temperature of $89.66^{\circ}C$, and a ratio of water to material (v/w) of 39.20. Under these conditions, the prebiotic sugar content in the extract was 332.45 mg/g.

A TWO-STAGE SOURCE EXTRACTION ALGORITHM FOR TEMPORALLY CORRELATED SIGNALS BASED ON ICA-R

  • Zhang, Hongjuan;Shi, Zhenwei;Guo, Chonghui;Feng, Enmin
    • Journal of applied mathematics & informatics
    • /
    • v.26 no.5_6
    • /
    • pp.1149-1159
    • /
    • 2008
  • Blind source extraction (BSE) is a special class of blind source separation (BSS) methods, which only extracts one or a subset of the sources at a time. Based on the time delay of the desired signal, a simple but important extraction algorithm (simplified " BC algorithm")was presented by Barros and Cichocki. However, the performance of this method is not satisfying in some cases for which it only carries out the constrained minimization of the mean squared error. To overcome these drawbacks, ICA with reference (ICA-R) based approach, which considers the higher-order statistics of sources, is added as the second stage for further source extraction. Specifically, BC algorithm is exploited to roughly extract the desired signal. Then the extracted signal in the first stage, as the reference signal of ICA-R method, is further used to extract the desired sources as cleanly as possible. Simulations on synthetic data and real-world data show its validity and usefulness.

  • PDF

EXTRACTION OF THE LEAN TISSUE BOUNDARY OF A BEEF CARCASS

  • Lee, C. H.;H. Hwang
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 2000.11c
    • /
    • pp.715-721
    • /
    • 2000
  • In this research, rule and neuro net based boundary extraction algorithm was developed. Extracting boundary of the interest, lean tissue, is essential for the quality evaluation of the beef based on color machine vision. Major quality features of the beef are size, marveling state of the lean tissue, color of the fat, and thickness of back fat. To evaluate the beef quality, extracting of loin parts from the sectional image of beef rib is crucial and the first step. Since its boundary is not clear and very difficult to trace, neural network model was developed to isolate loin parts from the entire image input. At the stage of training network, normalized color image data was used. Model reference of boundary was determined by binary feature extraction algorithm using R(red) channel. And 100 sub-images(selected from maximum extended boundary rectangle 11${\times}$11 masks) were used as training data set. Each mask has information on the curvature of boundary. The basic rule in boundary extraction is the adaptation of the known curvature of the boundary. The structured model reference and neural net based boundary extraction algorithm was developed and implemented to the beef image and results were analyzed.

  • PDF