• Title/Summary/Keyword: data repository

Search Result 436, Processing Time 0.024 seconds

A Differential Evolution based Support Vector Clustering (차분진화 기반의 Support Vector Clustering)

  • Jun, Sung-Hae
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.5
    • /
    • pp.679-683
    • /
    • 2007
  • Statistical learning theory by Vapnik consists of support vector machine(SVM), support vector regression(SVR), and support vector clustering(SVC) for classification, regression, and clustering respectively. In this algorithms, SVC is good clustering algorithm using support vectors based on Gaussian kernel function. But, similar to SVM and SVR, SVC needs to determine kernel parameters and regularization constant optimally. In general, the parameters have been determined by the arts of researchers and grid search which is demanded computing time heavily. In this paper, we propose a differential evolution based SVC(DESVC) which combines differential evolution into SVC for efficient selection of kernel parameters and regularization constant. To verify improved performance of our DESVC, we make experiments using the data sets from UCI machine learning repository and simulation.

Analysis of permeability in rock fracture with effective stress at deep depth

  • Lee, Hangbok;Oh, Tae-Min;Park, Chan
    • Geomechanics and Engineering
    • /
    • v.22 no.5
    • /
    • pp.375-384
    • /
    • 2020
  • In this study, the application of conventional cubic law to a deep depth condition was experimentally evaluated. Moreover, a modified equation for estimating the rock permeability at a deep depth was suggested using precise hydraulic tests and an effect analysis according to the vertical stress, pore water pressure and fracture roughness. The experimental apparatus which enabled the generation of high pore water pressure (< 10 MPa) and vertical stress (< 20 MPa) was manufactured, and the surface roughness of a cylindrical rock sample was quantitatively analyzed by means of 3D (three-dimensional) laser scanning. Experimental data of the injected pore water pressure and outflow rate obtained through the hydraulic test were applied to the cubic law equation, which was used to estimate the permeability of rock fracture. The rock permeability was estimated under various pressure (vertical stress and pore water pressure) and geometry (roughness) conditions. Finally, an empirical formula was proposed by considering nonlinear flow behavior; the formula can be applied to evaluations of changes of rock permeability levels in deep underground facility such as nuclear waste disposal repository with high vertical stress and pore water pressure levels.

Development of New Processes for the Decommissioning Decontamination and for Treatment and Disposal of the Secondary Low- and Intermediate-Level Radioactive Waste

  • John, Jan;Bartl, Pavel;Cubova, Katerina;Nemec, Mojmir;Semelova, Miroslava;Sebesta, Ferdinand;Sobova, Tereza;Sul'akova, Jana;Vetesnik, Ales;Vopalka, Dusan
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.19 no.1
    • /
    • pp.9-27
    • /
    • 2021
  • As an example of research activities in decontamination for decommissioning, new data are presented on the options for corrosion layer dissolution during the decommissioning decontamination, or persulfate regeneration for decontamination solutions re-use. For the management of spent decontamination solutions, new method based on solvent extraction of radionuclides into ionic liquid followed by electrodeposition of the radionuclides has been developed. Fields of applications of composite inorganic-organic absorbers or solid extractants with polyacrylonitrile (PAN) binding matrix for the treatment of liquid radioactive waste are reviewed; a method for americium separation from the boric acid containing NPP evaporator concentrates based on the TODGA-PAN material is discussed in more detail. Performance of a model of radionuclide transport, developed and implemented within the GoldSim programming environment, for the safety studies of the LLW/ILW repository is demonstrated on the specific case of the Richard repository (Czech Republic). Continuation and even broadening of these activities are expected in connection with the approaching end of the lifespan of the first blocks of the Czech NPPs.

Improving classification of low-resource COVID-19 literature by using Named Entity Recognition

  • Lithgow-Serrano, Oscar;Cornelius, Joseph;Kanjirangat, Vani;Mendez-Cruz, Carlos-Francisco;Rinaldi, Fabio
    • Genomics & Informatics
    • /
    • v.19 no.3
    • /
    • pp.22.1-22.5
    • /
    • 2021
  • Automatic document classification for highly interrelated classes is a demanding task that becomes more challenging when there is little labeled data for training. Such is the case of the coronavirus disease 2019 (COVID-19) clinical repository-a repository of classified and translated academic articles related to COVID-19 and relevant to the clinical practice-where a 3-way classification scheme is being applied to COVID-19 literature. During the 7th Biomedical Linked Annotation Hackathon (BLAH7) hackathon, we performed experiments to explore the use of named-entity-recognition (NER) to improve the classification. We processed the literature with OntoGene's Biomedical Entity Recogniser (OGER) and used the resulting identified Named Entities (NE) and their links to major biological databases as extra input features for the classifier. We compared the results with a baseline model without the OGER extracted features. In these proof-of-concept experiments, we observed a clear gain on COVID-19 literature classification. In particular, NE's origin was useful to classify document types and NE's type for clinical specialties. Due to the limitations of the small dataset, we can only conclude that our results suggests that NER would benefit this classification task. In order to accurately estimate this benefit, further experiments with a larger dataset would be needed.

Characterization of Groundwater Colloids From the Granitic KURT Site and Their Roles in Radionuclide Migration

  • Baik, Min-Hoon;Park, Tae-Jin;Cho, Hye-Ryun;Jung, Euo Chang
    • Journal of Nuclear Fuel Cycle and Waste Technology(JNFCWT)
    • /
    • v.20 no.3
    • /
    • pp.279-296
    • /
    • 2022
  • The fundamental characteristics of groundwater colloids, such as composition, concentration, size, and stability, were analyzed using granitic groundwater samples taken from the KAERI Underground Research Tunnel (KURT) site by such analytical methods as inductively coupled plasma-mass spectrometry, field emission-transmission electron microscopy, a liquid chromatography-organic carbon detector, and dynamic light scattering technique. The results show that the KURT groundwater colloids are mainly composed of clay minerals, calcite, metal (Fe) oxide, and organic matter. The size and concentration of the groundwater colloids were 10-250 nm and 33-64 ㎍·L-1, respectively. These values are similar to those from other studies performed in granitic groundwater. The groundwater colloids were found to be moderately stable under the groundwater conditions of the KURT site. Consequently, the groundwater colloids in the fractured granite system of the KURT site can form stable radiocolloids and increase the mobility of radionuclides if they associate with radionuclides released from a radioactive waste repository. The results provide basic data for evaluating the effects of groundwater colloids on radionuclide migration in fractured granite rock, which is necessary for the safety assessment of a high-level radioactive waste repository.

Design and Implementation of XML based Global Peer-to-Peer Engine (XML기반 전역 Peer-to-Peer 엔진 설계 및 구현)

  • Kwon Tae-suk;Lee Il-su;Lee Sung-young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1B
    • /
    • pp.73-85
    • /
    • 2004
  • SIn this paper, we introduce our experience for designing and implementing new concept of a global XML-based Peer-to-Peer (P2P) engine to support various P2P applications, and interconnection among PC, Web and mobile computing environments. The proposed P2P engine can support to heterogeneous data exchanges and web interconnection by facilitating with the text-base XML while message exchange are necessary. It is also to provide multi-level security functions as well as to apply different types of security algorithms. The system consist of four modules; a message dispatcher to scheduling and filtering the message, a SecureNet to providing security services and data transmission, a Discovery Manager to constructing peer-to-peer networking, and a Repository Manager to processing data management including XML documents. As a feasibility test, we implement various P2P services such as chatting as a communication service, white-board as an authoring tool set during collaborative working, and a file system as a file sharing service. We also compared the proposed system to a Gnutella in order to measure performance of the systems.

Fuzzy Clustering Model using Principal Components Analysis and Naive Bayesian Classifier (주성분 분석과 나이브 베이지안 분류기를 이용한 퍼지 군집화 모형)

  • Jun, Sung-Hae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.485-490
    • /
    • 2004
  • In data representation, the clustering performs a grouping process which combines given data into some similar clusters. The various similarity measures have been used in many researches. But, the validity of clustering results is subjective and ambiguous, because of difficulty and shortage about objective criterion of clustering. The fuzzy clustering provides a good method for subjective clustering problems. It performs clustering through the similarity matrix which has fuzzy membership value for assigning each object. In this paper, for objective fuzzy clustering, the clustering algorithm which joins principal components analysis as a dimension reduction model with bayesian learning as a statistical learning theory. For performance evaluation of proposed algorithm, Iris and Glass identification data from UCI Machine Learning repository are used. The experimental results shows a happy outcome of proposed model.

Ensemble Learning of Region Based Classifiers (지역 기반 분류기의 앙상블 학습)

  • Choi, Sung-Ha;Lee, Byung-Woo;Yang, Ji-Hoon
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.303-310
    • /
    • 2007
  • In machine learning, the ensemble classifier that is a set of classifiers have been introduced for higher accuracy than individual classifiers. We propose a new ensemble learning method that employs a set of region based classifiers. To show the performance of the proposed method. we compared its performance with that of bagging and boosting, which ard existing ensemble methods. Since the distribution of data can be different in different regions in the feature space, we split the data and generate classifiers based on each region and apply a weighted voting among the classifiers. We used 11 data sets from the UCI Machine Learning Repository to compare the performance of our new ensemble method with that of individual classifiers as well as existing ensemble methods such as bagging and boosting. As a result, we found that our method produced improved performance, particularly when the base learner is Naive Bayes or SVM.

REVIEW AND COMPILATION OF DATA ON RADIONUCLIDE MIGRATION AND RETARDATION FOR THE PERFORMANCE ASSESSMENT OF A HLW REPOSITORY IN KOREA

  • Baik, Min-Hoon;Lee, Seung-Yeop;Lee, Jae-Kwang;Kim, Seung-Soo;Park, Chung-Kyun;Choi, Jong-Won
    • Nuclear Engineering and Technology
    • /
    • v.40 no.7
    • /
    • pp.593-606
    • /
    • 2008
  • In this study, data on radionuclide migration and retardation processes in the engineered and natural barriers of High-Level Radioactive Waste (HLW) repository have been reviewed and compiled for use in the performance assessment of a HLW disposal system in Korea. The status of the database on radionuclide migration and retardation that is being developed in Korea is investigated and summarized in this study. The solubilities of major actinides such as D, Th, Am, Np, and Pu both in Korean bentonite porewater and in deep Korean groundwater are calculated by using the geochemical code PHREEQC (Ver. 2.0) based on the KAERI-TDB(Korea Atomic Energy Research Institute-Thermochemical Database), which is under development. Databases for the diffusion coefficients ($D^b_e$ values) and distribution coefficients ($K^b_d$ values) of some radionuclides in the compacted Korean Ca-bentonite are developed based upon domestic experimental results. Databases for the rock matrix diffusion coefficients ($D^r_e$ values) and distribution coefficients ($K^r_d$ values) of some radionuclides for Korean granite rock and deep groundwater are also developed based upon domestic experimental results. Finally, data related to colloids such as the characteristics of natural groundwater colloids and the pseudo-colloid formation constants ($K_{pc}$ values) are provided for the consideration of colloid effects in the performance assessment.

Formation of Nearest Neighbors Set Based on Similarity Threshold (유사도 임계치에 근거한 최근접 이웃 집합의 구성)

  • Lee, Jae-Sik;Lee, Jin-Chun
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.1-14
    • /
    • 2007
  • Case-based reasoning (CBR) is one of the most widely applied data mining techniques and has proven its effectiveness in various domains. Since CBR is basically based on k-Nearest Neighbors (NN) method, the value of k affects the performance of CBR model directly. Once the value of k is set, it is fixed for the lifetime of the CBR model. However, if the value is set greater or smaller than the optimal value, the performance of CBR model will be deteriorated. In this research, we propose a new method of composing the NN set using similarity scores as themselves, which we shall call s-NN method, rather than using the fixed value of k. In the s-NN method, the different number of nearest neighbors can be selected for each new case. Performance evaluation using the data from UCI Machine Learning Repository shows that the CBR model adopting the s-NN method outperforms the CBR model adopting the traditional k-NN method.

  • PDF