• Title/Summary/Keyword: Diverse data sets

Search Result 77, Processing Time 0.025 seconds

Deformable image registration in radiation therapy

  • Oh, Seungjong;Kim, Siyong
    • Radiation Oncology Journal
    • /
    • v.35 no.2
    • /
    • pp.101-111
    • /
    • 2017
  • The number of imaging data sets has significantly increased during radiation treatment after introducing a diverse range of advanced techniques into the field of radiation oncology. As a consequence, there have been many studies proposing meaningful applications of imaging data set use. These applications commonly require a method to align the data sets at a reference. Deformable image registration (DIR) is a process which satisfies this requirement by locally registering image data sets into a reference image set. DIR identifies the spatial correspondence in order to minimize the differences between two or among multiple sets of images. This article describes clinical applications, validation, and algorithms of DIR techniques. Applications of DIR in radiation treatment include dose accumulation, mathematical modeling, automatic segmentation, and functional imaging. Validation methods discussed are based on anatomical landmarks, physical phantoms, digital phantoms, and per application purpose. DIR algorithms are also briefly reviewed with respect to two algorithmic components: similarity index and deformation models.

Video augmentation technique for human action recognition using genetic algorithm

  • Nida, Nudrat;Yousaf, Muhammad Haroon;Irtaza, Aun;Velastin, Sergio A.
    • ETRI Journal
    • /
    • v.44 no.2
    • /
    • pp.327-338
    • /
    • 2022
  • Classification models for human action recognition require robust features and large training sets for good generalization. However, data augmentation methods are employed for imbalanced training sets to achieve higher accuracy. These samples generated using data augmentation only reflect existing samples within the training set, their feature representations are less diverse and hence, contribute to less precise classification. This paper presents new data augmentation and action representation approaches to grow training sets. The proposed approach is based on two fundamental concepts: virtual video generation for augmentation and representation of the action videos through robust features. Virtual videos are generated from the motion history templates of action videos, which are convolved using a convolutional neural network, to generate deep features. Furthermore, by observing an objective function of the genetic algorithm, the spatiotemporal features of different samples are combined, to generate the representations of the virtual videos and then classified through an extreme learning machine classifier on MuHAVi-Uncut, iXMAS, and IAVID-1 datasets.

The Design of Optimized Type-2 Fuzzy Neural Networks and Its Application (최적 Type-2 퍼지신경회로망 설계와 응용)

  • Kim, Gil-Sung;Ahn, Ihn-Seok;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.58 no.8
    • /
    • pp.1615-1623
    • /
    • 2009
  • In order to develop reliable on-site partial discharge (PD) pattern recognition algorithm, we introduce Type-2 Fuzzy Neural Networks (T2FNNs) optimized by means of Particle Swarm Optimization(PSO). T2FNNs exploit Type-2 fuzzy sets which have a characteristic of robustness in the diverse area of intelligence systems. Considering the on-site situation where it is not easy to obtain voltage phases to be used for PRPDA (Phase Resolved Partial Discharge Analysis), the PD data sets measured in the laboratory were artificially changed into data sets with shifted voltage phases and added noise in order to test the proposed algorithm. Also, the results obtained by the proposed algorithm were compared with that of conventional Neural Networks(NNs) as well as the existing Radial Basis Function Neural Networks (RBFNNs). The T2FNNs proposed in this study were appeared to have better performance when compared to conventional NNs and RBFNNs.

The application of GIS in analyzing acoustical and multidimensional data related to artificial reefs ground (인공어초 어장에서 수록한 음향학적 다차원 데이터 해석을 위한 GIS의 응용)

  • Kang, Myoung-Hee;Nakamura, Takeshi;Hamano, Akira
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.47 no.3
    • /
    • pp.222-233
    • /
    • 2011
  • This study is for the multi-dimensional analysis of diverse data sets for artificial reefs off the coast of Shimonoseki, Yamaguchi prefecture, Japan. Various data sets recorded in artificial reefs ground were integrated in new GIS software: to reveal the relationships between water temperature and fish schools; to visualize the quantitative connection between the reefs and the fish schools; and to compare the seabed types derived from two different data sources. The results obtained suggest that the application of GIS in analyzing multi-dimensional data is a better way to understand the characteristics of fish schools and environmental information around artificial reefs and particularly in the evaluation of the effectiveness of artificial reefs.

Significant Gene Selection Using Integrated Microarray Data Set with Batch Effect

  • Kim Ki-Yeol;Chung Hyun-Cheol;Jeung Hei-Cheul;Shin Ji-Hye;Kim Tae-Soo;Rha Sun-Young
    • Genomics & Informatics
    • /
    • v.4 no.3
    • /
    • pp.110-117
    • /
    • 2006
  • In microarray technology, many diverse experimental features can cause biases including RNA sources, microarray production or different platforms, diverse sample processing and various experiment protocols. These systematic effects cause a substantial obstacle in the analysis of microarray data. When such data sets derived from different experimental processes were used, the analysis result was almost inconsistent and it is not reliable. Therefore, one of the most pressing challenges in the microarray field is how to combine data that comes from two different groups. As the novel trial to integrate two data sets with batch effect, we simply applied standardization to microarray data before the significant gene selection. In the gene selection step, we used new defined measure that considers the distance between a gene and an ideal gene as well as the between-slide and within-slide variations. Also we discussed the association of biological functions and different expression patterns in selected discriminative gene set. As a result, we could confirm that batch effect was minimized by standardization and the selected genes from the standardized data included various expression pattems and the significant biological functions.

Applications of Diverse Data Combinations in Subsurface Characterization using D-optimality Based Pilot Point Methods (DBM)

  • Jung, Yong;Mahinthakumar, G.
    • Journal of Soil and Groundwater Environment
    • /
    • v.18 no.2
    • /
    • pp.45-53
    • /
    • 2013
  • Many cases of strategically designed groundwater remediation have lack of information of hydraulic conductivity or permeability, which can render remediation methods inefficient. Many studies have been carried out to minimize this shortcoming by determining detailed hydraulic information either through direct or indirect measurements. One popular method for hydraulic characterization is the pilot point method (PPM), where the hydraulic property is estimated at a small number of strategically selected points using secondary measurements such as hydraulic head or tracer concentration. This paper adopted a D-optimality based pilot point method (DBM) developed previously for hydraulic head measurements and extended it to include both hydraulic head and tracer measurements. Based on different combinations of trials, our analysis showed that DBM performs well when hydraulic head is used for pilot point selection and both hydraulic head and tracer measurements are used for determining the conductivity values.

Reinforcement learning multi-agent using unsupervised learning in a distributed cloud environment

  • Gu, Seo-Yeon;Moon, Seok-Jae;Park, Byung-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.192-198
    • /
    • 2022
  • Companies are building and utilizing their own data analysis systems according to business characteristics in the distributed cloud. However, as businesses and data types become more complex and diverse, the demand for more efficient analytics has increased. In response to these demands, in this paper, we propose an unsupervised learning-based data analysis agent to which reinforcement learning is applied for effective data analysis. The proposal agent consists of reinforcement learning processing manager and unsupervised learning manager modules. These two modules configure an agent with k-means clustering on multiple nodes and then perform distributed training on multiple data sets. This enables data analysis in a relatively short time compared to conventional systems that perform analysis of large-scale data in one batch.

Improving Diversity of Keyword Search on Graph-structured Data by Controlling Similarity of Content Nodes (콘텐트 노드의 유사성 제어를 통한 그래프 구조 데이터 검색의 다양성 향상)

  • Park, Chang-Sup
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.3
    • /
    • pp.18-30
    • /
    • 2020
  • Recently, as graph-structured data is widely used in various fields such as social networks and semantic Webs, needs for an effective and efficient search on a large amount of graph data have been increasing. Previous keyword-based search methods often find results by considering only the relevance to a given query. However, they are likely to produce semantically similar results by selecting answers which have high query relevance but share the same content nodes. To improve the diversity of search results, we propose a top-k search method that finds a set of subtrees which are not only relevant but also diverse in terms of the content nodes by controlling their similarity. We define a criterion for a set of diverse answer trees and design two kinds of diversified top-k search algorithms which are based on incremental enumeration and A heuristic search, respectively. We also suggest an improvement on the A search algorithm to enhance its performance. We show by experiments using real data sets that the proposed heuristic search method can find relevant answers with diverse content nodes efficiently.

Generation of Pareto Sets based on Resource Reduction for Multi-Objective Problems Involving Project Scheduling and Resource Leveling (프로젝트 일정과 자원 평준화를 포함한 다목적 최적화 문제에서 순차적 자원 감소에 기반한 파레토 집합의 생성)

  • Jeong, Woo-Jin;Park, Sung-Chul;Yim, Dong-Soon
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.79-86
    • /
    • 2020
  • To make a satisfactory decision regarding project scheduling, a trade-off between the resource-related cost and project duration must be considered. A beneficial method for decision makers is to provide a number of alternative schedules of diverse project duration with minimum resource cost. In view of optimization, the alternative schedules are Pareto sets under multi-objective of project duration and resource cost. Assuming that resource cost is closely related to resource leveling, a heuristic algorithm for resource capacity reduction (HRCR) is developed in this study in order to generate the Pareto sets efficiently. The heuristic is based on the fact that resource leveling can be improved by systematically reducing the resource capacity. Once the reduced resource capacity is given, a schedule with minimum project duration can be obtained by solving a resource-constrained project scheduling problem. In HRCR, VNS (Variable Neighborhood Search) is implemented to solve the resource-constrained project scheduling problem. Extensive experiments to evaluate the HRCR performance are accomplished with standard benchmarking data sets, PSPLIB. Considering 5 resource leveling objective functions, it is shown that HRCR outperforms well-known multi-objective optimization algorithm, SPEA2 (Strength Pareto Evolutionary Algorithm-2), in generating dominant Pareto sets. The number of approximate Pareto optimal also can be extended by modifying weight parameter to reduce resource capacity in HRCR.

Machine Learning Applied to Uncovering Gene Regulation

  • Craven, Mark
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.61-68
    • /
    • 2000
  • Now that the complete genomes of numerous organisms have been ascertained, key problems in molecular biology include determining the functions of the genes in each organism, the relationships that exist among these genes, and the regulatory mechanisms that control their operation. These problems can be partially addressed by using machine learning methods to induce predictive models from available data. My group is applying and developing machine learning methods for several tasks that involve characterizing gene regulation. In one project, for example, we are using machine learning methods to identify transcriptional control elements such as promoters, terminators and operons. In another project, we are using learning methods to identify and characterize sets of genes that are affected by tumor promoters in mammals. Our approach to these tasks involves learning multiple models for inter-related tasks, and applying learning algorithms to rich and diverse data sources including sequence data, microarray data, and text from the scientific literature.

  • PDF