• Title/Summary/Keyword: Multiple Data Sets

Search Result 348, Processing Time 0.031 seconds

3D Medical Image Data Watermarking Applied to Healthcare Information Management System (헬스케어 정보 관리 시스템의 3D 의료영상 데이터 다중 워터마킹 기법)

  • Lee, Suk-Hwan;Kwon, Ki-Ryong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.11A
    • /
    • pp.870-881
    • /
    • 2009
  • The rapid development of healthcare information management for 3D medical digital library, 3D PACS and 3D medical diagnosis has addressed security issues with medical IT technology. This paper presents multiple 3D medical image data for protection, authentication, indexing and diagnosis information hiding applied to healthcare information management. The proposed scheme based on POCS watermarking embeds the robust watermark for doctor's digital signature and information retrieval indexing key to the distribution of vertex curvedness and embeds the fragile watermark for diagnosis information and authentication reference message to the distance difference of vertex. The multiple embedding process designs three convex sets for robustness, fragileness and invisibility and projects 3D medical image data onto three convex sets alternatively and iteratively. Experimental results confirmed that the proposed scheme has the robustness and fragileness to various 3D geometric and mesh modifiers at once.

Development of kNN QSAR Models for 3-Arylisoquinoline Antitumor Agents

  • Tropsha, Alexander;Golbraikh, Alexander;Cho, Won-Jea
    • Bulletin of the Korean Chemical Society
    • /
    • v.32 no.7
    • /
    • pp.2397-2404
    • /
    • 2011
  • Variable selection k nearest neighbor QSAR modeling approach was applied to a data set of 80 3-arylisoquinolines exhibiting cytotoxicity against human lung tumor cell line (A-549). All compounds were characterized with molecular topology descriptors calculated with the MolconnZ program. Seven compounds were randomly selected from the original dataset and used as an external validation set. The remaining subset of 73 compounds was divided into multiple training (56 to 61 compounds) and test (17 to 12 compounds) sets using a chemical diversity sampling method developed in this group. Highly predictive models characterized by the leave-one out cross-validated $R^2$ ($q^2$) values greater than 0.8 for the training sets and $R^2$ values greater than 0.7 for the test sets have been obtained. The robustness of models was confirmed by the Y-randomization test: all models built using training sets with randomly shuffled activities were characterized by low $q^2{\leq}0.26$ and $R^2{\leq}0.22$ for training and test sets, respectively. Twelve best models (with the highest values of both $q^2$ and $R^2$) predicted the activities of the external validation set of seven compounds with $R^2$ ranging from 0.71 to 0.93.

Parametrized Construction of Virtual Drivers' Reach Motion to Seat Belt (매개변수로 제어가능한 운전자의 안전벨트 뻗침 모션 생성)

  • Seo, Hye-Won;Cordier, Frederic;Choi, Woo-Jin;Choi, Hyung-Yun
    • Korean Journal of Computational Design and Engineering
    • /
    • v.16 no.4
    • /
    • pp.249-259
    • /
    • 2011
  • In this paper we present our work on the parameterized construction of virtual drivers' reach motion to seat belt, by using motion capture data. A user can generate a new reach motion by controlling a number of parameters. We approach the problem by using multiple sets of example reach motions and learning the relation between the labeling parameters and the motion data. The work is composed of three tasks. First, we construct a motion database using multiple sets of labeled motion clips obtained by using a motion capture device. This involves removing the redundancy of each motion clip by using PCA (Principal Component Analysis), and establishing temporal correspondence among different motion clips by automatic segmentation and piecewise time warping of each clip. Next, we compute motion blending functions by learning the relation between labeling parameters (age, hip base point (HBP), and height) and the motion parameters as represented by a set of PC coefficients. During runtime, on-line motion synthesis is accomplished by evaluating the motion blending function from the user-supplied control parameters.

A Study on Classifications of Remote Sensed Multispectral Image Data using Soft Computing Technique - Stressed on Rough Sets - (소프트 컴퓨팅기술을 이용한 원격탐사 다중 분광 이미지 데이터의 분류에 관한 연구 -Rough 집합을 중심으로-)

  • Won Sung-Hyun
    • Management & Information Systems Review
    • /
    • v.3
    • /
    • pp.15-45
    • /
    • 1999
  • Processing techniques of remote sensed image data using computer have been recognized very necessary techniques to all social fields, such as, environmental observation, land cultivation, resource investigation, military trend grasp and agricultural product estimation, etc. Especially, accurate classification and analysis to remote sensed image da are important elements that can determine reliability of remote sensed image data processing systems, and many researches have been processed to improve these accuracy of classification and analysis. Traditionally, remote sensed image data processing systems have been processed 2 or 3 selected bands in multiple bands, in this time, their selection criterions are statistical separability or wavelength properties. But, it have be bring up the necessity of bands selection method by data distribution characteristics than traditional bands selection by wavelength properties or statistical separability. Because data sensing environments change from multispectral environments to hyperspectral environments. In this paper for efficient data classification in multispectral bands environment, a band feature extraction method using the Rough sets theory is proposed. First, we make a look up table from training data, and analyze the properties of experimental multispectral image data, then select the efficient band using indiscernibility relation of Rough set theory from analysis results. Proposed method is applied to LANDSAT TM data on 2 June 1992. From this, we show clustering trends that similar to traditional band selection results by wavelength properties, from this, we verify that can use the proposed method that centered on data properties to select the efficient bands, though data sensing environment change to hyperspectral band environments.

  • PDF

Parallel Multithreaded Processing for Data Set Summarization on Multicore CPUs

  • Ordonez, Carlos;Navas, Mario;Garcia-Alvarado, Carlos
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.2
    • /
    • pp.111-120
    • /
    • 2011
  • Data mining algorithms should exploit new hardware technologies to accelerate computations. Such goal is difficult to achieve in database management system (DBMS) due to its complex internal subsystems and because data mining numeric computations of large data sets are difficult to optimize. This paper explores taking advantage of existing multithreaded capabilities of multicore CPUs as well as caching in RAM memory to efficiently compute summaries of a large data set, a fundamental data mining problem. We introduce parallel algorithms working on multiple threads, which overcome the row aggregation processing bottleneck of accessing secondary storage, while maintaining linear time complexity with respect to data set size. Our proposal is based on a combination of table scans and parallel multithreaded processing among multiple cores in the CPU. We introduce several database-style and hardware-level optimizations: caching row blocks of the input table, managing available RAM memory, interleaving I/O and CPU processing, as well as tuning the number of working threads. We experimentally benchmark our algorithms with large data sets on a DBMS running on a computer with a multicore CPU. We show that our algorithms outperform existing DBMS mechanisms in computing aggregations of multidimensional data summaries, especially as dimensionality grows. Furthermore, we show that local memory allocation (RAM block size) does not have a significant impact when the thread management algorithm distributes the workload among a fixed number of threads. Our proposal is unique in the sense that we do not modify or require access to the DBMS source code, but instead, we extend the DBMS with analytic functionality by developing User-Defined Functions.

Stream Data Processing based on Sliding Window at u-Health System (u-Health 시스템에서 슬라이딩 윈도우 기반 스트림 데이터 처리)

  • Kim, Tae-Yeun;Song, Byoung-Ho;Bae, Sang-Hyun
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.4 no.2
    • /
    • pp.103-110
    • /
    • 2011
  • It is necessary to accurate and efficient management for measured digital data from sensors in u-health system. It is not efficient that sensor network process input stream data of mass storage stored in database the same time. We propose to improve the processing performance of multidimensional stream data continuous incoming from multiple sensor. We propose process query based on sliding window for efficient input stream and found multiple query plan to Mjoin method and we reduce stored data using backpropagation algorithm. As a result, we obtained to efficient result about 18.3% reduction rate of database using 14,324 data sets.

Localization Algorithm of Multiple-AUVs Utilizing Relative 3D Observations (3차원 상대 관측 정보를 통한 다중자율무인잠수정의 위치추정 알고리즘)

  • Choi, Kihwan;Lee, Gwonsoo;Lee, Phil-Yeob;Kim, Ho Sung;Lee, Hansol;Kang, Hyungjoo;Lee, Jihong
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.2
    • /
    • pp.110-117
    • /
    • 2022
  • This paper describes a localization algorithm utilizing relative observations for multiple autonomous underwater vehicles (Multiple-AUVs). In order to maximize the efficiency of operation and mission accomplishment and to prevent problems such as collision and interference, the locations and directions of Multiple-AUVs must be precisely estimated. To estimate the locations and directions, we designed a localization algorithm utilizing relative observations and verified it with simulations based on sensor data sets acquired through real sea experiments. Also, an optimal combination of relative observation information for efficient localization is figured out through combining various relative observations. The proposed method shows improved localization results compared to those only using the navigation algorithm. The performance of localization is improved up to 58% depending on the combination of relative observations.

Multi-period DEA Models Using Spanning Set and A Case Example (생성집합을 이용한 다 기간 성과평가를 위한 DEA 모델 개발 및 공학교육혁신사업 사례적용)

  • Kim, Kiseong;Lee, Taehan
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.3
    • /
    • pp.57-65
    • /
    • 2022
  • DEA(data envelopment analysis) is a technique for evaluation of relative efficiency of decision making units (DMUs) that have multiple input and output. A DEA model measures the efficiency of a DMU by the relative position of the DMU's input and output in the production possibility set defined by the input and output of the DMUs being compared. In this paper, we proposed several DEA models measuring the multi-period efficiency of a DMU. First, we defined the input and output data that make a production possibility set as the spanning set. We proposed several spanning sets containing input and output of entire periods for measuring the multi-period efficiency of a DMU. We defined the production possibility sets with the proposed spanning sets and gave DEA models under the production possibility sets. Some models measure the efficiency score of each period of a DMU and others measure the integrated efficiency score of the DMU over the entire period. For the test, we applied the models to the sample data set from a long term university student training project. The results show that the suggested models may have the better discrimination power than CCR based results while the ranking of DMUs is not different.

Schema- and Data-driven Discovery of SQL Keys

  • Le, Van Bao Tran;Sebastian, Link;Mozhgan, Memari
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.3
    • /
    • pp.193-206
    • /
    • 2012
  • Keys play a fundamental role in all data models. They allow database systems to uniquely identify data items, and therefore, promote efficient data processing in many applications. Due to this, support is required to discover keys. These include keys that are semantically meaningful for the application domain, or are satisfied by a given database. We study the discovery of keys from SQL tables. We investigate the structural and computational properties of Armstrong tables for sets of SQL keys. Inspections of Armstrong tables enable data engineers to consolidate their understanding of semantically meaningful keys, and to communicate this understanding to other stake-holders. The stake-holders may want to make changes to the tables or provide entirely different tables to communicate their views to the data engineers. For such a purpose, we propose data mining algorithms that discover keys from a given SQL table. We combine the key mining algorithms with Armstrong table computations to generate informative Armstrong tables, that is, key-preserving semantic samples of existing SQL tables. Finally, we define formal measures to assess the distance between sets of SQL keys. The measures can be applied to validate the usefulness of Armstrong tables, and to automate the marking and feedback of non-multiple choice questions in database courses.

Prediction System Design based on An Interval Type-2 Fuzzy Logic System using HCBKA (HCBKA를 이용한 Interval Type-2 퍼지 논리시스템 기반 예측 시스템 설계)

  • Bang, Young-Keun;Lee, Chul-Heui
    • Journal of Industrial Technology
    • /
    • v.30 no.A
    • /
    • pp.111-117
    • /
    • 2010
  • To improve the performance of the prediction system, the system should reflect well the uncertainty of nonlinear data. Thus, this paper presents multiple prediction systems based on Type-2 fuzzy sets. To construct each prediction system, an Interval Type-2 TSK Fuzzy Logic System and difference data were used, because, in general, it has been known that the Type-2 Fuzzy Logic System can deal with the uncertainty of nonlinear data better than the Type-1 Fuzzy Logic System, and the difference data can provide more steady information than that of original data. Also, to improve each rule base of the fuzzy prediction systems, the HCBKA (Hierarchical Correlation Based K-means clustering Algorithm) was applied because it can consider correlationship and statistical characteristics between data at a time. Subsequently, to alleviate complexity of the proposed prediction system, a system selection method was used. Finally, this paper analyzed and compared the performances between the Type-1 prediction system and the Interval Type-2 prediction system using simulations of three typical time series examples.

  • PDF