• Title/Summary/Keyword: Large-scale Analysis Data

Search Result 1,143, Processing Time 0.029 seconds

Analysis of X-ray luminosities of isolated elliptical galaxies in SDSS

  • Choi, Yun-Young;Kim, Eun-Bin;Kim, Sung-Soo S.;Park, Chang-Bom
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.36 no.1
    • /
    • pp.58.2-58.2
    • /
    • 2011
  • Park, Gott, & Choi (2008) found that when a galaxy is located within the virial radius from its closest neighbor and the neighbor is an elliptical, the probability of the galaxy to be an elliptical is very sensitive to the large-scale background density over a few Mpc scales. They suggested that the large-scale dependence can be arise if the temperature of a diffuse hot gas held by elliptical galaxies are higher in higher density environment. In this study, to understand the large-scale environment affects the X-ray properties of individual galaxies, we investigated the dependence of the X-ray luminosities of the elliptical galaxies on the large-scale environment using X-ray and optical data which we selected from the ROSAT All-Sky Survey and the Sloan Digital Sky Survey Data Release 7. To exclude galaxies embedded in an intra-group/cluster medium which could enhance their observed X-ray luminosity, we used isolated elliptical galaxies.

  • PDF

Iterative integrated imputation for missing data and pathway models with applications to breast cancer subtypes

  • Linder, Henry;Zhang, Yuping
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.4
    • /
    • pp.411-430
    • /
    • 2019
  • Tumor development is driven by complex combinations of biological elements. Recent advances suggest that molecularly distinct subtypes of breast cancers may respond differently to pathway-targeted therapies. Thus, it is important to dissect pathway disturbances by integrating multiple molecular profiles, such as genetic, genomic and epigenomic data. However, missing data are often present in the -omic profiles of interest. Motivated by genomic data integration and imputation, we present a new statistical framework for pathway significance analysis. Specifically, we develop a new strategy for imputation of missing data in large-scale genomic studies, which adapts low-rank, structured matrix completion. Our iterative strategy enables us to impute missing data in complex configurations across multiple data platforms. In turn, we perform large-scale pathway analysis integrating gene expression, copy number, and methylation data. The advantages of the proposed statistical framework are demonstrated through simulations and real applications to breast cancer subtypes. We demonstrate superior power to identify pathway disturbances, compared with other imputation strategies. We also identify differential pathway activity across different breast tumor subtypes.

Bio-inspired neuro-symbolic approach to diagnostics of structures

  • Shoureshi, Rahmat A.;Schantz, Tracy;Lim, Sun W.
    • Smart Structures and Systems
    • /
    • v.7 no.3
    • /
    • pp.229-240
    • /
    • 2011
  • Recent developments in Smart Structures with very large scale embedded sensors and actuators have introduced new challenges in terms of data processing and sensor fusion. These smart structures are dynamically classified as a large-scale system with thousands of sensors and actuators that form the musculoskeletal of the structure, analogous to human body. In order to develop structural health monitoring and diagnostics with data provided by thousands of sensors, new sensor informatics has to be developed. The focus of our on-going research is to develop techniques and algorithms that would utilize this musculoskeletal system effectively; thus creating the intelligence for such a large-scale autonomous structure. To achieve this level of intelligence, three major research tasks are being conducted: development of a Bio-Inspired data analysis and information extraction from thousands of sensors; development of an analytical technique for Optimal Sensory System using Structural Observability; and creation of a bio-inspired decision-making and control system. This paper is focused on the results of our effort on the first task, namely development of a Neuro-Morphic Engineering approach, using a neuro-symbolic data manipulation, inspired by the understanding of human information processing architecture, for sensor fusion and structural diagnostics.

Development of the Design Methodology for Large-scale Data Warehouse based on MongoDB

  • Lee, Junho;Joo, Kyungsoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.49-54
    • /
    • 2018
  • A data warehouse is a system that collectively manages and integrates data of a company. And provides the basis for decision making for management strategy. Nowadays, analysis data volumes are reaching critical size challenging traditional data ware housing approaches. Current implemented solutions are mainly based on relational database that are no longer adapted to these data volume. NoSQL solutions allow us to consider new approaches for data warehousing, especially from the multidimensional data management point of view. In this paper, we extend the data warehouse design methodology based on relational database using star schema, and have developed a consistent design methodology from information requirement analysis to data warehouse construction for large scale data warehouse construction based on MongoDB, one of NoSQL.

Integration of a Large-Scale Genetic Analysis Workbench Increases the Accessibility of a High-Performance Pathway-Based Analysis Method

  • Lee, Sungyoung;Park, Taesung
    • Genomics & Informatics
    • /
    • v.16 no.4
    • /
    • pp.39.1-39.3
    • /
    • 2018
  • The rapid increase in genetic dataset volume has demanded extensive adoption of biological knowledge to reduce the computational complexity, and the biological pathway is one well-known source of such knowledge. In this regard, we have introduced a novel statistical method that enables the pathway-based association study of large-scale genetic dataset-namely, PHARAOH. However, researcher-level application of the PHARAOH method has been limited by a lack of generally used file formats and the absence of various quality control options that are essential to practical analysis. In order to overcome these limitations, we introduce our integration of the PHARAOH method into our recently developed all-in-one workbench. The proposed new PHARAOH program not only supports various de facto standard genetic data formats but also provides many quality control measures and filters based on those measures. We expect that our updated PHARAOH provides advanced accessibility of the pathway-level analysis of large-scale genetic datasets to researchers.

Analyzing Characteristic of Business District in Urban Area Using GIS Methods - Focused on Large-Scale Store and Traditional Market - (GIS 기법을 활용한 도시지역 상권 특성 분석 - 대형할인점과 전통시장을 중심으로 -)

  • SONG, Bong-Geun;PARK, Kyung-Hun
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.20 no.2
    • /
    • pp.89-101
    • /
    • 2017
  • The study used GIS methods to analyze a business district consisting of traditional markets and large-scale stores, to determine the level of support needed for small enterprises in an urban area of Changwon-si, Gyeongsangnam-do. Data gathered on the area was analyzed using GIS tools such as Kernel density, Network analysis, and Huff modeling. Traditional markets are concentrated in areas where large-scale stores are located, and data analyses show that the number of consumer'use of large-scale stores (157,071) was three times that of traditional markets (59,953). One explanation for these results is that the large-scale stores are located either in densely populated areas or are adjacent to the traditional markets. Therefore, standards and regulations are needed to support small enterprise business districts. In the future, the results of this study can be used as a reference for planning and supporting traditional market business districts.

A Study on Efficiency Estimation of Aquaculture : the Case of the Korean Seaweed Farms (해조류 양식업 규모의 효율성 추정에 관한 연구 - 부산 기장지역 미역양식을 중심으로 -)

  • Seo, Ju-Nam;Song, Jung-Hun
    • The Journal of Fisheries Business Administration
    • /
    • v.40 no.1
    • /
    • pp.1-26
    • /
    • 2009
  • The aquaculture management considers the maintenance of households lifehood more than profit maximization. As aquaculture industry has developed enterprise farms appeared, and the small and the large scale farms coexist. The features of coexistence could be summarized as followings. First of all, the large scale farms show the higher net profit while the small scale farms show the higher profit per 1ha and the earning rate. Secondly, in the case of over 2ha, the earning rate is stable in spite of the scale expansion. Moreover, in processing method, dried seaweed occupy the biggest proportion in the small scale farms while the raw seaweed occupy the biggest proportion in the large scale farms. Lastly, the scale of farms becomes larger, the participation rate of household labor rises. This thesis analyses the efficiency of Korean seaweed farms in the way of DEA model and suggests the improvements for the efficiency management. The mean technical, pure technical and scale efficiencies were measured to be 0.88, 0.96 and 0.91, respectively. Among the 20 farms included in the analysis, 10 were technically efficient and 12 were scale efficient. In conclusion, it is shown that the aquaculture farms has been becoming the form of coexistence. This appearance results in the effort for reducing the cost in the small scale farms and in profit maximization in the large scale farms. On the other hand, middle scale farms is inefficient compared with the small or large scale farms. Therefore, in order to achieve the efficiency, it is necessary to accomplish economy of scale by extending farm size or to cut expenses by reducing farm area. In other word, the efforts for achieving the efficiency is required in a different direction in spite of the same scale.

  • PDF

Computational analysis of large-scale genome expression data

  • Zhang, Michael
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2000.11a
    • /
    • pp.41-44
    • /
    • 2000
  • With the advent of DNA microarray and "chip" technologies, gene expression in an organism can be monitored on a genomic scale, allowing the transcription levels of many genes to be measured simultaneously. Functional interpretation of massive expression data and linking such data to DNA sequences have become the new challenges to bioinformatics. I will us yeast cell cycle expression data analysis as an example to demonstrate how special database and computational methods may be used for extracting functional information, I will also briefly describe a novel clustering algorithm which has been applied to the cell cycle data.

  • PDF

High-Resolution Tiled Display System for Visualization of Large-scale Analysis Data (초대형 해석 결과의 분석을 위한 고해상도 타일 가시화 시스템 개발)

  • 김홍성;조진연;양진오
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.34 no.6
    • /
    • pp.67-74
    • /
    • 2006
  • In this paper, a tiled display system is developed to get a high-resolution image in visualization of large-scale structural analysis data with low-resolution display devices and low-cost cluster computer system. Concerning the hardware system, some of the crucial points are investigated, and a new beam-projector positioner is designed and manufactured to resolve the keystone phenomena which result in distorted image. In the development of tiled display software, Qt and OpenGL are utilized for GUI and rendering, respectively. To obtain the entire tiled image, LAM-MPI is utilized to synchronize the several sub-images produced from each cluster computer node.

A Workflow Execution System for Analyzing Large-scale Astronomy Data on Virtualized Computing Environments

  • Yu, Jung-Lok;Jin, Du-Seok;Yeo, Il-Yeon;Yoon, Hee-Jun
    • International Journal of Contents
    • /
    • v.16 no.4
    • /
    • pp.16-25
    • /
    • 2020
  • The size of observation data in astronomy has been increasing exponentially with the advents of wide-field optical telescopes. This means the needs of changes to the way used for large-scale astronomy data analysis. The complexity of analysis tools and the lack of extensibility of computing environments, however, lead to the difficulty and inefficiency of dealing with the huge observation data. To address this problem, this paper proposes a workflow execution system for analyzing large-scale astronomy data efficiently. The proposed system is composed of two parts: 1) a workflow execution manager and its RESTful endpoints that can automate and control data analysis tasks based on workflow templates and 2) an elastic resource manager as an underlying mechanism that can dynamically add/remove virtualized computing resources (i.e., virtual machines) according to the analysis requests. To realize our workflow execution system, we implement it on a testbed using OpenStack IaaS (Infrastructure as a Service) toolkit and HTCondor workload manager. We also exhaustively perform a broad range of experiments with different resource allocation patterns, system loads, etc. to show the effectiveness of the proposed system. The results show that the resource allocation mechanism works properly according to the number of queued and running tasks, resulting in improving resource utilization, and the workflow execution manager can handle more than 1,000 concurrent requests within a second with reasonable average response times. We finally describe a case study of data reduction system as an example application of our workflow execution system.