• Title/Summary/Keyword: Data-based analysis

Search Result 30,989, Processing Time 0.054 seconds

Comparison of Distributed and Parallel NGS Data Analysis Methods based on Cloud Computing

  • Kang, Hyungil;Kim, Sangsoo
    • International Journal of Contents
    • /
    • v.14 no.1
    • /
    • pp.34-38
    • /
    • 2018
  • With the rapid growth of genomic data, new requirements have emerged that are difficult to handle with big data storage and analysis techniques. Regardless of the size of an organization performing genomic data analysis, it is becoming increasingly difficult for an institution to build a computing environment for storing and analyzing genomic data. Recently, cloud computing has emerged as a computing environment that meets these new requirements. In this paper, we analyze and compare existing distributed and parallel NGS (Next Generation Sequencing) analysis based on cloud computing environment for future research.

Study of Digital Analysis Efficiency through a Complexity Analysis (복잡성 분석을 통한 디지털 분석의 유효성에 관한 연구)

  • 이혁준;이종석
    • Korean Institute of Interior Design Journal
    • /
    • no.31
    • /
    • pp.56-63
    • /
    • 2002
  • This study intends to prepare a system that can be used, by applying digital technique, in analyzing complexity of architectural forms that have been visualized by the correlation based on the distribution chart made in accordance with profile lines. The profile lines are derived from the edge analysis of the architectural forms, simplified based on the visual theory. For the purpose, this study was conducted in the following ways: First, problems of the existing models for the elevation analysis were examined along with formal analysis based on visual recognition to consider the profile lines derived from the forms. Secondly, in elevation analysis, profile lines were derived by digital method to measure them qualitatively. To verify the objectivity of the measured data value, a survey was conducted based on the adjective cataloging method, and the correlation of the survey result and analyzed data was analyzed to verify the validity of the derived data. Thirdly, supplementation for the problems deducted from experiments and the possibility to use it in designing were suggested. Digital method has many advantages over the conventional analyzing system in deriving precise data value by excluding subjectivity. It also allows various analytical methods in analyzing numerous data repeatedly. Diversified models and methods of analysis considering numerous factors arising in the process of designing remain assignments to research in future.

The Audio Signal Classification System Using Contents Based Analysis

  • Lee, Kwang-Seok;Kim, Young-Sub;Han, Hag-Yong;Hur, Kang-In
    • Journal of information and communication convergence engineering
    • /
    • v.5 no.3
    • /
    • pp.245-248
    • /
    • 2007
  • In this paper, we research the content-based analysis and classification according to the composition of the feature parameter data base for the audio data to implement the audio data index and searching system. Audio data is classified to the primitive various auditory types. We described the analysis and feature extraction method for the feature parameters available to the audio data classification. And we compose the feature parameters data base in the index group unit, then compare and analyze the audio data centering the including level around and index criterion into the audio categories. Based on this result, we compose feature vectors of audio data according to the classification categories, and simulate to classify using discrimination function.

Web-based DNA Microarray Data Analysis Tool

  • Ryu, Ki-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.4
    • /
    • pp.1161-1167
    • /
    • 2006
  • Since microarray data structures are various and complicative, the data are generally stored in databases for approaching to and controlling the data effectively. But we have some difficulties to analyze and control the data when the data are stored in the several database management systems. The existing analysis tools for DNA microarray data have many difficult problems by complicated instructions, and dependency on data types and operating system, and high cost, etc. In this paper, we design and implement the web-based analysis tool for obtaining to useful information from DNA microarray data. When we use this tool, we can analyze effectively DNA microarray data without special knowledge and education for data types and analytical methods.

  • PDF

Development of the Design Methodology for Large-scale Data Warehouse based on MongoDB

  • Lee, Junho;Joo, Kyungsoo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.3
    • /
    • pp.49-54
    • /
    • 2018
  • A data warehouse is a system that collectively manages and integrates data of a company. And provides the basis for decision making for management strategy. Nowadays, analysis data volumes are reaching critical size challenging traditional data ware housing approaches. Current implemented solutions are mainly based on relational database that are no longer adapted to these data volume. NoSQL solutions allow us to consider new approaches for data warehousing, especially from the multidimensional data management point of view. In this paper, we extend the data warehouse design methodology based on relational database using star schema, and have developed a consistent design methodology from information requirement analysis to data warehouse construction for large scale data warehouse construction based on MongoDB, one of NoSQL.

Development of web-based system for dynamic statistical analysis of clinical data (웹기반 임상자료의 동적 통계분석 시스템 개발)

  • Shin, Im Hee;Kwak, Sang Gyu;Park, Jun Woo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.27-36
    • /
    • 2014
  • Statistical analysis provides information that can be applied to draw final decisions in many fields. However, statistical analysis program for PC (personal computer) is yet restricted by time and space. To minimize this issue, a server based PC statistic analysis program using internet in addition to web based system allowing statistical analysis have been continually developed. However, the current web based analysis system is limited to the data that is saved on the server. Data that is modified or newly inserted must go through a server administrator before its use in analysis. In order to solve this problem, we have developed a web based system using HTML, java, JSP scripts to incorporate dynamic data without much restriction.

Ranking subjects based on paired compositional data with application to age-related hearing loss subtyping

  • Nam, Jin Hyun;Khatiwada, Aastha;Matthews, Lois J.;Schulte, Bradley A.;Dubno, Judy R.;Chung, Dongjun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.225-239
    • /
    • 2020
  • Analysis approaches for single compositional data are well established; however, effective analysis strategies for paired compositional data remain to be investigated. The current project was motivated by studies of age-related hearing loss (presbyacusis), where subjects are classified into four audiometric phenotypes that need to be ranked within these phenotypes based on their paired compositional data. We address this challenge by formulating this problem as a classification problem and integrating a penalized multinomial logistic regression model with compositional data analysis approaches. We utilize Elastic Net for a penalty function, while considering average, absolute difference, and perturbation operators for compositional data. We applied the proposed approach to the presbyacusis study of 532 subjects with probabilities that each ear of a subject belongs to each of four presbyacusis subtypes. We further investigated the ranking of presbyacusis subjects using the proposed approach based on previous literature. The data analysis results indicate that the proposed approach is effective for ranking subjects based on paired compositional data.

Identification of sentiment keywords association-based hotel network of hotel review using mapper method in topological data analysis (Topological Data Analysis 기법을 활용한 호텔 리뷰데이터의 감성 키워드 기반 호텔 관계망 구축)

  • Jeon, Ye-Seul;Kim, Jeong-Jae
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.75-86
    • /
    • 2020
  • Hotel review data can extract various information that includes purchasing factors that lead to consumption, advantages, and disadvantages for hotels. In particular, the sentiment keyword of the review data helps consumers understand the pros and cons of hotels. However, it is not efficient for consumers to read a large number of reviews. Therefore, it is necessary to offer a summary review to customers. In this study, we suggest providing summary information on sentiment keywords association as well as a network of hotels based on sentiment keywords. Based on a sentiment keyword dictionary, the extracted sentiment keywords associations construct the hotel network through topological data analysis based mapper. This hotel network allows a consumer to find some hotels associated with specific sentiment keywords as well as recommends the same related hotels. This summary information provides users with a summarized emotional assessment of hotels and helps hotel marketing teams understand consumers' perceptions of their hotel.

A Test Data Generation Tool based on Inter-Relation of Fields in the Menu Structure (메뉴 구조의 필드간의 상호 연관관계를 기반으로 한 테스트 데이타 자동 생성 도구)

  • 이윤정;최병주
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.2
    • /
    • pp.123-132
    • /
    • 2003
  • The quality certification test is usually conducted by a certifying organization to determine and guarantee the quality of software after the software development phase, commonly without the actual source code, but with by going against the product's manual. In this paper, we implement a Manual-based Automatic Test data generating tool: MaT, the test technique based on manual, that automatizes producing the test data from analysis data of software package and manual. The input data of MaT are the result of the analysis of software and manual. We propose 'menu-based test analysis model' in order to generate the input data. We believe that the proposed technique and tool he]p improving quality and reliability of the software.

UNCERTAINTY ANALYSIS OF DATA-BASED MODELS FOR ESTIMATING COLLAPSE MOMENTS OF WALL-THINNED PIPE BENDS AND ELBOWS

  • Kim, Dong-Su;Kim, Ju-Hyun;Na, Man-Gyun;Kim, Jin-Weon
    • Nuclear Engineering and Technology
    • /
    • v.44 no.3
    • /
    • pp.323-330
    • /
    • 2012
  • The development of data-based models requires uncertainty analysis to explain the accuracy of their predictions. In this paper, an uncertainty analysis of the support vector regression (SVR) model, which is a data-based model, was performed because previous research showed that the SVR method accurately estimates the collapse moments of wall-thinned pipe bends and elbows. The uncertainty analysis method used in this study was an analytic uncertainty analysis method, and estimates with a 95% confidence interval were obtained for 370 test data points. From the results, the prediction interval (PI) was very narrow, which means that the predicted values are quite accurate. Therefore, the proposed SVR method can be used effectively to assess and validate the integrity of the wall-thinned pipe bends and elbows.