• Title/Summary/Keyword: Large-scale Analysis Data

Search Result 1,157, Processing Time 0.035 seconds

Informatics for protein identification by tandem mass spectrometry; Focused on two most-widely applied algorithms, Mascot and SEQUEST

  • Sohn, Chang-Ho;Jung, Jin-Woo;Kang, Gum-Yong;Kim, Kwang-Pyo
    • Bioinformatics and Biosystems
    • /
    • v.1 no.2
    • /
    • pp.89-94
    • /
    • 2006
  • Mass spectrometry (MS) is widely applied for high throughput proteomics analysis. When large-scale proteome analysis experiments are performed, it generates massive amount of data. To search these proteomics data against protein databases, fully automated database search algorithms, such as Mascot and SEQUEST are routinely employed. At present, it is critical to reduce false positives and false negatives during such analysis. In this review we have focused on aspects of automated protein identification using tandem mass spectrometry (MS/MS) spectra and validation of the protein identifications of two most common automated protein identification algorithms Mascot and SEQUEST.

  • PDF

Clothing Purchase Behavior according to Consumer Self-Confidence (소비자 자신감에 따른 의복구매행동)

  • Jeon, Kyung-Sook
    • Journal of the Korean Home Economics Association
    • /
    • v.45 no.6
    • /
    • pp.1-9
    • /
    • 2007
  • Even though self-confidence is a personal factor of a people, it works as behavioristic factor in consumer behavior. In this study, the influence of consumer self-confidence on clothing purchase behavior was investigated. A total of 284 data sets were analyzed after collecting questionnaires from college students in Seoul using convenient sampling method. For data analysis, chi-square test, analysis of variance, reliability test and factor analysis were performed by SPSSWIN program. The results were as followed. First, the clothing purchase places were affected by the consumers' level of self-confidence. The more confident consumers preferred internet shopping and Dongdaemun market to large-scale shops. The discount stores were selected by the less confident consumers. Second, information search was one of the main reasons to visit internet shopping mall by the more confident consumers. Third, the more confident consumers showed the higher level of clothing involvement than the less confident consumers. Finally, unplanned purchases, such as pure impulse buying and reminder buying were more likely to occur by the more confident consumers with less purchase conflicts.

Comparative Analysis on Unit Price based on Historical Cost Data Estimating for Large and Small-scale Civil Engineering Works (대·소규모 토목공사의 실적공사비 비교 분석)

  • Hong, Sung Ho;Shin, Juyeoul;Kim, Chang Hak;Lee, Dong Wook
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.4
    • /
    • pp.1707-1718
    • /
    • 2013
  • Historical cost data estimating system has been introduced since 2004 in the construction industry. Based on contract prices of past projects, this estimating system estimates construction cost according to work types. The Korea Institute of Construction Technology (KICT) announces the historical cost data twice a year annually. The unit price of a small construction project is higher than large construction project due to the increase on production cost per work unit, equipment and labor cost, etc. However, the historical cost data estimating system is applied to project estimation uniformly regardless the construction project's size. This study compared and analyzed the historical cost data of large and small construction project to indicate the problem of historical cost date estimating system. The study derived that the unit price of a small construction project is 21.8% higher than that of large construction project.

Rainstorm Tracking Using Statistical Analysis Method (통계적 기법을 이용한 국지성집중호우의 이동경로 분석)

  • Kim Sooyoung;Nam Woo-Sung;Heo Jun-Haeng
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2005.05b
    • /
    • pp.194-198
    • /
    • 2005
  • Although the rainstorm causes local damage on large scale, it is difficult to predict the movement of the rainstorm exactly. In order to reduce the rainstorm damage of the rainstorm, it is necessary to analyze the path of the rainstorm using various statistical methods. In addition, efficient time interval of rainfall observation for the analysis of the rainstorm movement can be derived by applying various statistical methods to rainfall data. In this study, the rainstorm tracking using statistical method is performed for various types of rainfall data. For the tracking of the rainstorm, the methods of temporal distribution, inclined Plane equations, and cross correlation were applied for various types of data including electromagnetic rainfall gauge data and AWS data. The speed and direction of each method were compared with those of real rainfall movement. In addition, the effective time interval of rainfall observation for the analysis of the rainstorm movement was also investigated for the selected time intervals 10, 20, 30, 40, 50, and 60 minutes. As a result, the absolute relative errors of the method of inclined plane equations are smaller than those of other methods in case of electromagnetic rainfall gauges data. The absolute relative errors of the method of cross correlation are smaller than those of other methods in case of AWS data. The absolute relative errors of 30 minutes or less than 30 minutes are smaller than those of other time intervals.

  • PDF

Analysis of the Effects of Population, Household, and Housing Characteristics on the Status of Empty Houses Using Population Housing Census Data (인구주택 총조사 자료를 이용한 인구, 가구, 주택 특성과 빈집 현황 분석)

  • Lee, Jimin;Choi, Won
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.62 no.5
    • /
    • pp.1-13
    • /
    • 2020
  • The empty houses' problem is important in the local revitalization and local sustainability, and these phenomenon caused by various factors of the region. The population and housing census data are the most effective data available to study this phenomenon by small regions. In this study, logistic regression and multiple regression analysis were performed to understand the effects of population, household, and housing characteristics on empty houses using population and housing census data. Also, the scale and direction of the effect of each characteristic in large cities, small cities, and rural areas were compared. As results, there was a slight difference between cities and province regions in the district and housing characteristic variables. In the comparison of Eup-Myeon-Dong, the affected variables were different in the Dong and Myeon areas. The significance of this study is to examine the effect of the characteristics of population and housing on the vacant houses and to confirm that the factors affecting different regions.

Auto Regulated Data Provisioning Scheme with Adaptive Buffer Resilience Control on Federated Clouds

  • Kim, Byungsang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.11
    • /
    • pp.5271-5289
    • /
    • 2016
  • On large-scale data analysis platforms deployed on cloud infrastructures over the Internet, the instability of the data transfer time and the dynamics of the processing rate require a more sophisticated data distribution scheme which maximizes parallel efficiency by achieving the balanced load among participated computing elements and by eliminating the idle time of each computing element. In particular, under the constraints that have the real-time and limited data buffer (in-memory storage) are given, it needs more controllable mechanism to prevent both the overflow and the underflow of the finite buffer. In this paper, we propose an auto regulated data provisioning model based on receiver-driven data pull model. On this model, we provide a synchronized data replenishment mechanism that implicitly avoids the data buffer overflow as well as explicitly regulates the data buffer underflow by adequately adjusting the buffer resilience. To estimate the optimal size of buffer resilience, we exploits an adaptive buffer resilience control scheme that minimizes both data buffer space and idle time of the processing elements based on directly measured sample path analysis. The simulation results show that the proposed scheme provides allowable approximation compared to the numerical results. Also, it is suitably efficient to apply for such a dynamic environment that cannot postulate the stochastic characteristic for the data transfer time, the data processing rate, or even an environment where the fluctuation of the both is presented.

Extraction of Environmental Informations for Reclaimed Area using Satellite Image Data (인공위성데이타를 이용한 간척지역의 환경정보의 추출)

  • 안철호;김용일;이창노
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.7 no.1
    • /
    • pp.49-57
    • /
    • 1989
  • On this study, we performed the landuse classification using the Landsat data acquired before and after reclamation, and extracted the ground temperature from infrared band(TM band6) data. Using the satellite data, it was possible to extract changes of landuses effectively according to the reclamation, and could obtain the thermal characteristics of the reclaimed area and the surroundings by converting infrared data value into temperatures of surfaces of ground and water. The result of this analysis will be used for the land management of large-scale reclaimed area applying the satellite data and related information.

  • PDF

Predictive Analysis of Financial Fraud Detection using Azure and Spark ML

  • Priyanka Purushu;Niklas Melcher;Bhagyashree Bhagwat;Jongwook Woo
    • Asia pacific journal of information systems
    • /
    • v.28 no.4
    • /
    • pp.308-319
    • /
    • 2018
  • This paper aims at providing valuable insights on Financial Fraud Detection on a mobile money transactional activity. We have predicted and classified the transaction as normal or fraud with a small sample and massive data set using Azure and Spark ML, which are traditional systems and Big Data respectively. Experimenting with sample dataset in Azure, we found that the Decision Forest model is the most accurate to proceed in terms of the recall value. For the massive data set using Spark ML, it is found that the Random Forest classifier algorithm of the classification model proves to be the best algorithm. It is presented that the Spark cluster gets much faster to build and evaluate models as adding more servers to the cluster with the same accuracy, which proves that the large scale data set can be predictable using Big Data platform. Finally, we reached a recall score with 0.73, which implies a satisfying prediction quality in predicting fraudulent transactions.

A framework for distributed analytical and hybrid simulations

  • Kwon, Oh-Sung;Elnashai, Amr S.;Spencer, Billie F.
    • Structural Engineering and Mechanics
    • /
    • v.30 no.3
    • /
    • pp.331-350
    • /
    • 2008
  • A framework for multi-platform analytical and multi-component hybrid (testing-analysis) simulations is described in this paper and illustrated with several application examples. The framework allows the integration of various analytical platforms and geographically distributed experimental facilities into a comprehensive pseudo-dynamic hybrid simulation. The object-oriented architecture of the framework enables easy inclusion of new analysis platforms or experimental models, and the addition of a multitude of auxiliary components, such as data acquisition and camera control. Four application examples are given, namely; (i) multi-platform analysis of a bridge with soil and structural models, (ii) multiplatform, multi-resolution analysis of a high-rise building, (iii) three-site small scale frame hybrid simulation, and (iv) three-site large scale bridge hybrid simulation. These simulations serve as illustrative examples of collaborative research among geographically distributed researchers employing different analysis platforms and testing equipment. The versatility of the framework, ease of including additional modules and the wide application potential demonstrated in the paper provide a rich research environment for structural and geotechnical engineering.

Study on the Thermal Property and Aging Prediction for Pressable Plastic Bonded Explosives through ARC(Heat-wait-search method) & Isothermal Conditions (ARC(Heat-wait-search method)와 Isothermal 조건을 이용한 압축형 복합화약의 열적 특성 및 노화 예측 연구)

  • Lee, Sojung;Kim, Seunghee;Kwon, Kuktae;Jeon, Yeongjin
    • Journal of the Korean Society of Propulsion Engineers
    • /
    • v.22 no.4
    • /
    • pp.55-60
    • /
    • 2018
  • The thermal property is one of the most important characteristics in the field of energetic materials. Because energy materials release decomposition heat, differential scanning calorimetry (DSC) is frequently used for thermal analysis. However, thermodynamic events, such as melting can interfere with DSC kinetic analysis. In this study, we use isothermal mode for DSC measurement to avoid thermodynamic issues. We also merge accelerating rate calorimetry(ARC) data with DSC data to obtain a robust prediction results for small scale samples and for large scale samples as well. For the thermal property prediction, advanced kinetics and technology solutions(AKTS) programs are used.