• Title/Summary/Keyword: Map Reduce

Search Result 852, Processing Time 0.027 seconds

Design and Implementation of a PCR Primer Search System on Cloud Computing Environments (클라우드 컴퓨팅 환경에서 PCR Primer 검색 시스템 설계 및 개발)

  • Park, Junho;Lim, Jongtae;Kim, Dongjoo;Lee, Yunjeong;Ryu, Eunkyung;Ahn, Minje;Cha, Jaehong;Yu, Seok Jong;Yoo, Jaesoo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2012.05a
    • /
    • pp.269-270
    • /
    • 2012
  • 유전자 증폭을 위한 정확한 PCR Primer의 디자인은 핵심적인 기반 기술이다. 기존 연구를 통해 각 유전자별 특이적인 PCR Primer를 디자인할 수 있는 도구가 제안되었으나, 유전체 정보를 활용한 대단위의 디자인작업을 수행하기에는 적합하지 않았다. 본 논문에서는 클라우드 컴퓨팅 환경에서 대규모의 유전체를 대상으로 특이적인 PCR Primer를 디자인하고 검색할 수 있는 시스템을 설계하고 구현한다. 제안하는 시스템은 Hadoop 플랫폼에서의 MapReduce 프레임워크를 기반으로 설계 및 구현하여 유전자 서열검색을 대규모로 수행할 수 있도록 하였다. 5만개의 질의를 이용한 성능 평가 결과, 제안하는 기법은 기존 BLAST를 이용한 검색방법에 비해 약 3배의 성능 향상을 보였다.

  • PDF

Development of Clustering Algorithm based on Massive Network Compression (대용량 네트워크 압축 기반 클러스터링 알고리즘 개발)

  • Seo, Dongmin;Yu, Seok Jong;Lee, Min-Ho
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2016.05a
    • /
    • pp.53-54
    • /
    • 2016
  • 빅데이터란 대용량 데이터 활용 및 분석을 통해 가치 있는 정보를 추출하고, 이를 바탕으로 대응 방안 도출 또는 변화를 예측하는 기술을 의미한다. 그리고 빅데이터 분석에 활용되는 데이터인 페이스북과 같은 소셜 데이터, 유전자 발현과 같은 바이오 데이터, 항공망과 같은 지리정보 데이터들은 대용량 네트워크로 구성되어 있다. 네트워크 클러스터링은 서로 유사한 특성을 갖는 네트워크 내의 데이터들을 동일한 클러스터로 묶는 기법으로 네트워크 데이터를 분석하고 그 특성을 파악하는데 폭넓게 사용된다. 최근 빅데이터가 다양한 분야에서 활용되면서 방대한 양의 네트워크 데이터가 생성되고 있고, 이에 따라서 대용량 네트워크 데이터를 효율적으로 처리하는 클러스터링 기법의 중요성이 증가하고 있다. MCL(Markov Clustering) 알고리즘은 플로우 기반 무감독(unsupervised) 클러스터링 알고리즘으로 확장성이 우수해 다양한 분야에서 활용되고 있다. 하지만, MCL은 대용량 네트워크에 대해서는 많은 클러스터링 연산을 요구하며 너무 많은 클러스터를 생성하는 문제를 갖는다. 본 논문에서는 네트워크 압축을 기반으로 한 클러스터링 알고리즘을 제안함으로써 MCL보다 클러스터링 속도와 정확도를 향상시켰다. 또한, 희소행렬을 효율적으로 저장하는 CSC(Compressed Sparse Column) 자료구조와 MapReduce 기법을 제안한 클러스터링 알고리즘에 적용함으로써 대용량 네트워크에 대한 클러스터링 속도를 향상시켰다.

  • PDF

An Improved Snake Algorithm Using Local Curvature (부분 곡률을 이용한 개선된 스네이크 알고리즘)

  • Lee, Jung-Ho;Choi, Wan-Sok;Jang, Jong-Whan
    • The KIPS Transactions:PartB
    • /
    • v.15B no.6
    • /
    • pp.501-506
    • /
    • 2008
  • The classical snake algorithm has a problem in detecting the boundary of an object with deep concavities. While the GVF method can successfully detect boundary concavities, it consumes a lot of time computing the energy map. In this paper, we propose an algorithm to reduce the computation time and improve performance in detecting the boundary of an object with high concavity. We define the degree of complexity of object boundary as the local curvature. If the value of the local curvature is greater than a threshold value, new snake points are added. Simulation results on several different test images show that our method performs well in detecting object boundary and requires less computation time.

Problems and Countermeasures in Applying of Toyota Production System (도요타 생산방식의 도입적용상 문제점과 대응방안)

  • Park, Jin-Je;Lee, Dong-Hyung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.38 no.1
    • /
    • pp.152-161
    • /
    • 2015
  • Until a recent date, Toyota Production System (called TPS) was introduced by many domestic companies to remove waste and reduce manufacturing cost. However, cases of substantial and effective improvement after the introduction are not much. Even though many companies have actively conducted TPS during that time, the outcome is not satisfactory. In this paper, we show the problems and core contents to consider in applying of TPS as follows. First, the innovative organizational culture formed by active participation of employees and leadership of the CEO is very important for a successful introduction of TPS above all. Second, it is necessary to prepare various training programs optimized for the field in order to continuously improve the competency of employees in each class, and to train skilled personnel through that programs. Third, it is necessary to improve the maturity level of TPS application through the construction of correct evaluation system on accomplishment of the production system. In addition, the problems that occur should be solved through the continuous improvement activities. These results will help to TPS introduction of the domestic small-medium companies. Therefore, this study will contribute to strengthen and improve the global competitiveness in the related industries.

Research on Big Data Integration Method

  • Kim, Jee-Hyun;Cho, Young-Im
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.1
    • /
    • pp.49-56
    • /
    • 2017
  • In this paper we propose the approach for big data integration so as to analyze, visualize and predict the future of the trend of the market, and that is to get the integration data model using the R language which is the future of the statistics and the Hadoop which is a parallel processing for the data. As four approaching methods using R and Hadoop, ff package in R, R and Streaming as Hadoop utility, and Rhipe and RHadoop as R and Hadoop interface packages are used, and the strength and weakness of four methods are described and analyzed, so Rhipe and RHadoop are proposed as a complete set of data integration model. The integration of R, which is popular for processing statistical algorithm and Hadoop contains Distributed File System and resource management platform and can implement the MapReduce programming model gives us a new environment where in R code can be written and deployed in Hadoop without any data movement. This model allows us to predictive analysis with high performance and deep understand over the big data.

A Study on the Pattern Classificatiion of the EMG Signals Using Neural Network and Probabilistic Model (신경회로망과 확률모델을 이용한 근전도신호의 패턴분류에 관한 연구)

  • 장영건;권장우;장원환;장원석;홍성홍
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.10
    • /
    • pp.831-841
    • /
    • 1991
  • A combined model of probabilistic and MLP(multi layer perceptron) model is proposed for the pattern classification of EMG( electromyogram) signals. The MLP model has a problem of not guaranteeing the global minima of error and different quality of approximations to Bayesian probabilities. The probabilistic model is, however, closely related to the estimation error of model parameters and the fidelity of assumptions. A proper combination of these will reduce the effects of the problems and be robust to input variations. Proposed model is able to get the MAP(maximum a posteriori probability) in the probabilistic model by estimating a priori probability distribution using the MLP model adaptively. This method minimize the error probability of the probabilistic model as long as the realization of the MLP model is optimal, and this is a good combination of the probabilistic model and the MLP model for the usage of MLP model reliability. Simulation results show the benefit of the proposed model compared to use the Mlp and the probabilistic model seperately and the average calculation time fro classification is about 50ms in the case of combined motion using an IBM PC 25 MHz 386model.

  • PDF

Rough Set-based Ambiguity Reduction of Location Recognition for Autonomous Robots (러프집합을 이용한 자율주행 로봇 위치인식의 애매성 축소)

  • Lee, In-K.;Son, Chang-S.;Kwon, Soon-H.
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.18 no.4
    • /
    • pp.463-470
    • /
    • 2008
  • In this paper, we confirm that the two properties, 'existence of obstacles' and 'connectivity between obstacles', involved in information acquired by a robot can be used efficiently for location recognition of the robot by using rough sets. Moreover, we propose a method which can reduce ambiguity of the location recognition by applying the properties and recognize the robot's location with distrustful information of the environment where the robot moves. We confirmed it through computer simulation that a robot moves to a goal with only the map containing not enough information on the real environment.

Efficient k-Nearest Neighbor Join Query Processing Algorithm using MapReduce (맵리듀스를 이용한 효율적인 k-NN 조인 질의처리 알고리즘)

  • Yun, Deulnyeok;Jang, Miyoung;Chang, Jaewoo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2014.11a
    • /
    • pp.767-770
    • /
    • 2014
  • 대용량 데이터를 분석하기 위한 맵리듀스 기반 k-NN 조인 질의처리 알고리즘은 최근 데이터 마이닝 및 분석을 기반으로 하는 응용 분야에서 매우 중요하게 활용되고 있다. 그러나, 대표적인 연구인 보로노이 기반 k-NN 조인 질의처리 알고리즘은 보로노이 인덱스 구축 비용이 매우 크기 때문에 대용량 데이터에 적합하지 못하다. 아울러 보로노이 셀 정보를 저장하기 위해 사용하는 R-트리는 맵리듀스 환경의 분산 병렬 처리에 적합하지 않다. 따라서 본 논문에서는 새로운 그리드 인덱스 기반의 k-NN 조인 질의 처리 알고리즘을 제안한다. 첫째, 높은 인덱스 구축 비용 문제를 해결하기 위해, 데이터 분포를 고려한 동적 그리드 인덱스 생성 기법을 제안한다. 둘째, 맵리듀스 환경에서 효율적으로 k-NN 조인 질의를 수행하기 위해, 인접셀 정보를 시그니처로 활용하는 후보영역 탐색 및 필터링 알고리즘을 제안한다. 마지막으로 성능 평가를 통해 제안하는 기법이 질의 처리 시간 측면에서 기존 기법에 비해 최대 3배 높은 질의 처리 성능을 나타냄을 보인다.

A Survey of Deterioration Causes of High Voltage Motors in Power Plants (발전소 고압전동기 열화 요인 분석)

  • Kim, Kyeong-Yeol;Kim, Hee-Dong;Kim, Byeong-Rae;Kong, Tae-Sik;Kim, Byong-Han;Lee, Sang-Kil;Lee, Jong-Hweon;Choi, Hong-Suck
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.24 no.10
    • /
    • pp.807-811
    • /
    • 2011
  • When the high voltage motor fails in power plants, there will be an occurrence where the generator's output would be reduced or the generators would trip. Despite of these effects, the causes of deterioration of the high voltage motor is very seldom investigated. In this paper, the collected data which tested in the field over 10 years have been treated statistically, and analyzed to correlate the insulation deterioration of high voltage motor and installation environment, number of start/stop, and service life. Moreover, the proper period of insulation test is developed to map out maintenance strategy and reduce maintenance costs.

The measurement and analysis of Regenerative Pump Noise (재생펌프 소음특성의 측정 및 해석에 관한 연구)

  • Kim, Tae-Hoon;Seo, Young-Soo;Jeong, Weui-Bong;Jeong, Ho-Kyeong
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2004.11a
    • /
    • pp.1067-1071
    • /
    • 2004
  • In this paper, the characteristic of the regenerative pump is reviewed by the measurement and the analysis. The dominant noise sources are harmonic components of the rotating impeller frequency. The acoustic characteristics and the noise source position at the dump are identified. In order to reduce the high-level peak noise, the interior flow of the pump chamber is analyzed by CFD (Computational Fluid Dynamics). Acoustic pressure is calculated with Ffowscs Williams and Hawkings equation. As the result of the analysis new design of the pump chamber is recommended. The recommended pump is compared with original pump by evaluating the RMS value of a interior static pressure and the sound pressure level. The new pump chamber recommended by analysis results is proved by a process of the measurement. The overall SPL of a recommended pump is reduced about 3 dBA.

  • PDF