• Title/Summary/Keyword: join

Search Result 1,155, Processing Time 0.024 seconds

K Nearest Neighbor Joins for Big Data Processing based on Spark (Spark 기반 빅데이터 처리를 위한 K-최근접 이웃 연결)

  • JIAQI, JI;Chung, Yeongjee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.9
    • /
    • pp.1731-1737
    • /
    • 2017
  • K Nearest Neighbor Join (KNN Join) is a simple yet effective method in machine learning. It is widely used in small dataset of the past time. As the number of data increases, it is infeasible to run this model on an actual application by a single machine due to memory and time restrictions. Nowadays a popular batch process model called MapReduce which can run on a cluster with a large number of computers is widely used for large-scale data processing. Hadoop is a framework to implement MapReduce, but its performance can be further improved by a new framework named Spark. In the present study, we will provide a KNN Join implement based on Spark. With the advantage of its in-memory calculation capability, it will be faster and more effective than Hadoop. In our experiments, we study the influence of different factors on running time and demonstrate robustness and efficiency of our approach.

Design of Multiprocess Models for Parallel Protocol Implementation (병렬 프로토콜 구현을 위한 다중 프로세스 모델의 설계)

  • Choi, Sun-Wan;Chung, Kwang-Sue
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.10
    • /
    • pp.2544-2552
    • /
    • 1997
  • This paper presents three multiprocess models for parallel protocol implementation, that is, (1)channel communication model, (2)fork-join model, and (3)event polling model. For the specification of parallelism for each model, a parallel programming language, Par. C System, is used. to measure the performance of multiprocess models, we implemented the Internet Protocol Suite(IPS) Internet Protocol (IP) for each model by writing the parallel language on the Transputer. After decomposing the IP functions into two parts, that is, the sending side and the receiving side, the parallelism in both sides is exploited in the form of Multiple Instruction Single Data (MISD). Three models are evaluated and compared on the basis of various run-time overheads, such as an event sending via channels in the parallel channel communication model, process creating in the fork-join model and context switching in the event polling model, at the sending side and the receiving side. The event polling model has lower processing delays as about 77% and 9% in comparison with the channel communication model and the fork-join model at the sending side, respectively. At the receiving side, the fork-join model has lower processing delays as about 55% and 107% in comparison with the channel communication model and the event polling model, respectively.

  • PDF

Cost Model for Parallel Spatial Joins using Fixed Grids (고정 그리드를 이용한 병렬 공간 조인을 위한 비용 모델)

  • Kim, Jin-Deog;Hong, Bong-Hee
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.665-676
    • /
    • 2001
  • The most expensive spatial operation in patial database in a spatial join which computes a combined table of which tuple consists of two tuples of the two tables satisgying a spatial predicate. Although the execution time of sequential processing of a spatial join has been so far considerably improved the response time is not tolerable because of not meeting the requiremetns of interactive users. It is usually appropriate to use parallel processing to improve the performance of spatial join processing. in spatial database the fixed grids which consist of the regularly partitioned cells can be employed the previous works on the spatial joins have not studied the parallel processing of spatial joins using fixed grids. This paper has presented an analytical cost model that estimates the comparative performance of a parallel spatial join algorithm based on the fixed grids in terms of the number of MBR comparisons. disk accesses, and message passing, Several experiments on the synthetic and real datasets show that the proposed analytical model is very accurate. This most model is also expected to used for implementing a very important DBMS component, Called the query processing optimizer.

  • PDF

Spreadsheet Model Approach for Buffer-Sharing Fork-Join Production Systems with General Processing Times and Structure (일반 공정시간과 구조를 갖는 버퍼 공유 분기-접합 생산시스템의 스프레드시트 모형 분석)

  • Seo, Dong-Won
    • Journal of the Korea Society for Simulation
    • /
    • v.28 no.3
    • /
    • pp.65-74
    • /
    • 2019
  • For many years, it has been widely studied on fork-join production systems but there is not much literature focusing on the finite buffer(s) of either individuals or shared, and generally distributed processing times. Usually, it is difficult to handle finite buffer(s) through a standard queueing theoretical approach. In this study, by using the max-plus algebraic approach we studied buffer-shared fork-join production systems with general processing times. However, because it cannot provide proper computational ways for performance measures, we developed simulation models using @RISK software and the expressions derived from max-plus algebra. From the simulation experiments, we compared some properties on waiting time with respect to a buffer capacity under two blocking policies: BBS (Blocking Before Service) and BAS (Blocking After Service).

Implementation of Effective Dominator Trees Using Eager Reduction Algorithm and Delay Reduction Algorithm (순차감축 알고리즘과 지연감축 알고리즘을 이용한 효과적인 지배자 트리의 구현)

  • Lee, Dae-Sik
    • Journal of Internet Computing and Services
    • /
    • v.6 no.6
    • /
    • pp.117-125
    • /
    • 2005
  • The dominator tree presents the dominance frontier from directed graph to the tree. we present the effective algorithm for constructing the dominator tree from arbitrary directed graph. The reducible flow graph was reduced to dominator tree after dominator calculation. And the irreducible flow graph was constructed to dominator-join graph using join-edge information of information table. For reducing the dominator tree from dominator-join graph, we implement the effective sequency reducible algorithm and delay reducible algorithm. As a result of implementation, we can see that the delay reducible algorithm takes less execution time than the sequency reducible algorithm. Therefore, we can reduce the flow graph to dominator tree effectively.

  • PDF

Personal Broadcasting System Using mOBCP-based Overlay Multicast Tree Construction Method (개인 방송 시스템을 위한 mOBCP 기반의 오버레이 멀티캐스트 트리 구성 방안)

  • Nam, Ji-Seung;Kang, Mi-Young;Jeon, Jin-Han;Son, Seung-Chul
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.8B
    • /
    • pp.539-546
    • /
    • 2007
  • For better performance and to avoid member service annoyance that results due to joining-clients' waiting durations and time-outs when there are more than one client wanting to join concurrently for Personal Broadcasting System service, there is a need for improving concurrent member joining mechanism. For a more efficient and better performing, this paper apply Overlay Multicast based mini-Overlay Broadcasting Control Protocol(mOBCP) Algorithm on Personal Broadcasting System. mOBCP proposed is performance-effective mechanism, since it considers the case of how fast will children, concurrently, find and join new parents when paths to existing parents are in Failure. The performance comparison, in terms of tree construction time variation and Latency are done through simulations and the results conclude in favour of the Proposed mOBCP.

What Kinds of Aptitude Will Be Required for Undergraduate Students Who Want to Join Export-Oriented SMEs? (수출중소기업은 어떤 직무적성을 가진 대학생을 채용할까? -광주 지역을 중심으로-)

  • PARK, Hyun-Chae
    • THE INTERNATIONAL COMMERCE & LAW REVIEW
    • /
    • v.73
    • /
    • pp.111-128
    • /
    • 2017
  • The main objective of this study is to examine the required aptitudes for undergraduate students who want to join export-oriented Small & Medium Enterprises(SMEs). 178 Dataset from a survey of exporting firms in Gwangju, Korea, were used to analyze the study. The results of the study are as follows ; First, the most required aptitude is 'the capability related to build up human relationship'. So students should learn negotiation skills in the college. In addition to this, student also try to join informal club and cultivate teamwork capabilities. Second, finding out a job in export-oriented SMEs is needed to equip with problem-solving capabilities. To do it, students should learn various subjects related to trade theory. Additionally, having some certificates like 'international trade master' can be better. Third, communication capabilities including foreign language and international business skills will be also required for students who are preparing for joining export-oriented SMEs. However, capabilities related to information technology and basic statistic skills does not have statistically significant correlation to recruitment intention. As a result, students who have such above-mentioned four aptitudes may have better position to find out jobs in export-oriented SMEs.

  • PDF

An Improvement of Partition-Based Spatial Merge Join using Dynamic Object Decomposition (동적 객체 분해를 이용한 분할 기반의 공간 합병 조인의 개선)

  • Choi, Yong-Jin;Lee, Yong-Ju;Park, Ho-Hyun;Lee, Sung-Jin;Chung, Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.27 no.2
    • /
    • pp.247-255
    • /
    • 2000
  • Traditional object decomposition techniques do not decompose spatial objects dynamically during spatial joins, because the object decomposition is very expensive. In this paper, we propose a modified object decomposition technique that can be applied in PBSM(Partition Based Spatial Merge-Join). In real-life data, there are much differences among the sizes of objects. We decompose only large objects with great effects on spatial joins. This technique decreases the decomposition cost of objects during spatial joins and enables efficient filter-refinement steps. Experiments show that the PBSM used with our proposed method performs significantly better than the traditional PBSM.

  • PDF

Improving Join Performance for SPARQL Query Processing in the Clouds (클라우드에서 SPARQL 질의 처리를 위한 조인 성능 향상)

  • Choi, Gyu-Jin;Son, Yun-Hee;Lee, Kyu-Chul
    • Journal of KIISE
    • /
    • v.43 no.6
    • /
    • pp.700-709
    • /
    • 2016
  • Recently, with the rapid growth of LOD (Linked Open Data) existing methods based on a single machine have limitation in performance. Existing solutions use distributed framework such as Mapreduce in order to improve the performance. However, the MapReduce framework for processing SPARQL queries involves multiple MapReduce jobs and additional costs incurred. In addition, the problem of unnecessary data processing arises. In this study, we proposed a method to reduce the number of MapReduce jobs during SPARQL query processing and join indexes based on Bitmap for minimizing the costs of processing unnecessary data.

The Correlational Study on Health-Promoting Behavior, Self-Esteem, and Life Satisfaction of Elderly (노인의 건강증진행위, 자아존중감 및 생활만족도 와의 관계)

  • Yang, Nam Young
    • Journal of Home Health Care Nursing
    • /
    • v.19 no.2
    • /
    • pp.112-118
    • /
    • 2012
  • Purpose: This study was examined to identify the correlation health-promoting behavior, self-esteem, and life satisfaction of the elderly. Method: The subjects consisted of 115 elderly. The data collected from Oct to Dec 2011 were analyzed using descriptive statistics, t-test, ANOVA, and Pearson correlation coefficients. Result: The mean scores of health-promoting behavior ($2.33{\pm}.34$), self-esteem ($2.87{\pm}.58$), and life satisfaction ($2.98{\pm}.44$) of elderly were the average. Health-promoting behavior was significantly different according to age, educational level, religion, spouse, living arrangement, economic status, and join groups. Self-esteem was significantly different according to religion, economic status, and join groups. Life satisfaction was significantly different according to age, economic status, and join groups. Significant correlations were found between health-promoting behavior, self-esteem, and life satisfaction. Conclusion: These findings indicate that health-promoting behavior, self-esteem, and life satisfaction may be necessities to pursue successful aging of elderly. In addition, above mentioned results will be reflected in improving the quality of life programs.

  • PDF