• Title/Summary/Keyword: Sort-Sorting

Search Result 79, Processing Time 0.023 seconds

External Merge Sorting in Tajo with Variable Server Configuration (매개변수 환경설정에 따른 타조의 외부합병정렬 성능 연구)

  • Lee, Jongbaeg;Kang, Woon-hak;Lee, Sang-won
    • Journal of KIISE
    • /
    • v.43 no.7
    • /
    • pp.820-826
    • /
    • 2016
  • There is a growing requirement for big data processing which extracts valuable information from a large amount of data. The Hadoop system employs the MapReduce framework to process big data. However, MapReduce has limitations such as inflexible and slow data processing. To overcome these drawbacks, SQL query processing techniques known as SQL-on-Hadoop were developed. Apache Tajo, one of the SQL-on-Hadoop techniques, was developed by a Korean development group. External merge sort is one of the heavily used algorithms in Tajo for query processing. The performance of external merge sort in Tajo is influenced by two parameters, sort buffer size and fanout. In this paper, we analyzed the performance of external merge sort in Tajo with various sort buffer sizes and fanouts. In addition, we figured out that there are two major causes of differences in the performance of external merge sort: CPU cache misses which increase as the sort buffer size grows; and the number of merge passes determined by fanout.

Crack Detection and Sorting of Eggs by Image Processing (영상처리에 의한 계란의 파란 검출 및 선별)

  • Cho, H.K.;Kwon, Y.;Cho, S.K.
    • Korean Journal of Poultry Science
    • /
    • v.22 no.4
    • /
    • pp.233-238
    • /
    • 1995
  • A computer vision system was built to generate images of a single, stationary egg. This system includes a CGD camera, a frame grabber, and incandescent back lighting system. Image processing algorithms were developed to inspect egg shell and to sort eggs. Those values of both gray level and area of dark spots in the egg image were used as criteria to detect holes in egg and those values of both area and roundness of dark spots in the egg image were used to detect cracks in egg. For a sample of 300 eggs, this system was able to correctly analyze an egg for the presence of a defect 97.5% of the time. The weights of eggs were found to be linear to both the projected area and the perimeter of eggs viewed from above. Those two values were used as criteria to sort eggs. The coefficients of determination(r$^2$) for the regression equations between weights and those two values were 0.967 and 0.972 in the two sets of experiment. Accuracies in grading were found to be 95.6% and 96.7% as compared with results from sizing by electronic weight scale.

  • PDF

Quality Inspection and Sorting in Eggs by Machine Vision

  • Cho, Han-Keun;Yang Kwon
    • Proceedings of the Korean Society for Agricultural Machinery Conference
    • /
    • 1996.06c
    • /
    • pp.834-841
    • /
    • 1996
  • Egg production in Korea is becoming automated with a large scale farm. Although many operations in egg production have been and cracks are regraded as a critical problem. A computer vision system was built to generate images of a single , stationary egg. This system includes a CCD camera, a frame grabber board, a personal computer (IBM PC AT 486) and an incandescent back lighting system. Image processing algorithms were developed to inspect egg shell and to sort eggs. Those values of both gray level and area of dark spots in the egg image were used as criteria to detect holes in egg and those values of both area and roundness of dark spots in the egg and those values of both area and roundness of dark spots in the egg image were used to detect cracks in egg. Fro a sample of 300 eggs. this system was able to correctly analyze an egg for the presence of a defect 97.5% of the time. The weights of eggs were found to be linear to both the projected area and the perimeter of eggs v ewed from above. Those two values were used as criteria to sort eggs. Accuracy in grading was found to be 96.7% as compared with results from weight by electronic scale.

  • PDF

Sort-Based Distributed Parallel Data Cube Computation Algorithm using MapReduce (맵리듀스를 이용한 정렬 기반의 데이터 큐브 분산 병렬 계산 알고리즘)

  • Lee, Suan;Kim, Jinho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.9
    • /
    • pp.196-204
    • /
    • 2012
  • Recently, many applications perform OLAP(On-Line Analytical Processing) over a very large volume of data. Multidimensional data cube is regarded as a core tool in OLAP analysis. This paper focuses on the method how to efficiently compute data cubes in parallel by using a popular parallel processing tool, MapReduce. We investigate efficient ways to implement PipeSort algorithm, a well-known data cube computation method, on the MapReduce framework. The PipeSort executes several (descendant) cuboids at the same time as a pipeline by scanning one (ancestor) cuboid once, which have the same sorting order. This paper proposed four ways implementing the pipeline of the PipeSort on the MapReduce framework which runs across 20 servers. Our experiments show that PipeMap-NoReduce algorithm outperforms the rest algorithms for high-dimensional data. On the contrary, Post-Pipe stands out above the others for low-dimensional data.

Proposal of Fast Counting Sort (빠른 계수 정렬법의 제안)

  • Lee, Sang-Un
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.15 no.5
    • /
    • pp.61-68
    • /
    • 2015
  • Among comparison sorts, no algorithm excels a current set lower bound of O(nlogn) in operation. Quicksort, the fastest of its kind, has a complexity of O(nlogn) at its best and on average and $O(n^2)$ at worst. This paper thus presents two methods: first is an O(n+k) simple counting sort which operates much more speedily than an O(n+k), (k=maximum value) counting sort, and second is an O(ln) radix counting sort which counts the frequency of numbers in the digit l of a data and saves it in a corresponding virtual bucket in an array, only to virtually divide the array into radix digit numbers. For the 6 experimental data, the proposed algorithm makes O(nlogn) or $O(n^2)$ of Quicksort simple into O(n+k) or O(ln). After all, the proposed sorting algorithm has proved to be much faster than the counting sort and Quicksort.

Development of Automatic Sorting System for Green pepper Using Machine Vision (기계시각에 의한 풋고추 자동 선별시스템 개발)

  • Cho, N.H.;Chang, D.I.;Lee, S.H.;Hwang, H.;Lee, Y.H.;Park, J.R.
    • Journal of Biosystems Engineering
    • /
    • v.31 no.6 s.119
    • /
    • pp.514-523
    • /
    • 2006
  • Production of green pepper has been increased due to customer's preference and a projected ten-year boom in the industry in Korea. This study was carried out to develop an automatic grading and sorting system for green pepper using machine vision. The system consisted of a feeding mechanism, segregation section, an image inspection chamber, image processing section, system control section, grading section, and discharging section. Green peppers were separated and transported using a bowl feeder with a vibrator and a belt conveyor, respectively. Images were taken using color CCD cameras and a color frame grabber. An on-line grading algorithm was developed using Visual C/C++. The green peppers could be graded into four classes by activating air nozzles located at the discharging section. Length and curvature of each green pepper were measured while removing a stem of it. The first derivative of thickness profile was used to remove a stem area of segmented image of the pepper. While pepper is moving at 0.45 m/s, the accuracy of grading sorting for large, medium and small pepper are 86.0%, 81.3% and 90.6% respectively. Sorting performance was 121 kg/hour, and about five times better than manual sorting. The developed system was also economically feasible to grade and sort green peppers showing the cost about 40% lower than that of manual operations.

An Algorithm for Sorting Cucumbers using a T-shaped Array of Sensors (T형 센서배열을 이용한 오이형상분류 앨고리즘)

  • Yang, Moon-Hee;Chang, Kyung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.24 no.4
    • /
    • pp.613-625
    • /
    • 1998
  • This paper addresses a fundamental and theoretical model for the shape of a cucumber and its human-oriented proper definition of length and curvature in order to sort cucumbers electronically. In addition, we design a T-shaped array of sensors to minimize the number of sensors and the processing number of A/D(Analog/Digital) conversions, and we analyze regular patterns of a series of 1's and 0's which are converted from an A/D module. Finally we suggest an algorithm for measuring the length and curvature providing a rule from the regular patterns. The methodology suggested in this paper could be applied to electronical classification of some crops and fruits such as tomatoes, apples, and so on, and can be a basis for developing other sorting machines.

  • PDF

Pinched Flow Fractionation Microchannel to Sort Microring-Containing Immiscible Emulsion Droplets (마이크로 링이 함유된 비혼합성 에멀젼 액적의 분류를 위한 Pinched Flow Fractionation 마이크로 채널)

  • Ye, Woojun;Kim, Hyunggun;Byun, Doyoung
    • Journal of the Korean Society of Visualization
    • /
    • v.15 no.2
    • /
    • pp.41-47
    • /
    • 2017
  • Microring/nanoring structure has high applicability for nano-antenna and biosensor thanks to its superior optical characteristics. Although coiling nanowires manufactured using immiscible emulsion droplets have an advantage in mass production, this process also forms nanowire bundles. In this study, we solved the nanowire bundle problem by size-selective sorting of the emulsion droplets in a pinched flow fractionation microchannel. Utilizing silver nanowires and immiscible emsulsion droplets, we investigated the correlation between the size of ring droplets and bundle droplet. We visualized the sorting process for glass particles and microring-containing emulsion droplets. Droplets were sorted based on their size, and the ratio of bundle droplets in solution decreased. This droplet-sorting strategy has potential to help the printing and coating process for manufacturing of ring structure patterns and developing of functional materials.

Sorting of the Human Folate Receptor in MDCK Cells

  • Kim, Chong-Ho;Park, Young-Soon;Chung, Koong-Nah;Elwood, P.C.
    • BMB Reports
    • /
    • v.37 no.3
    • /
    • pp.362-369
    • /
    • 2004
  • The human folate receptor (hFR) is a glycosylphosphatidylinositol (GPI) linked plasma membrane protein that mediates delivery of folates into cells. We studied the sorting of the hFR using transfection of the hFR cDNA into MDCK cells. MDCK cells are polarized epithelial cells that preferentially sort GPI-linked proteins to their apical membrane. Unlike other GPI-tailed proteins, we found that in MDCK cells, hFR is functional on both the apical and basolateral surfaces. We verified that the same hFR cDNA that transfected into CHO cells produces the hFR protein that is GPI-linked. We also measured the hFR expression on the plasma membrane of type III paroxysmal nocturnal hemoglobinuria (PNH) human erythrocytes. PNH is a disease that is characterized by the inability of cells to express membrane proteins requiring a GPI anchor. Despite this defect, and different from other GPI-tailed proteins, we found similar levels of hFR in normal and type III PNH human erythrocytes. The results suggest the hypothesis that there may be multiple mechanisms for targeting hFR to the plasma membrane.

Improvement of Practical Suffix Sorting Algorithm (실용적인 접미사 정렬 알고리즘의 개선)

  • Jeong, Tae-Young;Lee, Tae-Hyung;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.2
    • /
    • pp.68-72
    • /
    • 2009
  • The suffix array is a data structure storing all suffixes of a string in lexicographical order. It is widely used in string problems instead of the suffix tree, which uses a large amount of memory space. Many researches have shown that not only the suffix array can be built in O(n), but also it can be constructed with a small time and space usage for real-world inputs. In this paper, we analyze a practical suffix sorting algorithm due to Maniscalco and Puglisi [1], and we propose an efficient algorithm which improves Maniscalco-Puglisi's running time.