• Title/Summary/Keyword: Large Size data Processing

Search Result 246, Processing Time 0.024 seconds

Utilizing the Effect of Market Basket Size for Improving the Practicality of Association Rule Measures (연관규칙 흥미성 척도의 실용성 향상을 위한 장바구니 크기 효과 반영 방안)

  • Kim, Won-Seo;Jeong, Seung-Ryul;Kim, Nam-Gyu
    • The KIPS Transactions:PartD
    • /
    • v.17D no.1
    • /
    • pp.1-8
    • /
    • 2010
  • Association rule mining techniques enable us to acquire knowledge concerning sales patterns among individual items from voluminous transactional data. Certainly, one of the major purposes of association rule mining is utilizing the acquired knowledge to provide marketing strategies such as catalogue design, cross-selling and shop allocation. However, this requires too much time and high cost to only extract the actionable and profitable knowledge from tremendous numbers of discovered patterns. In currently available literature, a number of interest measures have been devised to accelerate and systematize the process of pattern evaluation. Unfortunately, most of such measures, including support and confidence, are prone to yielding impractical results because they are calculated only from the sales frequencies of items. For instance, traditional measures cannot differentiate between the purchases in a small basket and those in a large shopping cart. Therefore, some adjustment should be made to the size of market baskets because there is a strong possibility that mutually irrelevant items could appear together in a large shopping cart. Contrary to the previous approaches, we attempted to consider market basket's size in calculating interest measures. Because the devised measure assigns different weights to individual purchases according to their basket sizes, we expect that the measure can minimize distortion of results caused by accidental patterns. Additionally, we performed intensive computer simulations under various environments, and we performed real case analyses to analyze the correctness and consistency of the devised measure.

Lightweight Deep Learning Model for Real-Time 3D Object Detection in Point Clouds (실시간 3차원 객체 검출을 위한 포인트 클라우드 기반 딥러닝 모델 경량화)

  • Kim, Gyu-Min;Baek, Joong-Hwan;Kim, Hee Yeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.9
    • /
    • pp.1330-1339
    • /
    • 2022
  • 3D object detection generally aims to detect relatively large data such as automobiles, buses, persons, furniture, etc, so it is vulnerable to small object detection. In addition, in an environment with limited resources such as embedded devices, it is difficult to apply the model because of the huge amount of computation. In this paper, the accuracy of small object detection was improved by focusing on local features using only one layer, and the inference speed was improved through the proposed knowledge distillation method from large pre-trained network to small network and adaptive quantization method according to the parameter size. The proposed model was evaluated using SUN RGB-D Val and self-made apple tree data set. Finally, it achieved the accuracy performance of 62.04% at mAP@0.25 and 47.1% at mAP@0.5, and the inference speed was 120.5 scenes per sec, showing a fast real-time processing speed.

Software Equation Based on Function Points (기능점수 기반 소프트웨어 공식)

  • Lee, Sang-Un
    • The KIPS Transactions:PartD
    • /
    • v.17D no.5
    • /
    • pp.327-336
    • /
    • 2010
  • This paper proposed software equation that is relation with effort and duration based on function point (FP) software size. Existent software equation based on lines of code (LOC). LOC sees big difference according to development language and there are a lot of difficulties in software size estimation. First, considered method that change LOC to FP. But, this method is not decided definitely conversion ratio between LOC and FP by development language. Also, failed though the conversion ratio motives software formula because was not presented about specification development language. Therefore, we derived software formula directly to large project data that was developed by FP. Firstly, datas that reasonable development period is set among development projects. Secondly, FP through regression analysis about this data and effort, motived relation with FP and duration. Finally, software equation was derived from these relation. Proposed model solves application problems that LOC-based model has and has advantage that application is possible easily in business.

On the Privacy Preserving Mining Association Rules by using Randomization (연관규칙 마이닝에서 랜덤화를 이용한 프라이버시 보호 기법에 관한 연구)

  • Kang, Ju-Sung;Cho, Sung-Hoon;Yi, Ok-Yeon;Hong, Do-Won
    • The KIPS Transactions:PartC
    • /
    • v.14C no.5
    • /
    • pp.439-452
    • /
    • 2007
  • We study on the privacy preserving data mining, PPDM for short, by using randomization. The theoretical PPDM based on the secure multi-party computation techniques is not practical for its computational inefficiency. So we concentrate on a practical PPDM, especially randomization technique. We survey various privacy measures and study on the privacy preserving mining of association rules by using randomization. We propose a new randomization operator, binomial selector, for privacy preserving technique of association rule mining. A binomial selector is a special case of a select-a-size operator by Evfimievski et al.[3]. Moreover we present some simulation results of detecting an appropriate parameter for a binomial selector. The randomization by a so-called cut-and-paste method in [3] is not efficient and has high variances on recovered support values for large item-sets. Our randomization by a binomial selector make up for this defects of cut-and-paste method.

SSQUSAR : A Large-Scale Qualitative Spatial Reasoner Using Apache Spark SQL (SSQUSAR : Apache Spark SQL을 이용한 대용량 정성 공간 추론기)

  • Kim, Jonghoon;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.2
    • /
    • pp.103-116
    • /
    • 2017
  • In this paper, we present the design and implementation of a large-scale qualitative spatial reasoner, which can derive new qualitative spatial knowledge representing both topological and directional relationships between two arbitrary spatial objects in efficient way using Aparch Spark SQL. Apache Spark SQL is well known as a distributed parallel programming environment which provides both efficient join operations and query processing functions over a variety of data in Hadoop cluster computer systems. In our spatial reasoner, the overall reasoning process is divided into 6 jobs such as knowledge encoding, inverse reasoning, equal reasoning, transitive reasoning, relation refining, knowledge decoding, and then the execution order over the reasoning jobs is determined in consideration of both logical causal relationships and computational efficiency. The knowledge encoding job reduces the size of knowledge base to reason over by transforming the input knowledge of XML/RDF form into one of more precise form. Repeat of the transitive reasoning job and the relation refining job usually consumes most of computational time and storage for the overall reasoning process. In order to improve the jobs, our reasoner finds out the minimal disjunctive relations for qualitative spatial reasoning, and then, based upon them, it not only reduces the composition table to be used for the transitive reasoning job, but also optimizes the relation refining job. Through experiments using a large-scale benchmarking spatial knowledge base, the proposed reasoner showed high performance and scalability.

A study on Cavity Closure Behavior During Hot Open Die Forging Process (열간 자유단조 공정시 내부 공극 압착 거동에 관한 연구)

  • Kwon, Y.C.;Lee, J.H.;Lee, S.W.;Jung, Y.S.;Kim, N.S.;Lee, Y.S.
    • Transactions of Materials Processing
    • /
    • v.16 no.4 s.94
    • /
    • pp.293-298
    • /
    • 2007
  • Recently, there is a need to produce a large forged part for the flight, shipping, some energies, and military industries, etc. Therefore, an open die forging technique of cast ingots is required to obtain higher quality of large size forged parts. Cogging process is one of the primary stages in many open die forging processes. In the cogging process prior to some open die forging processes, internal cavities have to be eliminated for defect-free. The present work is concerned with the elimination of the internal cavities in large ingots so as to obtain sound products. In this study, hot compression tests were carried out to obtain the flow stress of cast microstructure at different temperature and strain rates. The FEM analysis is performed to investigate the overlap defect of cast ingots during cogging stage. The measured flow stress data were used to simulate the cogging process of cast ingot using the practical material properties. Also the analysis of cavity closure is performed by using the $DEFORM^{TM}-3D$. The calculated results of cavity closure behavior are compared with the measured results before and after cogging, which are scanned by the X-ray scanner. From this result, the criteria for deformation amounts effect on the cavity closure can be investigated by the comparison between practical experiment and numerical analysis.

Efficient Implementation of Convolutional Neural Network Using CUDA (CUDA를 이용한 Convolutional Neural Network의 효율적인 구현)

  • Ki, Cheol-Min;Cho, Tai-Hoon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.6
    • /
    • pp.1143-1148
    • /
    • 2017
  • Currently, Artificial Intelligence and Deep Learning are rising as hot social issues, and these technologies are applied to various fields. A good method among the various algorithms in Artificial Intelligence is Convolutional Neural Networks. Convolutional Neural Network is a form that adds Convolution Layers to Multi Layer Neural Network. If you use Convolutional Neural Networks for small amount of data, or if the structure of layers is not complicated, you don't have to pay attention to speed. But the learning should take long time when the size of the learning data is large and the structure of layers is complicated. In these cases, GPU-based parallel processing is frequently needed. In this paper, we developed Convolutional Neural Networks using CUDA, and show that its learning is faster and more efficient than learning using some other frameworks or programs.

Performance improvement for Streaming of High Capacity Panoramic Video (대용량 파노라마 비디오 스트리밍의 성능개선)

  • Kim, Young-Back;Kim, Tae-Ho;Lee, Dae-Gyu;Kim, Jae-Joon
    • Journal of Internet Computing and Services
    • /
    • v.11 no.2
    • /
    • pp.143-153
    • /
    • 2010
  • When providing high quality panoramic video across the Internet, mobile communications, and broadcasting areas, it requires a suitable video codec that satisfies both high-compression efficiency and random access functionality. The users must have high-compression efficiency in order to enable video streaming of high-volume panoramic data. Random access allows the user to move the viewpoint and direction freely. In this paper, we propose the parallel processing scheme under cell units in order to improve the performance of streaming service for large screen panoramic video in 10Mbps bandwidths based on H.264/AVC with high compression rate. This improved algorithm divides a screen composed of cells less than $256{\times}256$ in size, encodes it, and decodes it with the cells in the present view. At this point, encoding/decoding is parallel processed by the present cell units. Also, since the cells only included in the present view are packed and transmitted, the possible processing of not extricating blocks is proven by experiment.

A Study on Area Detection Using Transfer-Learning Technique (Transfer-Learning 기법을 이용한 영역검출 기법에 관한 연구)

  • Shin, Kwang-seong;Shin, Seong-yoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.178-179
    • /
    • 2018
  • Recently, methods of using machine learning in artificial intelligence such as autonomous navigation and speech recognition have been actively studied. Classical image processing methods such as classical boundary detection and pattern recognition have many limitations in order to recognize a specific object or area in a digital image. However, when a machine learning method such as deep-learning is used, Can be obtained. However, basically, a large amount of learning data must be secured for machine learning such as deep-learning. Therefore, it is difficult to apply the machine learning for area classification when the amount of data is very small, such as aerial photographs for environmental analysis. In this study, we apply a transfer-learning technique that can be used when the dataset size of the input image is small and the shape of the input image is not included in the category of the training dataset.

  • PDF

Image Separation of Talker from a Background by Differential Image and Contours Information (차영상 및 윤곽선에 의한 배경에서 화자분리)

  • Park Jong-Il;Park Young-Bum;Yoo Hyun-Joong
    • The KIPS Transactions:PartB
    • /
    • v.12B no.6 s.102
    • /
    • pp.671-678
    • /
    • 2005
  • In this paper, we suggest an algorithm that allows us to extract the important obbject from motion pictures and then replace the background with arbitrary images. The suggested technique can be used not only for protecting privacy and reducing the size of data to be transferred by removing the background of each frame, but also for replacing the background with user-selected image in video communication systems including mobile phones. Because of the relatively large size of image data, digital image processing usually takes much of the resources like memory and CPU. This can cause trouble especially for mobile video phones which typically have restricted resources. In our experiments, we could reduce the requirements of time and memory for processing the images by restricting the search area to the vicinity of major object's contour found in the previous frame based on the fact that the movement of major object is not wide or rapid in general. Specifically, we detected edges and used the edge image of the initial frame to locate candidate-object areas. Then, on the located areas, we computed the difference image between adjacent frames and used it to determine and trace the major object that might be moving. And then we computed the contour of the major object and used it to separate major object from the background. We could successfully separate major object from the background and replate the background with arbitrary images.