• Title/Summary/Keyword: Large Data Set

Search Result 1,063, Processing Time 0.022 seconds

SUPPORT VECTOR MACHINE USING K-MEANS CLUSTERING

  • Lee, S.J.;Park, C.;Jhun, M.;Koo, J.Y.
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.1
    • /
    • pp.175-182
    • /
    • 2007
  • The support vector machine has been successful in many applications because of its flexibility and high accuracy. However, when a training data set is large or imbalanced, the support vector machine may suffer from significant computational problem or loss of accuracy in predicting minority classes. We propose a modified version of the support vector machine using the K-means clustering that exploits the information in class labels during the clustering process. For large data sets, our method can save the computation time by reducing the number of data points without significant loss of accuracy. Moreover, our method can deal with imbalanced data sets effectively by alleviating the influence of dominant class.

Evaluating the effect of the size of brand consideration set upon the Gutenberg′s monopolistic price interval (고려상표군 크기에 따른 구텐베르그의 가격독점영역에 관한 연구)

  • 백지원;황선진;이수진
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.27 no.8
    • /
    • pp.1004-1013
    • /
    • 2003
  • This study addressed an ill-understood issue of a price response model and a monopolistic price interval of fashion goods. The concept of monopolistic price interval introduced by Gutenberg has been rarely applied to the fashion goods, which is known as price sensitive goods. Thus, this study examined the price insensitive zone of the blue jean. The data of 268 respondents were analyzed using Choice-based Conjoint (CBC) analysis and t-test. Considering brand consideration set as a price determinant, we found the presence of monopolistic price interval of the jean. The results obtained from the CBC analysis showed that the bigger the size of brand consideration set, the shorter the monopolistic interval. This implied that the consumer who had a small brand consideration set was more likely to have a longer monopolistic price interval than the one who had a large brand consideration set, since the consumer with a small consideration set tended to value brand itself more than price. Although significant monopolistic price intervals were shown only for the three jean brands out of the seven, to reduce the size of brand consideration set and to increase brand loyalty were found important in maximizing firms'financial profits.

Algorithm for the Constrained Chebyshev Estimation in Linear Regression

  • Kim, Bu-yong
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.47-54
    • /
    • 2000
  • This article is concerned with the algorithm for the Chebyshev estimation with/without linear equality and/or inequality constraints. The algorithm employs a linear scaling transformation scheme to reduce the computational burden which is induced when the data set is quite large. The convergence of the proposed algorithm is proved. And the updating and orthogonal decomposition techniques are considered to improve the computational efficiency and numerical stability.

  • PDF

A Robotic Medical Palpation using Contact Pressure Distribution (접촉 압력 분포를 이용한 로봇 의료 촉진)

  • Kim, Hyoungkyun;Choi, Seungmoon;Chung, Wan Kyun
    • The Journal of Korea Robotics Society
    • /
    • v.12 no.3
    • /
    • pp.322-331
    • /
    • 2017
  • In this paper we present a novel robotic palpation method for the lump shape estimation using contact pressure distribution. Many previous researches about the robotic palpation have used a stiffness map, which is not suitable to obtain geometrical information of a lump. As a result, they require a large data set and long palpation time to estimate the lump shape. Instead of using the stiffness map, the proposed palpation method uses the difference between the normal force direction and the surface normal to detect the lump boundary and estimate its normal. The palpation trajectory is generated by the normal of the lump boundary to track the lump boundary in real-time. The proposed approach requires small data set and short palpation time for the lump shape estimation since the shape can be directly estimated from the optimally generated palpation trajectory. An experiment result shows that our method can find the lump shape accurately in real-time with small data and short time.

Compression of Simulation Results by Sampling (샘플링에 의한 시뮬레이션 결과의 압축)

  • 안태균;최기영
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.31A no.5
    • /
    • pp.158-169
    • /
    • 1994
  • It is very common in today 's design practice to simulate a big design with a large set of test vectors thereby generating a huge set of data (simulation results) to be analyzed. As the design grows, the simulation results grow and become harder to handled. In this paper, we present algorithms for the compression and regeneration of simulation results. The compression is performed by sampling nets in a circuit. If the user wants to examine the lost part of the data, it is quickly regenerated by applying incremental simulation technique. Experimental results obtained for several practical circuits show that the compression ratio of 10 is easily obtained while maintaining a reasonably fast regeneration of data on a 15.7 MIPS workstation. Using the proposed method we can effectively reduce debug cycle time.

  • PDF

A Method for the Extraction of a Subset of Points from a Large Set of Points Affecting the Distribution of Surface Data - A Case Study of Market Area and Competitive Power Analysis by Sales Data of Micro Scale Retail Stores - (평면 데이터 분포에 영향을 끼치는 점 분포의 부분집합 추출 방법 - 소규모 소매점포의 매출자료를 이용한 상권 및 경쟁력 분석기법을 사례로 -)

  • Lee, Jung-Eun;Sadahiro, Yukio
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.9 no.1
    • /
    • pp.1-12
    • /
    • 2006
  • Approaches to spatial analysis differ from the type of spatial objects to be treated. Especially, in here, the case where two spatial data sets coexist is considered. The goal of such case lies on detecting a subset of spatial objects out of a large set that affects the distribution of the other object. However, it is not easy to extract a subset from a large set by visualization just with the help of GIS since huge amount of data are provided nowadays. In this research, therefore, relationship between two different spatial data are analyzed by quantitative measure in the case study of marketing geography. A purchase history data of a small retail store and the location of its competitors are given as source data for the analysis. The goal of analysis from the aspect of this case study is to extract strong competitors of the store that affects the sales amount of the store among many competitors. With the result, therefore, it is expected that market area pattern and competitive power of stores under micro scale retail environment would be understood by quantitative measure.

  • PDF

Surface Water Mapping of Remote Sensing Data Using Pre-Trained Fully Convolutional Network

  • Song, Ah Ram;Jung, Min Young;Kim, Yong Il
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.36 no.5
    • /
    • pp.423-432
    • /
    • 2018
  • Surface water mapping has been widely used in various remote sensing applications. Water indices have been commonly used to distinguish water bodies from land; however, determining the optimal threshold and discriminating water bodies from similar objects such as shadows and snow is difficult. Deep learning algorithms have greatly advanced image segmentation and classification. In particular, FCN (Fully Convolutional Network) is state-of-the-art in per-pixel image segmentation and are used in most benchmarks such as PASCAL VOC2012 and Microsoft COCO (Common Objects in Context). However, these data sets are designed for daily scenarios and a few studies have conducted on applications of FCN using large scale remotely sensed data set. This paper aims to fine-tune the pre-trained FCN network using the CRMS (Coastwide Reference Monitoring System) data set for surface water mapping. The CRMS provides color infrared aerial photos and ground truth maps for the monitoring and restoration of wetlands in Louisiana, USA. To effectively learn the characteristics of surface water, we used pre-trained the DeepWaterMap network, which classifies water, land, snow, ice, clouds, and shadows using Landsat satellite images. Furthermore, the DeepWaterMap network was fine-tuned for the CRMS data set using two classes: water and land. The fine-tuned network finally classifies surface water without any additional learning process. The experimental results show that the proposed method enables high-quality surface mapping from CRMS data set and show the suitability of pre-trained FCN networks using remote sensing data for surface water mapping.

Building Method an Image Dataset for Tracking Objects in a Video (동영상 내 객체 추적을 위한 영상 데이터셋 구축 방법)

  • Kim, Ji-Seong;Heo, Gyeongyong;Jang, Si-Woong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1790-1796
    • /
    • 2021
  • A large amount of image data sets are required for image deep learning, and there are many differences in the method of obtaining images and constructing image data sets depending on the type of object. In this paper, we presented a method of constructing an image data set for deep learning and analyzed the performance that varies depending on the object to be tracked. We took a video by rotating the object, and then created a data set by segmenting the video using the proposed data set construction method. As a result of performance analysis, detection rate was more than 95%, and detection rate of objects with little change in shape was higher performance. It is considered that it is effective to use the data set construction method presented in this paper for a situation in which it is difficult to obtain image data and to track an object with little change in shape within a video.

Performance Evaluation and Enhancement of Transmission Technique in Wireless Sensor Networks (무선센서네트워크에서 성능측정을 통한 전송방식의 문제점 분석 및 개선)

  • Lim, Dong-Sun;Lee, Joa-Hyoung;Jung, In-Bum
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.5
    • /
    • pp.1311-1321
    • /
    • 2010
  • Sensor network is used to obtain sensing data in various area. The interval to sense the events depends on the type of target application and the amounts of data generated by sensor nodes are not constant. Many applications exploit long sensing interval to enhance the life time of network but there are specific applications that requires very short interval to obtain fine-grained, high-precision sensing data. If the number of nodes in the network is increased and the interval to sense data is shortened, the amounts of generated data are greatly increased and this leads to increased amount of packets to transfer to the network. To transfer large amount of packets fast, it is necessary that the delay between successive packet transmissions should be minimized as possible. In this paper, we propose SET(SendDoneEventbasedTransmission Technique)which reduces the delay between successive packet transmissions by using SendDone Event which informs that a packet transmission has been completed. In SET, the delay between successive packet transmissions is shortened very much since the transmission of next packet starts at the time when the transmission of previous packet has completed, irrespective of the transmission time. Therefore SET could provide high packet transmission rate given large amount of packets.