• 제목/요약/키워드: K-means clustering technique

검색결과 151건 처리시간 0.031초

머신 러닝을 활용한 의류제품의 판매량 예측 모델 - 아우터웨어 품목을 중심으로 - (Sales Forecasting Model for Apparel Products Using Machine Learning Technique - A Case Study on Forecasting Outerwear Items -)

  • 채진미;김은희
    • 한국의류산업학회지
    • /
    • 제23권4호
    • /
    • pp.480-490
    • /
    • 2021
  • Sales forecasting is crucial for many retail operations. For apparel retailers, accurate sales forecast for the next season is critical to properly manage inventory and plan their supply chains. The challenge in this increases because apparel products are always new for the next season, have numerous variations, short life cycles, long lead times, and seasonal trends. In this study, a sales forecasting model is proposed for apparel products using machine learning techniques. The sales data pertaining to outerwear items for four years were collected from a Korean sports brand and filtered with outliers. Subsequently, the data were standardized by removing the effects of exogenous variables. The sales patterns of outerwear items were clustered by applying K-means clustering, and outerwear attributes associated with the specific sales-pattern type were determined by using a decision tree classifier. Six types of sales pattern clusters were derived and classified using a hybrid model of clustering and decision tree algorithm, and finally, the relationship between outerwear attributes and sales patterns was revealed. Each sales pattern can be used to predict stock-keeping-unit-level sales based on item attributes.

EEIRI: Efficient Encrypted Image Retrieval in IoT-Cloud

  • Abduljabbar, Zaid Ameen;Ibrahim, Ayad;Hussain, Mohammed Abdulridha;Hussien, Zaid Alaa;Al Sibahee, Mustafa A.;Lu, Songfeng
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권11호
    • /
    • pp.5692-5716
    • /
    • 2019
  • One of the best means to safeguard the confidentiality, security, and privacy of an image within the IoT-Cloud is through encryption. However, looking through encrypted data is a difficult process. Several techniques for searching encrypted data have been devised, but certain security solutions may not be used in IoT-Cloud because such solutions are not lightweight. We propose a lightweight scheme that can perform a content-based search of encrypted images, namely EEIRI. In this scheme, the images are represented using local features. We develop and validate a secure scheme for measuring the Euclidean distance between two descriptor sets. To improve the search efficiency, we employ the k-means clustering technique to construct a searchable tree-based index. Our index construction process ensures the privacy of the stored data and search requests. When compared with more familiar techniques of searching images over plaintexts, EEIRI is considered to be more efficient, demonstrating a higher search cost of 7% and a decrease in search accuracy of 1.7%. Numerous empirical investigations are carried out in relation to real image collections so as to evidence our work.

영상 클러스터링과 HSV 컬러 모델을 이용한 차선 검출 전처리 기법 (Preprocessing Technique for Lane Detection Using Image Clustering and HSV Color Model)

  • 최나래;최상일
    • 한국멀티미디어학회논문지
    • /
    • 제20권2호
    • /
    • pp.144-152
    • /
    • 2017
  • Among the technologies for implementing autonomous vehicles, advanced driver assistance system is a key technology to support driver's safe driving. In the technology using the vision sensor having a high utility, various preprocessing methods are used prior to feature extraction for lane detection. However, in the existing methods, the unnecessary lane candidates such as cars, lawns, and road separator in the road area are false positive. In addition, there are cases where the lane candidate itself can not be extracted in the area under the overpass, the lane within the dark shadow, the center lane of yellow, and weak lane. In this paper, we propose an efficient preprocessing method using k-means clustering for image division and the HSV color model. When the proposed preprocessing method is applied, the true positive region is maximally maintained during the lane detection and many false positive regions are removed.

모바일 환경에서의 시각 음성인식을 위한 눈 정위 기반 입술 탐지에 대한 연구 (A Study on Lip Detection based on Eye Localization for Visual Speech Recognition in Mobile Environment)

  • 송민규;;김진영;황성택
    • 한국지능시스템학회논문지
    • /
    • 제19권4호
    • /
    • pp.478-484
    • /
    • 2009
  • 음성 인식 기술은 편리한 삶을 추구하는 요즘 추세에 HMI를 위해 매력적인 기술이다. 음성 인식기술에 대한 많은 연구가 진행되고 있으나 여전히 잡음 환경에서의 성능은 취약하다. 이를 해결하기 위해 요즘은 청각 정보 뿐 아니라 시각 정보를 이용하는 시각 음성인식에 대한 연구가 활발히 진행되고 있다. 본 논문에서는 모바일 환경에서의 시각 음성인식을 위한 입술의 탐지 방법을 제안한다. 시각 음성인식을 위해서는 정확한 입술의 탐지가 필요하다. 우리는 입력 영상에서 입술에 비해 보다 찾기 쉬운 눈을 이용하여 눈의 위치를 먼저 탐지한 후 이 정보를 이용하여 대략적인 입술 영상을 구한다. 구해진 입술 영상에 K-means 집단화 알고리듬을 이용하여 영역을 분할하고 분할된 영역들 중 가장 큰 영역을 선택하여 입술의 양 끝점과 중심을 얻는다. 마지막으로, 실험을 통하여 제안된 기법의 성능을 확인하였다.

An eigenspace projection clustering method for structural damage detection

  • Zhu, Jun-Hua;Yu, Ling;Yu, Li-Li
    • Structural Engineering and Mechanics
    • /
    • 제44권2호
    • /
    • pp.179-196
    • /
    • 2012
  • An eigenspace projection clustering method is proposed for structural damage detection by combining projection algorithm and fuzzy clustering technique. The integrated procedure includes data selection, data normalization, projection, damage feature extraction, and clustering algorithm to structural damage assessment. The frequency response functions (FRFs) of the healthy and the damaged structure are used as initial data, median values of the projections are considered as damage features, and the fuzzy c-means (FCM) algorithm are used to categorize these features. The performance of the proposed method has been validated using a three-story frame structure built and tested by Los Alamos National Laboratory, USA. Two projection algorithms, namely principal component analysis (PCA) and kernel principal component analysis (KPCA), are compared for better extraction of damage features, further six kinds of distances adopted in FCM process are studied and discussed. The illustrated results reveal that the distance selection depends on the distribution of features. For the optimal choice of projections, it is recommended that the Cosine distance is used for the PCA while the Seuclidean distance and the Cityblock distance suitably used for the KPCA. The PCA method is recommended when a large amount of data need to be processed due to its higher correct decisions and less computational costs.

열화상카메라 기반 콘크리트 온도 측정을 위한 이미지 프로세싱 적용 기초 연구 (Preliminary Study on Image Processing Method for Concrete Temperature Monitoring using Thermal Imaging Camera)

  • 문성환;김태훈;조규만
    • 한국건축시공학회:학술대회논문집
    • /
    • 한국건축시공학회 2020년도 봄 학술논문 발표대회
    • /
    • pp.206-207
    • /
    • 2020
  • Accurate estimation of concrete strength development at early ages is a critical factor to secure structural stability as well as to speed up the construction process. The temperature generated from the heat of hydration is considered as a key parameter in predicting the early age strength. Conventionally, concrete temperature has been measured by temperature sensors installed inside concrete. However, considering the measurement on building structures with multiple floors, this method requires reinstallation and repositioning of hardware such as sensors, data loggers and routers for data transfer. This makes the temperature monitoring work cumbersome and inefficient. Concrete temperature monitoring by using thermal remote sensing can be an effective alternative to supplement those shortcomings. In this study, image processing was carried out through K-means clustering technique, which is a unsupervised learning method, and the classification results were analyzed accordingly. In the future, research will be conducted on how to automatically recognize concrete among various objects by using deep learning techniques.

  • PDF

휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘 (An Efficient Clustering Algorithm based on Heuristic Evolution)

  • 류정우;강명구;김명원
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제29권1_2호
    • /
    • pp.80-90
    • /
    • 2002
  • 클러스터링이란 한 군집에 포함된 데이터들 간의 유사한 성질을 갖도록 데이터들을 묶는 것으로 패턴인식, 영상처리 등의 공학 분야에 널리 적용되고 있을 뿐만 아니라, 최근 많은 관심의 대상이 되고 있는 데이터 마이닝의 주요 기술로서 활발히 응용되고 있다. 클러스터링에 있어서 K-means나 FCM(Fuzzy C-means)와 같은 기존의 알고리즘들은 지역적 최적해에 수렴하는 것과 사전에 클러스터 개수를 미리 결정해야 하는 문제점을 개선하였으며, 클러스터링의 특성을 분산도와 분리도로 정의하였다. 분산도는 임의의 클러스터의 중심으로부터 포함된 데이터들이 어느 정도 흩어져 있는지를 나타내는 척도인 반면, 분리도는 임의의 데이터와 모든 클러스터 중심간의 거리의 비율로서 얻어지는 소속정도를 고려하여 클러스터 중심간의 거리를 나타내는 척도이다. 이 두 척도를 이용하여 자동으로 적절한 클러스터 개수를 결정하게 하였다. 또한 진화알고리즘의 문제점인 탐색공간의 확대에 따른 수행시간의 증가는 휴리스틱 연산을 적용함으로써 크게 개선하였다. 제안한 알고리즘의 성능 및 타당성을 보이기 위해 이차원과 다차원 실험데이타를 사용하여 실험한 결과 제안한 알고리즘의 성능이 우수함을 나타내었다.

Optimizing Clustering and Predictive Modelling for 3-D Road Network Analysis Using Explainable AI

  • Rotsnarani Sethy;Soumya Ranjan Mahanta;Mrutyunjaya Panda
    • International Journal of Computer Science & Network Security
    • /
    • 제24권9호
    • /
    • pp.30-40
    • /
    • 2024
  • Building an accurate 3-D spatial road network model has become an active area of research now-a-days that profess to be a new paradigm in developing Smart roads and intelligent transportation system (ITS) which will help the public and private road impresario for better road mobility and eco-routing so that better road traffic, less carbon emission and road safety may be ensured. Dealing with such a large scale 3-D road network data poses challenges in getting accurate elevation information of a road network to better estimate the CO2 emission and accurate routing for the vehicles in Internet of Vehicle (IoV) scenario. Clustering and regression techniques are found suitable in discovering the missing elevation information in 3-D spatial road network dataset for some points in the road network which is envisaged of helping the public a better eco-routing experience. Further, recently Explainable Artificial Intelligence (xAI) draws attention of the researchers to better interprete, transparent and comprehensible, thus enabling to design efficient choice based models choices depending upon users requirements. The 3-D road network dataset, comprising of spatial attributes (longitude, latitude, altitude) of North Jutland, Denmark, collected from publicly available UCI repositories is preprocessed through feature engineering and scaling to ensure optimal accuracy for clustering and regression tasks. K-Means clustering and regression using Support Vector Machine (SVM) with radial basis function (RBF) kernel are employed for 3-D road network analysis. Silhouette scores and number of clusters are chosen for measuring cluster quality whereas error metric such as MAE ( Mean Absolute Error) and RMSE (Root Mean Square Error) are considered for evaluating the regression method. To have better interpretability of the Clustering and regression models, SHAP (Shapley Additive Explanations), a powerful xAI technique is employed in this research. From extensive experiments , it is observed that SHAP analysis validated the importance of latitude and altitude in predicting longitude, particularly in the four-cluster setup, providing critical insights into model behavior and feature contributions SHAP analysis validated the importance of latitude and altitude in predicting longitude, particularly in the four-cluster setup, providing critical insights into model behavior and feature contributions with an accuracy of 97.22% and strong performance metrics across all classes having MAE of 0.0346, and MSE of 0.0018. On the other hand, the ten-cluster setup, while faster in SHAP analysis, presented challenges in interpretability due to increased clustering complexity. Hence, K-Means clustering with K=4 and SVM hybrid models demonstrated superior performance and interpretability, highlighting the importance of careful cluster selection to balance model complexity and predictive accuracy.

문장구조 유사도와 단어 유사도를 이용한 클러스터링 기반의 통계기계번역 (Clustering-based Statistical Machine Translation Using Syntactic Structure and Word Similarity)

  • 김한경;나휘동;이금희;이종혁
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제37권4호
    • /
    • pp.297-304
    • /
    • 2010
  • 통계기계번역에서 번역성능의 향상을 위해서 문장의 유형이나 장르에 따라 클러스터링을 수행하여 도메인에 특화된 번역을 시도하는 방법이 있다. 그러나 기존의 연구 중 문장의 유형 정보와 장르에 따른 정보를 동시에 사용한 경우는 없었다. 본 논문에서는 각 문장의 문법적 구조 유사도에 따른 유형별분류 기법과, 단어 유사도 정보를 사용한 장르 구분법을 적용하여 기존의 두 기법을 통합하였다. 이렇게 분류된 말뭉치에서 추출한 도메인 특화 모델과 전체 말뭉치에서 추출된 모델에서 보간법(interpolation)을 사용하여 통계기계번역의 성능을 향상하였다. 문장구조 유사도와 단어 유사도의 계산 방법으로는 각각 커널과 코사인 유사도를 적용하였으며, 두 유사도를 적용하여 말뭉치를 분류하는 과정에서는 K-Means 알고리즘과 유사한 기계학습 기법을 사용하였다. 이를 일본어-영어의 특허문서에서 실험한 결과 최선의 경우 약 2.5%의 상대적인 성능 향상을 얻었다.

공동주택 전력 소비 데이터 분석 및 딥러닝을 사용한 전력 소비 예측 (Analysis of Apartment Power Consumption and Forecast of Power Consumption Based on Deep Learning)

  • 유남조;이은애;정범진;김동식
    • 전기전자학회논문지
    • /
    • 제23권4호
    • /
    • pp.1373-1380
    • /
    • 2019
  • 에너지의 생산 효율성을 증가시키기 위해 최근 스마트그리드 기술 중 지능형 검침 시스템(AMI, advanced metering infrastructure)의 개발이 활발히 진행되고 있다. 전력 소비 데이터를 분석하고 소비 패턴을 예측하는 일은 AMI에서 핵심적인 부분이다. 본 논문에서는 수집된 전력 소비 데이터를 분석하고 발생할 수 있는 오류들을 정리하였으며 소비 패턴을 월별로 k-means 군집화 알고리즘을 사용하여 분석하였다. 또한 deep neural network를 이용하여 소비 패턴을 예측하였는데, 가구별 하루 전력 사용량 예측의 어려움을 극복하기 위하여 전력 사용량을 100개의 군집으로 분류하여 이 군집의 하루 평균으로 다음날 군집의 평균을 예측하였다. 실제 AMI에서의 전력 데이터를 사용하여 오류들을 분석하였으며 군집화 방법을 도입하여 성공적으로 전력 소비 예측이 가능하였다.