• Title/Summary/Keyword: K-means++ algorithm

Search Result 1,367, Processing Time 0.024 seconds

[Retracted]Hot Spot Analysis of Tourist Attractions Based on Stay Point Spatial Clustering

  • Liao, Yifan
    • Journal of Information Processing Systems
    • /
    • v.16 no.4
    • /
    • pp.750-759
    • /
    • 2020
  • The wide application of various integrated location-based services (LBS social) and tourism application (app) has generated a large amount of trajectory space data. The trajectory data are used to identify popular tourist attractions with high density of tourists, and they are of great significance to smart service and emergency management of scenic spots. A hot spot analysis method is proposed, based on spatial clustering of trajectory stop points. The DBSCAN algorithm is studied with fast clustering speed, noise processing and clustering of arbitrary shapes in space. The shortage of parameters is manually selected, and an improved method is proposed to adaptively determine parameters based on statistical distribution characteristics of data. DBSCAN clustering analysis and contrast experiments are carried out for three different datasets of artificial synthetic two-dimensional dataset, four-dimensional Iris real dataset and scenic track retention point. The experiment results show that the method can automatically generate reasonable clustering division, and it is superior to traditional algorithms such as DBSCAN and k-means. Finally, based on the spatial clustering results of the trajectory stay points, the Getis-Ord Gi* hotspot analysis and mapping are conducted in ArcGIS software. The hot spots of different tourist attractions are classified according to the analysis results, and the distribution of popular scenic spots is determined with the actual heat of the scenic spots.

EDGE: An Enticing Deceptive-content GEnerator as Defensive Deception

  • Li, Huanruo;Guo, Yunfei;Huo, Shumin;Ding, Yuehang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.5
    • /
    • pp.1891-1908
    • /
    • 2021
  • Cyber deception defense mitigates Advanced Persistent Threats (APTs) with deploying deceptive entities, such as the Honeyfile. The Honeyfile distracts attackers from valuable digital documents and attracts unauthorized access by deliberately exposing fake content. The effectiveness of distraction and trap lies in the enticement of fake content. However, existing studies on the Honeyfile focus less on this perspective. In this work, we seek to improve the enticement of fake text content through enhancing its readability, indistinguishability, and believability. Hence, an enticing deceptive-content generator, EDGE, is presented. The EDGE is constructed with three steps: extracting key concepts with a semantics-aware K-means clustering algorithm, searching for candidate deceptive concepts within the Word2Vec model, and generating deceptive text content under the Integrated Readability Index (IR). Furthermore, the readability and believability performance analyses are undertaken. The experimental results show that EDGE generates indistinguishable deceptive text content without decreasing readability. In all, EDGE proves effective to generate enticing deceptive text content as deception defense against APTs.

Method for Estimating Intramuscular Fat Percentage of Hanwoo(Korean Traditional Cattle) Using Convolutional Neural Networks in Ultrasound Images

  • Kim, Sang Hyun
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.105-116
    • /
    • 2021
  • In order to preserve the seeds of excellent Hanwoo(Korean traditional cattle) and secure quality competitiveness in the infinite competition with foreign imported beef, production of high-quality Hanwoo beef is absolutely necessary. %IMF (Intramuscular Fat Percentage) is one of the most important factors in evaluating the value of high-quality meat, although standards vary according to food culture and industrial conditions by country. Therefore, it is required to develop a %IMF estimation algorithm suitable for Hanwoo. In this study, we proposed a method of estimating %IMF of Hanwoo using CNN in ultrasound images. First, the proposed method classified the chemically measured %IMF into 10 classes using k-means clustering method to apply CNN. Next, ROI images were obtained at regular intervals from each ultrasound image and used for CNN training and estimation. The proposed CNN model is composed of three stages of convolution layer and fully connected layer. As a result of the experiment, it was confirmed that the %IMF of Hanwoo was estimated with an accuracy of 98.2%. The correlation coefficient between the estimated %IMF and the real %IMF by the proposed method is 0.97, which is about 10% better than the 0.88 of the previous method.

A Study on Image Analysis for Determination of Wear Area in Accelerated Durability Test (가속내구시험 마모영역 판별에 대한 이미지 분석 연구)

  • Cheon, Min-Woo;Lee, Chul-Hee
    • Tribology and Lubricants
    • /
    • v.38 no.4
    • /
    • pp.128-135
    • /
    • 2022
  • In the product development process, the reliability of the product can be secured through durability tests. However, since the durability test method is expensive and time consuming, a method to save time and money by utilizing virtual product development (VPD) is required. However, research on the accuracy of the results of virtual product development is required. In this paper, an accelerated durability test was designed and conducted using a planetary gear decelerator. And an analysis model under the same conditions was created and simulated. To correlate the results of the experiment with the results of the analytical model, created a model that can discriminate the wear region using one of the data mining methods, the k-means algorithm method and HSV (Hue, Saturation, Value). The wear area is compared by counting the number of pixels defined as wear through a discrimination model. A similar ratio was calculated by comparing the pixel ratio of the area determined as wear in the entire area. It showed a similar ratio of about 70%, and it is necessary to improve the discrimination method.

Developing efficient model updating approaches for different structural complexity - an ensemble learning and uncertainty quantifications

  • Lin, Guangwei;Zhang, Yi;Liao, Qinzhuo
    • Smart Structures and Systems
    • /
    • v.29 no.2
    • /
    • pp.321-336
    • /
    • 2022
  • Model uncertainty is a key factor that could influence the accuracy and reliability of numerical model-based analysis. It is necessary to acquire an appropriate updating approach which could search and determine the realistic model parameter values from measurements. In this paper, the Bayesian model updating theory combined with the transitional Markov chain Monte Carlo (TMCMC) method and K-means cluster analysis is utilized in the updating of the structural model parameters. Kriging and polynomial chaos expansion (PCE) are employed to generate surrogate models to reduce the computational burden in TMCMC. The selected updating approaches are applied to three structural examples with different complexity, including a two-storey frame, a ten-storey frame, and the national stadium model. These models stand for the low-dimensional linear model, the high-dimensional linear model, and the nonlinear model, respectively. The performances of updating in these three models are assessed in terms of the prediction uncertainty, numerical efforts, and prior information. This study also investigates the updating scenarios using the analytical approach and surrogate models. The uncertainty quantification in the Bayesian approach is further discussed to verify the validity and accuracy of the surrogate models. Finally, the advantages and limitations of the surrogate model-based updating approaches are discussed for different structural complexity. The possibility of utilizing the boosting algorithm as an ensemble learning method for improving the surrogate models is also presented.

Key-based dynamic S-Box approach for PRESENT lightweight block cipher

  • Yogaraja CA;Sheela Shobana Rani K
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.12
    • /
    • pp.3398-3415
    • /
    • 2023
  • Internet-of-Things (IoT) is an emerging technology that interconnects millions of small devices to enable communication between the devices. It is heavily deployed across small scale to large scale industries because of its wide range of applications. These devices are very capable of transferring data over the internet including critical data in few applications. Such data is exposed to various security threats and thereby raises privacy-related concerns. Even devices can be compromised by the attacker. Modern cryptographic algorithms running on traditional machines provide authentication, confidentiality, integrity, and non-repudiation in an easy manner. IoT devices have numerous constraints related to memory, storage, processors, operating systems and power. Researchers have proposed several hardware and software implementations for addressing security attacks in lightweight encryption mechanism. Several works have made on lightweight block ciphers for improving the confidentiality by means of providing security level against cryptanalysis techniques. With the advances in the cipher breaking techniques, it is important to increase the security level to much higher. This paper, focuses on securing the critical data that is being transmitted over the internet by PRESENT using key-based dynamic S-Box. Security analysis of the proposed algorithm against other lightweight block cipher shows a significant improvement against linear and differential attacks, biclique attack and avalanche effect. A novel key-based dynamic S-Box approach for PRESENT strongly withstands cryptanalytic attacks in the IoT Network.

Analysis method of patent document to Forecast Patent Registration (특허 등록 예측을 위한 특허 문서 분석 방법)

  • Koo, Jung-Min;Park, Sang-Sung;Shin, Young-Geun;Jung, Won-Kyo;Jang, Dong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.11 no.4
    • /
    • pp.1458-1467
    • /
    • 2010
  • Recently, imitation and infringement rights of an intellectual property are being recognized as impediments to nation's industrial growth. To prevent the huge loss which comes from theses impediments, many researchers are studying protection and efficient management of an intellectual property in various ways. Especially, the prediction of patent registration is very important part to protect and assert intellectual property rights. In this study, we propose the patent document analysis method by using text mining to predict whether the patent is registered or rejected. In the first instance, the proposed method builds the database by using the word frequencies of the rejected patent documents. And comparing the builded database with another patent documents draws the similarity value between each patent document and the database. In this study, we used k-means which is partitioning clustering algorithm to select criteria value of patent rejection. In result, we found conclusion that some patent which similar to rejected patent have strong possibility of rejection. We used U.S.A patent documents about bluetooth technology, solar battery technology and display technology for experiment data.

Efficient Dummy Generation for Protecting Location Privacy (개인의 위치를 보호하기 위한 효율적인 더미 생성)

  • Cai, Tian-Yuan;Song, Doo-Hee;Youn, Ji-Hye;Lee, Won-Gyu;Kim, Yong-Kab;Park, Kwang-Jin
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.6
    • /
    • pp.526-533
    • /
    • 2016
  • The researches protecting user's location in location-based services(LBS) have received much attention. Especially k-anonymity is the most popular privacy preservation method. k-anonymization means that it selects k-1 other dummies or clients to make the cloaking region. This reduced the probability of the query issuer's location being exposed to untrusted parties to 1/k. But query's location may expose to adversary when k-1 dummies are concentrated in query's location or there is dummy in where query can not exist. Therefore, we proposed the dummy system model and algorithm taking the real environment into account to protect user's location privacy. And we proved the efficiency of our method in terms of experiment result.

A Study on Stroke Extraction for Handwritten Korean Character Recognition (필기체 한글 문자 인식을 위한 획 추출에 관한 연구)

  • Choi, Young-Kyoo;Rhee, Sang-Burm
    • The KIPS Transactions:PartB
    • /
    • v.9B no.3
    • /
    • pp.375-382
    • /
    • 2002
  • Handwritten character recognition is classified into on-line handwritten character recognition and off-line handwritten character recognition. On-line handwritten character recognition has made a remarkable outcome compared to off-line hacdwritten character recognition. This method can acquire the dynamic written information such as the writing order and the position of a stroke by means of pen-based electronic input device such as a tablet board. On the contrary, Any dynamic information can not be acquired in off-line handwritten character recognition since there are extreme overlapping between consonants and vowels, and heavily noisy images between strokes, which change the recognition performance with the result of the preprocessing. This paper proposes a method that effectively extracts the stroke including dynamic information of characters for off-line Korean handwritten character recognition. First of all, this method makes improvement and binarization of input handwritten character image as preprocessing procedure using watershed algorithm. The next procedure is extraction of skeleton by using the transformed Lu and Wang's thinning: algorithm, and segment pixel array is extracted by abstracting the feature point of the characters. Then, the vectorization is executed with a maximum permission error method. In the case that a few strokes are bound in a segment, a segment pixel array is divided with two or more segment vectors. In order to reconstruct the extracted segment vector with a complete stroke, the directional component of the vector is mortified by using right-hand writing coordinate system. With combination of segment vectors which are adjacent and can be combined, the reconstruction of complete stroke is made out which is suitable for character recognition. As experimentation, it is verified that the proposed method is suitable for handwritten Korean character recognition.

Speaker-Adaptive Speech Synthesis based on Fuzzy Vector Quantizer Mapping and Neural Networks (퍼지 벡터 양자화기 사상화와 신경망에 의한 화자적응 음성합성)

  • Lee, Jin-Yi;Lee, Gwang-Hyeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.4 no.1
    • /
    • pp.149-160
    • /
    • 1997
  • This paper is concerned with the problem of speaker-adaptive speech synthes is method using a mapped codebook designed by fuzzy mapping on FLVQ (Fuzzy Learning Vector Quantization). The FLVQ is used to design both input and reference speaker's codebook. This algorithm is incorporated fuzzy membership function into the LVQ(learning vector quantization) networks. Unlike the LVQ algorithm, this algorithm minimizes the network output errors which are the differences of clas s membership target and actual membership values, and results to minimize the distances between training patterns and competing neurons. Speaker Adaptation in speech synthesis is performed as follow;input speaker's codebook is mapped a reference speaker's codebook in fuzzy concepts. The Fuzzy VQ mapping replaces a codevector preserving its fuzzy membership function. The codevector correspondence histogram is obtained by accumulating the vector correspondence along the DTW optimal path. We use the Fuzzy VQ mapping to design a mapped codebook. The mapped codebook is defined as a linear combination of reference speaker's vectors using each fuzzy histogram as a weighting function with membership values. In adaptive-speech synthesis stage, input speech is fuzzy vector-quantized by the mapped codcbook, and then FCM arithmetic is used to synthesize speech adapted to input speaker. The speaker adaption experiments are carried out using speech of males in their thirties as input speaker's speech, and a female in her twenties as reference speaker's speech. Speeches used in experiments are sentences /anyoung hasim nika/ and /good morning/. As a results of experiments, we obtained a synthesized speech adapted to input speaker.

  • PDF