Search | Korea Science

Machine Learning-Based Transactions Anomaly Prediction for Enhanced IoT Blockchain Network Security and Performance

Nor Fadzilah Abdullah;Ammar Riadh Kairaldeen;Asma Abu-Samah;Rosdiadee Nordin
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.18 no.7
- /
- pp.1986-2009
- /
- 2024
The integration of blockchain technology with the rapid growth of Internet of Things (IoT) devices has enabled secure and decentralised data exchange. However, security vulnerabilities and performance limitations remain significant challenges in IoT blockchain networks. This work proposes a novel approach that combines transaction representation and machine learning techniques to address these challenges. Various clustering techniques, including k-means, DBSCAN, Gaussian Mixture Models (GMM), and Hierarchical clustering, were employed to effectively group unlabelled transaction data based on their intrinsic characteristics. Anomaly transaction prediction models based on classifiers were then developed using the labelled data. Performance metrics such as accuracy, precision, recall, and F1-measure were used to identify the minority class representing specious transactions or security threats. The classifiers were also evaluated on their performance using balanced and unbalanced data. Compared to unbalanced data, balanced data resulted in an overall average improvement of approximately 15.85% in accuracy, 88.76% in precision, 60% in recall, and 74.36% in F1-score. This demonstrates the effectiveness of each classifier as a robust classifier with consistently better predictive performance across various evaluation metrics. Moreover, the k-means and GMM clustering techniques outperformed other techniques in identifying security threats, underscoring the importance of appropriate feature selection and clustering methods. The findings have practical implications for reinforcing security and efficiency in real-world IoT blockchain networks, paving the way for future investigations and advancements.
https://doi.org/10.3837/tiis.2024.07.014 인용 PDF HTML

An Application of the Clustering Threshold Gradient Descent Regularization Method for Selecting Genes in Predicting the Survival Time of Lung Carcinomas

Lee, Seung-Yeoun;Kim, Young-Chul
- Genomics & Informatics
- /
- v.5 no.3
- /
- pp.95-101
- /
- 2007
In this paper, we consider the variable selection methods in the Cox model when a large number of gene expression levels are involved with survival time. Deciding which genes are associated with survival time has been a challenging problem because of the large number of genes and relatively small sample size (n<
PDF KSCI

Classification of Wind Sector in Pohang Region Using Similarity of Time-Series Wind Vectors (시계열 풍속벡터의 유사성을 이용한 포항지역 바람권역 분류)

Kim, Hyun-Goo;Kim, Jinsol;Kang, Yong-Heack;Park, Hyeong-Dong
- Journal of the Korean Solar Energy Society
- /
- v.36 no.1
- /
- pp.11-18
- /
- 2016
The local wind systems in the Pohang region were categorized into wind sectors. Still, thorough knowledge of wind resource assessment, wind environment analysis, and atmospheric environmental impact assessment was required since the region has outstanding wind resources, it is located on the path of typhoon, and it has large-scale atmospheric pollution sources. To overcome the resolution limitation of meteorological dataset and problems of categorization criteria of the preceding studies, the high-resolution wind resource map of the Korea Institute of Energy Research was used as time-series meteorological data; the 2-step method of determining the clustering coefficient through hierarchical clustering analysis and subsequently categorizing the wind sectors through non-hierarchical K-means clustering analysis was adopted. The similarity of normalized time-series wind vector was proposed as the Euclidean distance. The meteor-statistical characteristics of the mean vector wind distribution and meteorological variables of each wind sector were compared. The comparison confirmed significant differences among wind sectors according to the terrain elevation, mean wind speed, Weibull shape parameter, etc.
https://doi.org/10.7836/kses.2016.36.1.011 인용 PDF KSCI

Clustering-based Hierarchical Scene Structure Construction for Movie Videos (영화 비디오를 위한 클러스터링 기반의 계층적 장면 구조 구축)

Choi, Ick-Won;Byun, Hye-Ran
- Journal of KIISE:Software and Applications
- /
- v.27 no.5
- /
- pp.529-542
- /
- 2000
Recent years, the use of multimedia information is rapidly increasing, and the video media is the most rising one than any others, and this field Integrates all the media into a single data stream. Though the availability of digital video is raised largely, it is very difficult for users to make the effective video access, due to its length and unstructured video format. Thus, the minimal interaction of users and the explicit definition of video structure is a key requirement in the lately developing image and video management systems. This paper defines the terms and hierarchical video structure, and presents the system, which construct the clustering-based video hierarchy, which facilitate users by browsing the summary and do a random access to the video content. Instead of using a single feature and domain-specific thresholds, we use multiple features that have complementary relationship for each other and clustering-based methods that use normalization so as to interact with users minimally. The stage of shot boundary detection extracts multiple features, performs the adaptive filtering process for each features to enhance the performance by eliminating the false factors, and does k-means clustering with two classes. The shot list of a result after the proposed procedure is represented as the video hierarchy by the intelligent unsupervised clustering technique. We experimented the static and the dynamic movie videos that represent characteristics of various video types. In the result of shot boundary detection, we had almost more than 95% good performance, and had also rood result in the video hierarchy.
PDF

A Study of Similarity Measure Algorithms for Recomendation System about the PET Food (반려동물 사료 추천시스템을 위한 유사성 측정 알고리즘에 대한 연구)

Kim, Sam-Taek
- Journal of the Korea Convergence Society
- /
- v.10 no.11
- /
- pp.159-164
- /
- 2019
Recent developments in ICT technology have increased interest in the care and health of pets such as dogs and cats. In this paper, cluster analysis was performed based on the component data of pet food to be used in various fields of the pet industry. For cluster analysis, the similarity was analyzed by analyzing the correlation between components of 300 dogs and cats in the market. In this paper, clustering techniques such as Hierarchical, K-Means, Partitioning around medoids (PAM), Density-based, Mean-Shift are clustered and analyzed. We also propose a personalized recommendation system for pets. The results of this paper can be used for personalized services such as feed recommendation system for pets.
https://doi.org/10.15207/JKCS.2019.10.11.159 인용 PDF KSCI

V-mask Type Criterion for Identification of Outliers In Logistic Regression

Kim Bu-Yong
- Communications for Statistical Applications and Methods
- /
- v.12 no.3
- /
- pp.625-634
- /
- 2005
A procedure is proposed to identify multiple outliers in the logistic regression. It detects the leverage points by means of hierarchical clustering of the robust distances based on the minimum covariance determinant estimator, and then it employs a V-mask type criterion on the scatter plot of robust residuals against robust distances to classify the observations into vertical outliers, bad leverage points, good leverage points, and regular points. Effectiveness of the proposed procedure is evaluated on the basis of the classic and artificial data sets, and it is shown that the procedure deals very well with the masking and swamping effects.
https://doi.org/10.5351/CKSS.2005.12.3.625 인용 PDF KSCI

Development of a Knowledge Discovery System using Hierarchical Self-Organizing Map and Fuzzy Rule Generation

Koo, Taehoon;Rhee, Jongtae
- Proceedings of the Korea Inteligent Information System Society Conference
- /
- 2001.01a
- /
- pp.431-434
- /
- 2001
Knowledge discovery in databases(KDD) is the process for extracting valid, novel, potentially useful and understandable knowledge form real data. There are many academic and industrial activities with new technologies and application areas. Particularly, data mining is the core step in the KDD process, consisting of many algorithms to perform clustering, pattern recognition and rule induction functions. The main goal of these algorithms is prediction and description. Prediction means the assessment of unknown variables. Description is concerned with providing understandable results in a compatible format to human users. We introduce an efficient data mining algorithm considering predictive and descriptive capability. Reasonable pattern is derived from real world data by a revised neural network model and a proposed fuzzy rule extraction technique is applied to obtain understandable knowledge. The proposed neural network model is a hierarchical self-organizing system. The rule base is compatible to decision makers perception because the generated fuzzy rule set reflects the human information process. Results from real world application are analyzed to evaluate the system\`s performance.
PDF

Comparison of clustering methods of microarray gene expression data (마이크로어레이 유전자 발현 자료에 대한 군집 방법 비교)

Lim, Jin-Soo;Lim, Dong-Hoon
- Journal of the Korean Data and Information Science Society
- /
- v.23 no.1
- /
- pp.39-51
- /
- 2012
Cluster analysis has proven to be a useful tool for investigating the association structure among genes and samples in a microarray data set. We applied several cluster validation measures to evaluate the performance of clustering algorithms for analyzing microarray gene expression data, including hierarchical clustering, K-means, PAM, SOM and model-based clustering. The available validation measures fall into the three general categories of internal, stability and biological. The performance of clustering algorithms is evaluated using simulated and SRBCT microarray data. Our results from simulated data show that nearly every methods have good results with same result as the number of classes in the original data. For the SRBCT data the best choice for the number of clusters is less clear than the simulated data. It appeared that PAM, SOM, model-based method showed similar results to simulated data under Silhouette with of internal measure as well as PAM and model-based method under biological measure, while model-based clustering has the best value of stability measure.
https://doi.org/10.7465/jkdi.2012.23.1.039 인용 PDF KSCI

Comparison of Clustering Techniques in Flight Approach Phase using ADS-B Track Data (공항 근처 ADS-B 항적 자료에서의 클러스터링 기법 비교)

Jong-Chan Park;Heon Jin Park
- The Journal of Bigdata
- /
- v.6 no.2
- /
- pp.29-38
- /
- 2021
Deviation of route in aviation safety management is a dangerous factor that can lead to serious accidents. In this study, the anomaly score is calculated by classifying the tracks through clustering and calculating the distance from the cluster center. The study was conducted by extracting tracks within 100 km of the airport from the ADS-B track data received for one year. The wake was vectorized using linear interpolation. Latitude, longitude, and altitude 3D coordinates were used. Through PCA, the dimension was reduced to an axis representing more than 90% of the overall data distribution, and k-means clustering, hierarchical clustering, and PAM techniques were applied. The number of clusters was selected using the silhouette measure, and an abnormality score was calculated by calculating the distance from the cluster center. In this study, we compare the number of clusters for each cluster technique, and evaluate the clustering result through the silhouette measure.
https://doi.org/10.36498/kbigdt.2021.6.2.29 인용 PDF KSCI

Re-evaluation of Obesity Syndrome Differentiation Questionnaire Based on Real-world Survey Data Using Data Mining (데이터 마이닝을 이용한 한의비만변증 설문지 재평가: 실제 임상에서 수집한 설문응답 기반으로)

Oh, Jihong;Wang, Jing-Hua;Choi, Sun-Mi;Kim, Hojun
- Journal of Korean Medicine for Obesity Research
- /
- v.21 no.2
- /
- pp.80-94
- /
- 2021
Objectives: The purpose of this study is to re-evaluate the importance of questions of obesity syndrome differentiation (OSD) questionnaire based on real-world survey and to explore the possibility of simplifying OSD types. Methods: The OSD frequency was identified, and variance threshold feature selection was performed to filter the questions. Filtered questions were clustered by K-means clustering and hierarchical clustering. After principal component analysis (PCA), the distribution patterns of the subjects were identified and the differences in the syndrome distribution were compared. Results: The frequency of OSD in spleen deficiency, phlegm (PH), and blood stasis (BS) was lower than in food retention (FR), liver qi stagnation (LS), and yang deficiency. We excluded 13 questions with low variance, 7 of which were related to BS. Filtered questions were clustered into 3 groups by K-means clustering; Cluster 1 (17 questions) mainly related to PH, BS syndromes; Cluster 2 (11 questions) related to swelling, and indigestion; Cluster 3 (11 questions) related to overeating or emotional symptoms. After PCA, significant different patterns of subjects were observed in the FR, LS, and other obesity syndromes. The questions that mainly affect the FR distribution were digestive symptoms. And emotional symptoms mainly affect the distribution of LS subjects. And other obesity syndrome was partially affected by both digestive and emotional symptoms, and also affected by symptoms related to poor circulation. Conclusions: In-depth data mining analysis identified relatively low importance questions and the potential to simplify OSD types.
https://doi.org/10.15429/jkomor.2021.21.2.80 인용 PDF KSCI

Search Result 88, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)