• Title/Summary/Keyword: $k$-NN

Search Result 796, Processing Time 0.021 seconds

Fast k-NN based Malware Analysis in a Massive Malware Environment

  • Hwang, Jun-ho;Kwak, Jin;Lee, Tae-jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.12
    • /
    • pp.6145-6158
    • /
    • 2019
  • It is a challenge for the current security industry to respond to a large number of malicious codes distributed indiscriminately as well as intelligent APT attacks. As a result, studies using machine learning algorithms are being conducted as proactive prevention rather than post processing. The k-NN algorithm is widely used because it is intuitive and suitable for handling malicious code as unstructured data. In addition, in the malicious code analysis domain, the k-NN algorithm is easy to classify malicious codes based on previously analyzed malicious codes. For example, it is possible to classify malicious code families or analyze malicious code variants through similarity analysis with existing malicious codes. However, the main disadvantage of the k-NN algorithm is that the search time increases as the learning data increases. We propose a fast k-NN algorithm which improves the computation speed problem while taking the value of the k-NN algorithm. In the test environment, the k-NN algorithm was able to perform with only the comparison of the average of similarity of 19.71 times for 6.25 million malicious codes. Considering the way the algorithm works, Fast k-NN algorithm can also be used to search all data that can be vectorized as well as malware and SSDEEP. In the future, it is expected that if the k-NN approach is needed, and the central node can be effectively selected for clustering of large amount of data in various environments, it will be possible to design a sophisticated machine learning based system.

Quality Characteristics of Cookies Added with Nelumbo nucifera G. powder (연근분말을 첨가한 쿠키의 품질특성)

  • Lee, Eun-Jun;Kim, Hyeong-Il;Hong, Geum-Ju
    • Journal of the Korean Society of Food Culture
    • /
    • v.26 no.4
    • /
    • pp.394-399
    • /
    • 2011
  • This study was conducted to investigate the effect of Nelumbo nucifera G. (NN) powder on cookie quality characteristics. The cookies were made with various NN powder levels (1, 3, and 5%). Crude fiber, crude ash, and the Mg contents of cookies with added NN powder were higher in concentration than those of the control group. Salinity of NN-powder added groups was not significantly different when it was compared with the control group's salinity. No significant difference among the groups were observed for specific volume, but the width determined by water content in the dough decreased as the amount of added NN powder increased. The L-value of the cookies was significantly larger than that of the control group. The a- and b-values were the highest for the 5% substituted NN flour. According to the sensory evaluation of the cookies, scores for color, flavor, and texture increased with increasing amounts of added NN powder. The overall acceptance of the 3% NN added cookies was greater than that of the 1 and 5% cookies.

Density Adaptive Grid-based k-Nearest Neighbor Regression Model for Large Dataset (대용량 자료에 대한 밀도 적응 격자 기반의 k-NN 회귀 모형)

  • Liu, Yiqi;Uk, Jung
    • Journal of Korean Society for Quality Management
    • /
    • v.49 no.2
    • /
    • pp.201-211
    • /
    • 2021
  • Purpose: This paper proposes a density adaptive grid algorithm for the k-NN regression model to reduce the computation time for large datasets without significant prediction accuracy loss. Methods: The proposed method utilizes the concept of the grid with centroid to reduce the number of reference data points so that the required computation time is much reduced. Since the grid generation process in this paper is based on quantiles of original variables, the proposed method can fully reflect the density information of the original reference data set. Results: Using five real-life datasets, the proposed k-NN regression model is compared with the original k-NN regression model. The results show that the proposed density adaptive grid-based k-NN regression model is superior to the original k-NN regression in terms of data reduction ratio and time efficiency ratio, and provides a similar prediction error if the appropriate number of grids is selected. Conclusion: The proposed density adaptive grid algorithm for the k-NN regression model is a simple and effective model which can help avoid a large loss of prediction accuracy with faster execution speed and fewer memory requirements during the testing phase.

Location Positioning System Based on K-NN for Sensor Networks (센서네트워크를 위한 K-NN 기반의 위치 추정 시스템)

  • Kim, Byoung-Kug;Hong, Won-Gil
    • Journal of Korea Multimedia Society
    • /
    • v.15 no.9
    • /
    • pp.1112-1125
    • /
    • 2012
  • To realize LBS (Location Based Service), typically GPS is mostly used. However, this system can be only used in out-sides. Furthermore, the use of the GPS in sensor networks is not efficient due to the low power consumption. Hence, we propose methods for the location positioning which is runnable at indoor in this paper. The proposed methods elaborate the location positioning system via applying K-NN(K-Nearest Neighbour) Algorithm with its intermediate values based on IEEE 802.15.4 technology; which is mostly used for the sensor networks. Logically the accuracy of the location positioning is proportional to the number of sampling sensor nodes' RSS according to the K-NN. By the way, numerous sampling uses a lot of sensor networks' resources. In order to reduce the number of samplings, we, instead, attempt to use the intermediate values of K-NN's signal boundaries, so that our proposed methods are able to positioning almost two times as accurate as the general ways of K-NN's result.

An Improvement in K-NN Graph Construction using re-grouping with Locality Sensitive Hashing on MapReduce (MapReduce 환경에서 재그룹핑을 이용한 Locality Sensitive Hashing 기반의 K-Nearest Neighbor 그래프 생성 알고리즘의 개선)

  • Lee, Inhoe;Oh, Hyesung;Kim, Hyoung-Joo
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.11
    • /
    • pp.681-688
    • /
    • 2015
  • The k nearest neighbor (k-NN) graph construction is an important operation with many web-related applications, including collaborative filtering, similarity search, and many others in data mining and machine learning. Despite its many elegant properties, the brute force k-NN graph construction method has a computational complexity of $O(n^2)$, which is prohibitive for large scale data sets. Thus, (Key, Value)-based distributed framework, MapReduce, is gaining increasingly widespread use in Locality Sensitive Hashing which is efficient for high-dimension and sparse data. Based on the two-stage strategy, we engage the locality sensitive hashing technique to divide users into small subsets, and then calculate similarity between pairs in the small subsets using a brute force method on MapReduce. Specifically, generating a candidate group stage is important since brute-force calculation is performed in the following step. However, existing methods do not prevent large candidate groups. In this paper, we proposed an efficient algorithm for approximate k-NN graph construction by regrouping candidate groups. Experimental results show that our approach is more effective than existing methods in terms of graph accuracy and scan rate.

Antioxidant and Anti-wrinkling Effects of Extracts from Nelumbo nucifera leaves (하엽(荷葉) 추출물이 항산화 효능 및 피부노화에 미치는 영향)

  • Park, Chan-Ik;Park, Geun-Hye
    • The Korea Journal of Herbology
    • /
    • v.31 no.4
    • /
    • pp.53-60
    • /
    • 2016
  • Objective : The purpose of this study was to investigate anti-aging and antioxidant effects of extracts of Nelumbo nucifera leaves (NN-L) using ethanol on skin .Methods : Each part of leaves(NN-L), flowers(NN-F) and stem(NN-S) was extracted with 70% ethanol. We performed radical scavenging assay(DPPH, ABTS+, Superoxide anion radical), elastase inhibition assay, collagenase inhibition assay. NN-L extracts were tested for cell viability(MTT assay), MMP-1 inhibition and MMP-1 protein expression on CCD-986sk cells (human fibroblast line).Results : Recently, many studies have reported that elastin is also involved in inhibiting or repairing wrinkle formation, although collagen is a major factor in the skin wrinkle formation. We measured its free radical scavenging activity, elastase inhibitory activity and expression of MMP-1 (matrix metalloprotease-1) in human fibroblast cells. Among the parts of Nelumbo nucifera, NN-L showed the highest antioxidant activities and in radical scavenging. DPPH, ABTS+ and Superoxide anion radical scavenging activity of NN-L at concentration of 1,000 μg/mL were 91.43%, 99.31% and 73.7% respectively. In vitro elastase and collagenase inhibition effects of NN-L at concentration of 1,000 μg/mL was 42.8% and 55.3% respectively. The ethanol extract of NN-L showed cell viability of 95.4% in 50 μg/mL concentration. In addition, The results from Western blot assay showed that NN-L decreased the expression of MMP-1 protein in a dose-dependent manner (by up to 35.0% at 50 μM).Conclusion : The findings suggest that the NN-L great potential as a cosmeceutical ingredient with antioxidant and anti-wrinkling effects.

kNN Query Processing Algorithm based on the Encrypted Index for Hiding Data Access Patterns (데이터 접근 패턴 은닉을 지원하는 암호화 인덱스 기반 kNN 질의처리 알고리즘)

  • Kim, Hyeong-Il;Kim, Hyeong-Jin;Shin, Youngsung;Chang, Jae-woo
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1437-1457
    • /
    • 2016
  • In outsourced databases, the cloud provides an authorized user with querying services on the outsourced database. However, sensitive data, such as financial or medical records, should be encrypted before being outsourced to the cloud. Meanwhile, k-Nearest Neighbor (kNN) query is the typical query type which is widely used in many fields and the result of the kNN query is closely related to the interest and preference of the user. Therefore, studies on secure kNN query processing algorithms that preserve both the data privacy and the query privacy have been proposed. However, existing algorithms either suffer from high computation cost or leak data access patterns because retrieved index nodes and query results are disclosed. To solve these problems, in this paper we propose a new kNN query processing algorithm on the encrypted database. Our algorithm preserves both data privacy and query privacy. It also hides data access patterns while supporting efficient query processing. To achieve this, we devise an encrypted index search scheme which can perform data filtering without revealing data access patterns. Through the performance analysis, we verify that our proposed algorithm shows better performance than the existing algorithms in terms of query processing times.

A study on the imputation solution for missing speed data on UTIS by using adaptive k-NN algorithm (적응형 k-NN 기법을 이용한 UTIS 속도정보 결측값 보정처리에 관한 연구)

  • Kim, Eun-Jeong;Bae, Gwang-Soo;Ahn, Gye-Hyeong;Ki, Yong-Kul;Ahn, Yong-Ju
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.3
    • /
    • pp.66-77
    • /
    • 2014
  • UTIS(Urban Traffic Information System) directly collects link travel time in urban area by using probe vehicles. Therefore it can estimate more accurate link travel speed compared to other traffic detection systems. However, UTIS includes some missing data caused by the lack of probe vehicles and RSEs on road network, system failures, and other factors. In this study, we suggest a new model, based on k-NN algorithm, for imputing missing data to provide more accurate travel time information. New imputation model is an adaptive k-NN which can flexibly adjust the number of nearest neighbors(NN) depending on the distribution of candidate objects. The evaluation result indicates that the new model successfully imputed missing speed data and significantly reduced the imputation error as compared with other models(ARIMA and etc). We have a plan to use the new imputation model improving traffic information service by applying UTIS Central Traffic Information Center.

Expanded Korean Chunking by $k$-NN ($k$-NN으로 확장된 한국어 단위화)

  • 박성배;장병탁;김영택
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.182-184
    • /
    • 2000
  • 대부분의 자연언어처리에서 단위화는 구문 분석 이전의 매우 기본적인 처리 단계로, 텍스트 문장을 문법적으로 서로 관련된 단위로 분할하는 것이다. 따라서, 단위화를 이용하면 구문 분석이나 의미 분석 등에서 메모리와 시간을 효율적으로 줄일 수 있다. 일반적으로 통찰에 의한 규칙을 사용해서도 비교적 높은 단위화 성능을 얻을 수 있지만, 본 논문에서는 기계 학습 기법인 k-NN을 사용하여 보다 정확한 단위화를 구현한다. 인터넷 홈페이지에서 얻은 1,273 문장을 대상으로 학습한 결과, k-NN으로 단위화를 확장했을 때에 확장하지 않았을 때보다 2.3%의 정확도 증가를 보였다.

  • PDF

Load Fidelity Improvement of Piecewise Integrated Composite Beam by Construction Training Data of k-NN Classification Model (k-NN 분류 모델의 학습 데이터 구성에 따른 PIC 보의 하중 충실도 향상에 관한 연구)

  • Ham, Seok Woo;Cheon, Seong S.
    • Composites Research
    • /
    • v.33 no.3
    • /
    • pp.108-114
    • /
    • 2020
  • Piecewise Integrated Composite (PIC) beam is composed of different stacking against loading type depending upon location. The aim of current study is to assign robust stacking sequences against external loading to every corresponding part of the PIC beam based on the value of stress triaxiality at generated reference points using the k-NN (k-Nearest Neighbor) classification, which is one of representative machine learning techniques, in order to excellent superior bending characteristics. The stress triaxiality at reference points is obtained by three-point bending analysis of the Al beam with training data categorizing the type of external loading, i.e., tension, compression or shear. Loading types of each plane of the beam were classified by independent plane scheme as well as total beam scheme. Also, loading fidelities were calibrated for each case with the variation of hyper-parameters. Most effective stacking sequences were mapped into the PIC beam based on the k-NN classification model with the highest loading fidelity. FE analysis result shows the PIC beam has superior external loading resistance and energy absorption compared to conventional beam.