• Title/Summary/Keyword: and machine-learning

Search Result 5,337, Processing Time 0.035 seconds

Classifying Windows Executables using API-based Information and Machine Learning (API 정보와 기계학습을 통한 윈도우 실행파일 분류)

  • Cho, DaeHee;Lim, Kyeonghwan;Cho, Seong-je;Han, Sangchul;Hwang, Young-sup
    • Journal of KIISE
    • /
    • v.43 no.12
    • /
    • pp.1325-1333
    • /
    • 2016
  • Software classification has several applications such as copyright infringement detection, malware classification, and software automatic categorization in software repositories. It can be also employed by software filtering systems to prevent the transmission of illegal software. If illegal software is identified by measuring software similarity in software filtering systems, the average number of comparisons can be reduced by shrinking the search space. In this study, we focused on the classification of Windows executables using API call information and machine learning. We evaluated the classification performance of machine learning-based classifier according to the refinement method for API information and machine learning algorithm. The results showed that the classification success rate of SVM (Support Vector Machine) with PolyKernel was higher than other algorithms. Since the API call information can be extracted from binary executables and machine learning-based classifier can identify tampered executables, API call information and machine learning-based software classifiers are suitable for software filtering systems.

Deep Structured Learning: Architectures and Applications

  • Lee, Soowook
    • International Journal of Advanced Culture Technology
    • /
    • v.6 no.4
    • /
    • pp.262-265
    • /
    • 2018
  • Deep learning, a sub-field of machine learning changing the prospects of artificial intelligence (AI) because of its recent advancements and application in various field. Deep learning deals with algorithms inspired by the structure and function of the brain called artificial neural networks. This works reviews basic architecture and recent advancement of deep structured learning. It also describes contemporary applications of deep structured learning and its advantages over the treditional learning in artificial interlligence. This study is useful for the general readers and students who are in the early stage of deep learning studies.

Forecasting Sow's Productivity using the Machine Learning Models (머신러닝을 활용한 모돈의 생산성 예측모델)

  • Lee, Min-Soo;Choe, Young-Chan
    • Journal of Agricultural Extension & Community Development
    • /
    • v.16 no.4
    • /
    • pp.939-965
    • /
    • 2009
  • The Machine Learning has been identified as a promising approach to knowledge-based system development. This study aims to examine the ability of machine learning techniques for farmer's decision making and to develop the reference model for using pig farm data. We compared five machine learning techniques: logistic regression, decision tree, artificial neural network, k-nearest neighbor, and ensemble. All models are well performed to predict the sow's productivity in all parity, showing over 87.6% predictability. The model predictability of total litter size are highest at 91.3% in third parity and decreasing as parity increases. The ensemble is well performed to predict the sow's productivity. The neural network and logistic regression is excellent classifier for all parity. The decision tree and the k-nearest neighbor was not good classifier for all parity. Performance of models varies over models used, showing up to 104% difference in lift values. Artificial Neural network and ensemble models have resulted in highest lift values implying best performance among models.

  • PDF

Effectiveness of Normalization Pre-Processing of Big Data to the Machine Learning Performance (빅데이터의 정규화 전처리과정이 기계학습의 성능에 미치는 영향)

  • Jo, Jun-Mo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.3
    • /
    • pp.547-552
    • /
    • 2019
  • Recently, the massive growth in the scale of data has been observed as a major issue in the Big Data. Furthermore, the Big Data should be preprocessed for normalization to get a high performance of the Machine learning since the Big Data is also an input of Machine Learning. The performance varies by many factors such as the scope of the columns in a Big Data or the methods of normalization preprocessing. In this paper, the various types of normalization preprocessing methods and the scopes of the Big Data columns will be applied to the SVM(: Support Vector Machine) as a Machine Learning method to get the efficient environment for the normalization preprocessing. The Machine Learning experiment has been programmed in Python and the Jupyter Notebook.

RFA: Recursive Feature Addition Algorithm for Machine Learning-Based Malware Classification

  • Byeon, Ji-Yun;Kim, Dae-Ho;Kim, Hee-Chul;Choi, Sang-Yong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.2
    • /
    • pp.61-68
    • /
    • 2021
  • Recently, various technologies that use machine learning to classify malicious code have been studied. In order to enhance the effectiveness of machine learning, it is most important to extract properties to identify malicious codes and normal binaries. In this paper, we propose a feature extraction method for use in machine learning using recursive methods. The proposed method selects the final feature using recursive methods for individual features to maximize the performance of machine learning. In detail, we use the method of extracting the best performing features among individual feature at each stage, and then combining the extracted features. We extract features with the proposed method and apply them to machine learning algorithms such as Decision Tree, SVM, Random Forest, and KNN, to validate that machine learning performance improves as the steps continue.

A Study on Protecting Privacy of Machine Learning Models

  • Lee, Younghan;Han, Woorim;Cho, Yungi;Kim, Hyunjun;Paek, Yunheung
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.61-63
    • /
    • 2021
  • Machine learning model gained the popularity in recent years as multi-national companies have incorporated machine learning in their services. Such service is called machine learning as a service (MLaSS). Such services are provided to users based on charge-per-query which triggers the motivations for adversaries to steal the trained victim model to reduce the cost of using the service. Therefore, it is important for companies that provide MLaSS to protect their intellectual property (IP) against adversaries. It has been arms race between the attack and defence in a context of the privacy of machine learning models. In this paper, we provide a comprehensive study of recent development in protecting privacy of machine learning models.

Evaluation performance of machine learning in merging multiple satellite-based precipitation with gauge observation data

  • Nhuyen, Giang V.;Le, Xuan-hien;Jung, Sungho;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.143-143
    • /
    • 2022
  • Precipitation plays an essential role in water resources management and disaster prevention. Therefore, the understanding related to spatiotemporal characteristics of rainfall is necessary. Nowadays, highly accurate precipitation is mainly obtained from gauge observation systems. However, the density of gauge stations is a sparse and uneven distribution in mountainous areas. With the proliferation of technology, satellite-based precipitation sources are becoming increasingly common and can provide rainfall information in regions with complex topography. Nevertheless, satellite-based data is that it still remains uncertain. To overcome the above limitation, this study aims to take the strengthens of machine learning to generate a new reanalysis of precipitation data by fusion of multiple satellite precipitation products (SPPs) with gauge observation data. Several machine learning algorithms (i.e., Random Forest, Support Vector Regression, and Artificial Neural Network) have been adopted. To investigate the robustness of the new reanalysis product, observed data were collected to evaluate the accuracy of the products through Kling-Gupta efficiency (KGE), probability of detection (POD), false alarm rate (FAR), and critical success index (CSI). As a result, the new precipitation generated through the machine learning model showed higher accuracy than original satellite rainfall products, and its spatiotemporal variability was better reflected than others. Thus, reanalysis of satellite precipitation product based on machine learning can be useful source input data for hydrological simulations in ungauged river basins.

  • PDF

Face Recognition using Correlation Filters and Support Vector Machine in Machine Learning Approach

  • Long, Hoang;Kwon, Oh-Heum;Lee, Suk-Hwan;Kwon, Ki-Ryong
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.4
    • /
    • pp.528-537
    • /
    • 2021
  • Face recognition has gained significant notice because of its application in many businesses: security, healthcare, and marketing. In this paper, we will present the recognition method using the combination of correlation filters (CF) and Support Vector Machine (SVM). Firstly, we evaluate the performance and compared four different correlation filters: minimum average correlation energy (MACE), maximum average correlation height (MACH), unconstrained minimum average correlation energy (UMACE), and optimal-tradeoff (OT). Secondly, we propose the machine learning approach by using the OT correlation filter for features extraction and SVM for classification. The numerical results on National Cheng Kung University (NCKU) and Pointing'04 face database show that the proposed method OT-SVM gets higher accuracy in face recognition compared to other machine learning methods. Our approach doesn't require graphics card to train the image. As a result, it could run well on a low hardware system like an embedded system.

Improving Performance of Machine Learning-based Haze Removal Algorithms with Enhanced Training Database

  • Ngo, Dat;Kang, Bongsoon
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.948-952
    • /
    • 2018
  • Haze removal is an object of scientific desire due to its various practical applications. Existing algorithms are founded upon histogram equalization, contrast maximization, or the growing trend of applying machine learning in image processing. Since machine learning-based algorithms solve problems based on the data, they usually perform better than those based on traditional image processing/computer vision techniques. However, to achieve such a high performance, one of the requisites is a large and reliable training database, which seems to be unattainable owing to the complexity of real hazy and haze-free images acquisition. As a result, researchers are currently using the synthetic database, obtained by introducing the synthetic haze drawn from the standard uniform distribution into the clear images. In this paper, we propose the enhanced equidistribution, improving upon our previous study on equidistribution, and use it to make a new database for training machine learning-based haze removal algorithms. A large number of experiments verify the effectiveness of our proposed methodology.

Short-term Wind Power Prediction Based on Empirical Mode Decomposition and Improved Extreme Learning Machine

  • Tian, Zhongda;Ren, Yi;Wang, Gang
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.5
    • /
    • pp.1841-1851
    • /
    • 2018
  • For the safe and stable operation of the power system, accurate wind power prediction is of great significance. A wind power prediction method based on empirical mode decomposition and improved extreme learning machine is proposed in this paper. Firstly, wind power time series is decomposed into several components with different frequency by empirical mode decomposition, which can reduce the non-stationary of time series. The components after decomposing remove the long correlation and promote the different local characteristics of original wind power time series. Secondly, an improved extreme learning machine prediction model is introduced to overcome the sample data updating disadvantages of standard extreme learning machine. Different improved extreme learning machine prediction model of each component is established. Finally, the prediction value of each component is superimposed to obtain the final result. Compared with other prediction models, the simulation results demonstrate that the proposed prediction method has better prediction accuracy for wind power.