Search | Korea Science

Research on text mining based malware analysis technology using string information (문자열 정보를 활용한 텍스트 마이닝 기반 악성코드 분석 기술 연구)

Ha, Ji-hee;Lee, Tae-jin
- Journal of Internet Computing and Services
- /
- v.21 no.1
- /
- pp.45-55
- /
- 2020
Due to the development of information and communication technology, the number of new / variant malicious codes is increasing rapidly every year, and various types of malicious codes are spreading due to the development of Internet of things and cloud computing technology. In this paper, we propose a malware analysis method based on string information that can be used regardless of operating system environment and represents library call information related to malicious behavior. Attackers can easily create malware using existing code or by using automated authoring tools, and the generated malware operates in a similar way to existing malware. Since most of the strings that can be extracted from malicious code are composed of information closely related to malicious behavior, it is processed by weighting data features using text mining based method to extract them as effective features for malware analysis. Based on the processed data, a model is constructed using various machine learning algorithms to perform experiments on detection of malicious status and classification of malicious groups. Data has been compared and verified against all files used on Windows and Linux operating systems. The accuracy of malicious detection is about 93.5%, the accuracy of group classification is about 90%. The proposed technique has a wide range of applications because it is relatively simple, fast, and operating system independent as a single model because it is not necessary to build a model for each group when classifying malicious groups. In addition, since the string information is extracted through static analysis, it can be processed faster than the analysis method that directly executes the code.
https://doi.org/10.7472/jksii.2020.21.1.45 인용 PDF KSCI HTML

The Prediction of Survival of Breast Cancer Patients Based on Machine Learning Using Health Insurance Claim Data (건강보험 청구 데이터를 활용한 머신러닝 기반유방암 환자의 생존 여부 예측)

Doeggyu Lee;Kyungkeun Byun;Hyungdong Lee;Sunhee Shin
- Journal of Korea Society of Industrial Information Systems
- /
- v.28 no.2
- /
- pp.1-9
- /
- 2023
Research using AI and big data is also being actively conducted in the health and medical fields such as disease diagnosis and treatment. Most of the existing research data used cohort data from research institutes or some patient data. In this paper, the difference in the prediction rate of survival and the factors affecting survival between breast cancer patients in their 40~50s and other age groups was revealed using health insurance review claim data held by the HIRA. As a result, the accuracy of predicting patients' survival was 0.93 on average in their 40~50s, higher than 0.86 in their 60~80s. In terms of that factor, the number of treatments was high for those in their 40~50s, and age was high for those in their 60~80s. Performance comparison with previous studies, the average precision was 0.90, which was higher than 0.81 of the existing paper. As a result of performance comparison by applied algorithm, the overall average precision of Decision Tree, Random Forest, and Gradient Boosting was 0.90, and the recall was 1.0, and the precision of multi-layer perceptrons was 0.89, and the recall was 1.0. I hope that more research will be conducted using machine learning automation(Auto ML) tools for non-professionals to enhance the use of the value for health insurance review claim data held by the HIRA.
https://doi.org/10.9723/jksiis.2023.28.2.001 인용 PDF

Hypernetwork Classifiers for Microarray-Based miRNA Module Analysis (마이크로어레이 기반 miRNA 모듈 분석을 위한 하이퍼망 분류 기법)

Kim, Sun;Kim, Soo-Jin;Zhang, Byoung-Tak
- Journal of KIISE:Software and Applications
- /
- v.35 no.6
- /
- pp.347-356
- /
- 2008
High-throughput microarray is one of the most popular tools in molecular biology, and various computational methods have been developed for the microarray data analysis. While the computational methods easily extract significant features, it suffers from inferring modules of multiple co-regulated genes. Hypernetworhs are motivated by biological networks, which handle all elements based on their combinatorial processes. Hence, the hypernetworks can naturally analyze the biological effects of gene combinations. In this paper, we introduce a hypernetwork classifier for microRNA (miRNA) profile analysis based on microarray data. The hypernetwork classifier uses miRNA pairs as elements, and an evolutionary learning is performed to model the microarray profiles. miTNA modules are easily extracted from the hypernetworks, and users can directly evaluate if the miRNA modules are significant. For experimental results, the hypernetwork classifier showed 91.46% accuracy for miRNA expression profiles on multiple human canters, which outperformed other machine learning methods. The hypernetwork-based analysis showed that our approach could find biologically significant miRNA modules.
PDF KSCI

Implementation and performance estimation of interferometer-type linear scale with high-resolution (고분해능을 갖는 간섭계형 리니어 스케일 제작 및 성능 평가)

김수진;은재정;최평석;권오영
- Journal of the Institute of Convergence Signal Processing
- /
- v.2 no.3
- /
- pp.86-92
- /
- 2001
Position controls are very important in semiconductor manufacturing devices, machine tools, precision measuring instruments, etc. to measure the distance of movement of moving objects in minute units and the accuracy of measurement for the moving distance in these devices affect the performance of the whole devices. Therefore, in those precision instruments, a sensing device that can measure the distance of movement with high-precision resolution is required. Thus an optical encoder that has such advantages as easy digital interface, economical price, and a resolution similar to that of laser interferometers can be used. In this paper, a interferometer-type linear scale with easy digital interface and high-resolution has been set up and measured the distance of movement based on the diffraction principle. Interference signals produced in this optical setup of the linear scale have beers digitalized through fabricated photodetectors and designed signal processing circuits. A resolution of 0.5${\mu}{\textrm}{m}$ is acquired from the experimental interferometer-type linear scale without for the movement of scales any additional dividing circuits. It is shown that from this experiment a high-resolution distance measurement device can be designed by a simple optical setup.
PDF

Modeling of surface roughness in electro-discharge machining using artificial neural networks

Cavaleri, Liborio;Chatzarakis, George E.;Trapani, Fabio Di;Douvika, Maria G.;Roinos, Konstantinos;Vaxevanidis, Nikolaos M.;Asteris, Panagiotis G.
- Advances in materials Research
- /
- v.6 no.2
- /
- pp.169-184
- /
- 2017
Electro-Discharge machining (EDM) is a thermal process comprising a complex metal removal mechanism. This method works by forming of a plasma channel between the tool and the workpiece electrodes leading to the melting and evaporation of the material to be removed. EDM is considered especially suitable for machining complex contours with high accuracy, as well as for materials that are not amenable to conventional removal methods. However, several phenomena can arise and adversely affect the surface integrity of EDMed workpieces. These have to be taken into account and studied in order to optimize the process. Recently, artificial neural networks (ANN) have emerged as a novel modeling technique that can provide reliable results and readily, be integrated into several technological areas. In this paper, we use an ANN, namely, the multi-layer perceptron and the back propagation network (BPNN) to predict the mean surface roughness of electro-discharge machined surfaces. The comparison of the derived results with experimental findings demonstrates the promising potential of using back propagation neural networks (BPNNs) for getting a reliable and robust approximation of the Surface Roughness of Electro-discharge Machined Components.
https://doi.org/10.12989/amr.2017.6.2.169 인용 KSCI

Drug-Drug Interaction Prediction Using Krill Herd Algorithm Based on Deep Learning Method

Al-Marghilani, Abdulsamad
- International Journal of Computer Science & Network Security
- /
- v.21 no.6
- /
- pp.319-328
- /
- 2021
Parallel administration of numerous drugs increases Drug-Drug Interaction (DDI) because one drug might affect the activity of other drugs. DDI causes negative or positive impacts on therapeutic output. So there is a need to discover DDI to enhance the safety of consuming drugs. Though there are several DDI system exist to predict an interaction but nowadays it becomes impossible to maintain with a large number of biomedical texts which is getting increased rapidly. Mostly the existing DDI system address classification issues, and especially rely on handcrafted features, and some features which are based on particular domain tools. The objective of this paper to predict DDI in a way to avoid adverse effects caused by the consumed drugs, to predict similarities among the drug, Drug pair similarity calculation is performed. The best optimal weight is obtained with the support of KHA. LSTM function with weight obtained from KHA and makes bets prediction of DDI. Our methodology depends on (LSTM-KHA) for the detection of DDI. Similarities among the drugs are measured with the help of drug pair similarity calculation. KHA is used to find the best optimal weight which is used by LSTM to predict DDI. The experimental result was conducted on three kinds of dataset DS1 (CYP), DS2 (NCYP), and DS3 taken from the DrugBank database. To evaluate the performance of proposed work in terms of performance metrics like accuracy, recall, precision, F-measures, AUPR, AUC, and AUROC. Experimental results express that the proposed method outperforms other existing methods for predicting DDI. LSTMKHA produces reasonable performance metrics when compared to the existing DDI prediction model.
https://doi.org/10.22937/IJCSNS.2021.21.6.41 인용 PDF KSCI

A Multiclass Classification of the Security Severity Level of Multi-Source Event Log Based on Natural Language Processing (자연어 처리 기반 멀티 소스 이벤트 로그의 보안 심각도 다중 클래스 분류)

Seo, Yangjin
- Journal of the Korea Institute of Information Security & Cryptology
- /
- v.32 no.5
- /
- pp.1009-1017
- /
- 2022
Log data has been used as a basis in understanding and deciding the main functions and state of information systems. It has also been used as an important input for the various applications in cybersecurity. It is an essential part to get necessary information from log data, to make a decision with the information, and to take a suitable countermeasure according to the information for protecting and operating systems in stability and reliability, but due to the explosive increase of various types and amounts of log, it is quite challenging to effectively and efficiently deal with the problem using existing tools. Therefore, this study has suggested a multiclass classification of the security severity level of multi-source event log using machine learning based on natural language processing. The experimental results with the training and test samples of 472,972 show that our approach has archived the accuracy of 99.59%.
https://doi.org/10.13089/JKIISC.2022.32.5.1009 인용 PDF KSCI HTML

Nanotechnology in early diagnosis of gastro intestinal cancer surgery through CNN and ANN-extreme gradient boosting

Y. Wenjing;T. Yuhan;Y. Zhiang;T. Shanhui;L. Shijun;M. Sharaf
- Advances in nano research
- /
- v.15 no.5
- /
- pp.451-466
- /
- 2023
Gastrointestinal cancer (GC) is a prevalent malignant tumor of the digestive system that poses a severe health risk to humans. Due to the specific organ structure of the gastrointestinal system, both endoscopic and MRI diagnoses of GIC have limited sensitivity. The primary factors influencing curative efficacy in GIC patients are drug inefficacy and high recurrence rates in surgical and pharmacological therapy. Due to its unique optical features, good biocompatibility, surface effects, and small size effects, nanotechnology is a developing and advanced area of study for the detection and treatment of cancer. Because of its deep location and complex surgery, diagnosing and treating gastrointestinal cancer is very difficult. The early diagnosis and urgent treatment of gastrointestinal illness are enabled by nanotechnology. As diagnostic and therapeutic tools, nanoparticles directly target tumor cells, allowing their detection and removal. XGBoost was used as a classification method known for achieving numerous winning solutions in data analysis competitions, to capture nonlinear relations among many input variables and outcomes using the boosting approach to machine learning. The research sample included 300 GC patients, comprising 190 males (72.2% of the sample) and 110 women (27.8%). Using convolutional neural networks (CNN) and artificial neural networks (ANN)-EXtreme Gradient Boosting (XGBoost), the patients mean± SD age was 50.42 ± 13.06. High-risk behaviors (P = 0.070), age at diagnosis (P = 0.037), distant metastasis (P = 0.004), and tumor stage (P = 0.015) were shown to have a statistically significant link with GC patient survival. AUC was 0.92, sensitivity was 81.5%, specificity was 90.5%, and accuracy was 84.7 when analyzing stomach picture.
https://doi.org/10.12989/anr.2023.15.5.451 인용

Students' Performance Prediction in Higher Education Using Multi-Agent Framework Based Distributed Data Mining Approach: A Review

M.Nazir;A.Noraziah;M.Rahmah
- International Journal of Computer Science & Network Security
- /
- v.23 no.10
- /
- pp.135-146
- /
- 2023
An effective educational program warrants the inclusion of an innovative construction which enhances the higher education efficacy in such a way that accelerates the achievement of desired results and reduces the risk of failures. Educational Decision Support System (EDSS) has currently been a hot topic in educational systems, facilitating the pupil result monitoring and evaluation to be performed during their development. Insufficient information systems encounter trouble and hurdles in making the sufficient advantage from EDSS owing to the deficit of accuracy, incorrect analysis study of the characteristic, and inadequate database. DMTs (Data Mining Techniques) provide helpful tools in finding the models or forms of data and are extremely useful in the decision-making process. Several researchers have participated in the research involving distributed data mining with multi-agent technology. The rapid growth of network technology and IT use has led to the widespread use of distributed databases. This article explains the available data mining technology and the distributed data mining system framework. Distributed Data Mining approach is utilized for this work so that a classifier capable of predicting the success of students in the economic domain can be constructed. This research also discusses the Intelligent Knowledge Base Distributed Data Mining framework to assess the performance of the students through a mid-term exam and final-term exam employing Multi-agent system-based educational mining techniques. Using single and ensemble-based classifiers, this study intends to investigate the factors that influence student performance in higher education and construct a classification model that can predict academic achievement. We also discussed the importance of multi-agent systems and comparative machine learning approaches in EDSS development.
https://doi.org/10.22937/IJCSNS.2023.23.10.17 인용 PDF

Development of Classification Model for hERG Ion Channel Inhibitors Using SVM Method (SVM 방법을 이용한 hERG 이온 채널 저해제 예측모델 개발)

Gang, Sin-Moon;Kim, Han-Jo;Oh, Won-Seok;Kim, Sun-Young;No, Kyoung-Tai;Nam, Ky-Youb
- Journal of the Korean Chemical Society
- /
- v.53 no.6
- /
- pp.653-662
- /
- 2009
Developing effective tools for predicting absorption, distribution, metabolism, excretion properties and toxicity (ADME/T) of new chemical entities in the early stage of drug design is one of the most important tasks in drug discovery and development today. As one of these attempts, support vector machines (SVM) has recently been exploited for the prediction of ADME/T related properties. However, two problems in SVM modeling, i.e. feature selection and parameters setting, are still far from solved. The two problems have been shown to be crucial to the efficiency and accuracy of SVM classification. In particular, the feature selection and optimal SVM parameters setting influence each other, which indicates that they should be dealt with simultaneously. In this account, we present an integrated practical solution, in which genetic-based algorithm (GA) is used for feature selection and grid search (GS) method for parameters optimization. hERG ion-channel inhibitor classification models of ADME/T related properties has been built for assessing and testing the proposed GA-GS-SVM. We generated 6 different models that are 3 different single models and 3 different ensemble models using training set - 1891 compounds and validated with external test set - 175 compounds. We compared single model with ensemble model to solve data imbalance problems. It was able to improve accuracy of prediction to use ensemble model.
https://doi.org/10.5012/jkcs.2009.53.6.653 인용 PDF KSCI KPUBS

Search Result 304, Processing Time 0.023 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)