• Title/Summary/Keyword: Decision Tree Regression

Search Result 328, Processing Time 0.028 seconds

Activity Recognition of Workers and Passengers onboard Ships Using Multimodal Sensors in a Smartphone (선박 탑승자를 위한 다중 센서 기반의 스마트폰을 이용한 활동 인식 시스템)

  • Piyare, Rajeev Kumar;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39C no.9
    • /
    • pp.811-819
    • /
    • 2014
  • Activity recognition is a key component in identifying the context of a user for providing services based on the application such as medical, entertainment and tactical scenarios. Instead of applying numerous sensor devices, as observed in many previous investigations, we are proposing the use of smartphone with its built-in multimodal sensors as an unobtrusive sensor device for recognition of six physical daily activities. As an improvement to previous works, accelerometer, gyroscope and magnetometer data are fused to recognize activities more reliably. The evaluation indicates that the IBK classifier using window size of 2s with 50% overlapping yields the highest accuracy (i.e., up to 99.33%). To achieve this peak accuracy, simple time-domain and frequency-domain features were extracted from raw sensor data of the smartphone.

A Date Mining Approach to Intelligent College Road Map Advice Service (데이터 마이닝을 이용한 지능형 전공지도시스템 연구)

  • Choe, Deok-Won;Jo, Gyeong-Pil;Sin, Jin-Gyu
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.05a
    • /
    • pp.266-273
    • /
    • 2005
  • Data mining techniques enable us to generate useful information for decision support from the data sources which are generated and accumulated in the process of routine organizational management activities. College administration system is a typical example that produces a warehouse of student records as each and every student enters a college and undertakes the curricular and extracurricular activities. So far, these data have been utilized to a very limited student service purposes, such as issuance of transcripts, graduation evaluation, GPA calculation, etc. In this paper, we utilize Holland career search test results, TOEIC score, course work list, and GPA score as the input for data mining and generation the student advisory information. Factor analysis, AHP(Analytic Hierarchy Process), artificial neural net, and CART(Classification And Regression Tree) techniques are deployed in the data mining process. Since these data mining techniques are very powerful in processing and discovering useful knowledge and information from large scale student databases, we can expect a highly sophisticated student advisory knowledge and services which may not be obtained with the human student advice experts.

  • PDF

Effects of soft occlusal appliance therapy for patients with masticatory muscle pain

  • Kashiwagi, Kosuke;Noguchi, Tomoyasu;Fukuda, Kenichi
    • Journal of Dental Anesthesia and Pain Medicine
    • /
    • v.21 no.1
    • /
    • pp.71-80
    • /
    • 2021
  • Background: The options for stabilization appliance therapy for masticatory muscle pain include soft occlusal and hard stabilization appliances. A previous study suggested that hard stabilization appliance therapy was effective for patients with local myalgia who developed long facets on their occlusal appliances. The objective of this study was to identify patients in whom a soft occlusal appliance should be used to treat masticatory muscle pain by analyzing the type of muscle pain present and patient factors that influenced the effectiveness of this treatment. Methods: The study included 42 patients diagnosed with local myalgia or myofascial pain according to the Diagnostic Criteria for Temporomandibular Disorders Diagnostic Decision Tree. The analysis of patient factors included variables believed to be associated with temporomandibular disorders. First, a temporary screening appliance was used for 2 weeks to assess each patient for bruxism during sleep. Soft appliance therapy was then started. For each patient, the effectiveness of the appliance was evaluated according to the intensity of tenderness during muscle palpation and the treatment satisfaction score at one month after starting treatment. Results: Data from 37 of the 42 patients were available for analysis. Twenty-five patients reported satisfaction with the appliance. In logistic regression analysis, the odds ratio for reduction of facet length was 1.998. Nineteen patients showed at least a 30% improvement in the visual analog scale score. The odds ratio for local myalgia was 18.148. Conclusion: Soft appliance therapy may be used in patients with local myalgia. Moreover, patients who develop short facets on the appliance surface are likely to be satisfied with soft appliance therapy. Soft appliance therapy may be appropriate for patients with local myalgia who develop short facets on their occlusal appliance.

Ensemble Machine Learning Model Based YouTube Spam Comment Detection (앙상블 머신러닝 모델 기반 유튜브 스팸 댓글 탐지)

  • Jeong, Min Chul;Lee, Jihyeon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.5
    • /
    • pp.576-583
    • /
    • 2020
  • This paper proposes a technique to determine the spam comments on YouTube, which have recently seen tremendous growth. On YouTube, the spammers appeared to promote their channels or videos in popular videos or leave comments unrelated to the video, as it is possible to monetize through advertising. YouTube is running and operating its own spam blocking system, but still has failed to block them properly and efficiently. Therefore, we examined related studies on YouTube spam comment screening and conducted classification experiments with six different machine learning techniques (Decision tree, Logistic regression, Bernoulli Naive Bayes, Random Forest, Support vector machine with linear kernel, Support vector machine with Gaussian kernel) and ensemble model combining these techniques in the comment data from popular music videos - Psy, Katy Perry, LMFAO, Eminem and Shakira.

A study on forecasting attendance rate of reserve forces training based on Data Mining (데이터마이닝에 기반한 예비군훈련 입소율 예측에 관한 연구)

  • Cho, Sangjoon;Ma, Jungmok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.261-267
    • /
    • 2021
  • The mission of the reserve forces unit is to prepare good training for reserve forces during peacetime. For good training, units require proper organization support agents, but they have difficulties due to a lack of unit members. For that reason, the units forecast the monthly attendance rate of reserve forces (using the x-1 year's result) to organize support agents and unit schedule. On the other hand, the existing planning method can have more errors compared to the actual result of the attendance rate. This problem has a negative effect on the training performance. Therefore, it requires more accurate forecast models to reduce attendance rate errors. This paper proposes an attendance rate forecast model using data mining. To verify the proposed data mining based model, the existing planning method was compared with the proposed model using real data. The results showed that the proposed model outperforms the existing planning method.

Prediction of drowning person's route using machine learning for meteorological information of maritime observation buoy

  • Han, Jung-Wook;Moon, Ho-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.1-12
    • /
    • 2022
  • In the event of a maritime distress accident, rapid search and rescue operations using rescue assets are very important to ensure the safety and life of drowning person's at sea. In this paper, we analyzed the surface layer current in the northwest sea area of Ulleungdo by applying machine learning such as multiple linear regression, decision tree, support vector machine, vector autoregression, and LSTM to the meteorological information collected from the maritime observation buoy. And we predicted the drowning person's route at sea based on the predicted current direction and speed information by constructing each prediction model. Comparing the various machine learning models applied in this paper through the performance evaluation measures of MAE and RMSE, the LSTM model is the best. In addition, LSTM model showed superior performance compared to the other models in the view of the difference distance between the actual and predicted movement point of drowning person.

Cross-Technology Localization: Leveraging Commodity WiFi to Localize Non-WiFi Device

  • Zhang, Dian;Zhang, Rujun;Guo, Haizhou;Xiang, Peng;Guo, Xiaonan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.11
    • /
    • pp.3950-3969
    • /
    • 2021
  • Radio Frequency (RF)-based indoor localization technologies play significant roles in various Internet of Things (IoT) services (e.g., location-based service). Most such technologies require that all the devices comply with a specified technology (e.g., WiFi, ZigBee, and Bluetooth). However, this requirement limits its application scenarios in today's IoT context where multiple devices complied with different standards coexist in a shared environment. To bridge the gap, in this paper, we propose a cross-technology localization approach, which is able to localize target nodes using a different type of devices. Specifically, the proposed framework reuses the existing WiFi infrastructure without introducing additional cost to localize Non-WiFi device (i.e., ZigBee). The key idea is to leverage the interference between devices that share the same operating frequency (e.g., 2.4GHz). Such interference exhibits unique patterns that depend on the target device's location, thus it can be leveraged for cross-technology localization. The proposed framework uses Principal Components Analysis (PCA) to extract salient features of the received WiFi signals, and leverages Dynamic Time Warping (DTW), Gradient Boosting Regression Tree (GBRT) to improve the robustness of our system. We conduct experiments in real scenario and investigate the impact of different factors. Experimental results show that the average localization accuracy of our prototype can reach 1.54m, which demonstrates a promising direction of building cross-technology technologies to fulfill the needs of modern IoT context.

Crop Yield Estimation Utilizing Feature Selection Based on Graph Classification (그래프 분류 기반 특징 선택을 활용한 작물 수확량 예측)

  • Ohnmar Khin;Sung-Keun Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1269-1276
    • /
    • 2023
  • Crop estimation is essential for the multinational meal and powerful demand due to its numerous aspects like soil, rain, climate, atmosphere, and their relations. The consequence of climate shift impacts the farming yield products. We operate the dataset with temperature, rainfall, humidity, etc. The current research focuses on feature selection with multifarious classifiers to assist farmers and agriculturalists. The crop yield estimation utilizing the feature selection approach is 96% accuracy. Feature selection affects a machine learning model's performance. Additionally, the performance of the current graph classifier accepts 81.5%. Eventually, the random forest regressor without feature selections owns 78% accuracy and the decision tree regressor without feature selections retains 67% accuracy. Our research merit is to reveal the experimental results of with and without feature selection significance for the proposed ten algorithms. These findings support learners and students in choosing the appropriate models for crop classification studies.

Data-driven Model Prediction of Harmful Cyanobacterial Blooms in the Nakdong River in Response to Increased Temperatures Under Climate Change Scenarios (기후변화 시나리오의 기온상승에 따른 낙동강 남세균 발생 예측을 위한 데이터 기반 모델 시뮬레이션)

  • Gayeon Jang;Minkyoung Jo;Jayun Kim;Sangjun Kim;Himchan Park;Joonhong Park
    • Journal of Korean Society on Water Environment
    • /
    • v.40 no.3
    • /
    • pp.121-129
    • /
    • 2024
  • Harmful cyanobacterial blooms (HCBs) are caused by the rapid proliferation of cyanobacteria and are believed to be exacerbated by climate change. However, the extent to which HCBs will be stimulated in the future due to increased temperature remains uncertain. This study aims to predict the future occurrence of cyanobacteria in the Nakdong River, which has the highest incidence of HCBs in South Korea, based on temperature rise scenarios. Representative Concentration Pathways (RCPs) were used as the basis for these scenarios. Data-driven model simulations were conducted, and out of the four machine learning techniques tested (multiple linear regression, support vector regressor, decision tree, and random forest), the random forest model was selected for its relatively high prediction accuracy. The random forest model was used to predict the occurrence of cyanobacteria. The results of boxplot and time-series analyses showed that under the worst-case scenario (RCP8.5 (2100)), where temperature increases significantly, cyanobacterial abundance across all study areas was greatly stimulated. The study also found that the frequencies of HCB occurrences exceeding certain thresholds (100,000 and 1,000,000 cells/mL) increased under both the best-case scenario (RCP2.6 (2050)) and worst-case scenario (RCP8.5 (2100)). These findings suggest that the frequency of HCB occurrences surpassing a certain threshold level can serve as a useful diagnostic indicator of vulnerability to temperature increases caused by climate change. Additionally, this study highlights that water bodies currently susceptible to HCBs are likely to become even more vulnerable with climate change compared to those that are currently less susceptible.

Protecting Accounting Information Systems using Machine Learning Based Intrusion Detection

  • Biswajit Panja
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.5
    • /
    • pp.111-118
    • /
    • 2024
  • In general network-based intrusion detection system is designed to detect malicious behavior directed at a network or its resources. The key goal of this paper is to look at network data and identify whether it is normal traffic data or anomaly traffic data specifically for accounting information systems. In today's world, there are a variety of principles for detecting various forms of network-based intrusion. In this paper, we are using supervised machine learning techniques. Classification models are used to train and validate data. Using these algorithms we are training the system using a training dataset then we use this trained system to detect intrusion from the testing dataset. In our proposed method, we will detect whether the network data is normal or an anomaly. Using this method we can avoid unauthorized activity on the network and systems under that network. The Decision Tree and K-Nearest Neighbor are applied to the proposed model to classify abnormal to normal behaviors of network traffic data. In addition to that, Logistic Regression Classifier and Support Vector Classification algorithms are used in our model to support proposed concepts. Furthermore, a feature selection method is used to collect valuable information from the dataset to enhance the efficiency of the proposed approach. Random Forest machine learning algorithm is used, which assists the system to identify crucial aspects and focus on them rather than all the features them. The experimental findings revealed that the suggested method for network intrusion detection has a neglected false alarm rate, with the accuracy of the result expected to be between 95% and 100%. As a result of the high precision rate, this concept can be used to detect network data intrusion and prevent vulnerabilities on the network.