• Title/Summary/Keyword: Decision-trees

Search Result 311, Processing Time 0.034 seconds

Syntactic Category Prediction for Improving Parsing Accuracy in English-Korean Machine Translation (영한 기계번역에서 구문 분석 정확성 향상을 위한 구문 범주 예측)

  • Kim Sung-Dong
    • The KIPS Transactions:PartB
    • /
    • v.13B no.3 s.106
    • /
    • pp.345-352
    • /
    • 2006
  • The practical English-Korean machine translation system should be able to translate long sentences quickly and accurately. The intra-sentence segmentation method has been proposed and contributed to speeding up the syntactic analysis. This paper proposes the syntactic category prediction method using decision trees for getting accurate parsing results. In parsing with segmentation, the segment is separately parsed and combined to generate the sentence structure. The syntactic category prediction would facilitate to select more accurate analysis structures after the partial parsing. Thus, we could improve the parsing accuracy by the prediction. We construct features for predicting syntactic categories from the parsed corpus of Wall Street Journal and generate decision trees. In the experiments, we show the performance comparisons with the predictions by human-built rules, trigram probability and neural networks. Also, we present how much the category prediction would contribute to improving the translation quality.

How to Define the Content of a Job-Specific Worker's Health Surveillance for Hospital Physicians?

  • Ruitenburg, Martijn M.;Frings-Dresen, Monique H.W.;Sluiter, Judith K.
    • Safety and Health at Work
    • /
    • v.7 no.1
    • /
    • pp.18-31
    • /
    • 2016
  • Background: A job-specific Worker's Health Surveillance (WHS) for hospital physicians is a preventive occupational health strategy aiming at early detection of their diminished work-related health in order to improve or maintain physician's health and quality of care. This study addresses what steps should be taken to determine the content of a job-specific WHS for hospital physicians and outlines that content. Methods: Based on four questions, decision trees were developed for physical and psychological job demands and for biological, chemical, and physical exposures to decide whether or not to include work-related health effects related to occupational exposures or aspects of health reflecting insufficient job requirements. Information was gathered locally through self-reporting and systematic observations at the workplace and from evidence in international publications. Results: Information from the decision trees on the prevalence and impact of the health- or work-functioning effect led to inclusion of occupational exposures (e.g., biological agents, emotionally demanding situations), job requirements (e.g., sufficient vision, judging ability), or health effects (e.g., depressive symptoms, neck complaints). Additionally, following the Dutch guideline for occupational physicians and based on specific job demands, screening for cardiovascular diseases, work ability, drug use, and alcohol consumption was included. Targeted interventions were selected when a health or work functioning problem existed and were chosen based on evidence for effectiveness. Conclusion: The process of developing a job-specific WHS for hospital physicians was described and the content presented, which might serve as an example for other jobs. Before implementation, it must first be tested for feasibility and acceptability.

Asian Ethnic Group Classification Model Using Data Mining (데이터마이닝 방법을 이용한 아시아 민족 분류 모형 구축)

  • Kim, Yoon Geon;Lee, Ji Hyun;Cho, Sohee;Kim, Moon Young;Lee, Soong Deok;Ha, Eun Ho;Ahn, Jae Joon
    • The Korean Journal of Legal Medicine
    • /
    • v.41 no.2
    • /
    • pp.32-40
    • /
    • 2017
  • In addition to identifying genetic differences between target populations, it is also important to determine the impact of genetic differences with regard to the respective target populations. In recent years, there has been an increasing number of cases where this approach is needed, and thus various statistical methods must be considered. In this study, genetic data from populations of Southeast and Southwest Asia were collected, and several statistical approaches were evaluated on the Y-chromosome short tandem repeat data. In order to develop a more accurate and practical classification model, we applied gradient boosting and ensemble techniques. To infer between the Southeast and Southwest Asian populations, the overall performance of the classification models was better than that of the decision trees and regression models used in the past. In conclusion, this study suggests that additional statistical approaches, such as data mining techniques, could provide more useful interpretations for forensic analyses. These trials are expected to be the basis for further studies extending from target regions to the entire continent of Asia as well as the use of additional genes such as mitochondrial genes.

IoT Enabled Intelligent System for Radiation Monitoring and Warning Approach using Machine Learning

  • Muhammad Saifullah ;Imran Sarwar Bajwa;Muhammad Ibrahim;Mutyyba Asgher
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.5
    • /
    • pp.135-147
    • /
    • 2023
  • Internet of things has revolutionaries every field of life due to the use of artificial intelligence within Machine Learning. It is successfully being used for the study of Radiation monitoring, prediction of Ultraviolet and Electromagnetic rays. However, there is no particular system available that can monitor and detect waves. Therefore, the present study designed in which IOT enables intelligence system based on machine learning was developed for the prediction of the radiation and their effects of human beings. Moreover, a sensor based system was installed in order to detect harmful radiation present in the environment and this system has the ability to alert the humans within the range of danger zone with a buzz, so that humans can move to a safer place. Along with this automatic sensor system; a self-created dataset was also created in which sensor values were recorded. Furthermore, in order to study the outcomes of the effect of these rays researchers used Support Vector Machine, Gaussian Naïve Bayes, Decision Trees, Extra Trees, Bagging Classifier, Random Forests, Logistic Regression and Adaptive Boosting Classifier were used. To sum up the whole discussion it is stated the results give high accuracy and prove that the proposed system is reliable and accurate for the detection and monitoring of waves. Furthermore, for the prediction of outcome, Adaptive Boosting Classifier has shown the best accuracy of 81.77% as compared with other classifiers.

Decision Tree based Disambiguation of Semantic Roles for Korean Adverbial Postpositions in Korean-English Machine Translation (한영 기계번역에서 결정 트리 학습에 의한 한국어 부사격 조사의 의미 중의성 해소)

  • Park, Seong-Bae;Zhang, Byoung-Tak;Kim, Yung-Taek
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.6
    • /
    • pp.668-677
    • /
    • 2000
  • Korean has the characteristics that case postpositions determine the syntactic roles of phrases and a postposition may have more than one meanings. In particular, the adverbial postpositions make translation from Korean to English difficult, because they can have various meanings. In this paper, we describe a method for resolving such semantic ambiguities of Korean adverbial postpositions using decision trees. The training examples for decision tree induction are extracted from a corpus consisting of 0.5 million words, and the semantic roles for adverbial postpositions are classified into 25 classes. The lack of training examples in decision tree induction is overcome by clustering words into classes using a greedy clustering algorithm. The cross validation results show that the presented method achieved 76.2% of precision on the average, which means 26.0% improvement over the method determining the semantic role of an adverbial postposition as the most frequently appearing role.

  • PDF

Study on the Classification Methodology for DSRC Travel Speed Patterns Using Decision Trees (의사결정나무 기법을 적용한 DSRC 통행속도패턴 분류방안)

  • Lee, Minha;Lee, Sang-Soo;Namkoong, Seong;Choi, Keechoo
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.2
    • /
    • pp.1-11
    • /
    • 2014
  • In this paper, travel speed patterns were deducted based on historical DSRC travel speed data using Decision Tree technique to improve availability of the massive amount of historical data. These patterns were designed to reflect spatio-temporal vicissitudes in reality by generating pattern units classified by months, time of day, and highway sections. The study area was from Seoul TG to Ansung IC sections on Gyung-bu highway where high peak time of day frequently occurs in South Korea. Decision Tree technique was applied to categorize travel speed according to day of week. As a result, five different pattern groups were generated: (Mon)(Tue Wed Thu)(Fri)(Sat)(Sun). Statistical verification was conducted to prove the validity of patterns on nine different highway sections, and the accuracy of fitting was found to be 93%. To reduce travel pattern errors against individual travel speed data, inclusion of four additional variables were also tested. Among those variables, 'traffic condition on previous month' variable improved the pattern grouping accuracy by reducing 50% of speed variance in the decision tree model developed.

Development of Cartographic Models of Openspace Management for Practical Use of GIS (GIS를 활용한 녹지관리 지도모델의 개발)

  • Gwak, Haeng-Goo;Cho, Young-Hwan
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.5 no.2 s.10
    • /
    • pp.45-54
    • /
    • 1997
  • A mathodology to manage effectively urban open space using GIS(Geographic Information System) was developed to explore the methology of efficient urban open space management focusing on landscaped trees. Cartographic modeling technique was used for practical use of GIS as a case study of the Childeren's park in Kwangju city. First, spatial and attribute information for efficient landscaped tree management was acqired through the development of a tree management cartographic model. Second the information of location and the attribute of individual trees can be applied as a means of decision making in tree management. Thira optimal path of tree management and priority of management in work process of the selected urban open space could be determined according to the objective of park management.

  • PDF

Prediction Models of Mild Cognitive Impairment Using the Korea Longitudinal Study of Ageing (고령화연구패널조사를 이용한 경도인지장애 예측모형)

  • Park, Hyojin;Ha, Juyoung
    • Journal of Korean Academy of Nursing
    • /
    • v.50 no.2
    • /
    • pp.191-199
    • /
    • 2020
  • Purpose: The purpose of this study was to compare sociodemographic characteristics of a normal cognitive group and mild cognitive impairment group, and establish prediction models of Mild Cognitive Impairment (MCI). Methods: This study was a secondary data analysis research using data from "the 4th Korea Longitudinal Study of Ageing" of the Korea Employment Information Service. A total of 6,405 individuals, including 1,329 individuals with MCI and 5,076 individuals with normal cognitive abilities, were part of the study. Based on the panel survey items, the research used 28 variables. The methods of analysis included a χ2-test, logistic regression analysis, decision tree analysis, predicted error rate, and an ROC curve calculated using SPSS 23.0 and SAS 13.2. Results: In the MCI group, the mean age was 71.4 and 65.8% of the participants was women. There were statistically significant differences in gender, age, and education in both groups. Predictors of MCI determined by using a logistic regression analysis were gender, age, education, instrumental activity of daily living (IADL), perceived health status, participation group, cultural activities, and life satisfaction. Decision tree analysis of predictors of MCI identified education, age, life satisfaction, and IADL as predictors. Conclusion: The accuracy of logistic regression model for MCI is slightly higher than that of decision tree model. The implementation of the prediction model for MCI established in this study may be utilized to identify middle-aged and elderly people with risks of MCI. Therefore, this study may contribute to the prevention and reduction of dementia.

The study of foreign exchange trading revenue model using decision tree and gradient boosting (외환거래에서 의사결정나무와 그래디언트 부스팅을 이용한 수익 모형 연구)

  • Jung, Ji Hyeon;Min, Dae Kee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.161-170
    • /
    • 2013
  • The FX (Foreign Exchange) is a form of exchange for the global decentralized trading of international currencies. The simple sense of Forex is simultaneous purchase and sale of the currency or the exchange of one country's currency for other countries'. We can find the consistent rules of trading by comparing the gradient boosting method and the decision trees methods. Methods such as time series analysis used for the prediction of financial markets have advantage of the long-term forecasting model. On the other hand, it is difficult to reflect the rapidly changing price fluctuations in the short term. Therefore, in this study, gradient boosting method and decision tree method are applied to analyze the short-term data in order to make the rules for the revenue structure of the FX market and evaluated the stability and the prediction of the model.

Pattern Analysis of Clinical Signs in Cultured Olive Flounder, Paralichthys Olivaceus, with Edwardsielosis using the Decision Tree Technique (의사결정 나무 기법을 이용한 양식넙치의 에드워드병 증상 패턴 분석)

  • Kim, Kyeong-Im;Jung, Sung-Ju;Kim, Sung-Hyun;Han, Soon-Hee;Ceong, Hee-Taek;Kim, Tae-Ho;Park, Jeong-Seon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.4
    • /
    • pp.661-674
    • /
    • 2021
  • Edwardsiellosis is difficult to treat in cultured olive flounder, Paralichthys olivaceus. It is present in the fish for a long period during all growth stages, and it often leads to mass mortalites. In this paper, the clinical patterns of Edwardsiellosis were analyzed by dividing the data into the whole-water temperature, low-water temperature, low-high water temperature, high-water temperature, and high-low water temperature groups based on various clinical signs of diseased cultured olive flounder using a decision tree technique. In the clinical sign patterns in the decision trees analyzed in the experiment, clinical signs in the liver, such as liver nodules, liver hemorrhages, and liver degeneration, were selected as the criteria for determining Edwardsiellosis. The selected clinical signs were known as the major clinical signs of Edwardsiellosis, and through consultation with fishery disease experts, the analysis confirmed that the clinical signs of Edwardsiellosis were successfully found in this study.