• Title/Summary/Keyword: score rule

Search Result 80, Processing Time 0.03 seconds

Network Anomaly Detection Technologies Using Unsupervised Learning AutoEncoders (비지도학습 오토 엔코더를 활용한 네트워크 이상 검출 기술)

  • Kang, Koohong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.4
    • /
    • pp.617-629
    • /
    • 2020
  • In order to overcome the limitations of the rule-based intrusion detection system due to changes in Internet computing environments, the emergence of new services, and creativity of attackers, network anomaly detection (NAD) using machine learning and deep learning technologies has received much attention. Most of these existing machine learning and deep learning technologies for NAD use supervised learning methods to learn a set of training data set labeled 'normal' and 'attack'. This paper presents the feasibility of the unsupervised learning AutoEncoder(AE) to NAD from data sets collecting of secured network traffic without labeled responses. To verify the performance of the proposed AE mode, we present the experimental results in terms of accuracy, precision, recall, f1-score, and ROC AUC value on the NSL-KDD training and test data sets. In particular, we model a reference AE through the deep analysis of diverse AEs varying hyper-parameters such as the number of layers as well as considering the regularization and denoising effects. The reference model shows the f1-scores 90.4% and 89% of binary classification on the KDDTest+ and KDDTest-21 test data sets based on the threshold of the 82-th percentile of the AE reconstruction error of the training data set.

The Prevalence Subjective Symptom of Cumulative Trauma Disorders and Related Risk Factors among Workers in Automobile Assembly Plant (자동차 조립공장 근로자의 누적외상성질환 자각증상 호소율과 관련 위험요인)

  • Kim, Chang-Sun;Kim, Kwang-Jong;Choi, Jae-wook;Yoon, Soo-Jong
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.11 no.1
    • /
    • pp.85-91
    • /
    • 2001
  • Background : It goes without saying that cumulative trauma disorders is spreading to various types of occupation in many advanced countries including America, and it forms considerable rate of total occupational disorders already. and as the result of it, the seriousness of worker health problem and economic loss owing to production loss, recuperation expense, etc. are on the increasing trend through whole society. In case of Korea, the related measures as well as accurate fact-finding survey data on cumulative trauma disorders aren't prepared in detail, so it implies forward problem would be serious. Purpose : The purpose of this study is to draw the risk factor of cumulative trauma disorders in production factory, to forecast the rate of occurrence of cumulative trauma disorders on the basis of subjective symptoms, and to present forward realistic and effective prevention measures by showing the risk of cumulative trauma disorders, objecting the production factory of a domestic riding automobile assembly shop, by estimating in the aspect of human-engineering through the analysis of risk factor being a cause of cumulative trauma disorders. Methods : For this study, I investigated work type and on-duty hours, breathing time, and subjective symptoms of cumulative trauma disorders through questionnaire, objecting the workers in press. car body, coating, and outfit factory. Results : As a result of research, 81.2% of workers were in the physical burden due to present working, and the highest prevalence by part of body is an waist. The higher a rule score, The higher the number of workers complaind for working intensity, and the higher age work duration, tool used time, the higher prevalence of subjective symptoms by part of body. The number of workers complaind subjective symptom for treatment is the highest in drugstore.

  • PDF

A Cause Analysis of Missed Fractures in an Emergency Medical Center (응급센터에 내원한 외상환자에서 간과된 골절의 요인 분석)

  • Park, Deuk-Hyun;Lee, Sung-Sil;Kim, Dong-Un;Cho, Hyun-Young;Lee, Young-Geun;Kim, Jun-Su;Jun, Jin;Kim, Young-Sik;Ha, Young-Rock;Sin, Tae-Yong
    • Journal of Trauma and Injury
    • /
    • v.22 no.1
    • /
    • pp.37-43
    • /
    • 2009
  • Purpose: A missed fracture is a very common occurrence in the Emergency Department (ED) and can have serious results because of delays in treatment, resulting in long-term disability. It is also one of the most common causes leading to medical legal issues. We analyzed the causes of missed fractures by using a bone scan which is known to be an effective tool for diagnosing bony lesions. Methods: We reviewed the medical records of trauma patients who underwent a bone scan after being discharged the ED from September 2006 to March 2008. Cases of missed fractures were identified by using electronic medical records to review each diagnosis. Definition of missed fracture was read after bone scan by radiologist. We decided that there was no fracture if we read 'trauma-related lesion' or 'cannot rule out fracture' on a bone scan read by a radiologist. Enrolled patients were analyzed by age, sex, time until bone scan and Injury Severity Score (ISS). Patients were divided into two groups, alert mentality and not-alert mentality, so there were split between a diagnosis group and a missed fracture group. ISS was also used in determining the severity of the patient's injury upon discharge from the ED. Results: A total of 532 patients were enrolled in this study. Of those, 487 patients were in the diagnosis group, and 45 patients (8.4%) were discovered to have had a fracture. Of the 45 missed fracture patients, 34 patients (6.4%) had one-site fractures, 8 patients (1.5%) had two-site fractures, and 3 patients (0.6%) had three-site fractures. The most commonly missed fracture was multiple rib fractures (18 patients, 30.5%), followed by lumbosacral (LS) spine fractures (10 patients, 16.9%), thoracic spine fractures (8 patients, 13.6%), and clavicle fractures (6 patients, 10.2%). Mean age was $50.12{\pm}18.54$ years in the diagnosis group and $57.38{\pm}16.88$ years in the missed fracture group. For the diagnosis group, the mean ISS was $9.03{\pm}8.26$, but in the missed fracture group it was $17.53{\pm}9.69$. Missed fractures were much more frequent in the not-alert mentality (p<0.01) and in the high (ISS$ ISS{\geq}16$) group (p<0.01). Conclusion: Missed fractures occur most frequent in patients of old age, not-alert mentality, and high ISS. Multiple rib and spine fractures were found to be the most frequent missed fractures, regardless of trauma severity. This study also shows a high possibility of clavicle and scapula fractures in patients with severe trauma.

The structure of consciousness of board system collegian who attended in fisheries and maritime college (수, 해운 승선계열 대학생의 의식 구조)

  • Lee, Kil-Rae;Bae, Seok-Je;Hong, Sung-Kun
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.6 no.2
    • /
    • pp.143-160
    • /
    • 1994
  • I have investigated questionaire concerning to the view of worth and construction of consciousness as to boarding system collegian who will employed in fisheries and maritime industries. The results were as follows; It appeared 38.9% that most collegian had a firm subjectivity about enrolled motive, selection of department, they had selected department according to the neighborhood recommendation, own squired score rather than the adaption and prospection, also it was appeared 55% that they had not satisfied with attending to their college, accordingly, the effectiveness of education were demolished in such aspects. It was appeared 74.3% that the education of fisheries and maritime college were divided into theory and practice pertinently also, 77.1% that modulate both technical education and human like education but as a general rule, the education of ocean going liscensed officer was appeared 7.1% only so that, the reformation of the curriculum and contents of education would be needed. As to the relation between professors and collegian, it was appeared 38.4% that well understanding professor, 18.1% that well teaching professor, 13.3% that they having humanlish relation with professor, 30.6% that they will benefit to vocational selection. consequently, the professor have to brings up the harmony on dint of education and industries. The things which collegian think to be worth were 43.6%, intercourse of friend 30.3%, circle activity. 6.6% listening lecture, also, the leisure activity after school were 74.7% with friends. 16.4% alone. consequently, the professor has to guide of leisure activity after lesson. most collegian has a good relationship with their parent(91.6%) but the respond with bad relationship was 8.4%. The most serious agony which collegian think about is vocational problems (48.9%), the other sex problem(22.5%). The objects consult with agony was aquaintance(54.9%). The parent(5.1%), professors(2.3%), the collegian who did not consult with agony was 20.9%. The parents and the professor is not object consult with their agony. so that, the professors has to strengthen the education according to the human nature. As to job after graduation of such college, collegian who wish to be taken job on the fisheries and maritime industry were 50.5%, on the contrary, the collegian who wish to be taken a job no relation with his major subjects was 29.1% especially, the collegian who want to be embarked was 26.3% (fisheries 23.3%, maritime 30.5%). so that, we must adopt the counter plan for the globalization and effective investment on the fisheries and maritime college.

  • PDF

Deletion-Based Sentence Compression Using Sentence Scoring Reflecting Linguistic Information (언어 정보가 반영된 문장 점수를 활용하는 삭제 기반 문장 압축)

  • Lee, Jun-Beom;Kim, So-Eon;Park, Seong-Bae
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.3
    • /
    • pp.125-132
    • /
    • 2022
  • Sentence compression is a natural language processing task that generates concise sentences that preserves the important meaning of the original sentence. For grammatically appropriate sentence compression, early studies utilized human-defined linguistic rules. Furthermore, while the sequence-to-sequence models perform well on various natural language processing tasks, such as machine translation, there have been studies that utilize it for sentence compression. However, for the linguistic rule-based studies, all rules have to be defined by human, and for the sequence-to-sequence model based studies require a large amount of parallel data for model training. In order to address these challenges, Deleter, a sentence compression model that leverages a pre-trained language model BERT, is proposed. Because the Deleter utilizes perplexity based score computed over BERT to compress sentences, any linguistic rules and parallel dataset is not required for sentence compression. However, because Deleter compresses sentences only considering perplexity, it does not compress sentences by reflecting the linguistic information of the words in the sentences. Furthermore, since the dataset used for pre-learning BERT are far from compressed sentences, there is a problem that this can lad to incorrect sentence compression. In order to address these problems, this paper proposes a method to quantify the importance of linguistic information and reflect it in perplexity-based sentence scoring. Furthermore, by fine-tuning BERT with a corpus of news articles that often contain proper nouns and often omit the unnecessary modifiers, we allow BERT to measure the perplexity appropriate for sentence compression. The evaluations on the English and Korean dataset confirm that the sentence compression performance of sentence-scoring based models can be improved by utilizing the proposed method.

The Comparison of Image Quality between Computed Radiography(CR) and Direct Digital Radiography(DDR) which Follows the Proper Exposure Conditions in General Photographing under the Digital Radiography(DR) (Digital Radiography 환경하에서 일반촬영시 적정 노출조건에 따른 CR과 DDR의 Image Quality 비교)

  • Kim, Jin-Bae;Kang, Chung-Hwan;Kang, Sung-Jin;Park, Soo-In;Park, Jong-Won;Kim, Yeong-Su;Kim, Seung-Sik
    • Korean Journal of Digital Imaging in Medicine
    • /
    • v.5 no.1
    • /
    • pp.64-77
    • /
    • 2002
  • DR has had an important fact not only in the department of radiology but also in productivity or work efficiency of a whole hospital. The environment of DR has more various parameter than CR, so it is able to supply high quality of medical services. The current environment of radiology department in each hospital has been changed from Film-Screen system to DR through Full-PACS. This hospital which uses Full-PACS became to study the proper condition of CR and DDR and how the image quality of them is expressed among general photographing systems in the DR environment. From this experiment, the image quality of DDR is better than CR under the same exposure condition. And in the DDR system, the score of image which uses AEC is a little higher than the score which doesn't use it. Especially it can be known that the function of AEC of DDR is useful to improve the image quality in the part of skull and chest. (The function of AEC : It is the tool that detects the ionized current of x-ray which goes through objects with using the ion chamber which is in the detector. Also it controls the examination of X-ray when the proper density is reached.) Because the proper degree of density can be represented by this system, the photographing can be taken much easily without consideration of the exposure condition with the thickness of various objects. From the result of this experiment, it can be known that the selection of proper exposure condition plays an important rule to gain good Image Quality. More researches will be necessary about DDR system which has potential ability in the future.

  • PDF

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Building a Korean Sentiment Lexicon Using Collective Intelligence (집단지성을 이용한 한글 감성어 사전 구축)

  • An, Jungkook;Kim, Hee-Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.2
    • /
    • pp.49-67
    • /
    • 2015
  • Recently, emerging the notion of big data and social media has led us to enter data's big bang. Social networking services are widely used by people around the world, and they have become a part of major communication tools for all ages. Over the last decade, as online social networking sites become increasingly popular, companies tend to focus on advanced social media analysis for their marketing strategies. In addition to social media analysis, companies are mainly concerned about propagating of negative opinions on social networking sites such as Facebook and Twitter, as well as e-commerce sites. The effect of online word of mouth (WOM) such as product rating, product review, and product recommendations is very influential, and negative opinions have significant impact on product sales. This trend has increased researchers' attention to a natural language processing, such as a sentiment analysis. A sentiment analysis, also refers to as an opinion mining, is a process of identifying the polarity of subjective information and has been applied to various research and practical fields. However, there are obstacles lies when Korean language (Hangul) is used in a natural language processing because it is an agglutinative language with rich morphology pose problems. Therefore, there is a lack of Korean natural language processing resources such as a sentiment lexicon, and this has resulted in significant limitations for researchers and practitioners who are considering sentiment analysis. Our study builds a Korean sentiment lexicon with collective intelligence, and provides API (Application Programming Interface) service to open and share a sentiment lexicon data with the public (www.openhangul.com). For the pre-processing, we have created a Korean lexicon database with over 517,178 words and classified them into sentiment and non-sentiment words. In order to classify them, we first identified stop words which often quite likely to play a negative role in sentiment analysis and excluded them from our sentiment scoring. In general, sentiment words are nouns, adjectives, verbs, adverbs as they have sentimental expressions such as positive, neutral, and negative. On the other hands, non-sentiment words are interjection, determiner, numeral, postposition, etc. as they generally have no sentimental expressions. To build a reliable sentiment lexicon, we have adopted a concept of collective intelligence as a model for crowdsourcing. In addition, a concept of folksonomy has been implemented in the process of taxonomy to help collective intelligence. In order to make up for an inherent weakness of folksonomy, we have adopted a majority rule by building a voting system. Participants, as voters were offered three voting options to choose from positivity, negativity, and neutrality, and the voting have been conducted on one of the largest social networking sites for college students in Korea. More than 35,000 votes have been made by college students in Korea, and we keep this voting system open by maintaining the project as a perpetual study. Besides, any change in the sentiment score of words can be an important observation because it enables us to keep track of temporal changes in Korean language as a natural language. Lastly, our study offers a RESTful, JSON based API service through a web platform to make easier support for users such as researchers, companies, and developers. Finally, our study makes important contributions to both research and practice. In terms of research, our Korean sentiment lexicon plays an important role as a resource for Korean natural language processing. In terms of practice, practitioners such as managers and marketers can implement sentiment analysis effectively by using Korean sentiment lexicon we built. Moreover, our study sheds new light on the value of folksonomy by combining collective intelligence, and we also expect to give a new direction and a new start to the development of Korean natural language processing.

A Study on the Explanation of the Title of 'Siyongjeongdaeeopbo' in Daeakhubo Volume 2 (『대악후보』 권2 시용정대업보(時用定大業譜) 편명(篇名) 해설 고찰)

  • Lee, Jong-Sook;Nam, Sang-Sook
    • Korean Journal of Heritage: History & Science
    • /
    • v.49 no.4
    • /
    • pp.80-95
    • /
    • 2016
  • This study sought to disclose the problems surrounding the erroneous explanation of the title of the musical script 'Jeongdaeeop,' which is Jongmyojeryeak(Jongmyo Shrine ritual music), shown in Daeakhubo, Korea's treasure No.1291. Daehakhubo imitated and adopted expressions like 1 Byeon(change) and 1 Pyeon(edition), shown in the music written in the Annals of King Sejong, the foundation of Jongmyojeryeakbo music. Originally, 'Jeongdaeeop' recorded during the reign of King Sejong consisted of 6-Byeon and 13-Pyeon compositions, except Inlet and Outlet tunes. King Sejo, however, while rearranging this music into Jongmyo Shrine Mumuak music, reduced it to 9 tunes. And, when registering such arrangements in the musical scripts in the Annuals of King Sejo, he did not list the explanation of the titles as in the Annals of King Sejong. He just listed the nine tunes. In contrast to the musical scripts in the Annals of King Sejo, in Daeakhubo the details of Byeon and Pyeon under the nine tune titles are listed as in the Annals of King Sejong. This study revealed that Byeon and Pyeon expressed in Daeakhubo were the results of arbitrarily transcribing the different Byeon and Pyeon of 'Jeongdaeeop' and 'Balsang' in the Annals of King Sejong into the revised 'Jeongdaeeop' during the reign of King Sejo. Thus, under the titles of each score in 'Jeongdaeeop' of Daeakhubo are written the explanations of the muscial scores shown in both 'Jeongdaeeop' of the Annals of King Sejong and 'Balsang' of the Annals of King Sejong. Thus, the story of the son Ikjo is described even ahead of the story of the father Mokjo, and stories totally different from the original movements are described, creating overall errors. Such errors were presumably caused by powers that created the false musical script 'Sokakwonbo' during the Japanese colonial rule of Korea and disguised it as a traditional musical script.

A Development of Evaluation Indicators for Performance Improvement of Horticultural Therapy Garden (원예치료정원의 성능개선을 위한 평가지표 개발)

  • Ahn, Je-Jun;Park, Yool-Jin
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.36 no.4
    • /
    • pp.113-123
    • /
    • 2018
  • The purpose of this research is to develop evaluation indicators forperformance improvement of horticultural therapy garden. In order to achieve a therapeutic purpose, the gardening activity held by the trained horticultural therapist. Moreover, horticultural therapy is 'a medical model' for the treatment and basic premise of the research was set, as horticultural therapy garden is characterized area to support activities of patients and horticultural therapist functionally and efficiently. For this study, three times of Delphi and AHP techniques were proceeded to export panels who were recruited by purposive sampling. Through these techniques, it was possible to deduct the evaluation indicator which maximizes the performance of the horticultural therapy garden. The evaluation items were prioritized by typing and stratification of the indicator. The results and discussions were stated as followings. Firstly, a questionnaire of experts was conducted to horticultural therapists and civil servants who were in charge of horticultural therapy. As results(horticultural therapists: 87.8%, civil servants: 75.2%), It is possible to conclude that both positions have the high recognition and agreed on the necessity of horticultural therapy. Secondly, Delphi investigation was conducted three times in order to develop the evaluation indicator for performance evaluation. After Delphi analysis, total 34 of evaluation elements to improve the performance of the horticultural therapy garden by reliability and validity analysis results. Thirdly, AHP analysis of each evaluation indicator was conducted on the relative importance and weighting. Moreover, the results showed 'interaction between nature and human' as the most important element, and in order of 'plan of the program', 'social interaction', 'sustainable environmental', and 'universal design rule', respectively. On the other hand, the exports from the university and research institute evaluated the importance of 'interaction between nature and human', while horticultural therapists chose 'plan of the program' as the most important element. Fourthly, the total weight was used to develop weight applied evaluation indicator for the performance evaluation of the horticultural therapy garden. The weight applying to evaluation index is generally calculated multiply the evaluation scores and the total weight using AHP analysis. Finally, 'the evaluation indicator and evaluation score sheet for performance improvement of the horticultural therapy garden' was finally stated based on the relative order of priority between evaluation indicators and analyzing the weight. If it was deducted the improvement points for the efficiency of already established horticultural therapy garden using the 'weight applied evaluation sheet', it is possible to expand it by judging the importance with the decision of the priority because the item importance decided by experts was reflected. Moreover, in the condition of new garden establishment, it is expected to be helpful in suggesting ways for performance improvement and in setting the guidelines by understanding the major indicators of performance improvement in horticultural therapy activity.