• Title/Summary/Keyword: School level

Search Result 20,005, Processing Time 0.046 seconds

Sentiment Analysis of Korean Reviews Using CNN: Focusing on Morpheme Embedding (CNN을 적용한 한국어 상품평 감성분석: 형태소 임베딩을 중심으로)

  • Park, Hyun-jung;Song, Min-chae;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.59-83
    • /
    • 2018
  • With the increasing importance of sentiment analysis to grasp the needs of customers and the public, various types of deep learning models have been actively applied to English texts. In the sentiment analysis of English texts by deep learning, natural language sentences included in training and test datasets are usually converted into sequences of word vectors before being entered into the deep learning models. In this case, word vectors generally refer to vector representations of words obtained through splitting a sentence by space characters. There are several ways to derive word vectors, one of which is Word2Vec used for producing the 300 dimensional Google word vectors from about 100 billion words of Google News data. They have been widely used in the studies of sentiment analysis of reviews from various fields such as restaurants, movies, laptops, cameras, etc. Unlike English, morpheme plays an essential role in sentiment analysis and sentence structure analysis in Korean, which is a typical agglutinative language with developed postpositions and endings. A morpheme can be defined as the smallest meaningful unit of a language, and a word consists of one or more morphemes. For example, for a word '예쁘고', the morphemes are '예쁘(= adjective)' and '고(=connective ending)'. Reflecting the significance of Korean morphemes, it seems reasonable to adopt the morphemes as a basic unit in Korean sentiment analysis. Therefore, in this study, we use 'morpheme vector' as an input to a deep learning model rather than 'word vector' which is mainly used in English text. The morpheme vector refers to a vector representation for the morpheme and can be derived by applying an existent word vector derivation mechanism to the sentences divided into constituent morphemes. By the way, here come some questions as follows. What is the desirable range of POS(Part-Of-Speech) tags when deriving morpheme vectors for improving the classification accuracy of a deep learning model? Is it proper to apply a typical word vector model which primarily relies on the form of words to Korean with a high homonym ratio? Will the text preprocessing such as correcting spelling or spacing errors affect the classification accuracy, especially when drawing morpheme vectors from Korean product reviews with a lot of grammatical mistakes and variations? We seek to find empirical answers to these fundamental issues, which may be encountered first when applying various deep learning models to Korean texts. As a starting point, we summarized these issues as three central research questions as follows. First, which is better effective, to use morpheme vectors from grammatically correct texts of other domain than the analysis target, or to use morpheme vectors from considerably ungrammatical texts of the same domain, as the initial input of a deep learning model? Second, what is an appropriate morpheme vector derivation method for Korean regarding the range of POS tags, homonym, text preprocessing, minimum frequency? Third, can we get a satisfactory level of classification accuracy when applying deep learning to Korean sentiment analysis? As an approach to these research questions, we generate various types of morpheme vectors reflecting the research questions and then compare the classification accuracy through a non-static CNN(Convolutional Neural Network) model taking in the morpheme vectors. As for training and test datasets, Naver Shopping's 17,260 cosmetics product reviews are used. To derive morpheme vectors, we use data from the same domain as the target one and data from other domain; Naver shopping's about 2 million cosmetics product reviews and 520,000 Naver News data arguably corresponding to Google's News data. The six primary sets of morpheme vectors constructed in this study differ in terms of the following three criteria. First, they come from two types of data source; Naver news of high grammatical correctness and Naver shopping's cosmetics product reviews of low grammatical correctness. Second, they are distinguished in the degree of data preprocessing, namely, only splitting sentences or up to additional spelling and spacing corrections after sentence separation. Third, they vary concerning the form of input fed into a word vector model; whether the morphemes themselves are entered into a word vector model or with their POS tags attached. The morpheme vectors further vary depending on the consideration range of POS tags, the minimum frequency of morphemes included, and the random initialization range. All morpheme vectors are derived through CBOW(Continuous Bag-Of-Words) model with the context window 5 and the vector dimension 300. It seems that utilizing the same domain text even with a lower degree of grammatical correctness, performing spelling and spacing corrections as well as sentence splitting, and incorporating morphemes of any POS tags including incomprehensible category lead to the better classification accuracy. The POS tag attachment, which is devised for the high proportion of homonyms in Korean, and the minimum frequency standard for the morpheme to be included seem not to have any definite influence on the classification accuracy.

A Study on the Prediction Model of Stock Price Index Trend based on GA-MSVM that Simultaneously Optimizes Feature and Instance Selection (입력변수 및 학습사례 선정을 동시에 최적화하는 GA-MSVM 기반 주가지수 추세 예측 모형에 관한 연구)

  • Lee, Jong-sik;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.4
    • /
    • pp.147-168
    • /
    • 2017
  • There have been many studies on accurate stock market forecasting in academia for a long time, and now there are also various forecasting models using various techniques. Recently, many attempts have been made to predict the stock index using various machine learning methods including Deep Learning. Although the fundamental analysis and the technical analysis method are used for the analysis of the traditional stock investment transaction, the technical analysis method is more useful for the application of the short-term transaction prediction or statistical and mathematical techniques. Most of the studies that have been conducted using these technical indicators have studied the model of predicting stock prices by binary classification - rising or falling - of stock market fluctuations in the future market (usually next trading day). However, it is also true that this binary classification has many unfavorable aspects in predicting trends, identifying trading signals, or signaling portfolio rebalancing. In this study, we try to predict the stock index by expanding the stock index trend (upward trend, boxed, downward trend) to the multiple classification system in the existing binary index method. In order to solve this multi-classification problem, a technique such as Multinomial Logistic Regression Analysis (MLOGIT), Multiple Discriminant Analysis (MDA) or Artificial Neural Networks (ANN) we propose an optimization model using Genetic Algorithm as a wrapper for improving the performance of this model using Multi-classification Support Vector Machines (MSVM), which has proved to be superior in prediction performance. In particular, the proposed model named GA-MSVM is designed to maximize model performance by optimizing not only the kernel function parameters of MSVM, but also the optimal selection of input variables (feature selection) as well as instance selection. In order to verify the performance of the proposed model, we applied the proposed method to the real data. The results show that the proposed method is more effective than the conventional multivariate SVM, which has been known to show the best prediction performance up to now, as well as existing artificial intelligence / data mining techniques such as MDA, MLOGIT, CBR, and it is confirmed that the prediction performance is better than this. Especially, it has been confirmed that the 'instance selection' plays a very important role in predicting the stock index trend, and it is confirmed that the improvement effect of the model is more important than other factors. To verify the usefulness of GA-MSVM, we applied it to Korea's real KOSPI200 stock index trend forecast. Our research is primarily aimed at predicting trend segments to capture signal acquisition or short-term trend transition points. The experimental data set includes technical indicators such as the price and volatility index (2004 ~ 2017) and macroeconomic data (interest rate, exchange rate, S&P 500, etc.) of KOSPI200 stock index in Korea. Using a variety of statistical methods including one-way ANOVA and stepwise MDA, 15 indicators were selected as candidate independent variables. The dependent variable, trend classification, was classified into three states: 1 (upward trend), 0 (boxed), and -1 (downward trend). 70% of the total data for each class was used for training and the remaining 30% was used for verifying. To verify the performance of the proposed model, several comparative model experiments such as MDA, MLOGIT, CBR, ANN and MSVM were conducted. MSVM has adopted the One-Against-One (OAO) approach, which is known as the most accurate approach among the various MSVM approaches. Although there are some limitations, the final experimental results demonstrate that the proposed model, GA-MSVM, performs at a significantly higher level than all comparative models.

Study on Medical Records In ${\ulcorner}$the Historical Records of the Three Kingdoms${\lrcorner}$ ("삼국사기(三國史記)"에 기록된 의약내용(醫藥內容) 분석)

  • Shin, Soon-Shik;Choi, Hwan-Soo
    • Journal of The Association for Neo Medicine
    • /
    • v.2 no.1
    • /
    • pp.35-54
    • /
    • 1997
  • We tried to observe the features of ancient medical practice by analysing the records related to medicine in the book, ${\ulcorner}$the Historical Records of the Three Kingdom${\lrcorner}$ of which content includes the features of medicine in mythology, plague, delivery of twins, drugs, medical system, shamanism, constitutional medicine, psychiatry, forensic medicine, deformity, a spa, medical phrase, health and welfare work, religion, death. physiological anatomy, Taoist medicine, acupuncture, the occult af of transformation and etc. Our initial concern was about where to draw line as of medical field and we defined medicine in more broad meaning. The book ${\ulcorner}$the Historical Records of the Three Kingdoms${\lrcorner}$ describes the world of mythology by way of medicine which is not clearly a conventional one. There appears records of birth of multiple offsprings 7 times in which cases are of triplets or more. Delivering multiple offsprings were rare phenomenon though such fertility was highly admired. This shows one aspect of ancient country having more population meant more power of the nation. Of those medical records conveyed in that book includes stories of childbirth such as giving birth to a son after praying, giving birth to Kim Yoo-shin after 20 months after mother's dream of conception, and a song longing for getting a laudable child. Plagues were prevalent throughout winter to spring season and one can observe various symptoms of plagues in the record. Of these epidemic diseases, cold type might have been more common than the heat one. Appearance of epidemic diseases frequently coincided with that of natural disasters that this suggests a linkage between plague and underlying doctrine on five elements' motion and six kinds of natural factors. There exists only a few names of diseases such as epidemic disease, wind disease, and syndrome characterized by dyspnea. Otherwise there appeared only afflictions that were not specified therefore it remains cluless to keep track of certain diseases of prevalence. Since this ${\ulcorner}$Historical Records of the Three Kingdoms'${\lrcorner}$ wasn't any sort of medical book, words and terms used were not technical kind and most were the ones used generally among lay people. Therefore any mechanisms of the diseases were hardly mentioned. Some of medicinal substances such as Calculus Bovis, Radix Ginseng, Gaboderma Luciderm, magnetitum were also in use in those days. 53 kinds of dietary supplies appears in the records and some of these might have been used as medicinal purpose. Records concerning dicipline of one's body includes activities such as hunting, archery, horseback riding etc. In Shilla dynasty there were positions such as professor of medicine, Naekongbong(內供奉), Kongbong's doctor(供奉醫師), Kongbong's diviner(供奉卜師). As an educational facility, medical school was built at the first year of King Hyoso's reign and it's curricula included various subjects as ${\ulcorner}$Shin Nong's Herbal classic${\lrcorner}$, ${\ulcorner}$Kabeul classic of acupuncture and moxbustion${\lrcorner}$, ${\ulcorner}$The Plain Questions of the Yellow Emperor's Classic of Internal Medicine${\lrcorner}$, ${\ulcorner}$Classic of Acupuncturer${\lrcorner}$, ${\ulcorner}$The Pulse Classic${\lrcorner}$, ${\ulcorner}$Classic of Channels and Acupuncture Points${\lrcorner}$ and ${\ulcorner}$Difficult Classic${\lrcorner}$. There were 2 medical professors who were in charge of education. To establish pharmacopoeia, 2 Shaji(舍知), 6 Sha(史), 2 Jongshaji(從舍知) were appointed. In Baekje dynasty, Department of Herb was maintained. Doing praying for the sake of health, doing phrenology also can be extended to medical arena. Those who survived over 100 years of age appear 3 times in the record, while 98 appears once. The earliest psychiatrist Nokjin differentiated symptoms to apply either therapies using acupuncture and drug or psychotherapy. There appears a case of rape, a case of burying alive with the dead, 8 cases of suicide that can characterize a prototype of forensic medicine. Deformity-related records include phrases as follow: 'there seems protrudent bone behind the head', 'a body which has two heads, two trunks, four arms.', 'a body equipped with two heads' In those times spa can be said to be used as a place for he리ing, convalescence, and relaxation seeing the records describing a person pretended illness and went to spa to enjoy with his friends. Priest doctors and millitary surgeons were in charge of the medical sevice in the period of the Three Kingdoms by the record written by Mookhoja(墨胡子) and Hoonkyeom(訓謙). Poor diet and regimen makes people more vulnerable to diseases. So there existed charity services for those poor people who couldn't live with one's own capacity such as single parents, orphans, the aged people no one to take care and those who are ill. The cause of affliction was frequently coined with human relation. There appeared the phenomenon of releasing prisoners and allowing people to become priests at the time of king's suffering. Besides, as a healing procedure, sutra-chanting was peformed. There appears 10 cases of death related records which varies from death by drowning, or by freezing, death from animals, death from war, death from wightloss and killing oneself at the moment of spouse's death and etc. There also exist certain records which suggest the knowledge of physiology and anatomy in those times. Since the taoist books such as ${\ulcorner}$Book of the Way and Its Power(老子道德經)${\lrcorner}$ were introduced in the period of Three Kingdoms, it can be considered that medicine was also influenced by taoism. Records of higher level of acupuncture, records which links the medicine and occult art of transformation existed. Although limited, we could figure out the medical state of ancient society.

  • PDF

Effect of 6-Hydroxydopamine (6-OHDA) on the Expression of Hypothalamus-Pituitary Axis Hormone Genes in Male Rats (수컷 흰쥐의 시상하부-뇌하수체 축 호르몬 유전자 발현에 미치는 6-Hydroxydopamine(6-OHDA)의 영향)

  • Heo, Hyun-Jin;Ahn, Ryun-Sup;Lee, Sung-Ho
    • Development and Reproduction
    • /
    • v.13 no.4
    • /
    • pp.257-264
    • /
    • 2009
  • A neurotoxin, 6-hydroxydopamine (6-OHDA) has been widely used to create animal model for Parkinson's disease (PD) due to its specific toxicity against dopaminergic (DA) neurons. Since DA signals modulate a broad spectrum of CNS physiology, one can expect profound alterations in neuroendocrine activities of both PD patients and 6-OHDA treated animals. Limited applications of 6-OHDA injection model, however, have been made on the studies of hypothalamuspituitary neuroendocrine circuits. The present study was performed to examine whether blockade of brain catecholamine (CA) biosynthesis with 6-OHDA can make any alteration in the transcriptional activities of hypothalamus-pituitary hormone genes in adult male rats. Three-month-old male rats (SD strain) were received 6-OHDA ($200{\mu}g$ in $10{\mu}\ell$ of saline/animal) by intracerebroventricular (icv) injection, and sacrificed after two weeks. To determine the mRNA levels of hypothalamuspituitary hormone genes, total RNAs were extracted and applied to the semi-quantitative RT-PCRs. The mRNA levels of tyrosine hydroxylase (TH), the rate-limiting enzyme for the catecholamine biosynthesis, were significantly lower than those from the control group (control:6-OHDA=1:0.72${\pm}$0.02AU, p<0.001), confirming the efficacy of 6-OHDA injection. The mRNA levels of gonadotropin-releasing hormone (GnRH) and corticotropin releasing hormone (CRH) in the hypothalami from 6-OHDA group were significantly lower than those from the control group (GnRH, control:6-OHDA=1:0.39${\pm}$0.03AU, p<0.001; CRH, control:6-OHDA=1:0.76${\pm}$0.07AU, p<0.01). There were significant decreases in the mRNA levels of common alpha subunit of glycoprotein homones (Cg$\alpha$), LH beta subunit (LH-$\beta$), and FSH beta subunit (FSH-$\beta$) in pituitaries from 6-OHDA group compared to control values (Cg$\alpha$, control:6-OHDA=1:0.81${\pm}$0.02AU, p<0.001; LH-$\beta$, control:6-OHDA=1:0.68${\pm}$0.04AU, p<0.001; FSH-$\beta$, control:6-OHDA=1:0.84${\pm}$0.05AU, p<0.001). Similarly, the level of adrenocorticotrophic hormone (ACTH) transcripts from 6-OHDA group was significantly lower than that from the control group (control: 6-OHDA=1:0.86${\pm}$0.04AU, p<0.01). The present study demonstrated that centrally injected DA neurotoxin could downregulate the transcriptional activities of the two hypothalamus-pituitary neuroendocrine circuits, i.e., GnRH-gonadotropins and CRH-ACTH systems. These results suggested that hypothalamic CA input might affect on the activities of gonad and adrenal through modulation of hypothalamus-pituitary function, providing plausible explanation for frequent occurrence of sexual dysfunction and poor stress-response in PD patients.

  • PDF

Genetic Environments of the High-purity Limestone in the Upper Zone of the Daegi Formation at the Jeongseon-Samcheok Area (정선-삼척 일대 대기층 상부 고품위 석회석의 생성환경)

  • Kim, Chang Seong;Choi, Seon-Gyu;Kim, Gyu-Bo;Kang, Jeonggeuk;Kim, Kyeong Bae;Kim, Hagsoo;Lee, Jeongsang;Ryu, In-Chang
    • Economic and Environmental Geology
    • /
    • v.50 no.4
    • /
    • pp.287-302
    • /
    • 2017
  • The carbonate rocks of the Daegi Formation are composed of the limestone at the upper and lower zones, and the dolomite at the middle zone, in which the upper zone has higher CaO content than others. The colors of carbonate rock in the Daegi Formation can be divided into five types; white, light brown, light gray, gray, and dark gray. The white to light gray colored rocks correspond to the high purity limestone with 53.15 ~ 55.64 wt. % CaO, and the light brown colored rocks contain 20.71 ~ 21.67 wt. % MgO. The bleaching of carbonate rocks are not related to CaO composition of the rocks, as light gray rocks tend to be higher in CaO content than those of the white rocks at the lower zone. The pelitic components are also occasionally increased in white limestone than light grey one. $Al_2O_3$ is one of the most difficult content to remove during hydrothermal processes, so the interpretation that the limestone is purified together with hydrothemral bleaching, has little merit. The wide range (over 16 ‰) of ${\delta}^{18}O_{SMOW}$, smaller variation (within 2 ‰) of ${\delta}^{13}C_{PDB}$ are apparent in both the upper and lower zones, which indicate the Daegi Formation had been affected overall by hydrothermal fluids. The K-Ar isotopic age of hydrothermal alteration in the GMI limestone mine is $85.1{\pm}1.7Ma$. Gradual change from grey through light grey to white limestone is accompaned by lower oxygen stable isotope values, which is major evidence that the hydrothermal effect is the main process of the bleaching. Although the Daegi Formation has suffered from hydrothermal activity and increase in whiteness, there is no clear evidence demonstrating the relationship between bleaching and high purity of limestone. The purification of limestone has nothing to do with the hydrothermal activity in this area. Instead, it should be considered that the change of sedimentary environment related to see-level fluctuation which can prevent deposition of pelitic components especially $Al_2O_3$ contrbuted to the formation of the high purity limestone in the upper zone of the Daegi Formation. Considering the evidences such as increase in CaO content of limestone by depth, gradual change from calcite to dolomite at the lower zones, and occurring the high purity limestone at the upper zone, the interpretation of sequence stratigraphic aspect to the formation of the high purity Daegi limestone appears to be more suitable than that of hydrothermal alteration origin.

Impact of Shortly Acquired IPO Firms on ICT Industry Concentration (ICT 산업분야 신생기업의 IPO 이후 인수합병과 산업 집중도에 관한 연구)

  • Chang, YoungBong;Kwon, YoungOk
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.51-69
    • /
    • 2020
  • Now, it is a stylized fact that a small number of technology firms such as Apple, Alphabet, Microsoft, Amazon, Facebook and a few others have become larger and dominant players in an industry. Coupled with the rise of these leading firms, we have also observed that a large number of young firms have become an acquisition target in their early IPO stages. This indeed results in a sharp decline in the number of new entries in public exchanges although a series of policy reforms have been promulgated to foster competition through an increase in new entries. Given the observed industry trend in recent decades, a number of studies have reported increased concentration in most developed countries. However, it is less understood as to what caused an increase in industry concentration. In this paper, we uncover the mechanisms by which industries have become concentrated over the last decades by tracing the changes in industry concentration associated with a firm's status change in its early IPO stages. To this end, we put emphasis on the case in which firms are acquired shortly after they went public. Especially, with the transition to digital-based economies, it is imperative for incumbent firms to adapt and keep pace with new ICT and related intelligent systems. For instance, after the acquisition of a young firm equipped with AI-based solutions, an incumbent firm may better respond to a change in customer taste and preference by integrating acquired AI solutions and analytics skills into multiple business processes. Accordingly, it is not unusual for young ICT firms become an attractive acquisition target. To examine the role of M&As involved with young firms in reshaping the level of industry concentration, we identify a firm's status in early post-IPO stages over the sample periods spanning from 1990 to 2016 as follows: i) being delisted, ii) being standalone firms and iii) being acquired. According to our analysis, firms that have conducted IPO since 2000s have been acquired by incumbent firms at a relatively quicker time than those that did IPO in previous generations. We also show a greater acquisition rate for IPO firms in the ICT sector compared with their counterparts in other sectors. Our results based on multinomial logit models suggest that a large number of IPO firms have been acquired in their early post-IPO lives despite their financial soundness. Specifically, we show that IPO firms are likely to be acquired rather than be delisted due to financial distress in early IPO stages when they are more profitable, more mature or less leveraged. For those IPO firms with venture capital backup have also become an acquisition target more frequently. As a larger number of firms are acquired shortly after their IPO, our results show increased concentration. While providing limited evidence on the impact of large incumbent firms in explaining the change in industry concentration, our results show that the large firms' effect on industry concentration are pronounced in the ICT sector. This result possibly captures the current trend that a few tech giants such as Alphabet, Apple and Facebook continue to increase their market share. In addition, compared with the acquisitions of non-ICT firms, the concentration impact of IPO firms in early stages becomes larger when ICT firms are acquired as a target. Our study makes new contributions. To our best knowledge, this is one of a few studies that link a firm's post-IPO status to associated changes in industry concentration. Although some studies have addressed concentration issues, their primary focus was on market power or proprietary software. Contrast to earlier studies, we are able to uncover the mechanism by which industries have become concentrated by placing emphasis on M&As involving young IPO firms. Interestingly, the concentration impact of IPO firm acquisitions are magnified when a large incumbent firms are involved as an acquirer. This leads us to infer the underlying reasons as to why industries have become more concentrated with a favor of large firms in recent decades. Overall, our study sheds new light on the literature by providing a plausible explanation as to why industries have become concentrated.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Analysis on Factors Influencing Welfare Spending of Local Authority : Implementing the Detailed Data Extracted from the Social Security Information System (지방자치단체 자체 복지사업 지출 영향요인 분석 : 사회보장정보시스템을 통한 접근)

  • Kim, Kyoung-June;Ham, Young-Jin;Lee, Ki-Dong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.141-156
    • /
    • 2013
  • Researchers in welfare services of local government in Korea have rather been on isolated issues as disables, childcare, aging phenomenon, etc. (Kang, 2004; Jung et al., 2009). Lately, local officials, yet, realize that they need more comprehensive welfare services for all residents, not just for above-mentioned focused groups. Still cases dealt with focused group approach have been a main research stream due to various reason(Jung et al., 2009; Lee, 2009; Jang, 2011). Social Security Information System is an information system that comprehensively manages 292 welfare benefits provided by 17 ministries and 40 thousand welfare services provided by 230 local authorities in Korea. The purpose of the system is to improve efficiency of social welfare delivery process. The study of local government expenditure has been on the rise over the last few decades after the restarting the local autonomy, but these studies have limitations on data collection. Measurement of a local government's welfare efforts(spending) has been primarily on expenditures or budget for an individual, set aside for welfare. This practice of using monetary value for an individual as a "proxy value" for welfare effort(spending) is based on the assumption that expenditure is directly linked to welfare efforts(Lee et al., 2007). This expenditure/budget approach commonly uses total welfare amount or percentage figure as dependent variables (Wildavsky, 1985; Lee et al., 2007; Kang, 2000). However, current practice of using actual amount being used or percentage figure as a dependent variable may have some limitation; since budget or expenditure is greatly influenced by the total budget of a local government, relying on such monetary value may create inflate or deflate the true "welfare effort" (Jang, 2012). In addition, government budget usually contain a large amount of administrative cost, i.e., salary, for local officials, which is highly unrelated to the actual welfare expenditure (Jang, 2011). This paper used local government welfare service data from the detailed data sets linked to the Social Security Information System. The purpose of this paper is to analyze the factors that affect social welfare spending of 230 local authorities in 2012. The paper applied multiple regression based model to analyze the pooled financial data from the system. Based on the regression analysis, the following factors affecting self-funded welfare spending were identified. In our research model, we use the welfare budget/total budget(%) of a local government as a true measurement for a local government's welfare effort(spending). Doing so, we exclude central government subsidies or support being used for local welfare service. It is because central government welfare support does not truly reflect the welfare efforts(spending) of a local. The dependent variable of this paper is the volume of the welfare spending and the independent variables of the model are comprised of three categories, in terms of socio-demographic perspectives, the local economy and the financial capacity of local government. This paper categorized local authorities into 3 groups, districts, and cities and suburb areas. The model used a dummy variable as the control variable (local political factor). This paper demonstrated that the volume of the welfare spending for the welfare services is commonly influenced by the ratio of welfare budget to total local budget, the population of infants, self-reliance ratio and the level of unemployment factor. Interestingly, the influential factors are different by the size of local government. Analysis of determinants of local government self-welfare spending, we found a significant effect of local Gov. Finance characteristic in degree of the local government's financial independence, financial independence rate, rate of social welfare budget, and regional economic in opening-to-application ratio, and sociology of population in rate of infants. The result means that local authorities should have differentiated welfare strategies according to their conditions and circumstances. There is a meaning that this paper has successfully proven the significant factors influencing welfare spending of local government in Korea.

Measuring the Public Service Quality Using Process Mining: Focusing on N City's Building Licensing Complaint Service (프로세스 마이닝을 이용한 공공서비스의 품질 측정: N시의 건축 인허가 민원 서비스를 중심으로)

  • Lee, Jung Seung
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.35-52
    • /
    • 2019
  • As public services are provided in various forms, including e-government, the level of public demand for public service quality is increasing. Although continuous measurement and improvement of the quality of public services is needed to improve the quality of public services, traditional surveys are costly and time-consuming and have limitations. Therefore, there is a need for an analytical technique that can measure the quality of public services quickly and accurately at any time based on the data generated from public services. In this study, we analyzed the quality of public services based on data using process mining techniques for civil licensing services in N city. It is because the N city's building license complaint service can secure data necessary for analysis and can be spread to other institutions through public service quality management. This study conducted process mining on a total of 3678 building license complaint services in N city for two years from January 2014, and identified process maps and departments with high frequency and long processing time. According to the analysis results, there was a case where a department was crowded or relatively few at a certain point in time. In addition, there was a reasonable doubt that the increase in the number of complaints would increase the time required to complete the complaints. According to the analysis results, the time required to complete the complaint was varied from the same day to a year and 146 days. The cumulative frequency of the top four departments of the Sewage Treatment Division, the Waterworks Division, the Urban Design Division, and the Green Growth Division exceeded 50% and the cumulative frequency of the top nine departments exceeded 70%. Higher departments were limited and there was a great deal of unbalanced load among departments. Most complaint services have a variety of different patterns of processes. Research shows that the number of 'complementary' decisions has the greatest impact on the length of a complaint. This is interpreted as a lengthy period until the completion of the entire complaint is required because the 'complement' decision requires a physical period in which the complainant supplements and submits the documents again. In order to solve these problems, it is possible to drastically reduce the overall processing time of the complaints by preparing thoroughly before the filing of the complaints or in the preparation of the complaints, or the 'complementary' decision of other complaints. By clarifying and disclosing the cause and solution of one of the important data in the system, it helps the complainant to prepare in advance and convinces that the documents prepared by the public information will be passed. The transparency of complaints can be sufficiently predictable. Documents prepared by pre-disclosed information are likely to be processed without problems, which not only shortens the processing period but also improves work efficiency by eliminating the need for renegotiation or multiple tasks from the point of view of the processor. The results of this study can be used to find departments with high burdens of civil complaints at certain points of time and to flexibly manage the workforce allocation between departments. In addition, as a result of analyzing the pattern of the departments participating in the consultation by the characteristics of the complaints, it is possible to use it for automation or recommendation when requesting the consultation department. In addition, by using various data generated during the complaint process and using machine learning techniques, the pattern of the complaint process can be found. It can be used for automation / intelligence of civil complaint processing by making this algorithm and applying it to the system. This study is expected to be used to suggest future public service quality improvement through process mining analysis on civil service.

The Change of The Effect on The Subcutaneous Fat Area and Visceral Fat Area by The Functional Electrical Stimulation and Aerobic Exercise (기능적 전기 자극과 유산소 운동이 복부비만의 피하지방과 내장지방에 미치는 효과)

  • Oh Sung-tae;Lee Mun-hwan;Park Rae-Joon
    • The Journal of Korean Physical Therapy
    • /
    • v.16 no.1
    • /
    • pp.85-123
    • /
    • 2004
  • Back ground : Subcutaneous fat area is the main factor involved in replacement disease and arteriosclerosis. Simple weight control is the appropriate medical treatment. It's understood that weight reduction does not only reduce the fat concentrations in blood but also reduces blood pressure, improves glucose levels in diabetes patients and reduces incidents of heart disease. there are several methods for reducing fat in the abdominal region but their effectiveness is not folly understood. one method is electrical stimulation of the problem areas. Method : From May 1st 2002 to October 31st. The 15 subjects who received medical examination were aged between 25 and 53 and were of mixed gender. The subjects were divided into two groups one to received functional electrical stimulation and the other a control group. Using Broca's criterion for judging fat grades. I analysed the differences between the two groups before and after the treatment. Subjects received functional electrical stimulation on the abdominal muscle intensity 50Hz. They received this treatment 4 days a week for 40 minutes a day. In the case of aerobic exercise, at the Treadmill, we used it with the intensity of $75\%$ maximum heart rate (220-age). Result 1)After functional electrical stimulation in the case of male subjects, the weight was reduced 1.93kg, obesity $2.60\%$, fat mass 2.73kg, Percent body fat $4.40\%$, waist circumference 6.53cm, circumference of hips 5.53cm. On the other side, the quality of muscle was increased at the rate of 1.03kg, but it's not attentional level. The subcutaneous fat area was reduced by $26.63cm^2$, the visceral fat area was reduced by $43.00cm^2$, In the female subjects, we can see the reduction of fat grade by $26.63cm^2$, the quantity of body fat by 1.5kg, percent body fat by $1.77\%$, circumference of waist by 4.02cm, circumference of hips by 3.67cm, weight by 1.40kg but was increased 0.72kg at the quantity of muscles. We can see the reduction also in the subcutaneous fat area $24.03cm^2$, the visceral fat area by $25.36cm^2$. 2)After aerobic exercise, on the male subjects, we can see reduction of weight by 3.36kg, obesity by $4.00\%$, fat mass by 2.83kg and we can see increase at the soft lean mass by 2.96kg, but we can see reduction, the percent body fat by $3.03\%$, fat distribution by $0.023\%$, circumference of waist by 3.10cm, circumference of hips by 2.23cm. The female subjects show a reduction in the weight by 2.48kg, percent body fat by $2.20\%$, show an increase in the soft lean mass by 1.54kg. We can see a reduction in the quantity of fat mass by 2.32kg, the percent body fat by $2.80\%$, the circumference of waist by 2.16cm, the circumference of hips by 2.68cm, the fat distribution by $0.016\%$, the subcutaneous fat area by $15.25cm^2$ the visceral fat area by $11.52cm^2$. After aerobic exercise, we can't see the attentional change at the total cholesterol, triglyceride, high density lipoprotein cholesterol, low density lipoprotein cholesterol. 3)After the application of functional electrical stimulation and aerobic exercise, in result of measurement on the body ingredient, we could see the weight reduction and increase the quantity of muscle with the male group who exercised aerobic. We can see the attentional rate on the electrical stimulation about abdominal fat rate, circumference of waist, circumference of hips. The other hand, I couldn't see the attentional differences between the two groups in the rate of fatness and quantity of body fat and the rate of body fat. There isn't any attentional difference in the area of fat under skin, on the contrary, There is attentional difference in the fat in the internal organs area at the electrical stimulation site. We can't see the attentional change of total cholesterol, triglyceride, high density lipoprotein cholesterol, low density lipoprotein cholesterol between electrical stimulation and aerobic exercise. 4)After execution of functional electrical stimulation and aerobic exercise, in result of measurement on change of body ingredient among female objects, We could see weight reduction, increase at muscle quantity in the aerobic exercise group. We could see the attentional differences in the rate of fatness, the rate of abdominal region, the circumference which received electrical stimulation. But, we couldn't see the attentional differences between two groups in the quantity of body fatness, the circumference of hips. The subcutaneous fat area doesn't show the attentional differences. On the Contrary, we could see lots of differences in the visceral fat area of the electrical stimulation group. Conclusion The results show that functional electrical stimulation and aerobic exercise have insignificant differences when if comes to total cholesterol, triglyceride, high density lipoprotein cholesterol, low density lipoprotein cholesterol. Though there is affirmative change in body ingredient after both electrical stimulation and aerobic exercise. Functional electrical stimulation is more effective on the subcutaneous fat area and in changing visceral fat area. There fore. It is concluded that the physical therapy is more effective in the treatment of abdominal fatness.

  • PDF