• Title/Summary/Keyword: Classification Problem

Search Result 1,735, Processing Time 0.031 seconds

Self-optimizing feature selection algorithm for enhancing campaign effectiveness (캠페인 효과 제고를 위한 자기 최적화 변수 선택 알고리즘)

  • Seo, Jeoung-soo;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.173-198
    • /
    • 2020
  • For a long time, many studies have been conducted on predicting the success of campaigns for customers in academia, and prediction models applying various techniques are still being studied. Recently, as campaign channels have been expanded in various ways due to the rapid revitalization of online, various types of campaigns are being carried out by companies at a level that cannot be compared to the past. However, customers tend to perceive it as spam as the fatigue of campaigns due to duplicate exposure increases. Also, from a corporate standpoint, there is a problem that the effectiveness of the campaign itself is decreasing, such as increasing the cost of investing in the campaign, which leads to the low actual campaign success rate. Accordingly, various studies are ongoing to improve the effectiveness of the campaign in practice. This campaign system has the ultimate purpose to increase the success rate of various campaigns by collecting and analyzing various data related to customers and using them for campaigns. In particular, recent attempts to make various predictions related to the response of campaigns using machine learning have been made. It is very important to select appropriate features due to the various features of campaign data. If all of the input data are used in the process of classifying a large amount of data, it takes a lot of learning time as the classification class expands, so the minimum input data set must be extracted and used from the entire data. In addition, when a trained model is generated by using too many features, prediction accuracy may be degraded due to overfitting or correlation between features. Therefore, in order to improve accuracy, a feature selection technique that removes features close to noise should be applied, and feature selection is a necessary process in order to analyze a high-dimensional data set. Among the greedy algorithms, SFS (Sequential Forward Selection), SBS (Sequential Backward Selection), SFFS (Sequential Floating Forward Selection), etc. are widely used as traditional feature selection techniques. It is also true that if there are many risks and many features, there is a limitation in that the performance for classification prediction is poor and it takes a lot of learning time. Therefore, in this study, we propose an improved feature selection algorithm to enhance the effectiveness of the existing campaign. The purpose of this study is to improve the existing SFFS sequential method in the process of searching for feature subsets that are the basis for improving machine learning model performance using statistical characteristics of the data to be processed in the campaign system. Through this, features that have a lot of influence on performance are first derived, features that have a negative effect are removed, and then the sequential method is applied to increase the efficiency for search performance and to apply an improved algorithm to enable generalized prediction. Through this, it was confirmed that the proposed model showed better search and prediction performance than the traditional greed algorithm. Compared with the original data set, greed algorithm, genetic algorithm (GA), and recursive feature elimination (RFE), the campaign success prediction was higher. In addition, when performing campaign success prediction, the improved feature selection algorithm was found to be helpful in analyzing and interpreting the prediction results by providing the importance of the derived features. This is important features such as age, customer rating, and sales, which were previously known statistically. Unlike the previous campaign planners, features such as the combined product name, average 3-month data consumption rate, and the last 3-month wireless data usage were unexpectedly selected as important features for the campaign response, which they rarely used to select campaign targets. It was confirmed that base attributes can also be very important features depending on the type of campaign. Through this, it is possible to analyze and understand the important characteristics of each campaign type.

Target-Aspect-Sentiment Joint Detection with CNN Auxiliary Loss for Aspect-Based Sentiment Analysis (CNN 보조 손실을 이용한 차원 기반 감성 분석)

  • Jeon, Min Jin;Hwang, Ji Won;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.1-22
    • /
    • 2021
  • Aspect Based Sentiment Analysis (ABSA), which analyzes sentiment based on aspects that appear in the text, is drawing attention because it can be used in various business industries. ABSA is a study that analyzes sentiment by aspects for multiple aspects that a text has. It is being studied in various forms depending on the purpose, such as analyzing all targets or just aspects and sentiments. Here, the aspect refers to the property of a target, and the target refers to the text that causes the sentiment. For example, for restaurant reviews, you could set the aspect into food taste, food price, quality of service, mood of the restaurant, etc. Also, if there is a review that says, "The pasta was delicious, but the salad was not," the words "steak" and "salad," which are directly mentioned in the sentence, become the "target." So far, in ABSA, most studies have analyzed sentiment only based on aspects or targets. However, even with the same aspects or targets, sentiment analysis may be inaccurate. Instances would be when aspects or sentiment are divided or when sentiment exists without a target. For example, sentences like, "Pizza and the salad were good, but the steak was disappointing." Although the aspect of this sentence is limited to "food," conflicting sentiments coexist. In addition, in the case of sentences such as "Shrimp was delicious, but the price was extravagant," although the target here is "shrimp," there are opposite sentiments coexisting that are dependent on the aspect. Finally, in sentences like "The food arrived too late and is cold now." there is no target (NULL), but it transmits a negative sentiment toward the aspect "service." Like this, failure to consider both aspects and targets - when sentiment or aspect is divided or when sentiment exists without a target - creates a dual dependency problem. To address this problem, this research analyzes sentiment by considering both aspects and targets (Target-Aspect-Sentiment Detection, hereby TASD). This study detected the limitations of existing research in the field of TASD: local contexts are not fully captured, and the number of epochs and batch size dramatically lowers the F1-score. The current model excels in spotting overall context and relations between each word. However, it struggles with phrases in the local context and is relatively slow when learning. Therefore, this study tries to improve the model's performance. To achieve the objective of this research, we additionally used auxiliary loss in aspect-sentiment classification by constructing CNN(Convolutional Neural Network) layers parallel to existing models. If existing models have analyzed aspect-sentiment through BERT encoding, Pooler, and Linear layers, this research added CNN layer-adaptive average pooling to existing models, and learning was progressed by adding additional loss values for aspect-sentiment to existing loss. In other words, when learning, the auxiliary loss, computed through CNN layers, allowed the local context to be captured more fitted. After learning, the model is designed to do aspect-sentiment analysis through the existing method. To evaluate the performance of this model, two datasets, SemEval-2015 task 12 and SemEval-2016 task 5, were used and the f1-score increased compared to the existing models. When the batch was 8 and epoch was 5, the difference was largest between the F1-score of existing models and this study with 29 and 45, respectively. Even when batch and epoch were adjusted, the F1-scores were higher than the existing models. It can be said that even when the batch and epoch numbers were small, they can be learned effectively compared to the existing models. Therefore, it can be useful in situations where resources are limited. Through this study, aspect-based sentiments can be more accurately analyzed. Through various uses in business, such as development or establishing marketing strategies, both consumers and sellers will be able to make efficient decisions. In addition, it is believed that the model can be fully learned and utilized by small businesses, those that do not have much data, given that they use a pre-training model and recorded a relatively high F1-score even with limited resources.

Updating DEM for Improving Geomorphic Details (미기복 지형 표현을 위한 DEM 개선)

  • Kim, Nam-Shin
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.12 no.1
    • /
    • pp.64-72
    • /
    • 2009
  • The method to generate a digital elevation model(DEM) from contour lines causes a problem in which the low relief landform cannot be clearly presented due to the fact that it is significantly influenced by the expression of micro landform elements according to the interval of contours. Thus, this study attempts to develop a landcover burning method that recovers the micro relief landform of the DEM, which applies buffering and map algebra methods by inputting the elevation information to the landcover. In the recovering process of the micro landform, the DEM was recovered using the buffering method and elevation information through the map algebra for the landcover element for the micro landform among the primary DEM generation, making landcover map, and landcover elements. The recovering of the micro landform was applied based on stream landforms. The recovering of landforms using the buffering method was performed for the bar, which is a polygonal element, and wetland according to the properties of concave/convex through generating contours with a uniform interval in which the elevation information applied to the recovered landform. In the case of the linear elements, such as bank, road, waterway, and tributary, the landform can be recovered by using the elevation information through applying a map algebra function. Because the polygonal elements, such as stream channel, river terrace, and artificial objects (farmlands) are determined as a flat property, these are recovered by inputting constant elevation values. The results of this study were compared and analyzed for the degree of landform expression between the original DEM and the recovered DEM. In the results of the analysis, the DEM produced by using the conventional method showed few expressions in micro landform elements. The method developed in this study well described wetland, bar, landform around rivers, farmland, bank, river terrace, and artificial objects. It can be expected that the results of this study contribute to the classification and analysis of micro landforms, plain and the ecology and environment study that requires the recovering of micro landforms around streams and rivers.

  • PDF

Development of the Information Delivery System for the Home Nursing Service (가정간호사업 운용을 위한 정보전달체계 개발 I (가정간호 데이터베이스 구축과 뇌졸중 환자의 가정간호 전산개발))

  • Park, J.H;Kim, M.J;Hong, K.J;Han, K.J;Park, S.A;Yung, S.N;Lee, I.S;Joh, H.;Bang, K.S
    • Journal of Korean Academic Society of Home Health Care Nursing
    • /
    • v.4
    • /
    • pp.5-22
    • /
    • 1997
  • The purpose of the study was to development an information delivery system for the home nursing service, to demonstrate and to evaluate the efficiency of it. The period of research conduct was from September 1996 to August 31, 1997. At the 1st stage to achieve the purpose, Firstly Assessment tool for the patients with cerebral vascular disease who have the first priority of HNS among the patients with various health problems at home was developed through literature review. Secondly, after identification of patient nursing problem by the home care nurse with the assessment tool, the patient's classification system developed by Park (1988) that was 128 nursing activities under 6 categories was used to identify the home care nurse's activities of the patient with CAV at home. The research team had several workshops with 5 clinical nurse experts to refine it. At last 110 nursing activities under 11 categories for the patients with CVA were derived. At the second stage, algorithms were developed to connect 110 nursing activities with the patient nursing problems identified by assessment tool. The computerizing process of the algorithms is as follows: These algorithms are realized with the computer program by use of the software engineering technique. The development is made by the prototyping method, which is the requirement analysis of the software specifications. The basic features of the usability, compatibility, adaptability and maintainability are taken into consideration. Particular emphasis is given to the efficient construction of the database. To enhance the database efficiency and to establish the structural cohesion, the data field is categorized with the weight of relevance to the particular disease. This approach permits the easy adaptability when numerous diseases are applied in the future. In paralleled with this, the expandability and maintainability is stressed through out the program development, which leads to the modular concept. However since the disease to be applied is increased in number as the project progress and since they are interrelated and coupled each other, the expand ability as well as maintainability should be considered with a big priority. Furthermore, since the system is to be synthesized with other medical systems in the future, these properties are very important. The prototype developed in this project is to be evaluated through the stage of system testing. There are various evaluation metrics such as cohesion, coupling and adaptability so on. But unfortunately, direct measurement of these metrics are very difficult, and accordingly, analytical and quantitative evaluations are almost impossible. Therefore, instead of the analytical evaluation, the experimental evaluation is to be applied through the test run by various users. This system testing will provide the viewpoint analysis of the user's level, and the detail and additional requirement specifications arising from user's real situation will be feedback into the system modeling. Also. the degree of freedom of the input and output will be improved, and the hardware limitation will be investigated. Upon the refining, the prototype system will be used as a design template. and will be used to develop the more extensive system. In detail. the relevant modules will be developed for the various diseases, and the module will be integrated by the macroscopic design process focusing on the inter modularity, generality of the database. and compatibility with other systems. The Home care Evaluation System is comprised of three main modules of : (1) General information on a patient, (2) General health status of a patient, and (3) Cerebrovascular disease patient. The general health status module has five sub modules of physical measurement, vitality, nursing, pharmaceutical description and emotional/cognition ability. The CVA patient module is divided into ten sub modules such as subjective sense, consciousness, memory and language pattern so on. The typical sub modules are described in appendix 3.

  • PDF

Review of a Plant-Based Health Assessment Methods for Lake Ecosystems (식물에 의한 호수생태계 건강성 평가법에 대한 고찰)

  • Choung, Yeonsook;Lee, Kyungeun
    • Korean Journal of Ecology and Environment
    • /
    • v.46 no.2
    • /
    • pp.145-153
    • /
    • 2013
  • It is a global trend that the water management policy is shifting from a water quality-oriented assessment to the aquatic ecosystem-based assessment. The majority of aquatic ecosystem assessment systems were developed solely based on physicochemical factors (e.g., water quality and bed structure) and a limited number of organisms (e.g., plankton and benthic organisms). Only a few systems use plants for a health assessment, although plants are sensitive indicators reflecting long-term disturbances and alterations in water regimes. The development of an assessment system is underway to evaluate and manage lakes as ecosystem units in the Korean Ministry of Environment. We reviewed the existing multivariate health assessment methods of other leading countries, and discussed their applicability to Korean lakes. The application of multivariate assessment methods is costly and time consuming, in addition to the correlation problem among variables. However, a single variable is not available at this moment, and the multivariate method is an appropriate system due to its multidimensional evaluation and cumulative data generation. We, therefore, discussed multivariate assessment methods in three steps: selecting metrics, scoring metrics and assessing indices. In the step of selecting metrics, the best available metrics are species-related variables, such as composition and abundance, as well as richness and diversity. Indicator species, such as sensitive species, are the most frequently used in other countries, but their system of classification in Korea is not yet complete. In terms of scoring metrics, the lack of reference lakes with little anthropogenic impact make this step difficult, and therefore, the use of relative scores among the investigated lakes is a suitable alternative. Overall, in spite of several limitations, the development of a plant-based multivariate assessment method in Korea is possible using mostly field research data. Later, it could be improved based on qualitative metrics on plant species, and with the emergence of further survey data.

THE CLASSIFICATION OF ADOLESCENTS IN RUNAWAY SHELTERS BY THE EVALUATION OF THEIR PSYCHOPATHOLOGY (보호시설 가출청소년의 정신병리에 대한 평가와 분류)

  • Lee, Jong-Sung;Kwack, Young-Sook
    • Journal of the Korean Academy of Child and Adolescent Psychiatry
    • /
    • v.12 no.2
    • /
    • pp.192-217
    • /
    • 2001
  • Object:This study was carried out to classify adolescents in runaway shelters by evaluating their psychopathology. And the ultimate purpose is to offer basic data for preventing adolescents‘ runaway and for diversifying runaway shelters suitable for the problem of individual adolescent. Method:128 adolescents who stay in the runaway shelters were asked to complete self-report qeustionnaires including basic sociodemographic data, Child Behavior Check List(CBCL), Minnesota Multiphasic Personality Inventory(MMPI), and Symptom Check List-90-Revised(SCL-90-R). Korean Wechsler Adult Intelligence Scale(K-WAIS)[or Korean Educational Developmental Institute-Wechsler Intelligence Scale for Children(KEDI-WISC)] and Bender-Gestalt test(BGT) were also done by clinical psychologists. Results:The most common age of the subjects were 15-year-old, and they dropped out their schools in the middle school most commonly. Mostly they were from middle class family and their parents' educational level were high school graduates. The first runaway episode was most common in the middleschool period, and their runaways were repeated. The most common frequency of runaways were more than 10 times. About 10% of them abused drugs and about 80% of them abused alcohol. One third of them had experiences of illegal problems and 10% of them engaged in sexual activity for money. 95 adolescents(83%) in CBCL, 42 adolescents(36%) in SCL-90-R, and 70 adolescents(69.3%) in MMPI showed clinical significance. In intelligence test, 22 adolescents(22%) were mentally retarded. In BGT, 35 adolescents(39.4%) manifested brain dysfunction signs. Conclusion:Runaway adolescents in the shelters have variable and severe psychopathology. Their psychopathology is classified as follows;The behavior disorder group, the mood disorder group with anxiety/depression, the somatic disorder group with somatic symptoms, and the psychosis group with possibility of severe psychopathology. Therefore it is very important to evaluate psychiatric problems of runaway adolescents, and specific therapeutic interventions according to their problems are required.

  • PDF

Why Your Manuscripts Were Rejected or Required a Major Revision: An Analysis of Asia Pacific Journal if Information Systems (MIS 논문의 '게재 불가' 및 '수정 후 재심사' 사유: Asia Pacific Journal of Information Systems 심사소견서 분석)

  • Lee, Choong-C.;Yun, Hae-Jung;Hwang, Seong-Hoon
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.179-193
    • /
    • 2009
  • As the common saying attests, a publish-or-perish world, publishing is absolutely critical for academic researchers' successful careers. It is the most objectively-accepted academic performance criteria and the most viable way to attain public and academic recognition. Asia Pacific Journal of Information Systems(APJIS) has been recognized as the most influential domestic journal in Korean MIS field since July, 1991. Therefore, publishing in APJIS means your research is original, valid, and contributive. While most researchers learn how to publish an article in APJIS through a repetitive review process, thereby improving their chance of the' accepted' through their personal trial and error experiences, such valuable lessons and know-how tend to be kept personally and rarely shared. However, useful insights into research and publication skills could be also gained from sharing others' errors, neglect, and misjudgments which are equally critical in improving researchers' knowledge in the field (Murthy and Wiggins, 2002). For this reason, other academic disciplines make systematic efforts to examine the paper review process of major journals and share the findings from these studies with the rest of the research community members (Beyer et al., 1995; Cummings et al, 1985; Daft, 1995; Jauch and Wall, 1989; Murthy and Wiggins, 2002). Recognizing the urgent need to provide such type of information to MIS research community in Korea, we have chosen the most influential academic journal, APJIS with an intention to share the answer to the following research question: "What are the common problems found in the manuscripts either 'rejected' or 'required a major revision' by APJIS reviewers?" This study analyzes the review results of manuscripts submitted to APJIS (from January, 2006 to October, 2008), particularly those that were 'rejected' or required a 'major revision' at the first round. Based on Daft's(1995) study, twelve most-likelihood problems were defined and used to analyze the reviews. The twelve criteria for classification, or "twelve problems", are as follows: No theory, Concepts and operationalization not in alignment, Insufficient definition--theory, Insufficient rationale--design, Macrostructure--organization and flow, Amateur style and tone, Inadequate research design, Not relevant to the field, Overengineering, Conclusions not in alignment, Cutting up the data, and Poor editorial practice. Upon the approval of the editorial board of APJIS, the total 252 reviews, including 11 cases of 2005 and 241 cases from July, 2006 to October, 2008, were received without any information about manuscripts, authors, or reviewers. Eleven cases of 2005 were used in the pilot test because the data of 2005 were not in complete enumeration, and the 241 reviews (113 cases of 'rejection' and 128 ones of 'major revision') of 2006, 2007, and 2008 were examined in this study. Our findings show that insufficient rationale-design(20.25%), no theory(18.45%), and insufficient definition--theory(15.69%) were the three leading reasons of 'rejection' and 'major revision.' Between these two results, the former followed the same order of three major reasons as an overall analysis (insufficient rationale-design, no theory, and insufficient definition-theory), but the latter followed the order of insufficient rationale--design, insufficient definition--theory, and no theory. Using Daft's three major skills-- 'theory skills', 'design skills', and 'communication skills'-- twelve criteria were reclassified into 'theory problems', 'design problems', and 'communication problems' to derive more practical implications of our findings. Our findings show that 'theory problems' occupied 43.48%, 'design problems' were 30.86%, and 'communication problems' were 25.86%. In general, the APJIS reviewers weigh each of these three problem areas almost equally. Comparing to other disciplines like management field shown in Daft's study, the portion of 'design problems' and 'communication problems' are much higher in manuscripts submitted to the APJIS than in those of Administrative Science Quarterly and Academy of Management Journal even though 'theory problems' are the most predominant in both disciplines.

Operative treatment for Proximal Humeral Fracture (상완골 근위부 골절의 수술적 요법)

  • Park Jin-Young;Park Hee-Gon
    • Journal of Korean Orthopaedic Sports Medicine
    • /
    • v.2 no.2
    • /
    • pp.168-175
    • /
    • 2003
  • Fracture about proximal humerus may be classified as the articular segment or the anatomical neck, the greater tuberosity, the lesser tuberosity, and the shaft or surgical neck. Now, usually used, Neer's classification is based on the number of segments displaced, over 1cm of displaced or more than 45 degrees of angulation , rather than the number of fracture line . Absolute indication of a operative treatment a open fracture, the fracture with vascular injury or nerve injury , and unreductable fracture-dislocation . Inversely, the case that are severe osteoporosis, and eldly patient who can't be operated by strong internal fixation is better than arthroplasty used by primary prosthetic replacement and early rehabilitation program than open reduction and internal fixation. The operator make a decision for the patient who should be taken the open reduction and internal fixation, because it's different that anatomical morphology, bone density, condition of patient. The operator decide operation procedure. For example, percutaneous pinning, open reduction, plate & screws, wire tension bands combined with some intramedullary device are operation procedure that operator can decide . The poor health condition for other health problem, fracture with unstable vital sign and severe osteoporosis , are the relative contraindication. The stable fracture without dislocation is not the operative indication . The radiologic film of the prokimal humerus before the operation can not predict for fracture evaluation. It's necessary to good radiologic film for evaluation of fracture form. The trauma serise is better than the other radiologic film for evaluation. The accessary radiologic exam is able to help for evaluation of bone fragment and anatomy. The CT can be helpful in evaluating these injury, especially if the extract fracture type cannot be determined from plain roenterogram of the proximal humerus, bone of humerus head. If the dislocation is severe anatomically , we could consider to do three dimentional remodelling. The MRI doing for observing of bony morphology before the operation is not better than CT If we were suspicious of vascular injury, we could consider the angiography.

  • PDF

Classification of Domestic Freight Data and Application for Network Models in the Era of 'Government 3.0' ('정부 3.0' 시대를 맞이한 국내 화물 자료의 집계 수준에 따른 분류체계 구축 및 네트워크 모형 적용방안)

  • YOO, Han Sol;KIM, Nam Seok
    • Journal of Korean Society of Transportation
    • /
    • v.33 no.4
    • /
    • pp.379-392
    • /
    • 2015
  • Freight flow data in Korea has been collected for a variety of purposes by various organizations. However, since the representation and format of the data varies, it has not been substantially used for freight analyses and furthermore for freight policies. In order to increase the applicability of those data sets, it is required to bring them in a table and compare for finding the differences. Then, it is shown that the raw data can be aggregated by a particular criterion such as mode, origin and destination, and type commodity. This study aims to examine the freight data issue in terms of three different points of view. First, we investigated various freight volume data sets which are released by several organizations. Second, we tried to develop formulations for freight volume data. Third, we discussed how to apply the formulations to network models in which particular OR (Operations Research) techniques are used. The results emphasized that some data might be useless for modeling once they are aggregated. As a result of examining the freight volume data, this study found that 14 organizations share their data sets at various aggregation levels. This study is not an ordinary research article, which normally includes data analysis, because it seems to be impossible to conduct extensive case studies. The reason is that the data dealt in this study are diverse. Nevertheless, this study might guide the research direction in the freight transport research society in terms of data issue. Especially, it can be concluded that this study is a timely research because the governmemt has emphasized the importance of sharing data to public throughout 'government 3.0' for research purpose.

Evaluation of clinical status of removable partial dentures (가철성 국소의치의 임상적 상태에 대한 평가)

  • Yang, Dong-Seok;Cho, Uk;Jeong, Chang-Mo;Jeon, Young-Chan;Yun, Mi-Jung
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.47 no.3
    • /
    • pp.320-327
    • /
    • 2009
  • Statement of Problem: Although many efforts have been continually made to estimate long term prognosis of removable partial dentures, the complication of removable partial dentures was still found because of inaccurate fabrication procedure and improper maintenance care. Purpose: The purpose of this study was to evaluate the clinical status of removable partial dentures. Material and methods: A total of 112 individuals with 153 removable partial dentures (35 - 87 years, 64 women and 48 men) were examined by intra-oral examination, diagnostic cast and radiographic examination. Results and conclusion: The results of this study were as follows: 1. Length of service of removable partial dentures was $5.3{\pm}4.3$ years (mean), 4.0 years (median). 2. A total of 45 removable partial dentures were considered failures. The loss of 18 abutments of 369 was founded. 3. Type of arch, Kennedy classification and type of opposite dentition were found to have no influence on longevity and success rate of removable partial dentures (P > .05). 4. Most common major connector was the palatal plate in maxilla and the number of lingual bar and linguoplate designed in mandible were similar. 5. The circumferential type retainer was the most commonly used retainer. 6. Sixty-three percent of the class I and II removable partial dentures incorporated indirect retention into the design. 7. Approximately 81% of the removable partial dentures had at least one defect. Excessive wear of posterior teeth (27.9%), lack of integrity (23.2%), lack of stability (22.6%) were frequent defects of removable partial dentures.