A Proposal of a Keyword Extraction System for Detecting Social Issues (사회문제 해결형 기술수요 발굴을 위한 키워드 추출 시스템 제안)
-
- Journal of Intelligence and Information Systems
- /
- v.19 no.3
- /
- pp.1-23
- /
- 2013
To discover significant social issues such as unemployment, economy crisis, social welfare etc. that are urgent issues to be solved in a modern society, in the existing approach, researchers usually collect opinions from professional experts and scholars through either online or offline surveys. However, such a method does not seem to be effective from time to time. As usual, due to the problem of expense, a large number of survey replies are seldom gathered. In some cases, it is also hard to find out professional persons dealing with specific social issues. Thus, the sample set is often small and may have some bias. Furthermore, regarding a social issue, several experts may make totally different conclusions because each expert has his subjective point of view and different background. In this case, it is considerably hard to figure out what current social issues are and which social issues are really important. To surmount the shortcomings of the current approach, in this paper, we develop a prototype system that semi-automatically detects social issue keywords representing social issues and problems from about 1.3 million news articles issued by about 10 major domestic presses in Korea from June 2009 until July 2012. Our proposed system consists of (1) collecting and extracting texts from the collected news articles, (2) identifying only news articles related to social issues, (3) analyzing the lexical items of Korean sentences, (4) finding a set of topics regarding social keywords over time based on probabilistic topic modeling, (5) matching relevant paragraphs to a given topic, and (6) visualizing social keywords for easy understanding. In particular, we propose a novel matching algorithm relying on generative models. The goal of our proposed matching algorithm is to best match paragraphs to each topic. Technically, using a topic model such as Latent Dirichlet Allocation (LDA), we can obtain a set of topics, each of which has relevant terms and their probability values. In our problem, given a set of text documents (e.g., news articles), LDA shows a set of topic clusters, and then each topic cluster is labeled by human annotators, where each topic label stands for a social keyword. For example, suppose there is a topic (e.g., Topic1 = {(unemployment, 0.4), (layoff, 0.3), (business, 0.3)}) and then a human annotator labels "Unemployment Problem" on Topic1. In this example, it is non-trivial to understand what happened to the unemployment problem in our society. In other words, taking a look at only social keywords, we have no idea of the detailed events occurring in our society. To tackle this matter, we develop the matching algorithm that computes the probability value of a paragraph given a topic, relying on (i) topic terms and (ii) their probability values. For instance, given a set of text documents, we segment each text document to paragraphs. In the meantime, using LDA, we can extract a set of topics from the text documents. Based on our matching process, each paragraph is assigned to a topic, indicating that the paragraph best matches the topic. Finally, each topic has several best matched paragraphs. Furthermore, assuming there are a topic (e.g., Unemployment Problem) and the best matched paragraph (e.g., Up to 300 workers lost their jobs in XXX company at Seoul). In this case, we can grasp the detailed information of the social keyword such as "300 workers", "unemployment", "XXX company", and "Seoul". In addition, our system visualizes social keywords over time. Therefore, through our matching process and keyword visualization, most researchers will be able to detect social issues easily and quickly. Through this prototype system, we have detected various social issues appearing in our society and also showed effectiveness of our proposed methods according to our experimental results. Note that you can also use our proof-of-concept system in http://dslab.snu.ac.kr/demo.html.
As having the movement of developing private brand (PB) goods, domestic big retailers are facing up with new problems. Thus, it is required studies of PB products, and how consumers recognize PB products as a consideration commodity set. Also, it is worthy in order that it gives us the important meaning on the marketing strategy with focusing on evaluating the differences between customers buying PB grocery goods with respect to demographic characteristics and purchasing behaviors. PB has some advantages for customers and retailers. However, according to AC Nielson's report (2005), Asian and emerging market has 1/5 sales relatively to Western countries. But we can assume that the emerging market has the most potential growth through this result. As a result from several other studies, it becomes necessary to not only increase the rate of selling composition of PB product temporarily, but also analyze the characteristics of customers using big retailers and segmenting customer groups to make PB product as a consideration commodity set for them. In addition, it is needed to have a variety of acts of marketing. From studies related to PB, there is a prejudice - cheap products have low quality - but, evaluation by customers who have used those products shows neutral stand, and there is a study representing that it is the most important to accumulate the belief between the retailers selling PB products and consumers using those for the accurate evaluation and intention on purchasing. Also, by the result from analyzing the characteristics of customers buying PB products, we could assume that higher income and higher education level, more preference on PB products. Especially, according to TNS's research, the primary targets of PB product are 30's who seeks value for money and planned spending habits, and 40's who have teenager children, and are interested in encouraging themselves. This paper used Probit model to analyze the characteristics of consumers. This model helps us to analyze with the variables representing the demographic characteristics of consumers (gender, age, educational level, occupation, income level, living area), and variables related to purchasing behavior (visiting frequency on big retailers, the average amount that they pay for goods in there, and check-up which brand made those goods). The method we used in this study is by man to man interview and survey on-line with the rate of 89% and 11% in Seoul and Gyunggi Province, respectively, for about one month from the beginning of February, 2008. As a result of this, under the assumption that people buy PB products more as long as they go shopping more, it was not meaningful for target groups which we pointed out as frequently visiting customers to be. Although, we have expected women buy more PB products than men do, gender doesn't mean anything for the result. And, it has inferred that married people buy more PB goods than singles do. It was also meaningless with variables related to occupation. Because housewives are often exposed to any kind of supermarket than workers are, we could not get any relatives. Moreover, we couldn't proof that younger generation prefer big retailers more than older people who 50~60's. Education levels doesn't affect on the purchase of PB product as well. Related to living area, the result is statistically not similar as we expected whether living in Seoul or not. It shows there is no relationship with the preference on retail brands and PB products, and it is similar with the study researched by TNS(2008) that customers tend to buy PB product impulsively no matter which brand it is and where they are even though their shopping place is the big market where customers are often using. Variables on which we had meaningful results are income level and living place. That is, customers who have 3,000,000~6,000,000 WON every month on average are more willing to buy PB products than other customers whose income is over 6,000,000 WON, and residents not living in Seoul prefer PB goods than those who are living in Seoul. To explain more about what we got, if there is only one condition about customer's visiting frequency on big retails, we could come up with this result that more exposed to PB products, more purchasing frequency. Consequently, it brings the important insight that large retailers have to prepare something to make customers visit them often to increase selling rate of PB products. To demonstrate the result of analyzing more, what is more efficient variables are demographically including marital status, income level, and residential area to buy items that affect the PB products and could include the frequency of visiting large markets by the purchase habits. Specifically, then, married couples rather than singles, middle-income customers than high-income customers, and local residents not living in Seoul than customers in Seoul are more likely to purchase PB goods. In addition, as long as a customer visits two times more, then the purchasing rate of PB products is to increase over 5.3%. Therefore, it seems that retailers are better to make a shopping place as fun and comfortable places. With overwhelming the idea that PB products are just cheap, one-time purchase goods, it is needed to increase the loyalty on those goods like NB products, try to make PB products as a consideration products set, and occur to sustainable sales. Especially, as suggested by this paper, it seems like it strongly needs to identify the characteristics of customers who prefer PB, to segment those customers, and to select the main target, and to do positioning with well-planned marketing strategies. Then, it is able to give us a meaningful point on marketing strategy by developing the field of PB study, identifying the difference of life style and shopping habits of customers.
Purpose If a new test is introduced or reagents are changed in the laboratory of a medical institution, the characteristics of the test should be analyzed according to the procedure and the assessment of reagents should be made. However, several necessary conditions must be met to perform all required comparative evaluations, first enough samples should be prepared for each test, and secondly, various reagents applicable to the comparative evaluations must be supplied. Even if enough comparative evaluations have been done, there is a limit to the fact that the data variation for the new reagent represents the overall patient data variation, The fact puts a burden on the laboratory to the change the reagent. Due to these various difficulties, reagent changes in the laboratory are limited. In order to introduce a competitive bid, the institute conducted a full investigation of Radioimmunoassay(RIA) reagents for each test and established the range of reagents available in the laboratory through comparative evaluations. We wanted to share this process. Materials and Methods There are 20 items of tests conducted in our laboratory except for consignment tests. For each test, RIA reagents that can be used were fully investigated with the reference to external quality control report. and the manuals for each reagent were obtained. Each reagent was checked for the manual to check the test method, Incubation time, sample volume needed for the test. After that, the primary selection was made according to whether it was available in this laboratory. The primary selected reagents were supplied with 2kits based on 100tests, and the data correlation test, sensitivity measurement, recovery rate measurement, and dilution test were conducted. The secondary selection was performed according to the results of the comparative evaluation. The reagents that passed the primary and secondary selections were submitted to the competitive bidding list. In the case of reagent is designated as a singular, we submitted a explanatory statement with the data obtained during the primary and secondary selection processes. Results Excluded from the primary selection was the case where TAT was expected to be delayed at the moment, and it was impossible to apply to our equipment due to the large volume of reagents used during the test. In the primary selection, there were five items which only one reagent was available.(squamous cell carcinoma Ag(SCC Ag), β-human chorionic gonadotropin(β-HCG), vitamin B12, folate, free testosterone), two reagents were available(CA19-9, CA125, CA72-4, ferritin, thyroglobulin antibody(TG Ab), microsomal antibody(Mic Ab), thyroid stimulating hormone-receptor-antibody(TSH-R-Ab), calcitonin), three reagents were available (triiodothyronine(T3), Tree T3, Free T4, TSH, intact parathyroid hormone(intact PTH)) and four reagents were available are carcinoembryonic antigen(CEA), TG. In the secondary selection, there were eight items which only one reagent was available.(ferritin, TG, CA19-9, SCC, β-HCG, vitaminB12, folate, free testosterone), two reagents were available(TG Ab, Mic Ab, TSH-R-Ab, CA125, CA72-4, intact PTH, calcitonin), three reagents were available(T3, Tree T3, Free T4, TSH, CEA). Reasons excluded from the secondary selection were the lack of reagent supply for comparative evaluations, the problems with data reproducibility, and the inability to accept data variations. The most problematic part of comparative evaluations was sample collection. It didn't matter if the number of samples requested was large and the capacity needed for the test was small. It was difficult to collect various concentration samples in the case of a small number of tests(100 cases per month or less), and it was difficult to conduct a recovery rate test in the case of a relatively large volume of samples required for a single test(more than 100 uL). In addition, the lack of dilution solution or standard zero material for sensitivity measurement or dilution tests was one of the problems. Conclusion Comparative evaluation for changing test reagents require appropriate preparation time to collect diverse and sufficient samples. In addition, setting the total sample volume and reagent volume range required for comparative evaluations, depending on the sample volume and reagent volume required for one test, will reduce the burden of sample collection and planning for each comparative evaluation.
This project was a service-cum-research effort with a quasi-experimental study design to examine the health benefits of an integrated Family Planning (FP)/Maternal & Child health (MCH) Service approach that provides crucial factors missing in the present on-going programs. The specific objectives were: 1) To test the effectiveness of trained nurse/midwives (MW) assigned as change agents in the Health Sub-Center (HSC) to bring about the changes in the eight FP/MCH indicators, namely; (i)FP/MCH contacts between field workers and their clients (ii) the use of effective FP methods, (iii) the inter-birth interval and/or open interval, (iv) prenatal care by medically qualified personnel, (v) medically supervised deliveries, (vi) the rate of induced abortion, (vii) maternal and infant morbidity, and (viii) preinatal & infant mortality. 2) To measure the integrative linkage (contacts) between MW & HSC workers and between HSC and clients. 3) To examine the organizational or administrative factors influencing integrative linkage between health workers. Study design; The above objectives called for quasi-experimental design setting up a study and control area with and without a midwife. An active intervention program (FP/MCH minimum 'package' program) was conducted for a 2 year period from June 1982-July 1984 in Seosan County and 'before and after' surveys were conducted to measure the change. Service input; This study was undertaken by the Soonchunhyang University in collaboration with WHO. After a baseline survery in 1981, trained nurses/midwives were introduced into two health sub-centers in a rural setting (Seosan county) for a 2 year period from 1982 to 1984. A major service input was the establishment of midwifery services in the existing health delivery system with emphasis on nurse/midwife's role as the link between health workers (nurse aids) and village health workers, and the referral of risk patients to the private physician (OBGY specialist). An evaluation survey was made in August 1984 to assess the effectiveness of this alternative integrated approach in the study areas in comparison with the control area which had normal government services. Method of evaluation; a. In this study, the primary objective was first to examine to what extent the FP/MCH package program brought about changes in the pre-determined eight indicators (outcome and impact measures) and the following relationship was first analyzed; b. Nevertheless, this project did not automatically accept the assumption that if two or more activities were integrated, the results would automatically be better than a non-integrated or categorical program. There is a need to assess the 'integration process' itself within the package program. The process of integration was measured in terms of interactive linkages, or the quantity & quality of contacts between workers & clients and among workers. Intergrative linkages were hypothesized to be influenced by organizational factors at the HSC clinic level including HSC goals, sltrurture, authority, leadership style, resources, and personal characteristics of HSC staff. The extent or degree of integration, as measured by the intensity of integrative linkages, was in turn presumed to influence programme performance. Thus as indicated diagrammatically below, organizational factors constituted the independent variables, integration as the intervening variable and programme performance with respect to family planning and health services as the dependent variable: Concerning organizational factors, however, due to the limited number of HSCs (2 in the study area and 3 in the control area), they were studied by participatory observation of an anthropologist who was independent of the project. In this observation, we examined whether the assumed integration process actually occurred or not. If not, what were the constraints in producing an effective integration process. Summary of Findings; A) Program effects and impact 1. Effects on FP use: During this 2 year action period, FP acceptance increased from 58% in 1981 to 78% in 1984 in both the study and control areas. This increase in both areas was mainly due to the new family planning campaign driven by the Government for the same study period. Therefore, there was no increment of FP acceptance rate due to additional input of MW to the on-going FP program. But in the study area, quality aspects of FP were somewhat improved, having a better continuation rate of IUDs & pills and more use of effective Contraceptive methods in comparison with the control area. 2. Effects of use of MCH services: Between the study and control areas, however, there was a significant difference in maternal and child health care. For example, the coverage of prenatal care was increased from 53% for 1981 birth cohort to 75% for 1984 birth cohort in the study area. In the control area, the same increased from 41% (1981) to 65% (1984). It is noteworthy that almost two thirds of the recent birth cohort received prenatal care even in the control area, indicating that there is a growing demand of MCH care as the size of family norm becomes smaller 3. There has been a substantive increase in delivery care by medical professions in the study area, with an annual increase rate of 10% due to midwives input in the study areas. The project had about two times greater effect on postnatal care (68% vs. 33%) at delivery care(45.2% vs. 26.1%). 4. The study area had better reproductive efficiency (wanted pregancies with FP practice & healthy live births survived by one year old) than the control area, especially among women under 30 (14.1% vs. 9.6%). The proportion of women who preferred the 1st trimester for their first prenatal care rose significantly in the study area as compared to the control area (24% vs 13%). B) Effects on Interactive Linkage 1. This project made a contribution in making several useful steps in the direction of service integration, namely; i) The health workers have become familiar with procedures on how to work together with each other (especially with a midwife) in carrying out their work in FP/MCH and, ii) The health workers have gotten a feeling of the usefulness of family health records (statistical integration) in identifying targets in their own work and their usefulness in caring for family health. 2. On the other hand, because of a lack of required organizational factors, complete linkage was not obtained as the project intended. i) In regards to the government health worker's activities in terms of home visiting there was not much difference between the study & control areas though the MW did more home visiting than Government health workers. ii) In assessing the service performance of MW & health workers, the midwives balanced their workload between 40% FP, 40% MCH & 20% other activities (mainly immunization). However,