• Title/Summary/Keyword: system use

Search Result 29,949, Processing Time 0.06 seconds

Image Watermarking for Copyright Protection of Images on Shopping Mall (쇼핑몰 이미지 저작권보호를 위한 영상 워터마킹)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.4
    • /
    • pp.147-157
    • /
    • 2013
  • With the advent of the digital environment that can be accessed anytime, anywhere with the introduction of high-speed network, the free distribution and use of digital content were made possible. Ironically this environment is raising a variety of copyright infringement, and product images used in the online shopping mall are pirated frequently. There are many controversial issues whether shopping mall images are creative works or not. According to Supreme Court's decision in 2001, to ad pictures taken with ham products is simply a clone of the appearance of objects to deliver nothing but the decision was not only creative expression. But for the photographer's losses recognized in the advertising photo shoot takes the typical cost was estimated damages. According to Seoul District Court precedents in 2003, if there are the photographer's personality and creativity in the selection of the subject, the composition of the set, the direction and amount of light control, set the angle of the camera, shutter speed, shutter chance, other shooting methods for capturing, developing and printing process, the works should be protected by copyright law by the Court's sentence. In order to receive copyright protection of the shopping mall images by the law, it is simply not to convey the status of the product, the photographer's personality and creativity can be recognized that it requires effort. Accordingly, the cost of making the mall image increases, and the necessity for copyright protection becomes higher. The product images of the online shopping mall have a very unique configuration unlike the general pictures such as portraits and landscape photos and, therefore, the general image watermarking technique can not satisfy the requirements of the image watermarking. Because background of product images commonly used in shopping malls is white or black, or gray scale (gradient) color, it is difficult to utilize the space to embed a watermark and the area is very sensitive even a slight change. In this paper, the characteristics of images used in shopping malls are analyzed and a watermarking technology which is suitable to the shopping mall images is proposed. The proposed image watermarking technology divide a product image into smaller blocks, and the corresponding blocks are transformed by DCT (Discrete Cosine Transform), and then the watermark information was inserted into images using quantization of DCT coefficients. Because uniform treatment of the DCT coefficients for quantization cause visual blocking artifacts, the proposed algorithm used weighted mask which quantizes finely the coefficients located block boundaries and coarsely the coefficients located center area of the block. This mask improves subjective visual quality as well as the objective quality of the images. In addition, in order to improve the safety of the algorithm, the blocks which is embedded the watermark are randomly selected and the turbo code is used to reduce the BER when extracting the watermark. The PSNR(Peak Signal to Noise Ratio) of the shopping mall image watermarked by the proposed algorithm is 40.7~48.5[dB] and BER(Bit Error Rate) after JPEG with QF = 70 is 0. This means the watermarked image is high quality and the algorithm is robust to JPEG compression that is used generally at the online shopping malls. Also, for 40% change in size and 40 degrees of rotation, the BER is 0. In general, the shopping malls are used compressed images with QF which is higher than 90. Because the pirated image is used to replicate from original image, the proposed algorithm can identify the copyright infringement in the most cases. As shown the experimental results, the proposed algorithm is suitable to the shopping mall images with simple background. However, the future study should be carried out to enhance the robustness of the proposed algorithm because the robustness loss is occurred after mask process.

Preparation of Vitamin E Acetate Nano-emulsion and In Vitro Research Regarding Vitamin E Acetate Transdermal Delivery System which Use Franz Diffusion Cell (Vitamin E Acetate를 함유한 Nano-emulsion 제조와 Franz Diffusion Cell을 이용한 Vitamin E Acetate의 경표피 흡수에 관한 In Vitro 연구)

  • Park, Soo-Nam;Kim, Jai-Hyun;Yang, Hee-Jung;Won, Bo-Ryoung;Ahn, You-Jin;Kang, Myung-Kyu
    • Journal of the Society of Cosmetic Scientists of Korea
    • /
    • v.35 no.2
    • /
    • pp.91-101
    • /
    • 2009
  • in the cosmetics and medical supply field as a antioxidant material. The stable nano particle emulsion of skin toner type containing VEA was prepared. To evaluate the skin permeation, experiments on VEA permeation to the skin of the ICR outbred albino mice (12 weeks, about 50 g, female) and on differences of solubility as a function of receptor formulations was performed. The analysis of nano-emulsions containing VEA 0.07 % showed that the higher ethanol contents the larger emulsions were formed, while the higher surfactant contents the size became smaller.In this study, vitamin E acetate (VEA, tocopheryl acetate), a lipid-soluble vitamin which is widely used A certain contents of ethanol in receptor phase increased VEA solubility on the nano-emulsion. When the ethanol contents were 10.0 % and 20.0 %, the VEA solubility was higher than 5.0 % and 40.0 %, respectively. The type of surfactant in receptor solution influenced to VEA solubility. The comparison between three kind surfactants whose chemical structures and HLB values are different, showed that solubility of VEA was increased as order of sorbitan sesquioleate (Arlacel 83; HLB 3.7) > POE (10) hydrogenated castor oil (HCO-10; HLB 6.5) > sorbitan monostearate (Arlacel 60; HLB 4.7). VEA solubility was also shown to be different according to the type of antioxidant. In early time, the solubility of the sample including ascorbic acid was similar to those of other samples including other types of antioxidants. However, the solubility of the sample including ascorbic acid was 2 times higher than others after 24 h. Franz diffusion cell experiment using mouse skin was performed with four nano-emulsion samples which have different VEA contents. The emulsion of 10 wt% ethanol was shown to be the most permeable at the amount of 128.8 ${\mu}g/cm^2$. When the result of 10 % ethanol content was compared with initial input of 220.057 ${\mu}g/cm^2$, the permeated amount was 58.53 % and the permeated amount at 10 % ethanol was higher 45.0 % and 15.0 % than the other results which ethanol contents were 1.0 and 20.0 wt%, respectively. Emulsion particle size used 0.5 % surfactant (HCO-60) was 26.0 nm that is one twentieth time smaller than the size of 0.007 % surfactant (HCO-60) at the same ethanol content. Transepidermal permeation of VEA was 54.848 ${\mu}g/cm^2$ which is smaller than that of particlesize 590.7 nm. Skin permeation of nano-emulsion containing VEA and difference of VEA solubility as a function of receptor phase formulation were determined from the results. Using these results, optimal conditions of transepidermal permeation with VEA were considered to be set up.

The Effect of Nasal BiPAP Ventilation in Acute Exacerbation of Chronic Obstructive Airway Disease (만성 기도폐쇄환자에서 급성 호흡 부전시 BiPAP 환기법의 치료 효과)

  • Cho, Young-Bok;Kim, Ki-Beom;Lee, Hak-Jun;Chung, Jin-Hong;Lee, Kwan-Ho;Lee, Hyun-Woo
    • Tuberculosis and Respiratory Diseases
    • /
    • v.43 no.2
    • /
    • pp.190-200
    • /
    • 1996
  • Background : Mechanical ventilation constitutes the last therapeutic method for acute respiratory failure when oxygen therapy and medical treatment fail to improve the respiratory status of the patient. This invasive ventilation, classically administered by endotracheal intubation or by tracheostomy, is associated with significant mortality and morbidity. Consequently, any less invasive method able to avoid the use of endotracheal ventilation would appear to be useful in high risk patient. Over recent years, the efficacy of nasal mask ventilation has been demonstrated in the treatment of chronic restrictive respiratory failure, particularly in patients with neuromuscular diseases. More recently, this method has been successfully used in the treatment of acute respiratory failure due to parenchymal disease. Method : We assessed the efficacy of Bilevel positive airway pressure(BiPAP) in the treatment of acute exacerbation of chronic obstructive pulmonary disease(COPD). This study prospectively evaluated the clinical effectiveness of a treatment schedule with positive pressure ventilation via nasal mask(Respironics BiPAP device) in 22 patients with acute exacerbations of COPD. Eleven patients with acute exacerbations of COPD were treated with nasal pressure support ventilation delivered via a nasal ventilatory support system plus standard treatment for 3 consecutive days. An additional 11 control patients were treated only with standard treatment. The standard treatment consisted of medical and oxygen therapy. The nasal BiPAP was delivered by a pressure support ventilator in spontaneous timed mode and at an inspiratory positive airway pressure $6-8cmH_2O$ and an expiratory positive airway pressure $3-4cmH_2O$. Patients were evaluated with physical examination(respiratory rate), modified Borg scale and arterial blood gas before and after the acute therapeutic intervention. Results : Pretreatment and after 3 days of treatment, mean $PaO_2$ was 56.3mmHg and 79.1mmHg (p<0.05) in BiPAP group and 56.9mmHg and 70.2mmHg (p<0.05) in conventional treatment (CT) group and $PaCO_2$ was 63.9mmHg and 56.9mmHg (p<0.05) in BiPAP group and 53mmHg and 52.8mmHg in CT group respectively. pH was 7.36 and 7.41 (p<0.05) in BiPAP group and 7.37 and 7.38 in cr group respectively. Pretreatment and after treatment, mean respiratory rate was 28 and 23 beats/min in BiPAP group and 25 and 20 beats/min in CT group respectively. Borg scale was 7.6 and 4.7 in BiPAP group and 6.4 and 3.8 in CT group respectively. There were significant differences between the two groups in changes of mean $PaO_2$, $PaCO_2$ and pH respectively. Conclusion: We conclude that short-term nasal pressure-support ventilation delivered via nasal BiPAP in the treatment of acute exacerbation of COPD, is an efficient mode of assisted ventilation for improving blood gas values and dyspnea sensation and may reduce the need for endotracheal intubation with mechanical ventilation.

  • PDF

Evaluation of Tuberculosis Activity in Patients with Anthracofibrosis by Use of Serum Levels of IL-2 $sR{\alpha}$, IFN-${\gamma}$ and TBGL(Tuberculous Glycolipid) Antibody (Anthracofibrosis의 결핵활동성 지표로서 혈청 IL-2 $sR{\alpha}$, IFN-${\gamma}$, 그리고 TBGL(tuberculous glycolipid) antibody 측정의 의의)

  • Jeong, Do Young;Cha, Young Joo;Lee, Byoung Jun;Jung, Hye Ryung;Lee, Sang Hun;Shin, Jong Wook;Kim, Jae-Yeol;Park, In Won;Choi, Byoung Whui
    • Tuberculosis and Respiratory Diseases
    • /
    • v.55 no.3
    • /
    • pp.250-256
    • /
    • 2003
  • Background : Anthracofibrosis, a descriptive term for multiple black pigmentation with fibrosis on bronchoscopic examination, has a close relationship with active tuberculosis (TB). However, TB activity is determined in the later stage by the TB culture results in some cases of anthracofibrosis. Therefore, it is necessary to identify early markers of TB activity in anthracofibrosis. There have been several reports investigating the serum levels of IL-2 $sR{\alpha}$, IFN-${\gamma}$ and TBGL antibody for the evaluation of TB activity. In the present study, we tried to measure the above mentioned serologic markers for the evaluation of TB activity in patients with anthracofibrosis. Methods : Anthracofibrosis was defined when there was deep pigmentation (in more than two lobar bronchi) and fibrotic stenosis of the bronchi on bronchoscopic examination. The serum of patients with anthracofibrosis was collected and stored under refrigeration before the start of anti-TB medication. The serum of healthy volunteers (N=16), patients with active TB prior to (N=22), and after (N=13), 6 month-medication was also collected and stored. Serum IL-2 $sR{\alpha}$, IFN-${\gamma}$ were measured with ELISA kit (R&D system, USA) and serum TBGL antibody was measured with TBGL EIA kit (Kyowa Inc, Japan). Results : Serum levels of IL-2 $sR{\alpha}$ in healthy volunteers, active TB patients before and after medication, and patients with anthracofibrosis were $640{\pm}174$, $1,611{\pm}2,423$, $953{\pm}562$, and $863{\pm}401$ pg/ml, respectively. The Serum IFN-${\gamma}$ levels were 0, $8.16{\pm}17.34$, $0.70{\pm}2.53$, and $2.33{\pm}6.67$ pg/ml, and TBGL antibody levels were $0.83{\pm}0.80$, $5.91{\pm}6.71$, $6.86{\pm}6.85$, and $3.22{\pm}2.59$ U/ml, respectively. The serum level of TBGL antibody was lower than of other groups (p<0.05). There was no significant difference of serum IL-2 $sR{\alpha}$ and IFN-${\gamma}$ levels among the four groups. Conclusion : The serum levels of IL-2 $sR{\alpha}$, IFN-${\gamma}$ and TBGL antibody were not useful in the evaluation of TB activity in patients with anthracofibrosis. More useful ways need to be developed for the differentiation of active TB in patients with anthracofibrosis.

Clinical and radiographic evaluation of $Neoplan^{(R)}$ implant with a sandblasted and acid-etched surface and external connection (SLA 표면 처리 및 외측 연결형의 국산 임플랜트에 대한 임상적, 방사선학적 평가)

  • An, Hee-Suk;Moon, Hong-Suk;Shim, Jun-Sung;Cho, Kyu-Sung;Lee, Keun-Woo
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.46 no.2
    • /
    • pp.125-136
    • /
    • 2008
  • Statement of problem: Since the concept of osseointegration in dental implants was introduced by $Br{{\aa}}nemark$ et al, high long-term success rates have been achieved. Though the use of dental implants have increased dramatically, there are few studies on domestic implants with clinical and objective long-term data. Purpose: The aim of this retrospective study was to provide long-term data on the $Neoplan^{(R)}$ implant, which features a sandblasted and acid-etched surface and external connection. Material and methods: 96 $Neoplan^{(R)}$ implants placed in 25 patients in Yonsei University Hospital were examined to determine the effect of the factors on marginal bone loss, through clinical and radiographic results during 18 to 57 month period. Results: 1. Out of a total of 96 implants placed in 25 patients, two fixtures were lost, resulting in 97.9% of cumulative survival rate. 2. Throughout the study period, the survival rates were 96.8% in the maxilla and 98.5% in the mandible. The survival rates were 97.6% in the posterior regions and 100% in the anterior regions. 3. The mean bone loss for the first year after prosthesis placement and the mean annual bone loss after the first year for men were significantly higher than that of women (P<0.05). 4. The group of partial edentulism with no posterior teeth distal to the implant prosthesis showed significantly more bone loss compared to the group of partial edentulism with presence of posterior teeth distal to the implant prosthesis in terms of mean bone loss for the first year and after the first year (P<0.05). 5. The mean annual bone loss after the first year was more pronounced in posterior regions compared to anterior regions (P<0.05). 6. No significant difference in marginal bone loss was found in the following factors: jaws, type of prostheses, type of opposing dentition, and submerged /non-submerged implants (P<0.05). Conclusion: On the basis of these results, the factors influencing marginal bone loss were gender, type of edentulism, and location in the arch, while the factors such as arch, type of prostheses, type of opposing dentition, submerged / non- submerged implants had no significant effect on bone loss. In the present study, the cumulative survival rate of the $Neoplan^{(R)}$ implant with a sandblasted and acid-etched surface was 97.9% up to a maximum 57-month period. Further long-term investigations for this type of implant system and evaluation of other various domestic implant systems are needed in future studies.

Impact of Semantic Characteristics on Perceived Helpfulness of Online Reviews (온라인 상품평의 내용적 특성이 소비자의 인지된 유용성에 미치는 영향)

  • Park, Yoon-Joo;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.29-44
    • /
    • 2017
  • In Internet commerce, consumers are heavily influenced by product reviews written by other users who have already purchased the product. However, as the product reviews accumulate, it takes a lot of time and effort for consumers to individually check the massive number of product reviews. Moreover, product reviews that are written carelessly actually inconvenience consumers. Thus many online vendors provide mechanisms to identify reviews that customers perceive as most helpful (Cao et al. 2011; Mudambi and Schuff 2010). For example, some online retailers, such as Amazon.com and TripAdvisor, allow users to rate the helpfulness of each review, and use this feedback information to rank and re-order them. However, many reviews have only a few feedbacks or no feedback at all, thus making it hard to identify their helpfulness. Also, it takes time to accumulate feedbacks, thus the newly authored reviews do not have enough ones. For example, only 20% of the reviews in Amazon Review Dataset (Mcauley and Leskovec, 2013) have more than 5 reviews (Yan et al, 2014). The purpose of this study is to analyze the factors affecting the usefulness of online product reviews and to derive a forecasting model that selectively provides product reviews that can be helpful to consumers. In order to do this, we extracted the various linguistic, psychological, and perceptual elements included in product reviews by using text-mining techniques and identifying the determinants among these elements that affect the usability of product reviews. In particular, considering that the characteristics of the product reviews and determinants of usability for apparel products (which are experiential products) and electronic products (which are search goods) can differ, the characteristics of the product reviews were compared within each product group and the determinants were established for each. This study used 7,498 apparel product reviews and 106,962 electronic product reviews from Amazon.com. In order to understand a review text, we first extract linguistic and psychological characteristics from review texts such as a word count, the level of emotional tone and analytical thinking embedded in review text using widely adopted text analysis software LIWC (Linguistic Inquiry and Word Count). After then, we explore the descriptive statistics of review text for each category and statistically compare their differences using t-test. Lastly, we regression analysis using the data mining software RapidMiner to find out determinant factors. As a result of comparing and analyzing product review characteristics of electronic products and apparel products, it was found that reviewers used more words as well as longer sentences when writing product reviews for electronic products. As for the content characteristics of the product reviews, it was found that these reviews included many analytic words, carried more clout, and related to the cognitive processes (CogProc) more so than the apparel product reviews, in addition to including many words expressing negative emotions (NegEmo). On the other hand, the apparel product reviews included more personal, authentic, positive emotions (PosEmo) and perceptual processes (Percept) compared to the electronic product reviews. Next, we analyzed the determinants toward the usefulness of the product reviews between the two product groups. As a result, it was found that product reviews with high product ratings from reviewers in both product groups that were perceived as being useful contained a larger number of total words, many expressions involving perceptual processes, and fewer negative emotions. In addition, apparel product reviews with a large number of comparative expressions, a low expertise index, and concise content with fewer words in each sentence were perceived to be useful. In the case of electronic product reviews, those that were analytical with a high expertise index, along with containing many authentic expressions, cognitive processes, and positive emotions (PosEmo) were perceived to be useful. These findings are expected to help consumers effectively identify useful product reviews in the future.

A Study on the Regional Characteristics of Broadband Internet Termination by Coupling Type using Spatial Information based Clustering (공간정보기반 클러스터링을 이용한 초고속인터넷 결합유형별 해지의 지역별 특성연구)

  • Park, Janghyuk;Park, Sangun;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.45-67
    • /
    • 2017
  • According to the Internet Usage Research performed in 2016, the number of internet users and the internet usage have been increasing. Smartphone, compared to the computer, is taking a more dominant role as an internet access device. As the number of smart devices have been increasing, some views that the demand on high-speed internet will decrease; however, Despite the increase in smart devices, the high-speed Internet market is expected to slightly increase for a while due to the speedup of Giga Internet and the growth of the IoT market. As the broadband Internet market saturates, telecom operators are over-competing to win new customers, but if they know the cause of customer exit, it is expected to reduce marketing costs by more effective marketing. In this study, we analyzed the relationship between the cancellation rates of telecommunication products and the factors affecting them by combining the data of 3 cities, Anyang, Gunpo, and Uiwang owned by a telecommunication company with the regional data from KOSIS(Korean Statistical Information Service). Especially, we focused on the assumption that the neighboring areas affect the distribution of the cancellation rates by coupling type, so we conducted spatial cluster analysis on the 3 types of cancellation rates of each region using the spatial analysis tool, SatScan, and analyzed the various relationships between the cancellation rates and the regional data. In the analysis phase, we first summarized the characteristics of the clusters derived by combining spatial information and the cancellation data. Next, based on the results of the cluster analysis, Variance analysis, Correlation analysis, and regression analysis were used to analyze the relationship between the cancellation rates data and regional data. Based on the results of analysis, we proposed appropriate marketing methods according to the region. Unlike previous studies on regional characteristics analysis, In this study has academic differentiation in that it performs clustering based on spatial information so that the regions with similar cancellation types on adjacent regions. In addition, there have been few studies considering the regional characteristics in the previous study on the determinants of subscription to high-speed Internet services, In this study, we tried to analyze the relationship between the clusters and the regional characteristics data, assuming that there are different factors depending on the region. In this study, we tried to get more efficient marketing method considering the characteristics of each region in the new subscription and customer management in high-speed internet. As a result of analysis of variance, it was confirmed that there were significant differences in regional characteristics among the clusters, Correlation analysis shows that there is a stronger correlation the clusters than all region. and Regression analysis was used to analyze the relationship between the cancellation rate and the regional characteristics. As a result, we found that there is a difference in the cancellation rate depending on the regional characteristics, and it is possible to target differentiated marketing each region. As the biggest limitation of this study and it was difficult to obtain enough data to carry out the analyze. In particular, it is difficult to find the variables that represent the regional characteristics in the Dong unit. In other words, most of the data was disclosed to the city rather than the Dong unit, so it was limited to analyze it in detail. The data such as income, card usage information and telecommunications company policies or characteristics that could affect its cause are not available at that time. The most urgent part for a more sophisticated analysis is to obtain the Dong unit data for the regional characteristics. Direction of the next studies be target marketing based on the results. It is also meaningful to analyze the effect of marketing by comparing and analyzing the difference of results before and after target marketing. It is also effective to use clusters based on new subscription data as well as cancellation data.

Analysis of the Time-dependent Relation between TV Ratings and the Content of Microblogs (TV 시청률과 마이크로블로그 내용어와의 시간대별 관계 분석)

  • Choeh, Joon Yeon;Baek, Haedeuk;Choi, Jinho
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.163-176
    • /
    • 2014
  • Social media is becoming the platform for users to communicate their activities, status, emotions, and experiences to other people. In recent years, microblogs, such as Twitter, have gained in popularity because of its ease of use, speed, and reach. Compared to a conventional web blog, a microblog lowers users' efforts and investment for content generation by recommending shorter posts. There has been a lot research into capturing the social phenomena and analyzing the chatter of microblogs. However, measuring television ratings has been given little attention so far. Currently, the most common method to measure TV ratings uses an electronic metering device installed in a small number of sampled households. Microblogs allow users to post short messages, share daily updates, and conveniently keep in touch. In a similar way, microblog users are interacting with each other while watching television or movies, or visiting a new place. In order to measure TV ratings, some features are significant during certain hours of the day, or days of the week, whereas these same features are meaningless during other time periods. Thus, the importance of features can change during the day, and a model capturing the time sensitive relevance is required to estimate TV ratings. Therefore, modeling time-related characteristics of features should be a key when measuring the TV ratings through microblogs. We show that capturing time-dependency of features in measuring TV ratings is vitally necessary for improving their accuracy. To explore the relationship between the content of microblogs and TV ratings, we collected Twitter data using the Get Search component of the Twitter REST API from January 2013 to October 2013. There are about 300 thousand posts in our data set for the experiment. After excluding data such as adverting or promoted tweets, we selected 149 thousand tweets for analysis. The number of tweets reaches its maximum level on the broadcasting day and increases rapidly around the broadcasting time. This result is stems from the characteristics of the public channel, which broadcasts the program at the predetermined time. From our analysis, we find that count-based features such as the number of tweets or retweets have a low correlation with TV ratings. This result implies that a simple tweet rate does not reflect the satisfaction or response to the TV programs. Content-based features extracted from the content of tweets have a relatively high correlation with TV ratings. Further, some emoticons or newly coined words that are not tagged in the morpheme extraction process have a strong relationship with TV ratings. We find that there is a time-dependency in the correlation of features between the before and after broadcasting time. Since the TV program is broadcast at the predetermined time regularly, users post tweets expressing their expectation for the program or disappointment over not being able to watch the program. The highly correlated features before the broadcast are different from the features after broadcasting. This result explains that the relevance of words with TV programs can change according to the time of the tweets. Among the 336 words that fulfill the minimum requirements for candidate features, 145 words have the highest correlation before the broadcasting time, whereas 68 words reach the highest correlation after broadcasting. Interestingly, some words that express the impossibility of watching the program show a high relevance, despite containing a negative meaning. Understanding the time-dependency of features can be helpful in improving the accuracy of TV ratings measurement. This research contributes a basis to estimate the response to or satisfaction with the broadcasted programs using the time dependency of words in Twitter chatter. More research is needed to refine the methodology for predicting or measuring TV ratings.

Development of the Accident Prediction Model for Enlisted Men through an Integrated Approach to Datamining and Textmining (데이터 마이닝과 텍스트 마이닝의 통합적 접근을 통한 병사 사고예측 모델 개발)

  • Yoon, Seungjin;Kim, Suhwan;Shin, Kyungshik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.1-17
    • /
    • 2015
  • In this paper, we report what we have observed with regards to a prediction model for the military based on enlisted men's internal(cumulative records) and external data(SNS data). This work is significant in the military's efforts to supervise them. In spite of their effort, many commanders have failed to prevent accidents by their subordinates. One of the important duties of officers' work is to take care of their subordinates in prevention unexpected accidents. However, it is hard to prevent accidents so we must attempt to determine a proper method. Our motivation for presenting this paper is to mate it possible to predict accidents using enlisted men's internal and external data. The biggest issue facing the military is the occurrence of accidents by enlisted men related to maladjustment and the relaxation of military discipline. The core method of preventing accidents by soldiers is to identify problems and manage them quickly. Commanders predict accidents by interviewing their soldiers and observing their surroundings. It requires considerable time and effort and results in a significant difference depending on the capabilities of the commanders. In this paper, we seek to predict accidents with objective data which can easily be obtained. Recently, records of enlisted men as well as SNS communication between commanders and soldiers, make it possible to predict and prevent accidents. This paper concerns the application of data mining to identify their interests, predict accidents and make use of internal and external data (SNS). We propose both a topic analysis and decision tree method. The study is conducted in two steps. First, topic analysis is conducted through the SNS of enlisted men. Second, the decision tree method is used to analyze the internal data with the results of the first analysis. The dependent variable for these analysis is the presence of any accidents. In order to analyze their SNS, we require tools such as text mining and topic analysis. We used SAS Enterprise Miner 12.1, which provides a text miner module. Our approach for finding their interests is composed of three main phases; collecting, topic analysis, and converting topic analysis results into points for using independent variables. In the first phase, we collect enlisted men's SNS data by commender's ID. After gathering unstructured SNS data, the topic analysis phase extracts issues from them. For simplicity, 5 topics(vacation, friends, stress, training, and sports) are extracted from 20,000 articles. In the third phase, using these 5 topics, we quantify them as personal points. After quantifying their topic, we include these results in independent variables which are composed of 15 internal data sets. Then, we make two decision trees. The first tree is composed of their internal data only. The second tree is composed of their external data(SNS) as well as their internal data. After that, we compare the results of misclassification from SAS E-miner. The first model's misclassification is 12.1%. On the other hand, second model's misclassification is 7.8%. This method predicts accidents with an accuracy of approximately 92%. The gap of the two models is 4.3%. Finally, we test if the difference between them is meaningful or not, using the McNemar test. The result of test is considered relevant.(p-value : 0.0003) This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of enlisted men's data. Additionally, various independent variables used in the decision tree model are used as categorical variables instead of continuous variables. So it suffers a loss of information. In spite of extensive efforts to provide prediction models for the military, commanders' predictions are accurate only when they have sufficient data about their subordinates. Our proposed methodology can provide support to decision-making in the military. This study is expected to contribute to the prevention of accidents in the military based on scientific analysis of enlisted men and proper management of them.

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.