• Title/Summary/Keyword: support

Search Result 41,973, Processing Time 0.074 seconds

A Study on the Application of Outlier Analysis for Fraud Detection: Focused on Transactions of Auction Exception Agricultural Products (부정 탐지를 위한 이상치 분석 활용방안 연구 : 농수산 상장예외품목 거래를 대상으로)

  • Kim, Dongsung;Kim, Kitae;Kim, Jongwoo;Park, Steve
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.3
    • /
    • pp.93-108
    • /
    • 2014
  • To support business decision making, interests and efforts to analyze and use transaction data in different perspectives are increasing. Such efforts are not only limited to customer management or marketing, but also used for monitoring and detecting fraud transactions. Fraud transactions are evolving into various patterns by taking advantage of information technology. To reflect the evolution of fraud transactions, there are many efforts on fraud detection methods and advanced application systems in order to improve the accuracy and ease of fraud detection. As a case of fraud detection, this study aims to provide effective fraud detection methods for auction exception agricultural products in the largest Korean agricultural wholesale market. Auction exception products policy exists to complement auction-based trades in agricultural wholesale market. That is, most trades on agricultural products are performed by auction; however, specific products are assigned as auction exception products when total volumes of products are relatively small, the number of wholesalers is small, or there are difficulties for wholesalers to purchase the products. However, auction exception products policy makes several problems on fairness and transparency of transaction, which requires help of fraud detection. In this study, to generate fraud detection rules, real huge agricultural products trade transaction data from 2008 to 2010 in the market are analyzed, which increase more than 1 million transactions and 1 billion US dollar in transaction volume. Agricultural transaction data has unique characteristics such as frequent changes in supply volumes and turbulent time-dependent changes in price. Since this was the first trial to identify fraud transactions in this domain, there was no training data set for supervised learning. So, fraud detection rules are generated using outlier detection approach. We assume that outlier transactions have more possibility of fraud transactions than normal transactions. The outlier transactions are identified to compare daily average unit price, weekly average unit price, and quarterly average unit price of product items. Also quarterly averages unit price of product items of the specific wholesalers are used to identify outlier transactions. The reliability of generated fraud detection rules are confirmed by domain experts. To determine whether a transaction is fraudulent or not, normal distribution and normalized Z-value concept are applied. That is, a unit price of a transaction is transformed to Z-value to calculate the occurrence probability when we approximate the distribution of unit prices to normal distribution. The modified Z-value of the unit price in the transaction is used rather than using the original Z-value of it. The reason is that in the case of auction exception agricultural products, Z-values are influenced by outlier fraud transactions themselves because the number of wholesalers is small. The modified Z-values are called Self-Eliminated Z-scores because they are calculated excluding the unit price of the specific transaction which is subject to check whether it is fraud transaction or not. To show the usefulness of the proposed approach, a prototype of fraud transaction detection system is developed using Delphi. The system consists of five main menus and related submenus. First functionalities of the system is to import transaction databases. Next important functions are to set up fraud detection parameters. By changing fraud detection parameters, system users can control the number of potential fraud transactions. Execution functions provide fraud detection results which are found based on fraud detection parameters. The potential fraud transactions can be viewed on screen or exported as files. The study is an initial trial to identify fraud transactions in Auction Exception Agricultural Products. There are still many remained research topics of the issue. First, the scope of analysis data was limited due to the availability of data. It is necessary to include more data on transactions, wholesalers, and producers to detect fraud transactions more accurately. Next, we need to extend the scope of fraud transaction detection to fishery products. Also there are many possibilities to apply different data mining techniques for fraud detection. For example, time series approach is a potential technique to apply the problem. Even though outlier transactions are detected based on unit prices of transactions, however it is possible to derive fraud detection rules based on transaction volumes.

Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques (텍스트 마이닝을 이용한 2012년 한국대선 관련 트위터 분석)

  • Bae, Jung-Hwan;Son, Ji-Eun;Song, Min
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.3
    • /
    • pp.141-156
    • /
    • 2013
  • Social media is a representative form of the Web 2.0 that shapes the change of a user's information behavior by allowing users to produce their own contents without any expert skills. In particular, as a new communication medium, it has a profound impact on the social change by enabling users to communicate with the masses and acquaintances their opinions and thoughts. Social media data plays a significant role in an emerging Big Data arena. A variety of research areas such as social network analysis, opinion mining, and so on, therefore, have paid attention to discover meaningful information from vast amounts of data buried in social media. Social media has recently become main foci to the field of Information Retrieval and Text Mining because not only it produces massive unstructured textual data in real-time but also it serves as an influential channel for opinion leading. But most of the previous studies have adopted broad-brush and limited approaches. These approaches have made it difficult to find and analyze new information. To overcome these limitations, we developed a real-time Twitter trend mining system to capture the trend in real-time processing big stream datasets of Twitter. The system offers the functions of term co-occurrence retrieval, visualization of Twitter users by query, similarity calculation between two users, topic modeling to keep track of changes of topical trend, and mention-based user network analysis. In addition, we conducted a case study on the 2012 Korean presidential election. We collected 1,737,969 tweets which contain candidates' name and election on Twitter in Korea (http://www.twitter.com/) for one month in 2012 (October 1 to October 31). The case study shows that the system provides useful information and detects the trend of society effectively. The system also retrieves the list of terms co-occurred by given query terms. We compare the results of term co-occurrence retrieval by giving influential candidates' name, 'Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn' as query terms. General terms which are related to presidential election such as 'Presidential Election', 'Proclamation in Support', Public opinion poll' appear frequently. Also the results show specific terms that differentiate each candidate's feature such as 'Park Jung Hee' and 'Yuk Young Su' from the query 'Guen Hae Park', 'a single candidacy agreement' and 'Time of voting extension' from the query 'Jae In Moon' and 'a single candidacy agreement' and 'down contract' from the query 'Chul Su Ahn'. Our system not only extracts 10 topics along with related terms but also shows topics' dynamic changes over time by employing the multinomial Latent Dirichlet Allocation technique. Each topic can show one of two types of patterns-Rising tendency and Falling tendencydepending on the change of the probability distribution. To determine the relationship between topic trends in Twitter and social issues in the real world, we compare topic trends with related news articles. We are able to identify that Twitter can track the issue faster than the other media, newspapers. The user network in Twitter is different from those of other social media because of distinctive characteristics of making relationships in Twitter. Twitter users can make their relationships by exchanging mentions. We visualize and analyze mention based networks of 136,754 users. We put three candidates' name as query terms-Geun Hae Park', 'Jae In Moon', and 'Chul Su Ahn'. The results show that Twitter users mention all candidates' name regardless of their political tendencies. This case study discloses that Twitter could be an effective tool to detect and predict dynamic changes of social issues, and mention-based user networks could show different aspects of user behavior as a unique network that is uniquely found in Twitter.

Analysis of Success Cases of InsurTech and Digital Insurance Platform Based on Artificial Intelligence Technologies: Focused on Ping An Insurance Group Ltd. in China (인공지능 기술 기반 인슈어테크와 디지털보험플랫폼 성공사례 분석: 중국 평안보험그룹을 중심으로)

  • Lee, JaeWon;Oh, SangJin
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.71-90
    • /
    • 2020
  • Recently, the global insurance industry is rapidly developing digital transformation through the use of artificial intelligence technologies such as machine learning, natural language processing, and deep learning. As a result, more and more foreign insurers have achieved the success of artificial intelligence technology-based InsurTech and platform business, and Ping An Insurance Group Ltd., China's largest private company, is leading China's global fourth industrial revolution with remarkable achievements in InsurTech and Digital Platform as a result of its constant innovation, using 'finance and technology' and 'finance and ecosystem' as keywords for companies. In response, this study analyzed the InsurTech and platform business activities of Ping An Insurance Group Ltd. through the ser-M analysis model to provide strategic implications for revitalizing AI technology-based businesses of domestic insurers. The ser-M analysis model has been studied so that the vision and leadership of the CEO, the historical environment of the enterprise, the utilization of various resources, and the unique mechanism relationships can be interpreted in an integrated manner as a frame that can be interpreted in terms of the subject, environment, resource and mechanism. As a result of the case analysis, Ping An Insurance Group Ltd. has achieved cost reduction and customer service development by digitally innovating its entire business area such as sales, underwriting, claims, and loan service by utilizing core artificial intelligence technologies such as facial, voice, and facial expression recognition. In addition, "online data in China" and "the vast offline data and insights accumulated by the company" were combined with new technologies such as artificial intelligence and big data analysis to build a digital platform that integrates financial services and digital service businesses. Ping An Insurance Group Ltd. challenged constant innovation, and as of 2019, sales reached $155 billion, ranking seventh among all companies in the Global 2000 rankings selected by Forbes Magazine. Analyzing the background of the success of Ping An Insurance Group Ltd. from the perspective of ser-M, founder Mammingz quickly captured the development of digital technology, market competition and changes in population structure in the era of the fourth industrial revolution, and established a new vision and displayed an agile leadership of digital technology-focused. Based on the strong leadership led by the founder in response to environmental changes, the company has successfully led InsurTech and Platform Business through innovation of internal resources such as investment in artificial intelligence technology, securing excellent professionals, and strengthening big data capabilities, combining external absorption capabilities, and strategic alliances among various industries. Through this success story analysis of Ping An Insurance Group Ltd., the following implications can be given to domestic insurance companies that are preparing for digital transformation. First, CEOs of domestic companies also need to recognize the paradigm shift in industry due to the change in digital technology and quickly arm themselves with digital technology-oriented leadership to spearhead the digital transformation of enterprises. Second, the Korean government should urgently overhaul related laws and systems to further promote the use of data between different industries and provide drastic support such as deregulation, tax benefits and platform provision to help the domestic insurance industry secure global competitiveness. Third, Korean companies also need to make bolder investments in the development of artificial intelligence technology so that systematic securing of internal and external data, training of technical personnel, and patent applications can be expanded, and digital platforms should be quickly established so that diverse customer experiences can be integrated through learned artificial intelligence technology. Finally, since there may be limitations to generalization through a single case of an overseas insurance company, I hope that in the future, more extensive research will be conducted on various management strategies related to artificial intelligence technology by analyzing cases of multiple industries or multiple companies or conducting empirical research.

A Methodology to Develop a Curriculum based on National Competency Standards - Focused on Methodology for Gap Analysis - (국가직무능력표준(NCS)에 근거한 조경분야 교육과정 개발 방법론 - 갭분석을 중심으로 -)

  • Byeon, Jae-Sang;Ahn, Seong-Ro;Shin, Sang-Hyun
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.43 no.1
    • /
    • pp.40-53
    • /
    • 2015
  • To train the manpower to meet the requirements of the industrial field, the introduction of the National Qualification Frameworks(hereinafter referred to as NQF) was determined in 2001 by National Competency Standards(hereinafter referred to as NCS) centrally of the Office for Government Policy Coordination. Also, for landscape architecture in the construction field, the "NCS -Landscape Architecture" pilot was developed in 2008 to be test operated for 3 years starting in 2009. Especially, as the 'realization of a competence-based society, not by educational background' was adopted as one of the major government projects in the Park Geun-Hye government(inaugurated in 2013) the NCS system was constructed on a nationwide scale as a detailed method for practicing this. However, in the case of the NCS developed by the nation, the ideal job performing abilities are specified, therefore there are weaknesses of not being able to reflect the actual operational problem differences in the student level between universities, problems of securing equipment and professors, and problems in the number of current curricula. For soft landing to practical curriculum, the process of clearly analyzing the gap between the current curriculum and the NCS must be preceded. Gap analysis is the initial stage methodology to reorganize the existing curriculum into NCS based curriculum, and based on the ability unit elements and performance standards for each NCS ability unit, the discrepancy between the existing curriculum within the department or the level of coincidence used a Likert scale of 1 to 5 to fill in and analyze. Thus, the universities wishing to operate NCS in the future measuring the level of coincidence and the gap between the current university curriculum and NCS can secure the basic tool to verify the applicability of NCS and the effectiveness of further development and operation. The advantages of reorganizing the curriculum through gap analysis are, first, that the government financial support project can be connected to provide quantitative index of the NCS adoption rate for each qualitative department, and, second, an objective standard is provided on the insufficiency or sufficiency when reorganizing to NCS based curriculum. In other words, when introducing in the subdivisions of the relevant NCS, the insufficient ability units and the ability unit elements can be extracted, and the supplementary matters for each ability unit element per existing subject can be extracted at the same time. There is an advantage providing directions for detailed class program and basic subject opening. The Ministry of Education and the Ministry of Employment and Labor must gather people from the industry to actively develop and supply the NCS standard a practical level to systematically reflect the requirements of the industrial field the educational training and qualification, and the universities wishing to apply NCS must reorganize the curriculum connecting work and qualification based on NCS. To enable this, the universities must consider the relevant industrial prospect and the relation between the faculty resources within the university and the local industry to clearly select the NCS subdivision to be applied. Afterwards, gap analysis must be used for the NCS based curriculum reorganization to establish the direction of the reorganization more objectively and rationally in order to participate in the process evaluation type qualification system efficiently.

The Framework of Research Network and Performance Evaluation on Personal Information Security: Social Network Analysis Perspective (개인정보보호 분야의 연구자 네트워크와 성과 평가 프레임워크: 소셜 네트워크 분석을 중심으로)

  • Kim, Minsu;Choi, Jaewon;Kim, Hyun Jin
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.177-193
    • /
    • 2014
  • Over the past decade, there has been a rapid diffusion of electronic commerce and a rising number of interconnected networks, resulting in an escalation of security threats and privacy concerns. Electronic commerce has a built-in trade-off between the necessity of providing at least some personal information to consummate an online transaction, and the risk of negative consequences from providing such information. More recently, the frequent disclosure of private information has raised concerns about privacy and its impacts. This has motivated researchers in various fields to explore information privacy issues to address these concerns. Accordingly, the necessity for information privacy policies and technologies for collecting and storing data, and information privacy research in various fields such as medicine, computer science, business, and statistics has increased. The occurrence of various information security accidents have made finding experts in the information security field an important issue. Objective measures for finding such experts are required, as it is currently rather subjective. Based on social network analysis, this paper focused on a framework to evaluate the process of finding experts in the information security field. We collected data from the National Discovery for Science Leaders (NDSL) database, initially collecting about 2000 papers covering the period between 2005 and 2013. Outliers and the data of irrelevant papers were dropped, leaving 784 papers to test the suggested hypotheses. The co-authorship network data for co-author relationship, publisher, affiliation, and so on were analyzed using social network measures including centrality and structural hole. The results of our model estimation are as follows. With the exception of Hypothesis 3, which deals with the relationship between eigenvector centrality and performance, all of our hypotheses were supported. In line with our hypothesis, degree centrality (H1) was supported with its positive influence on the researchers' publishing performance (p<0.001). This finding indicates that as the degree of cooperation increased, the more the publishing performance of researchers increased. In addition, closeness centrality (H2) was also positively associated with researchers' publishing performance (p<0.001), suggesting that, as the efficiency of information acquisition increased, the more the researchers' publishing performance increased. This paper identified the difference in publishing performance among researchers. The analysis can be used to identify core experts and evaluate their performance in the information privacy research field. The co-authorship network for information privacy can aid in understanding the deep relationships among researchers. In addition, extracting characteristics of publishers and affiliations, this paper suggested an understanding of the social network measures and their potential for finding experts in the information privacy field. Social concerns about securing the objectivity of experts have increased, because experts in the information privacy field frequently participate in political consultation, and business education support and evaluation. In terms of practical implications, this research suggests an objective framework for experts in the information privacy field, and is useful for people who are in charge of managing research human resources. This study has some limitations, providing opportunities and suggestions for future research. Presenting the difference in information diffusion according to media and proximity presents difficulties for the generalization of the theory due to the small sample size. Therefore, further studies could consider an increased sample size and media diversity, the difference in information diffusion according to the media type, and information proximity could be explored in more detail. Moreover, previous network research has commonly observed a causal relationship between the independent and dependent variable (Kadushin, 2012). In this study, degree centrality as an independent variable might have causal relationship with performance as a dependent variable. However, in the case of network analysis research, network indices could be computed after the network relationship is created. An annual analysis could help mitigate this limitation.

Bankruptcy Forecasting Model using AdaBoost: A Focus on Construction Companies (적응형 부스팅을 이용한 파산 예측 모형: 건설업을 중심으로)

  • Heo, Junyoung;Yang, Jin Yong
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.1
    • /
    • pp.35-48
    • /
    • 2014
  • According to the 2013 construction market outlook report, the liquidation of construction companies is expected to continue due to the ongoing residential construction recession. Bankruptcies of construction companies have a greater social impact compared to other industries. However, due to the different nature of the capital structure and debt-to-equity ratio, it is more difficult to forecast construction companies' bankruptcies than that of companies in other industries. The construction industry operates on greater leverage, with high debt-to-equity ratios, and project cash flow focused on the second half. The economic cycle greatly influences construction companies. Therefore, downturns tend to rapidly increase the bankruptcy rates of construction companies. High leverage, coupled with increased bankruptcy rates, could lead to greater burdens on banks providing loans to construction companies. Nevertheless, the bankruptcy prediction model concentrated mainly on financial institutions, with rare construction-specific studies. The bankruptcy prediction model based on corporate finance data has been studied for some time in various ways. However, the model is intended for all companies in general, and it may not be appropriate for forecasting bankruptcies of construction companies, who typically have high liquidity risks. The construction industry is capital-intensive, operates on long timelines with large-scale investment projects, and has comparatively longer payback periods than in other industries. With its unique capital structure, it can be difficult to apply a model used to judge the financial risk of companies in general to those in the construction industry. Diverse studies of bankruptcy forecasting models based on a company's financial statements have been conducted for many years. The subjects of the model, however, were general firms, and the models may not be proper for accurately forecasting companies with disproportionately large liquidity risks, such as construction companies. The construction industry is capital-intensive, requiring significant investments in long-term projects, therefore to realize returns from the investment. The unique capital structure means that the same criteria used for other industries cannot be applied to effectively evaluate financial risk for construction firms. Altman Z-score was first published in 1968, and is commonly used as a bankruptcy forecasting model. It forecasts the likelihood of a company going bankrupt by using a simple formula, classifying the results into three categories, and evaluating the corporate status as dangerous, moderate, or safe. When a company falls into the "dangerous" category, it has a high likelihood of bankruptcy within two years, while those in the "safe" category have a low likelihood of bankruptcy. For companies in the "moderate" category, it is difficult to forecast the risk. Many of the construction firm cases in this study fell in the "moderate" category, which made it difficult to forecast their risk. Along with the development of machine learning using computers, recent studies of corporate bankruptcy forecasting have used this technology. Pattern recognition, a representative application area in machine learning, is applied to forecasting corporate bankruptcy, with patterns analyzed based on a company's financial information, and then judged as to whether the pattern belongs to the bankruptcy risk group or the safe group. The representative machine learning models previously used in bankruptcy forecasting are Artificial Neural Networks, Adaptive Boosting (AdaBoost) and, the Support Vector Machine (SVM). There are also many hybrid studies combining these models. Existing studies using the traditional Z-Score technique or bankruptcy prediction using machine learning focus on companies in non-specific industries. Therefore, the industry-specific characteristics of companies are not considered. In this paper, we confirm that adaptive boosting (AdaBoost) is the most appropriate forecasting model for construction companies by based on company size. We classified construction companies into three groups - large, medium, and small based on the company's capital. We analyzed the predictive ability of AdaBoost for each group of companies. The experimental results showed that AdaBoost has more predictive ability than the other models, especially for the group of large companies with capital of more than 50 billion won.

Comparison of CT based-CTV plan and CT based-ICRU38 plan in brachytherapy planning of uterine cervix cancer (자궁경부암 강내조사 시 CT를 이용한 CTV에 근거한 치료계획과 ICRU 38에 근거할 치료계획의 비교)

  • Shim JinSup;Jo JungKun;Si ChangKeun;Lee KiHo;Lee DuHyun;Choi KyeSuk
    • The Journal of Korean Society for Radiation Therapy
    • /
    • v.16 no.2
    • /
    • pp.9-17
    • /
    • 2004
  • Purpose : Although Improve of CT, MRI Radio-diagnosis and Radiation Therapy Planing, but we still use ICRU38 Planning system(2D film-based) broadly. 3-Dimensional ICR plan(CT image based) is not only offer tumor and normal tissue dose but also support DVH information. On this study, we plan irradiation-goal dose on CTV(CTV plan) and irradiation-goal dose on ICRU 38 point(ICRU38 plan) by use CT image. And compare with tumor-dose, rectal-dose, bladder-dose on both planning, and analysis DVH Method and Material : Sample 11 patients who treated by Ir-192 HDR. After 40Gy external radiation therapy, ICR plan established. All the patients carry out CT-image scanned by CT-simulator. And we use PLATO(Nucletron) v.14.2 planing system. We draw CTV, rectum, bladder on the CT image. And establish plan irradiation-$100\%$ dose on CTV(CTV plan) and irradiation-$100\%$ dose on A-point(ICRU38 plan) Result : CTV volume($average{\pm}SD$) is $21.8{\pm}26.6cm^3$, rectum volume($average{\pm}SD$) is $60.9{\pm}25.0cm^3$, bladder volume($average{\pm}SD$) is $116.1{\pm}40.1cm^3$ sampled 11 patients. The volume including $100\%$ dose is $126.7{\pm}18.9cm^3$ on ICRU plan and $98.2{\pm}74.5cm^3$ on CTV plan. On ICRU planning, the other one's $22.0cm^3$ CTV volume who residual tumor size excess 4cm is not including $100\%$ isodose. 8 patient's $12.9{\pm}5.9cm^3$ tumor volume who residual tumor size belows 4cm irradiated $100\%$ dose. Bladder dose(recommended by ICRU 38) is $90.1{\pm}21.3\%$ on ICRU plan, $68.7{\pm}26.6\%$ on CTV plan, and rectal dose is $86.4{\pm}18.3\%,\;76.9{\pm}15.6\%$. Bladder and Rectum maximum dose is $137.2{\pm}50.1\%,\;101.1{\pm}41.8\%$ on ICRU plan, $107.6{\pm}47.9\%,\;86.9{\pm}30.8\%$ on CTV plan. Therefore CTV plan more less normal issue-irradiated dose than ICRU plan. But one patient case who residual tumor size excess 4cm, Normal tissue dose more higher than critical dose remarkably on CTV plan. $80\%$over-Irradiated rectal dose(V80rec) is $1.8{\pm}2.4cm^3$ on ICRU plan, $0.7{\pm}1.0cm^3$ on CTV plan. $80\%$over-Irradiated bladder dose(V80bla) is $12.2{\pm}8.9cm^3$ on ICRU plan, $3.5{\pm}4.1cm^3$ on CTV plan. Likewise, CTV plan more less irradiated normal tissue than ICRU38 plan. Conclusion : Although, prove effect and stability about previous ICRU plan, if we use CTV plan by CT image, we will reduce normal tissue dose and irradiated goal-dose at residual tumor on small residual tumor case. But bigger residual tumor case, we need more research about effective 3D-planning.

  • PDF

Development of the Accident Prediction Model for Enlisted Men through an Integrated Approach to Datamining and Textmining (데이터 마이닝과 텍스트 마이닝의 통합적 접근을 통한 병사 사고예측 모델 개발)

  • Yoon, Seungjin;Kim, Suhwan;Shin, Kyungshik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.1-17
    • /
    • 2015
  • In this paper, we report what we have observed with regards to a prediction model for the military based on enlisted men's internal(cumulative records) and external data(SNS data). This work is significant in the military's efforts to supervise them. In spite of their effort, many commanders have failed to prevent accidents by their subordinates. One of the important duties of officers' work is to take care of their subordinates in prevention unexpected accidents. However, it is hard to prevent accidents so we must attempt to determine a proper method. Our motivation for presenting this paper is to mate it possible to predict accidents using enlisted men's internal and external data. The biggest issue facing the military is the occurrence of accidents by enlisted men related to maladjustment and the relaxation of military discipline. The core method of preventing accidents by soldiers is to identify problems and manage them quickly. Commanders predict accidents by interviewing their soldiers and observing their surroundings. It requires considerable time and effort and results in a significant difference depending on the capabilities of the commanders. In this paper, we seek to predict accidents with objective data which can easily be obtained. Recently, records of enlisted men as well as SNS communication between commanders and soldiers, make it possible to predict and prevent accidents. This paper concerns the application of data mining to identify their interests, predict accidents and make use of internal and external data (SNS). We propose both a topic analysis and decision tree method. The study is conducted in two steps. First, topic analysis is conducted through the SNS of enlisted men. Second, the decision tree method is used to analyze the internal data with the results of the first analysis. The dependent variable for these analysis is the presence of any accidents. In order to analyze their SNS, we require tools such as text mining and topic analysis. We used SAS Enterprise Miner 12.1, which provides a text miner module. Our approach for finding their interests is composed of three main phases; collecting, topic analysis, and converting topic analysis results into points for using independent variables. In the first phase, we collect enlisted men's SNS data by commender's ID. After gathering unstructured SNS data, the topic analysis phase extracts issues from them. For simplicity, 5 topics(vacation, friends, stress, training, and sports) are extracted from 20,000 articles. In the third phase, using these 5 topics, we quantify them as personal points. After quantifying their topic, we include these results in independent variables which are composed of 15 internal data sets. Then, we make two decision trees. The first tree is composed of their internal data only. The second tree is composed of their external data(SNS) as well as their internal data. After that, we compare the results of misclassification from SAS E-miner. The first model's misclassification is 12.1%. On the other hand, second model's misclassification is 7.8%. This method predicts accidents with an accuracy of approximately 92%. The gap of the two models is 4.3%. Finally, we test if the difference between them is meaningful or not, using the McNemar test. The result of test is considered relevant.(p-value : 0.0003) This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of enlisted men's data. Additionally, various independent variables used in the decision tree model are used as categorical variables instead of continuous variables. So it suffers a loss of information. In spite of extensive efforts to provide prediction models for the military, commanders' predictions are accurate only when they have sufficient data about their subordinates. Our proposed methodology can provide support to decision-making in the military. This study is expected to contribute to the prevention of accidents in the military based on scientific analysis of enlisted men and proper management of them.

NFC-based Smartwork Service Model Design (NFC 기반의 스마트워크 서비스 모델 설계)

  • Park, Arum;Kang, Min Su;Jun, Jungho;Lee, Kyoung Jun
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.157-175
    • /
    • 2013
  • Since Korean government announced 'Smartwork promotion strategy' in 2010, Korean firms and government organizations have started to adopt smartwork. However, the smartwork has been implemented only in a few of large enterprises and government organizations rather than SMEs (small and medium enterprises). In USA, both Yahoo! and Best Buy have stopped their flexible work because of its reported low productivity and job loafing problems. In addition, according to the literature on smartwork, we could draw obstacles of smartwork adoption and categorize them into the three types: institutional, organizational, and technological. The first category of smartwork adoption obstacles, institutional, include the difficulties of smartwork performance evaluation metrics, the lack of readiness of organizational processes, limitation of smartwork types and models, lack of employee participation in smartwork adoption procedure, high cost of building smartwork system, and insufficiency of government support. The second category, organizational, includes limitation of the organization hierarchy, wrong perception of employees and employers, a difficulty in close collaboration, low productivity with remote coworkers, insufficient understanding on remote working, and lack of training about smartwork. The third category, technological, obstacles include security concern of mobile work, lack of specialized solution, and lack of adoption and operation know-how. To overcome the current problems of smartwork in reality and the reported obstacles in literature, we suggest a novel smartwork service model based on NFC(Near Field Communication). This paper suggests NFC-based Smartwork Service Model composed of NFC-based Smartworker networking service and NFC-based Smartwork space management service. NFC-based smartworker networking service is comprised of NFC-based communication/SNS service and NFC-based recruiting/job seeking service. NFC-based communication/SNS Service Model supplements the key shortcomings that existing smartwork service model has. By connecting to existing legacy system of a company through NFC tags and systems, the low productivity and the difficulty of collaboration and attendance management can be overcome since managers can get work processing information, work time information and work space information of employees and employees can do real-time communication with coworkers and get location information of coworkers. Shortly, this service model has features such as affordable system cost, provision of location-based information, and possibility of knowledge accumulation. NFC-based recruiting/job-seeking service provides new value by linking NFC tag service and sharing economy sites. This service model has features such as easiness of service attachment and removal, efficient space-based work provision, easy search of location-based recruiting/job-seeking information, and system flexibility. This service model combines advantages of sharing economy sites with the advantages of NFC. By cooperation with sharing economy sites, the model can provide recruiters with human resource who finds not only long-term works but also short-term works. Additionally, SMEs (Small Medium-sized Enterprises) can easily find job seeker by attaching NFC tags to any spaces at which human resource with qualification may be located. In short, this service model helps efficient human resource distribution by providing location of job hunters and job applicants. NFC-based smartwork space management service can promote smartwork by linking NFC tags attached to the work space and existing smartwork system. This service has features such as low cost, provision of indoor and outdoor location information, and customized service. In particular, this model can help small company adopt smartwork system because it is light-weight system and cost-effective compared to existing smartwork system. This paper proposes the scenarios of the service models, the roles and incentives of the participants, and the comparative analysis. The superiority of NFC-based smartwork service model is shown by comparing and analyzing the new service models and the existing service models. The service model can expand scope of enterprises and organizations that adopt smartwork and expand the scope of employees that take advantages of smartwork.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.