• Title/Summary/Keyword: Automated analysis system

Search Result 849, Processing Time 0.025 seconds

An Investigation on the Periodical Transition of News related to North Korea using Text Mining (텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰)

  • Park, Chul-Soo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.63-88
    • /
    • 2019
  • The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.

Statistically Analyzed Effects of Coal-Fired Power Plants in West Coast on the Surface Air Pollutants over Seoul Metropolitan Area (통계적 기법을 활용한 서해안 화력발전소 오염물질 배출에 따른 수도권 지표면 대기오염농도 영향의 분석)

  • Ju, Jaemin;Youn, Daeok
    • Journal of the Korean earth science society
    • /
    • v.40 no.6
    • /
    • pp.549-560
    • /
    • 2019
  • The effects of the coal-fired power plant emissions, as the biggest point source of air pollutants, on spatiotemporal surface air pollution over the remote area are investigated in this study, based on a set of date selection and statistical technique to consider meteorological and geographical effects in the emission-concentration (source-receptor) relationship. We here proposed the sophisticated technique of data processing to separate and quantify the effects. The data technique comprises a set of data selection and statistical analysis procedure that include data selection criteria depending on meteorological conditions and statistical methods such as Kolmogorov-Zurbenko filter (K-Z filter) and empirical orthogonal function (EOF) analysis. The data selection procedure is important for filtering measurement data to consider the meteorological and geographical effects on the emission-concentration relationship. Together with meteorological data from the new high resolution ECMWF reanalysis 5 (ERA5) and the Korea Meteorological Administration automated surface observing system, air pollutant emission data from the telemonitoring system (TMS) of Dangjin and Taean power plants as well as spatio-temporal air pollutant concentrations from the air quality monitoring system are used for 4 years period of 2014-2017. Since all the data used in this study have the temporal resolution of 1 hour, the first EOF mode of spatio-temporal changes in air pollutant concentrations over the Seoul metropolitan area (SMA) due to power plant emission have been analyzed to explain over 97% of total variability under favorable meteorological conditions. It is concluded that SO2, NO2, and PM10 concentrations over the SMA would be decreased by 0.468, 1.050 ppb, and 2.045 ㎍ m-3 respectively if SO2, NO2, and TSP emissions from Dangjin power plant were reduced by 10%. In the same way, the 10% emission reduction in Taean power plant emissions would cause SO2, NO2, and PM10 decreased by 0.284, 0.842 ppb, and 1.230 ㎍ m-3 over the SMA respectively. Emissions from Dangjin power plant affect air pollution over the SMA in higher amount, but with lower R value, than those of Taean under the same meteorological condition.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

A Multimodal Profile Ensemble Approach to Development of Recommender Systems Using Big Data (빅데이터 기반 추천시스템 구현을 위한 다중 프로파일 앙상블 기법)

  • Kim, Minjeong;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.4
    • /
    • pp.93-110
    • /
    • 2015
  • The recommender system is a system which recommends products to the customers who are likely to be interested in. Based on automated information filtering technology, various recommender systems have been developed. Collaborative filtering (CF), one of the most successful recommendation algorithms, has been applied in a number of different domains such as recommending Web pages, books, movies, music and products. But, it has been known that CF has a critical shortcoming. CF finds neighbors whose preferences are like those of the target customer and recommends products those customers have most liked. Thus, CF works properly only when there's a sufficient number of ratings on common product from customers. When there's a shortage of customer ratings, CF makes the formation of a neighborhood inaccurate, thereby resulting in poor recommendations. To improve the performance of CF based recommender systems, most of the related studies have been focused on the development of novel algorithms under the assumption of using a single profile, which is created from user's rating information for items, purchase transactions, or Web access logs. With the advent of big data, companies got to collect more data and to use a variety of information with big size. So, many companies recognize it very importantly to utilize big data because it makes companies to improve their competitiveness and to create new value. In particular, on the rise is the issue of utilizing personal big data in the recommender system. It is why personal big data facilitate more accurate identification of the preferences or behaviors of users. The proposed recommendation methodology is as follows: First, multimodal user profiles are created from personal big data in order to grasp the preferences and behavior of users from various viewpoints. We derive five user profiles based on the personal information such as rating, site preference, demographic, Internet usage, and topic in text. Next, the similarity between users is calculated based on the profiles and then neighbors of users are found from the results. One of three ensemble approaches is applied to calculate the similarity. Each ensemble approach uses the similarity of combined profile, the average similarity of each profile, and the weighted average similarity of each profile, respectively. Finally, the products that people among the neighborhood prefer most to are recommended to the target users. For the experiments, we used the demographic data and a very large volume of Web log transaction for 5,000 panel users of a company that is specialized to analyzing ranks of Web sites. R and SAS E-miner was used to implement the proposed recommender system and to conduct the topic analysis using the keyword search, respectively. To evaluate the recommendation performance, we used 60% of data for training and 40% of data for test. The 5-fold cross validation was also conducted to enhance the reliability of our experiments. A widely used combination metric called F1 metric that gives equal weight to both recall and precision was employed for our evaluation. As the results of evaluation, the proposed methodology achieved the significant improvement over the single profile based CF algorithm. In particular, the ensemble approach using weighted average similarity shows the highest performance. That is, the rate of improvement in F1 is 16.9 percent for the ensemble approach using weighted average similarity and 8.1 percent for the ensemble approach using average similarity of each profile. From these results, we conclude that the multimodal profile ensemble approach is a viable solution to the problems encountered when there's a shortage of customer ratings. This study has significance in suggesting what kind of information could we use to create profile in the environment of big data and how could we combine and utilize them effectively. However, our methodology should be further studied to consider for its real-world application. We need to compare the differences in recommendation accuracy by applying the proposed method to different recommendation algorithms and then to identify which combination of them would show the best performance.

A Study on the Intelligent Quick Response System for Fast Fashion(IQRS-FF) (패스트 패션을 위한 지능형 신속대응시스템(IQRS-FF)에 관한 연구)

  • Park, Hyun-Sung;Park, Kwang-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.163-179
    • /
    • 2010
  • Recentlythe concept of fast fashion is drawing attention as customer needs are diversified and supply lead time is getting shorter in fashion industry. It is emphasized as one of the critical success factors in the fashion industry how quickly and efficiently to satisfy the customer needs as the competition has intensified. Because the fast fashion is inherently susceptible to trend, it is very important for fashion retailers to make quick decisions regarding items to launch, quantity based on demand prediction, and the time to respond. Also the planning decisions must be executed through the business processes of procurement, production, and logistics in real time. In order to adapt to this trend, the fashion industry urgently needs supports from intelligent quick response(QR) system. However, the traditional functions of QR systems have not been able to completely satisfy such demands of the fast fashion industry. This paper proposes an intelligent quick response system for the fast fashion(IQRS-FF). Presented are models for QR process, QR principles and execution, and QR quantity and timing computation. IQRS-FF models support the decision makers by providing useful information with automated and rule-based algorithms. If the predefined conditions of a rule are satisfied, the actions defined in the rule are automatically taken or informed to the decision makers. In IQRS-FF, QRdecisions are made in two stages: pre-season and in-season. In pre-season, firstly master demand prediction is performed based on the macro level analysis such as local and global economy, fashion trends and competitors. The prediction proceeds to the master production and procurement planning. Checking availability and delivery of materials for production, decision makers must make reservations or request procurements. For the outsourcing materials, they must check the availability and capacity of partners. By the master plans, the performance of the QR during the in-season is greatly enhanced and the decision to select the QR items is made fully considering the availability of materials in warehouse as well as partners' capacity. During in-season, the decision makers must find the right time to QR as the actual sales occur in stores. Then they are to decide items to QRbased not only on the qualitative criteria such as opinions from sales persons but also on the quantitative criteria such as sales volume, the recent sales trend, inventory level, the remaining period, the forecast for the remaining period, and competitors' performance. To calculate QR quantity in IQRS-FF, two calculation methods are designed: QR Index based calculation and attribute similarity based calculation using demographic cluster. In the early period of a new season, the attribute similarity based QR amount calculation is better used because there are not enough historical sales data. By analyzing sales trends of the categories or items that have similar attributes, QR quantity can be computed. On the other hand, in case of having enough information to analyze the sales trends or forecasting, the QR Index based calculation method can be used. Having defined the models for decision making for QR, we design KPIs(Key Performance Indicators) to test the reliability of the models in critical decision makings: the difference of sales volumebetween QR items and non-QR items; the accuracy rate of QR the lead-time spent on QR decision-making. To verify the effectiveness and practicality of the proposed models, a case study has been performed for a representative fashion company which recently developed and launched the IQRS-FF. The case study shows that the average sales rateof QR items increased by 15%, the differences in sales rate between QR items and non-QR items increased by 10%, the QR accuracy was 70%, the lead time for QR dramatically decreased from 120 hours to 8 hours.

Prevalence of Diabetes Mellitus and Associated Diseases in Yeungnam Province Area (영남지방에서의 당뇨병 유병율과 이에 관련돈 질환의 빈도에 관한 연구)

  • Cho, Ihn-Ho;Choi, Jung-Gyu;Yun, Sung-Chul;Choi, Soo-Bong
    • Journal of Yeungnam Medical Science
    • /
    • v.4 no.2
    • /
    • pp.65-73
    • /
    • 1987
  • To know the prevalence of the diabetes mellitus and associated diseases, we analysed the data of the 3,088 subjects who were examined with the Computed Automated Medi-Screening Test System which consisted of 65 parameters including blood glucose determination fasting and one hour after 100g of oral glucose load. We grouped the subjects by the modified criteria of National Diabetic Data Group. Followings are the results of the various analysis : 1. The prevalence of diabetes mellitus and impaired glucose tolerance is 2.27% and 18.26% respectively. 2. The prevalence of diabetes mellitus is 2.63% In male and 1.66% in female. There is no statistically significant difference between male and female. 3. There is tendency of increasing prevalence of diabetes mellitus as the age increases. From second to eighth decade, the prevalence of diabetes mellitus Increases as 0.0, 0.45, 0.67, 2.28, 3.47, 5.36, 10.00% respectively. 4. There is no statistically significant difference of prevalence of obesity between normal and diabetes: that is, 18.03%, 22.86% respectively.($P{\geq}0.1$) 5. There is no statistically significant difference of prevalence of impaired glucose tolerance and diabetes between non-obese and obses group. ($P{\geq}0.1$) 6. There is statistically significant increases of frequency of proteinuria, azotemia, hypertension as the glucose tolerance decreases. ($P{\leq}0.05$)

  • PDF

A Study on the Selection and Applicability Analysis of 3D Terrain Modeling Sensor for Intelligent Excavation Robot (지능형 굴삭 로봇의 개발을 위한 로컬영역 3차원 모델링 센서 선정 및 현장 적용성 분석에 관한 연구)

  • Yoo, Hyun-Seok;Kwon, Soon-Wook;Kim, Young-Suk
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.33 no.6
    • /
    • pp.2551-2562
    • /
    • 2013
  • Since 2006, an Intelligent Excavation Robot which automatically performs the earth-work without operator has been developed in Korea. The technologies for automatically recognizing the terrain of work environment and detecting the objects such as obstacles or dump trucks are essential for its work quality and safety. In several countries, terrestrial 3D laser scanner and stereo vision camera have been used to model the local area around workspace of the automated construction equipment. However, these attempts have some problems that require high cost to make the sensor system or long processing time to eliminate the noise from 3D model outcome. The objectives of this study are to analyze the advantages of the existing 3D modeling sensors and to examine the applicability for practical use by using Analytic Hierarchical Process(AHP). In this study, 3D modeling quality and accuracy of modeling sensors were tested at the real earth-work environment.

Simultaneous Characterization of Sofalcone and Its Metabolite in Human Plasma by Liquid Chromatography -Tandem Mass Spectrometry

  • Han, Sang-Beom;Jang, Moon-Sun;Lee, Hee-Joo;Lee, Ye-Rie;Yu, Chong-Woo;Lee, Kyung-Ryul;Kim, Ho-Hyun
    • Bulletin of the Korean Chemical Society
    • /
    • v.26 no.5
    • /
    • pp.729-734
    • /
    • 2005
  • A sensitive and selective method for quantitation of sofalcone and its active metabolite in human plasma has been established using liquid chromatography-electrospray ionization tandem mass spectrometry (LC-ESI/MS/MS). Plasma samples were transferred into 96-well plate using an automated sample handling system and spiked with 10 $\mu$L of 2 $\mu$g/mL $d_3$-sofalcone and $d_3$-sofalcone metabolite solutions (internal standard), respectively. After adding 0.5 mL of acetonitrile to the 96-well plate, the plasma samples were then vortexed for 30 sec. After centrifugation, the supernatant was transferred into another 96-well plate and completely evaporated at 40 ${^{\circ}C}$ under a stream of nitrogen. Dry residues were reconstituted with mobile phase and were injected into a $C_{18}$ reversed-phase column. The limit of quantitation of sofalcone and its metabolite was 2 ng/mL, using a sample volume of 0.2 mL for analysis. The reproducibility of the method was evaluated by analyzing 10 replicates over the concentration range of 2 ng/mL to 1000 ng/mL. The validation experiments of the method have shown that the assay has good precision and accuracy. Sofalcone and its metabolite produced a protonated precursor ion ([M+H]$^+$) of m/z 451 and 453, and a corresponding product ion of m/z 315 and 317, respectively. Internal standard ($d_3$-sofalcone and $d_3$-sofalcone metabolite) produced a protonated precursor ion ([M+H]$^+$) of m/z 454 and 456 and a corresponding product ion of m/z 315 and 317, respectively. The method has been successfully applied to a pharmacokinetic study of sofalcone and its active metabolite in human plasma.

Applications of Improved Low-Flow Mortar Type Grouting Method for Road Safety and Constructability in Dangerous Steep Slopes (급경사지 붕괴 위험지역의 도로 안전 및 시공성을 고려한 개선된 저유동 몰탈형 그라우팅공법 적용성 분석)

  • Choi, Gisung;Kim, Seokhyun;Kim, Nakseok
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.40 no.4
    • /
    • pp.409-415
    • /
    • 2020
  • Low-flow mortar injection method grouting technology was selected and the traffic area was preserved as much as possible in order to secure safety for road traffic when the outflow and subsidence of landfill occurred due to ground-water, and etc. In particular, the current existing method was newly improved since there are risks of damage such as hydraulic fracturing at the lower part of the road, spilling of soil particles on steep slopes, and bumps on the road due to excessive injection pressure during construction. This study was carried out at the site of reinforcement work on the road as a maintenance work for the danger zone for collapse of the steep slope of the 00 hill, which was ordered from the 00 city 00 province. The improved low-flow mortar type grouting method adopted a new automated grouting management system and especially, it composites the method for grouting conditions decision by high-pressure pre-grouting test and injection technology by AGS-controlled and studied about grouting effect analysis by using new technology. By applying the improved low-flow mortar type grouting method, it was possible to lay the groundwork for road maintenance work such as the prevention of subsidence of old roads, uneven subsidence of buildings and civil engineering structures, and of soil leakage of ground-water spills. Furthermore, the possibility of application on future grouting work not only for just construction that prevents subsidence of old roads but also for various buildings and civil engineering structures such as railroads, subways, bridges, underground structures, and boulder stone and limestone areas was confirmed.

Assessment of Surface Temperature Mitigation Effects of Wetlands During Heat and Cold Waves Using Daytime and Nighttime MODIS Land Surface Temperature (Terra/Aqua MODIS LST를 이용한 폭염 및 한파기간 동안 습지의 지면온도 완화효과 분석)

  • Chung, Jeehun;Lee, Yonggwan;Kim, Seongjoon
    • Journal of Wetlands Research
    • /
    • v.21 no.spc
    • /
    • pp.123-133
    • /
    • 2019
  • This study analyzed the surface temperature mitigation effect of wetlands during cold waves (below -12℃ from January to February) and heat waves (above 33℃ from July to August) in 2018. We used Terra/Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) Daytime and Nighttime Land Surface Temperature (LST) product, and the maximum and minimum air temperature observed at 86 stations of Korea Meteorological Administration (KMA). For the cold wave analysis, the LST of Terra MODIS nighttime was the highest at forest area with -12.7℃, followed by upland crop and wetland areas of -12.9℃ and -13.0℃ respectively. The urban area showed the lowest value of -14.4℃. During the heat wave, the urban area was the highest with + 34.6℃ in Aqua MODIS LST daytime. The wetland area was + 33.0℃ showing - 1.6℃ decrease comparing with urban area.