• Title/Summary/Keyword: Input system

Search Result 11,540, Processing Time 0.043 seconds

Bankruptcy Prediction Modeling Using Qualitative Information Based on Big Data Analytics (빅데이터 기반의 정성 정보를 활용한 부도 예측 모형 구축)

  • Jo, Nam-ok;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.33-56
    • /
    • 2016
  • Many researchers have focused on developing bankruptcy prediction models using modeling techniques, such as statistical methods including multiple discriminant analysis (MDA) and logit analysis or artificial intelligence techniques containing artificial neural networks (ANN), decision trees, and support vector machines (SVM), to secure enhanced performance. Most of the bankruptcy prediction models in academic studies have used financial ratios as main input variables. The bankruptcy of firms is associated with firm's financial states and the external economic situation. However, the inclusion of qualitative information, such as the economic atmosphere, has not been actively discussed despite the fact that exploiting only financial ratios has some drawbacks. Accounting information, such as financial ratios, is based on past data, and it is usually determined one year before bankruptcy. Thus, a time lag exists between the point of closing financial statements and the point of credit evaluation. In addition, financial ratios do not contain environmental factors, such as external economic situations. Therefore, using only financial ratios may be insufficient in constructing a bankruptcy prediction model, because they essentially reflect past corporate internal accounting information while neglecting recent information. Thus, qualitative information must be added to the conventional bankruptcy prediction model to supplement accounting information. Due to the lack of an analytic mechanism for obtaining and processing qualitative information from various information sources, previous studies have only used qualitative information. However, recently, big data analytics, such as text mining techniques, have been drawing much attention in academia and industry, with an increasing amount of unstructured text data available on the web. A few previous studies have sought to adopt big data analytics in business prediction modeling. Nevertheless, the use of qualitative information on the web for business prediction modeling is still deemed to be in the primary stage, restricted to limited applications, such as stock prediction and movie revenue prediction applications. Thus, it is necessary to apply big data analytics techniques, such as text mining, to various business prediction problems, including credit risk evaluation. Analytic methods are required for processing qualitative information represented in unstructured text form due to the complexity of managing and processing unstructured text data. This study proposes a bankruptcy prediction model for Korean small- and medium-sized construction firms using both quantitative information, such as financial ratios, and qualitative information acquired from economic news articles. The performance of the proposed method depends on how well information types are transformed from qualitative into quantitative information that is suitable for incorporating into the bankruptcy prediction model. We employ big data analytics techniques, especially text mining, as a mechanism for processing qualitative information. The sentiment index is provided at the industry level by extracting from a large amount of text data to quantify the external economic atmosphere represented in the media. The proposed method involves keyword-based sentiment analysis using a domain-specific sentiment lexicon to extract sentiment from economic news articles. The generated sentiment lexicon is designed to represent sentiment for the construction business by considering the relationship between the occurring term and the actual situation with respect to the economic condition of the industry rather than the inherent semantics of the term. The experimental results proved that incorporating qualitative information based on big data analytics into the traditional bankruptcy prediction model based on accounting information is effective for enhancing the predictive performance. The sentiment variable extracted from economic news articles had an impact on corporate bankruptcy. In particular, a negative sentiment variable improved the accuracy of corporate bankruptcy prediction because the corporate bankruptcy of construction firms is sensitive to poor economic conditions. The bankruptcy prediction model using qualitative information based on big data analytics contributes to the field, in that it reflects not only relatively recent information but also environmental factors, such as external economic conditions.

Measurement of Backscattering Coefficients of Rice Canopy Using a Ground Polarimetric Scatterometer System (지상관측 레이다 산란계를 이용한 벼 군락의 후방산란계수 측정)

  • Hong, Jin-Young;Kim, Yi-Hyun;Oh, Yi-Sok;Hong, Suk-Young
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.2
    • /
    • pp.145-152
    • /
    • 2007
  • The polarimetric backscattering coefficients of a wet-land rice field which is an experimental plot belong to National Institute of Agricultural Science and Technology in Suwon are measured using ground-based polarimetric scatterometers at 1.8 and 5.3 GHz throughout a growth year from transplanting period to harvest period (May to October in 2006). The polarimetric scatterometers consist of a vector network analyzer with time-gating function and polarimetric antenna set, and are well calibrated to get VV-, HV-, VH-, HH-polarized backscattering coefficients from the measurements, based on single target calibration technique using a trihedral corner reflector. The polarimetric backscattering coefficients are measured at $30^{\circ},\;40^{\circ},\;50^{\circ}\;and\;60^{\circ}$ with 30 independent samples for each incidence angle at each frequency. In the measurement periods the ground truth data including fresh and dry biomass, plant height, stem density, leaf area, specific leaf area, and moisture contents are also collected for each measurement. The temporal variations of the measured backscattering coefficients as well as the measured plant height, LAI (leaf area index) and biomass are analyzed. Then, the measured polarimetric backscattering coefficients are compared with the rice growth parameters. The measured plant height increases monotonically while the measured LAI increases only till the ripening period and decreases after the ripening period. The measured backscattering coefficientsare fitted with polynomial expressions as functions of growth age, plant LAI and plant height for each polarization, frequency, and incidence angle. As the incidence angle is bigger, correlations of L band signature to the rice growth was higher than that of C band signatures. It is found that the HH-polarized backscattering coefficients are more sensitive than the VV-polarized backscattering coefficients to growth age and other input parameters. It is necessary to divide the data according to the growth period which shows the qualitative changes of growth such as panicale initiation, flowering or heading to derive functions to estimate rice growth.

A study of Artificial Intelligence (AI) Speaker's Development Process in Terms of Social Constructivism: Focused on the Products and Periodic Co-revolution Process (인공지능(AI) 스피커에 대한 사회구성 차원의 발달과정 연구: 제품과 시기별 공진화 과정을 중심으로)

  • Cha, Hyeon-ju;Kweon, Sang-hee
    • Journal of Internet Computing and Services
    • /
    • v.22 no.1
    • /
    • pp.109-135
    • /
    • 2021
  • his study classified the development process of artificial intelligence (AI) speakers through analysis of the news text of artificial intelligence (AI) speakers shown in traditional news reports, and identified the characteristics of each product by period. The theoretical background used in the analysis are news frames and topic frames. As analysis methods, topic modeling and semantic network analysis using the LDA method were used. The research method was a content analysis method. From 2014 to 2019, 2710 news related to AI speakers were first collected, and secondly, topic frames were analyzed using Nodexl algorithm. The result of this study is that, first, the trend of topic frames by AI speaker provider type was different according to the characteristics of the four operators (communication service provider, online platform, OS provider, and IT device manufacturer). Specifically, online platform operators (Google, Naver, Amazon, Kakao) appeared as a frame that uses AI speakers as'search or input devices'. On the other hand, telecommunications operators (SKT, KT) showed prominent frames for IPTV, which is the parent company's flagship business, and 'auxiliary device' of the telecommunication business. Furthermore, the frame of "personalization of products and voice service" was remarkable for OS operators (MS, Apple), and the frame for IT device manufacturers (Samsung) was "Internet of Things (IoT) Integrated Intelligence System". The econd, result id that the trend of the topic frame by AI speaker development period (by year) showed a tendency to develop around AI technology in the first phase (2014-2016), and in the second phase (2017-2018), the social relationship between AI technology and users It was related to interaction, and in the third phase (2019), there was a trend of shifting from AI technology-centered to user-centered. As a result of QAP analysis, it was found that news frames by business operator and development period in AI speaker development are socially constituted by determinants of media discourse. The implication of this study was that the evolution of AI speakers was found by the characteristics of the parent company and the process of co-evolution due to interactions between users by business operator and development period. The implications of this study are that the results of this study are important indicators for predicting the future prospects of AI speakers and presenting directions accordingly.

A Study on the Designer's Post-Evaluation of Gyeongui Line Forest Park Based on Ground Theory - Focused on Yeonnam-dong Section - (근거이론을 활용한 설계자의 경의선숲길공원 사후평가 - 연남동 구간을 중심으로 -)

  • Kim, Eun-Young;Hong, Youn-Soon
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.47 no.3
    • /
    • pp.39-48
    • /
    • 2019
  • This research is based on the analysis of in-depth interviews of designers who participated in the design of the Yeonnam-dong section, which was completed in 2016. The case study site has received many domestic and foreign awards and is receiving very positive reviews from actual users. 53 concepts were derived from the open coding of the ground theory methodology. Thirty-four higher categories incorporated the concepts and 18 higher categories that reintegrated them. Later, the six categories of the ground theory were interpreted as the paradigm, and it was determined that the aspects of 'will of client' and 'work efficiency', 'site resources' and 'field manager's specialty' were the categories that had the greatest positive impact on the park construction. The key category of this park's construction was interpreted as "a park-construction model with active empathy and communication." The results of the study and are linked to the following research proposals. First, the need to improve the trust between the client and the landscape designer and the need to improve the customary administrative procedures; second, the importance of the input of landscape experts into the park construction process; third, the importance of all efforts to develop the design; fourth, the importance of on-site circular resources and landscape preservation; and fifth active social participation to increase the opportunity. This study, which seeks to grasp the facts that existed behind the park's construction, which received excellent internal and external evaluations, and has a qualitative, objective and structural interpretation of the social network related to the park's construction, in contrast to the conventional quantitative post-evaluation. It is expected that the administration and system improvements related to landscaping will be further improved through the continuation of in-depth post-evaluation studies.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Feasibility of Tax Increase in Korean Welfare State via Estimation of Optimal Tax burden Ratio (적정조세부담률 추정을 통한 한국 복지국가 증세가능성에 관한 연구)

  • Kim, SeongWook
    • 한국사회정책
    • /
    • v.20 no.3
    • /
    • pp.77-115
    • /
    • 2013
  • The purpose of this study is to present empirical evidence for discussion of financing social welfare via estimating optimal tax burden in the main member countries of the OECD by using Hausman-Taylor method considering endogeneity of explanatory variables. Also, the author produced an international tax comparison index reflecting theoretical hypotheses on revenue-expenditure nexus within a model to compare real tax burden by countries and to examine feasibility of tax increase in Korea. As a result of the analysis, the higher the level of tax burden was, the higher the level of welfare expenditure was, indicating the connection between high burden and high welfare from the aspect of scale. The results also indicated that the subject countries recently entered into the state of low tax burden. Meanwhile, Korea had maintained low burden until the late 1990s but the tax burden soared up since the financial crisis related to the IMF. However, due to the impact of foreign economy and the tax reduction policy, it reentered into the low-burden state after 2009. On the other hand, the degree of social welfare expenditure's reducing tax burden has been gradually enhanced since the crisis. In this context, the current optimal tax burden ratio of Korea as of 2010 may be 25.8%~26.5% of GDP based on input of welfare expenditure variables, a percent that Korea was investigated to be a 'high tax burden-low ITC' country whose tax increase of 0.7~1.4%p may be feasible and that the success of tax system reform for tax increase might be higher probability when compare to others. However, measures of increasing social security contributions and consumption tax were analyzed to be improper from the aspect of managing finance when compared to increase in other tax items, considering the relatively higher ITC. Tax increase is not necessarily required though there may be room for tax increase; the optimal tax burden ratio can be understood as the level that may be achieved on average when compared to other nations, not as the "proper" level. Thus, discussion of tax increase should be accompanied with comprehensive understanding of models of economic developmental difference from nations and institutional & historical attributes included in specific tax mix.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

Assessment of Region Specific Angstrom-Prescott Coefficients on Uncertainties of Crop Yield Estimates using CERES-Rice Model (작물모형 입력자료용 일사량 추정을 위한 지역 특이적 AP 계수 평가)

  • Young Sang, Joh;Jaemin, Jung;Shinwoo, Hyun;Kwang Soo, Kim
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.24 no.4
    • /
    • pp.256-266
    • /
    • 2022
  • Empirical models including the Angstrom-Prescott (AP) model have been used to estimate solar radiation at sites, which would support a wide use of crop models. The objective of this study was to estimate two sets of solar radiation estimates using the AP coefficients derived for climate zone (APFrere) and specific site (APChoi), respectively. The daily solar radiation was estimated at 18 sites in Korea where long-term measurements of solar radiation were available. In the present study, daily solar radiation and sunshine duration were collected for the period from 2012 to 2021. Daily weather data including maximum and minimum temperatures and rainfall were also obtained to prepare input data to a process-based crop model, CERES-Rice model included in Decision Support System for Agrotechnology Transfer (DSSAT). It was found that the daily estimates of solar radiation using the climate zone specific coefficient, SFrere, had significantly less error than those using site-specific coefficients SChoi (p<0.05). The cumulative values of SFrere for the period from march to September also had less error at 55% of study sites than those of SChoi. Still, the use of SFrere and SChoi as inputs to the CERES-Rice model resulted in slight differences between the outcomes of crop growth simulations, which had no significant difference between these outputs. These results suggested that the AP coefficients for the temperate climate zone would be preferable for the estimation of solar radiation. This merits further evaluation studies to compare the AP model with other sophisticated approaches such as models based on satellite data.

Estimation of ecological flow and fish habitats for Andong Dam downstream reach using 1-D and 2-D physical habitat models (1차원 및 2차원 물리서식처 모형을 활용한 안동댐 하류 하천의 환경생태유량 및 어류서식처 추정)

  • Kim, Yongwon;Lee, Jiwan;Woo, Soyoung;Kim, Soohong;Lee, Jongjin;Kim, Seongjoon
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.12
    • /
    • pp.1041-1052
    • /
    • 2022
  • This study is to estimate the optimal ecological flow and analysis the spatial distribution of fish habitat for Andong dam downstream reach (4,565.7 km2) using PHABSIM (Physical Habiat Simulation System) and River2D. To establish habitat models, the cross-section informations and hydraulic input data were collected uisng the Nakdong river basic plan report. The establishment range of PHABSIM was set up about 410.0 m from Gudam streamflow gauging station (GD) and about 6.0 km including GD for River2D. To select representative fish species and construct HSI (Habitat Suitability Index), the fish survey was performed at Pungji bridge where showed well the physical characteristics of target stream located downstream of GD. As a result of the fish survey, Zacco platypus was showed highly relative abundance resulting in selecting as the representative fish species, and HSI was constructed using physical habitat characteristics of the Zacco platypus. The optimal range of HSI was 0.3~0.5 m/s at the velocity suitability index, 0.4~0.6 m at the depth suitability index, and the substrate was sand to fine gravel. As a result of estimating the optimal ecological flow by applying HSI to PHABSIM, the optimal ecological flow for target stream was 20.0 m3/sec. As a result of analysis two-dimensional spatial analysis of fish habitat using River2D, WUA (Weighted Usable Area) was estimated 107,392.0 m2/1000 m under the ecological flow condition and it showed the fish habitat was secured throughout the target stream compared with Q355 condition.

Improvement of turbid water prediction accuracy using sensor-based monitoring data in Imha Dam reservoir (센서 기반 모니터링 자료를 활용한 임하댐 저수지 탁수 예측 정확도 개선)

  • Kim, Jongmin;Lee, Sang Ung;Kwon, Siyoon;Chung, Se Woong;Kim, Young Do
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.11
    • /
    • pp.931-939
    • /
    • 2022
  • In Korea, about two-thirds of the precipitation is concentrated in the summer season, so the problem of turbidity in the summer flood season varies from year to year. Concentrated rainfall due to abnormal rainfall and extreme weather is on the rise. The inflow of turbidity caused a sudden increase in turbidity in the water, causing a problem of turbidity in the dam reservoir. In particular, in Korea, where rivers and dam reservoirs are used for most of the annual average water consumption, if turbidity problems are prolonged, social and environmental problems such as agriculture, industry, and aquatic ecosystems in downstream areas will occur. In order to cope with such turbidity prediction, research on turbidity modeling is being actively conducted. Flow rate, water temperature, and SS data are required to model turbid water. To this end, the national measurement network measures turbidity by measuring SS in rivers and dam reservoirs, but there is a limitation in that the data resolution is low due to insufficient facilities. However, there is an unmeasured period depending on each dam and weather conditions. As a sensor for measuring turbidity, there are Optical Backscatter Sensor (OBS) and YSI, and a sensor for measuring SS uses equipment such as Laser In-Situ Scattering and Transmissometry (LISST). However, in the case of such a high-tech sensor, there is a limit due to the stability of the equipment. Therefore, there is an unmeasured period through analysis based on the acquired flow rate, water temperature, SS, and turbidity data, so it is necessary to develop a relational expression to calculate the SS used for the input data. In this study, the AEM3D model used in the Water Resources Corporation SURIAN system was used to improve the accuracy of prediction of turbidity through the turbidity-SS relationship developed based on the measurement data near the dam outlet.