• Title/Summary/Keyword: Prediction Accuracy

Search Result 3,705, Processing Time 0.036 seconds

Landslide Susceptibility Mapping Using Deep Neural Network and Convolutional Neural Network (Deep Neural Network와 Convolutional Neural Network 모델을 이용한 산사태 취약성 매핑)

  • Gong, Sung-Hyun;Baek, Won-Kyung;Jung, Hyung-Sup
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_2
    • /
    • pp.1723-1735
    • /
    • 2022
  • Landslides are one of the most prevalent natural disasters, threating both humans and property. Also landslides can cause damage at the national level, so effective prediction and prevention are essential. Research to produce a landslide susceptibility map with high accuracy is steadily being conducted, and various models have been applied to landslide susceptibility analysis. Pixel-based machine learning models such as frequency ratio models, logistic regression models, ensembles models, and Artificial Neural Networks have been mainly applied. Recent studies have shown that the kernel-based convolutional neural network (CNN) technique is effective and that the spatial characteristics of input data have a significant effect on the accuracy of landslide susceptibility mapping. For this reason, the purpose of this study is to analyze landslide vulnerability using a pixel-based deep neural network model and a patch-based convolutional neural network model. The research area was set up in Gangwon-do, including Inje, Gangneung, and Pyeongchang, where landslides occurred frequently and damaged. Landslide-related factors include slope, curvature, stream power index (SPI), topographic wetness index (TWI), topographic position index (TPI), timber diameter, timber age, lithology, land use, soil depth, soil parent material, lineament density, fault density, normalized difference vegetation index (NDVI) and normalized difference water index (NDWI) were used. Landslide-related factors were built into a spatial database through data preprocessing, and landslide susceptibility map was predicted using deep neural network (DNN) and CNN models. The model and landslide susceptibility map were verified through average precision (AP) and root mean square errors (RMSE), and as a result of the verification, the patch-based CNN model showed 3.4% improved performance compared to the pixel-based DNN model. The results of this study can be used to predict landslides and are expected to serve as a scientific basis for establishing land use policies and landslide management policies.

Study on Tourism Demand Forecast and Influencing Factors in Busan Metropolitan City (부산 연안도시 관광수요 예측과 영향요인에 관한 연구)

  • Kyu Won Hwang;Sung Mo Nam;Ah Reum Jang;Moon Suk Lee
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.7
    • /
    • pp.915-929
    • /
    • 2023
  • Improvements in people's quality of life, diversification of leisure activities, and changes in population structure have led to an increase in the demand for tourism and an expansion of the diversification of tourism activities. In particular, for coastal cities where land and marine tourism elements coexist, various factors influence their tourism demands. Tourism requires the construction of infrastructure and content development according to the demand at the tourist destination. This study aims to improve the prediction accuracy and explore influencing factors through time series analysis of tourism scale using agent-based data. Basic local governments in the Busan area were examined, and the data used were the number of tourists and the amount of tourism consumption on a monthly basis. The univariate time series analysis, which is a deterministic model, was used along with the SARIMAX analysis to identify the influencing factor. The tourism consumption propensity, focusing on the consumption amount according to business types and the amount of mentions on SNS, was set as the influencing factor. The difference in accuracy (RMSE standard) between the time series models that did and did not consider COVID-19 was found to be very wide, ranging from 1.8 times to 32.7 times by region. Additionally, considering the influencing factor, the tourism consumption business type and SNS trends were found to significantly impact the number of tourists and the amount of tourism consumption. Therefore, to predict future demand, external influences as well as the tourists' consumption tendencies and interests in terms of local tourism must be considered. This study aimed to predict future tourism demand in a coastal city such as Busan and identify factors affecting tourism scale, thereby contributing to policy decision-making to prepare tourism demand in consideration of government tourism policies and tourism trends.

Association between Texture Analysis Parameters and Molecular Biologic KRAS Mutation in Non-Mucinous Rectal Cancer (원발성 비점액성 직장암 환자에서 자기공명영상 기반 텍스처 분석 변수와 KRAS 유전자 변이와의 연관성)

  • Sung Jae Jo;Seung Ho Kim;Sang Joon Park;Yedaun Lee;Jung Hee Son
    • Journal of the Korean Society of Radiology
    • /
    • v.82 no.2
    • /
    • pp.406-416
    • /
    • 2021
  • Purpose To evaluate the association between magnetic resonance imaging (MRI)-based texture parameters and Kirsten rat sarcoma viral oncogene homolog (KRAS) mutation in patients with non-mucinous rectal cancer. Materials and Methods Seventy-nine patients who had pathologically confirmed rectal non-mucinous adenocarcinoma with or without KRAS-mutation and had undergone rectal MRI were divided into a training (n = 46) and validation dataset (n = 33). A texture analysis was performed on the axial T2-weighted images. The association was statistically analyzed using the Mann-Whitney U test. To extract an optimal cut-off value for the prediction of KRAS mutation, a receiver operating characteristic curve analysis was performed. The cut-off value was verified using the validation dataset. Results In the training dataset, skewness in the mutant group (n = 22) was significantly higher than in the wild-type group (n = 24) (0.221 ± 0.283; -0.006 ± 0.178, respectively, p = 0.003). The area under the curve of the skewness was 0.757 (95% confidence interval, 0.606 to 0.872) with a maximum accuracy of 71%, a sensitivity of 64%, and a specificity of 78%. None of the other texture parameters were associated with KRAS mutation (p > 0.05). When a cut-off value of 0.078 was applied to the validation dataset, this had an accuracy of 76%, a sensitivity of 86%, and a specificity of 68%. Conclusion Skewness was associated with KRAS mutation in patients with non-mucinous rectal cancer.

Safety and Efficacy of Ultrasound-Guided Percutaneous Core Needle Biopsy of Pancreatic and Peripancreatic Lesions Adjacent to Critical Vessels (주요 혈관 근처의 췌장 또는 췌장 주위 병변에 대한 초음파 유도하 경피적 중심 바늘 생검의 안전성과 효율성)

  • Sun Hwa Chung;Hyun Ji Kang;Hyo Jeong Lee;Jin Sil Kim;Jeong Kyong Lee
    • Journal of the Korean Society of Radiology
    • /
    • v.82 no.5
    • /
    • pp.1207-1217
    • /
    • 2021
  • Purpose To evaluate the safety and efficacy of ultrasound-guided percutaneous core needle biopsy (USPCB) of pancreatic and peripancreatic lesions adjacent to critical vessels. Materials and Methods Data were collected retrospectively from 162 patients who underwent USPCB of the pancreas (n = 98), the peripancreatic area adjacent to the portal vein, the paraaortic area adjacent to pancreatic uncinate (n = 34), and lesions on the third duodenal portion (n = 30) during a 10-year period. An automated biopsy gun with an 18-gauge needle was used for biopsies under US guidance. The USPCB results were compared with those of the final follow-up imaging performed postoperatively. The diagnostic accuracy and major complication rate of the USPCB were calculated. Multiple factors were evaluated for the prediction of successful biopsies using univariate and multivariate analyses. Results The histopathologic diagnosis from USPCB was correct in 149 (92%) patients. The major complication rate was 3%. Four cases of mesenteric hematomas and one intramural hematoma of the duodenum occurred during the study period. The following factors were significantly associated with successful biopsies: a transmesenteric biopsy route rather than a transgastric or transenteric route; good visualization of targets; and evaluation of the entire US pathway. In addition, the number of biopsies required was less when the biopsy was successful. Conclusion USPCB demonstrated high diagnostic accuracy and a low complication rate for the histopathologic diagnosis of pancreatic and peripancreatic lesions adjacent to critical vessels.

The Classification System and Information Service for Establishing a National Collaborative R&D Strategy in Infectious Diseases: Focusing on the Classification Model for Overseas Coronavirus R&D Projects (국가 감염병 공동R&D전략 수립을 위한 분류체계 및 정보서비스에 대한 연구: 해외 코로나바이러스 R&D과제의 분류모델을 중심으로)

  • Lee, Doyeon;Lee, Jae-Seong;Jun, Seung-pyo;Kim, Keun-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.127-147
    • /
    • 2020
  • The world is suffering from numerous human and economic losses due to the novel coronavirus infection (COVID-19). The Korean government established a strategy to overcome the national infectious disease crisis through research and development. It is difficult to find distinctive features and changes in a specific R&D field when using the existing technical classification or science and technology standard classification. Recently, a few studies have been conducted to establish a classification system to provide information about the investment research areas of infectious diseases in Korea through a comparative analysis of Korea government-funded research projects. However, these studies did not provide the necessary information for establishing cooperative research strategies among countries in the infectious diseases, which is required as an execution plan to achieve the goals of national health security and fostering new growth industries. Therefore, it is inevitable to study information services based on the classification system and classification model for establishing a national collaborative R&D strategy. Seven classification - Diagnosis_biomarker, Drug_discovery, Epidemiology, Evaluation_validation, Mechanism_signaling pathway, Prediction, and Vaccine_therapeutic antibody - systems were derived through reviewing infectious diseases-related national-funded research projects of South Korea. A classification system model was trained by combining Scopus data with a bidirectional RNN model. The classification performance of the final model secured robustness with an accuracy of over 90%. In order to conduct the empirical study, an infectious disease classification system was applied to the coronavirus-related research and development projects of major countries such as the STAR Metrics (National Institutes of Health) and NSF (National Science Foundation) of the United States(US), the CORDIS (Community Research & Development Information Service)of the European Union(EU), and the KAKEN (Database of Grants-in-Aid for Scientific Research) of Japan. It can be seen that the research and development trends of infectious diseases (coronavirus) in major countries are mostly concentrated in the prediction that deals with predicting success for clinical trials at the new drug development stage or predicting toxicity that causes side effects. The intriguing result is that for all of these nations, the portion of national investment in the vaccine_therapeutic antibody, which is recognized as an area of research and development aimed at the development of vaccines and treatments, was also very small (5.1%). It indirectly explained the reason of the poor development of vaccines and treatments. Based on the result of examining the investment status of coronavirus-related research projects through comparative analysis by country, it was found that the US and Japan are relatively evenly investing in all infectious diseases-related research areas, while Europe has relatively large investments in specific research areas such as diagnosis_biomarker. Moreover, the information on major coronavirus-related research organizations in major countries was provided by the classification system, thereby allowing establishing an international collaborative R&D projects.

Predicting the Direction of the Stock Index by Using a Domain-Specific Sentiment Dictionary (주가지수 방향성 예측을 위한 주제지향 감성사전 구축 방안)

  • Yu, Eunji;Kim, Yoosin;Kim, Namgyu;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.1
    • /
    • pp.95-110
    • /
    • 2013
  • Recently, the amount of unstructured data being generated through a variety of social media has been increasing rapidly, resulting in the increasing need to collect, store, search for, analyze, and visualize this data. This kind of data cannot be handled appropriately by using the traditional methodologies usually used for analyzing structured data because of its vast volume and unstructured nature. In this situation, many attempts are being made to analyze unstructured data such as text files and log files through various commercial or noncommercial analytical tools. Among the various contemporary issues dealt with in the literature of unstructured text data analysis, the concepts and techniques of opinion mining have been attracting much attention from pioneer researchers and business practitioners. Opinion mining or sentiment analysis refers to a series of processes that analyze participants' opinions, sentiments, evaluations, attitudes, and emotions about selected products, services, organizations, social issues, and so on. In other words, many attempts based on various opinion mining techniques are being made to resolve complicated issues that could not have otherwise been solved by existing traditional approaches. One of the most representative attempts using the opinion mining technique may be the recent research that proposed an intelligent model for predicting the direction of the stock index. This model works mainly on the basis of opinions extracted from an overwhelming number of economic news repots. News content published on various media is obviously a traditional example of unstructured text data. Every day, a large volume of new content is created, digitalized, and subsequently distributed to us via online or offline channels. Many studies have revealed that we make better decisions on political, economic, and social issues by analyzing news and other related information. In this sense, we expect to predict the fluctuation of stock markets partly by analyzing the relationship between economic news reports and the pattern of stock prices. So far, in the literature on opinion mining, most studies including ours have utilized a sentiment dictionary to elicit sentiment polarity or sentiment value from a large number of documents. A sentiment dictionary consists of pairs of selected words and their sentiment values. Sentiment classifiers refer to the dictionary to formulate the sentiment polarity of words, sentences in a document, and the whole document. However, most traditional approaches have common limitations in that they do not consider the flexibility of sentiment polarity, that is, the sentiment polarity or sentiment value of a word is fixed and cannot be changed in a traditional sentiment dictionary. In the real world, however, the sentiment polarity of a word can vary depending on the time, situation, and purpose of the analysis. It can also be contradictory in nature. The flexibility of sentiment polarity motivated us to conduct this study. In this paper, we have stated that sentiment polarity should be assigned, not merely on the basis of the inherent meaning of a word but on the basis of its ad hoc meaning within a particular context. To implement our idea, we presented an intelligent investment decision-support model based on opinion mining that performs the scrapping and parsing of massive volumes of economic news on the web, tags sentiment words, classifies sentiment polarity of the news, and finally predicts the direction of the next day's stock index. In addition, we applied a domain-specific sentiment dictionary instead of a general purpose one to classify each piece of news as either positive or negative. For the purpose of performance evaluation, we performed intensive experiments and investigated the prediction accuracy of our model. For the experiments to predict the direction of the stock index, we gathered and analyzed 1,072 articles about stock markets published by "M" and "E" media between July 2011 and September 2011.

A Study on Knowledge Entity Extraction Method for Individual Stocks Based on Neural Tensor Network (뉴럴 텐서 네트워크 기반 주식 개별종목 지식개체명 추출 방법에 관한 연구)

  • Yang, Yunseok;Lee, Hyun Jun;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.25-38
    • /
    • 2019
  • Selecting high-quality information that meets the interests and needs of users among the overflowing contents is becoming more important as the generation continues. In the flood of information, efforts to reflect the intention of the user in the search result better are being tried, rather than recognizing the information request as a simple string. Also, large IT companies such as Google and Microsoft focus on developing knowledge-based technologies including search engines which provide users with satisfaction and convenience. Especially, the finance is one of the fields expected to have the usefulness and potential of text data analysis because it's constantly generating new information, and the earlier the information is, the more valuable it is. Automatic knowledge extraction can be effective in areas where information flow is vast, such as financial sector, and new information continues to emerge. However, there are several practical difficulties faced by automatic knowledge extraction. First, there are difficulties in making corpus from different fields with same algorithm, and it is difficult to extract good quality triple. Second, it becomes more difficult to produce labeled text data by people if the extent and scope of knowledge increases and patterns are constantly updated. Third, performance evaluation is difficult due to the characteristics of unsupervised learning. Finally, problem definition for automatic knowledge extraction is not easy because of ambiguous conceptual characteristics of knowledge. So, in order to overcome limits described above and improve the semantic performance of stock-related information searching, this study attempts to extract the knowledge entity by using neural tensor network and evaluate the performance of them. Different from other references, the purpose of this study is to extract knowledge entity which is related to individual stock items. Various but relatively simple data processing methods are applied in the presented model to solve the problems of previous researches and to enhance the effectiveness of the model. From these processes, this study has the following three significances. First, A practical and simple automatic knowledge extraction method that can be applied. Second, the possibility of performance evaluation is presented through simple problem definition. Finally, the expressiveness of the knowledge increased by generating input data on a sentence basis without complex morphological analysis. The results of the empirical analysis and objective performance evaluation method are also presented. The empirical study to confirm the usefulness of the presented model, experts' reports about individual 30 stocks which are top 30 items based on frequency of publication from May 30, 2017 to May 21, 2018 are used. the total number of reports are 5,600, and 3,074 reports, which accounts about 55% of the total, is designated as a training set, and other 45% of reports are designated as a testing set. Before constructing the model, all reports of a training set are classified by stocks, and their entities are extracted using named entity recognition tool which is the KKMA. for each stocks, top 100 entities based on appearance frequency are selected, and become vectorized using one-hot encoding. After that, by using neural tensor network, the same number of score functions as stocks are trained. Thus, if a new entity from a testing set appears, we can try to calculate the score by putting it into every single score function, and the stock of the function with the highest score is predicted as the related item with the entity. To evaluate presented models, we confirm prediction power and determining whether the score functions are well constructed by calculating hit ratio for all reports of testing set. As a result of the empirical study, the presented model shows 69.3% hit accuracy for testing set which consists of 2,526 reports. this hit ratio is meaningfully high despite of some constraints for conducting research. Looking at the prediction performance of the model for each stocks, only 3 stocks, which are LG ELECTRONICS, KiaMtr, and Mando, show extremely low performance than average. this result maybe due to the interference effect with other similar items and generation of new knowledge. In this paper, we propose a methodology to find out key entities or their combinations which are necessary to search related information in accordance with the user's investment intention. Graph data is generated by using only the named entity recognition tool and applied to the neural tensor network without learning corpus or word vectors for the field. From the empirical test, we confirm the effectiveness of the presented model as described above. However, there also exist some limits and things to complement. Representatively, the phenomenon that the model performance is especially bad for only some stocks shows the need for further researches. Finally, through the empirical study, we confirmed that the learning method presented in this study can be used for the purpose of matching the new text information semantically with the related stocks.

Business Application of Convolutional Neural Networks for Apparel Classification Using Runway Image (합성곱 신경망의 비지니스 응용: 런웨이 이미지를 사용한 의류 분류를 중심으로)

  • Seo, Yian;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.1-19
    • /
    • 2018
  • Large amount of data is now available for research and business sectors to extract knowledge from it. This data can be in the form of unstructured data such as audio, text, and image data and can be analyzed by deep learning methodology. Deep learning is now widely used for various estimation, classification, and prediction problems. Especially, fashion business adopts deep learning techniques for apparel recognition, apparel search and retrieval engine, and automatic product recommendation. The core model of these applications is the image classification using Convolutional Neural Networks (CNN). CNN is made up of neurons which learn parameters such as weights while inputs come through and reach outputs. CNN has layer structure which is best suited for image classification as it is comprised of convolutional layer for generating feature maps, pooling layer for reducing the dimensionality of feature maps, and fully-connected layer for classifying the extracted features. However, most of the classification models have been trained using online product image, which is taken under controlled situation such as apparel image itself or professional model wearing apparel. This image may not be an effective way to train the classification model considering the situation when one might want to classify street fashion image or walking image, which is taken in uncontrolled situation and involves people's movement and unexpected pose. Therefore, we propose to train the model with runway apparel image dataset which captures mobility. This will allow the classification model to be trained with far more variable data and enhance the adaptation with diverse query image. To achieve both convergence and generalization of the model, we apply Transfer Learning on our training network. As Transfer Learning in CNN is composed of pre-training and fine-tuning stages, we divide the training step into two. First, we pre-train our architecture with large-scale dataset, ImageNet dataset, which consists of 1.2 million images with 1000 categories including animals, plants, activities, materials, instrumentations, scenes, and foods. We use GoogLeNet for our main architecture as it has achieved great accuracy with efficiency in ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Second, we fine-tune the network with our own runway image dataset. For the runway image dataset, we could not find any previously and publicly made dataset, so we collect the dataset from Google Image Search attaining 2426 images of 32 major fashion brands including Anna Molinari, Balenciaga, Balmain, Brioni, Burberry, Celine, Chanel, Chloe, Christian Dior, Cividini, Dolce and Gabbana, Emilio Pucci, Ermenegildo, Fendi, Giuliana Teso, Gucci, Issey Miyake, Kenzo, Leonard, Louis Vuitton, Marc Jacobs, Marni, Max Mara, Missoni, Moschino, Ralph Lauren, Roberto Cavalli, Sonia Rykiel, Stella McCartney, Valentino, Versace, and Yve Saint Laurent. We perform 10-folded experiments to consider the random generation of training data, and our proposed model has achieved accuracy of 67.2% on final test. Our research suggests several advantages over previous related studies as to our best knowledge, there haven't been any previous studies which trained the network for apparel image classification based on runway image dataset. We suggest the idea of training model with image capturing all the possible postures, which is denoted as mobility, by using our own runway apparel image dataset. Moreover, by applying Transfer Learning and using checkpoint and parameters provided by Tensorflow Slim, we could save time spent on training the classification model as taking 6 minutes per experiment to train the classifier. This model can be used in many business applications where the query image can be runway image, product image, or street fashion image. To be specific, runway query image can be used for mobile application service during fashion week to facilitate brand search, street style query image can be classified during fashion editorial task to classify and label the brand or style, and website query image can be processed by e-commerce multi-complex service providing item information or recommending similar item.

A Comparative Study on Failure Pprediction Models for Small and Medium Manufacturing Company (중소제조기업의 부실예측모형 비교연구)

  • Hwangbo, Yun;Moon, Jong Geon
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.11 no.3
    • /
    • pp.1-15
    • /
    • 2016
  • This study has analyzed predication capabilities leveraging multi-variate model, logistic regression model, and artificial neural network model based on financial information of medium-small sized companies list in KOSDAQ. 83 delisted companies from 2009 to 2012 and 83 normal companies, i.e. 166 firms in total were sampled for the analysis. Modelling with training data was mobilized for 100 companies inlcuding 50 delisted ones and 50 normal ones at random out of the 166 companies. The rest of samples, 66 companies, were used to verify accuracies of the models. Each model was designed by carrying out T-test with 79 financial ratios for the last 5 years and identifying 9 significant variables. T-test has shown that financial profitability variables were major variables to predict a financial risk at an early stage, and financial stability variables and financial cashflow variables were identified as additional significant variables at a later stage of insolvency. When predication capabilities of the models were compared, for training data, a logistic regression model exhibited the highest accuracy while for test data, the artificial neural networks model provided the most accurate results. There are differences between the previous researches and this study as follows. Firstly, this study considered a time-series aspect in light of the fact that failure proceeds gradually. Secondly, while previous studies constructed a multivariate discriminant model ignoring normality, this study has reviewed the regularity of the independent variables, and performed comparisons with the other models. Policy implications of this study is that the reliability for the disclosure documents is important because the simptoms of firm's fail woule be shown on financial statements according to this paper. Therefore institutional arragements for restraing moral laxity from accounting firms or its workers should be strengthened.

  • PDF

Development of a Window Program for Searching CpG Island (CpG Island 검색용 윈도우 프로그램 개발)

  • Kim, Ki-Bong
    • Journal of Life Science
    • /
    • v.18 no.8
    • /
    • pp.1132-1139
    • /
    • 2008
  • A CpG island is a short stretch of DNA in which the frequency of the CG dinucleotide is higher than other regions. CpG islands are present in the promoters and exonic regions of approximately $30{\sim}60$% of mammalian genes so they are useful markers for genes in organisms containing 5-methylcytosine in their genomes. Recent evidence supports the notion that the hypermethylation of CpG island, by silencing tumor suppressor genes, plays a major causal role in cancer, which has been described in almost every tumor types. In this respect, CpG island search by computational methods is very helpful for cancer research and computational promoter and gene predictions. I therefore developed a window program (called CpGi) on the basis of CpG island criteria defined by D. Takai and P. A. Jones. The program 'CpGi' was implemented in Visual C++ 6.0 and can determine the locations of CpG islands using diverse parameters (%GC, Obs (CpG)/Exp (CpG), window size, step size, gap value, # of CpG, length) specified by user. The analysis result of CpGi provides a graphical map of CpG islands and G+C% plot, where more detailed information on CpG island can be obtained through pop-up window. Two human contigs, i.e. AP00524 (from chromosome 22) and NT_029490.3 (from chromosome 21), were used to compare the performance of CpGi and two other public programs for the accuracy of search results. The two other programs used in the performance comparison are Emboss-CpGPlot and CpG Island Searcher that are web-based public CpG island search programs. The comparison result showed that CpGi is on a level with or outperforms Emboss-CpGPlot and CpG Island Searcher. Having a simple and easy-to-use user interface, CpGi would be a very useful tool for genome analysis and CpG island research. To obtain a copy of CpGi for academic use only, contact corresponding author.