Search | Korea Science

Influence analysis of Internet buzz to corporate performance : Individual stock price prediction using sentiment analysis of online news (온라인 언급이 기업 성과에 미치는 영향 분석 : 뉴스 감성분석을 통한 기업별 주가 예측)

Jeong, Ji Seon;Kim, Dong Sung;Kim, Jong Woo
- Journal of Intelligence and Information Systems
- /
- v.21 no.4
- /
- pp.37-51
- /
- 2015
Due to the development of internet technology and the rapid increase of internet data, various studies are actively conducted on how to use and analyze internet data for various purposes. In particular, in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of the current application of structured data. Especially, there are various studies on sentimental analysis to score opinions based on the distribution of polarity such as positivity or negativity of vocabularies or sentences of the texts in documents. As a part of such studies, this study tries to predict ups and downs of stock prices of companies by performing sentimental analysis on news contexts of the particular companies in the Internet. A variety of news on companies is produced online by different economic agents, and it is diffused quickly and accessed easily in the Internet. So, based on inefficient market hypothesis, we can expect that news information of an individual company can be used to predict the fluctuations of stock prices of the company if we apply proper data analysis techniques. However, as the areas of corporate management activity are different, an analysis considering characteristics of each company is required in the analysis of text data based on machine-learning. In addition, since the news including positive or negative information on certain companies have various impacts on other companies or industry fields, an analysis for the prediction of the stock price of each company is necessary. Therefore, this study attempted to predict changes in the stock prices of the individual companies that applied a sentimental analysis of the online news data. Accordingly, this study chose top company in KOSPI 200 as the subjects of the analysis, and collected and analyzed online news data by each company produced for two years on a representative domestic search portal service, Naver. In addition, considering the differences in the meanings of vocabularies for each of the certain economic subjects, it aims to improve performance by building up a lexicon for each individual company and applying that to an analysis. As a result of the analysis, the accuracy of the prediction by each company are different, and the prediction accurate rate turned out to be 56% on average. Comparing the accuracy of the prediction of stock prices on industry sectors, 'energy/chemical', 'consumer goods for living' and 'consumer discretionary' showed a relatively higher accuracy of the prediction of stock prices than other industries, while it was found that the sectors such as 'information technology' and 'shipbuilding/transportation' industry had lower accuracy of prediction. The number of the representative companies in each industry collected was five each, so it is somewhat difficult to generalize, but it could be confirmed that there was a difference in the accuracy of the prediction of stock prices depending on industry sectors. In addition, at the individual company level, the companies such as 'Kangwon Land', 'KT & G' and 'SK Innovation' showed a relatively higher prediction accuracy as compared to other companies, while it showed that the companies such as 'Young Poong', 'LG', 'Samsung Life Insurance', and 'Doosan' had a low prediction accuracy of less than 50%. In this paper, we performed an analysis of the share price performance relative to the prediction of individual companies through the vocabulary of pre-built company to take advantage of the online news information. In this paper, we aim to improve performance of the stock prices prediction, applying online news information, through the stock price prediction of individual companies. Based on this, in the future, it will be possible to find ways to increase the stock price prediction accuracy by complementing the problem of unnecessary words that are added to the sentiment dictionary.
https://doi.org/10.13088/jiis.2015.21.4.037 인용 PDF KSCI

The Pattern Analysis of Financial Distress for Non-audited Firms using Data Mining (데이터마이닝 기법을 활용한 비외감기업의 부실화 유형 분석)

Lee, Su Hyun;Park, Jung Min;Lee, Hyoung Yong
- Journal of Intelligence and Information Systems
- /
- v.21 no.4
- /
- pp.111-131
- /
- 2015
There are only a handful number of research conducted on pattern analysis of corporate distress as compared with research for bankruptcy prediction. The few that exists mainly focus on audited firms because financial data collection is easier for these firms. But in reality, corporate financial distress is a far more common and critical phenomenon for non-audited firms which are mainly comprised of small and medium sized firms. The purpose of this paper is to classify non-audited firms under distress according to their financial ratio using data mining; Self-Organizing Map (SOM). SOM is a type of artificial neural network that is trained using unsupervised learning to produce a lower dimensional discretized representation of the input space of the training samples, called a map. SOM is different from other artificial neural networks as it applies competitive learning as opposed to error-correction learning such as backpropagation with gradient descent, and in the sense that it uses a neighborhood function to preserve the topological properties of the input space. It is one of the popular and successful clustering algorithm. In this study, we classify types of financial distress firms, specially, non-audited firms. In the empirical test, we collect 10 financial ratios of 100 non-audited firms under distress in 2004 for the previous two years (2002 and 2003). Using these financial ratios and the SOM algorithm, five distinct patterns were distinguished. In pattern 1, financial distress was very serious in almost all financial ratios. 12% of the firms are included in these patterns. In pattern 2, financial distress was weak in almost financial ratios. 14% of the firms are included in pattern 2. In pattern 3, growth ratio was the worst among all patterns. It is speculated that the firms of this pattern may be under distress due to severe competition in their industries. Approximately 30% of the firms fell into this group. In pattern 4, the growth ratio was higher than any other pattern but the cash ratio and profitability ratio were not at the level of the growth ratio. It is concluded that the firms of this pattern were under distress in pursuit of expanding their business. About 25% of the firms were in this pattern. Last, pattern 5 encompassed very solvent firms. Perhaps firms of this pattern were distressed due to a bad short-term strategic decision or due to problems with the enterpriser of the firms. Approximately 18% of the firms were under this pattern. This study has the academic and empirical contribution. In the perspectives of the academic contribution, non-audited companies that tend to be easily bankrupt and have the unstructured or easily manipulated financial data are classified by the data mining technology (Self-Organizing Map) rather than big sized audited firms that have the well prepared and reliable financial data. In the perspectives of the empirical one, even though the financial data of the non-audited firms are conducted to analyze, it is useful for find out the first order symptom of financial distress, which makes us to forecast the prediction of bankruptcy of the firms and to manage the early warning and alert signal. These are the academic and empirical contribution of this study. The limitation of this research is to analyze only 100 corporates due to the difficulty of collecting the financial data of the non-audited firms, which make us to be hard to proceed to the analysis by the category or size difference. Also, non-financial qualitative data is crucial for the analysis of bankruptcy. Thus, the non-financial qualitative factor is taken into account for the next study. This study sheds some light on the non-audited small and medium sized firms' distress prediction in the future.
https://doi.org/10.13088/jiis.2015.21.4.111 인용 PDF KSCI

Health Assessment of the Nakdong River Basin Aquatic Ecosystems Utilizing GIS and Spatial Statistics (GIS 및 공간통계를 활용한 낙동강 유역 수생태계의 건강성 평가)

JO, Myung-Hee;SIM, Jun-Seok;LEE, Jae-An;JANG, Sung-Hyun
- Journal of the Korean Association of Geographic Information Studies
- /
- v.18 no.2
- /
- pp.174-189
- /
- 2015
The objective of this study was to reconstruct spatial information using the results of the investigation and evaluation of the health of the living organisms, habitat, and water quality at the investigation points for the aquatic ecosystem health of the Nakdong River basin, to support the rational decision making of the aquatic ecosystem preservation and restoration policies of the Nakdong River basin using spatial analysis techniques, and to present efficient management methods. To analyze the aquatic ecosystem health of the Nakdong River basin, punctiform data were constructed based on the position information of each point with the aquatic ecosystem health investigation and evaluation results of 250 investigation sections. To apply the spatial analysis technique, the data need to be reconstructed into areal data. For this purpose, spatial influence and trends were analyzed using the Kriging interpolation(ArcGIS 10.1, Geostatistical Analysis), and were reconstructed into areal data. To analyze the spatial distribution characteristics of the Nakdong River basin health based on these analytical results, hotspot(Getis-Ord Gi, $G^*_i$), LISA(Local Indicator of Spatial Association), and standard deviational ellipse analyses were used. The hotspot analysis results showed that the hotspot basins of the biotic indices(TDI, BMI, FAI) were the Andong Dam upstream, Wangpicheon, and the Imha Dam basin, and that the health grades of their biotic indices were good. The coldspot basins were Nakdong River Namhae, the Nakdong River mouth, and the Suyeong River basin. The LISA analysis results showed that the exceptional areas were Gahwacheon, the Hapcheon Dam, and the Yeong River upstream basin. These areas had high bio-health indices, but their surrounding basins were low and required management for aquatic ecosystem health. The hotspot basins of the physicochemical factor(BOD) were the Nakdong River downstream basin, Suyeong River, Hoeya River, and the Nakdong River Namhae basin, whereas the coldspot basins were the upstream basins of the Nakdong River tributaries, including Andong Dam, Imha Dam, and Yeong River. The hotspots of the habitat and riverside environment factor(HRI) were different from the hotspots and coldspots of each factor in the LISA analysis results. In general, the habitat and riverside environment of the Nakdong River mainstream and tributaries, including the Nakdong river upstream, Andong Dam, Imha Dam, and the Hapcheon Dam basin, had good health. The coldspot basins of the habitat and riverside environment also showed low health indices of the biotic indices and physicochemical factors, thus requiring management of the habitat and riverside environment. As a result of the time-series analysis with a standard deviation ellipsoid, the areas with good aquatic ecosystem health of the organisms, habitat, and riverside environment showed a tendency to move northward, and the BOD results showed different directions and concentrations by the year of investigation. These aquatic ecosystem health analysis results can provide not only the health management information for each investigation spot but also information for managing the aquatic ecosystem in the catchment unit for the working research staff as well as for the water environment researchers in the future, based on spatial information.
https://doi.org/10.11108/kagis.2015.18.2.174 인용 PDF KSCI

The Difference between Career Barrier Recognition and Career Preparation Behavior by Mandatory military service Planning Level among Male College Students (남자대학생의 군 의무복무계획 수준에 따른 진로장벽인식과 진로준비행동의 차이)

Hong, Hye-Young;Kang, Hye-Young
- 대한공업교육학회지
- /
- v.38 no.2
- /
- pp.218-239
- /
- 2013
This study aims to understand the status of mandatory military service planning and career barrier recognition as well as to analyze the difference between how students perceive mandatory military service as a potential barrier to their future careers(career barrier recognition) and career preparation behavior by the mandatory military service planning level among male college students. For the purpose, inquiries for the subject were set up as follows. 1. What are the levels of mandatory military service planning and career barrier recognition? 2. Is there a difference in career barrier recognition depending on the level of mandatory military service planning? 3. Is there a difference in career preparation behaviour depending on the level of mandatory military service planning? This study found out the level of mandatory military service, military barrier recognition and career preparation behavior of 284 male students from 4 universities in Daejeon and Chungnam area. Along with that, descriptive statistic, correlation analysis and t-test were conducted with SPSS 17.0 program The results of this study are as follows: First, 79.2% of male students have higher mandatory military service planning than the average value. Meanwhile, considering 3 sub-factors of mandatory military service planning, the ratio of those with high scores in practicality is lower than importance and concreteness. Based on this, it is assumable that they have a low perception for practical and concrete behaviors such as data collection in mandatory military service planning, which indicates their awareness has not developed into concrete behaviors even though they recognize the importance of planning. Also 73.9% of male students responded higher career barrier recognition than the average value shows that they recognize mandatory military service as a barrier relatively highly. Especially, those who answered "Very much" (7 scores) for every inquiry in career barrier recognition accounted for 16.9%, which forms the biggest group. and considering the response by each inquiry, it is ascertained that they consider the absence by mandatory military service time or military service as the biggest difficulty. Second, the difference in career barrier recognition between the top 30% and bottom 30% of mandatory military service planning is not statistically significant. However, in terms of importance and the sub-factor of mandatory military service planning, a significant inter-group difference in career barrier recognition is shown. In other words, to join the military is recognized as an obstacle in their career barrier recognition regardless of the mandatory military service planning level. Also, a group which considers the importance of the mandatory military service planning highly recognizes the military as the bigger obstacle compared to the other groups which are not considered in this way. Third, the difference in career barrier recognition between the top 30% and the bottom 30% of the mandatory military service planning is statistically significant. The need of mandatory military service planning is marked by the fact that those with a high level of mandatory military service planning show stronger career barrier recognition than those without plans. Through the study, the need of mandatory military service planning is suggested to both male students and career consultants considering the mandatory military service from a perspective of career based on Korean reality. Also, as precedent studies on pre-inducted men can be hardly found currently, this study is significant in accumulating empirical data about mandatory military service, a unique characteristic of the Korean career development process.
PDF KSCI

Image Watermarking for Copyright Protection of Images on Shopping Mall (쇼핑몰 이미지 저작권보호를 위한 영상 워터마킹)

Bae, Kyoung-Yul
- Journal of Intelligence and Information Systems
- /
- v.19 no.4
- /
- pp.147-157
- /
- 2013
With the advent of the digital environment that can be accessed anytime, anywhere with the introduction of high-speed network, the free distribution and use of digital content were made possible. Ironically this environment is raising a variety of copyright infringement, and product images used in the online shopping mall are pirated frequently. There are many controversial issues whether shopping mall images are creative works or not. According to Supreme Court's decision in 2001, to ad pictures taken with ham products is simply a clone of the appearance of objects to deliver nothing but the decision was not only creative expression. But for the photographer's losses recognized in the advertising photo shoot takes the typical cost was estimated damages. According to Seoul District Court precedents in 2003, if there are the photographer's personality and creativity in the selection of the subject, the composition of the set, the direction and amount of light control, set the angle of the camera, shutter speed, shutter chance, other shooting methods for capturing, developing and printing process, the works should be protected by copyright law by the Court's sentence. In order to receive copyright protection of the shopping mall images by the law, it is simply not to convey the status of the product, the photographer's personality and creativity can be recognized that it requires effort. Accordingly, the cost of making the mall image increases, and the necessity for copyright protection becomes higher. The product images of the online shopping mall have a very unique configuration unlike the general pictures such as portraits and landscape photos and, therefore, the general image watermarking technique can not satisfy the requirements of the image watermarking. Because background of product images commonly used in shopping malls is white or black, or gray scale (gradient) color, it is difficult to utilize the space to embed a watermark and the area is very sensitive even a slight change. In this paper, the characteristics of images used in shopping malls are analyzed and a watermarking technology which is suitable to the shopping mall images is proposed. The proposed image watermarking technology divide a product image into smaller blocks, and the corresponding blocks are transformed by DCT (Discrete Cosine Transform), and then the watermark information was inserted into images using quantization of DCT coefficients. Because uniform treatment of the DCT coefficients for quantization cause visual blocking artifacts, the proposed algorithm used weighted mask which quantizes finely the coefficients located block boundaries and coarsely the coefficients located center area of the block. This mask improves subjective visual quality as well as the objective quality of the images. In addition, in order to improve the safety of the algorithm, the blocks which is embedded the watermark are randomly selected and the turbo code is used to reduce the BER when extracting the watermark. The PSNR(Peak Signal to Noise Ratio) of the shopping mall image watermarked by the proposed algorithm is 40.7~48.5[dB] and BER(Bit Error Rate) after JPEG with QF = 70 is 0. This means the watermarked image is high quality and the algorithm is robust to JPEG compression that is used generally at the online shopping malls. Also, for 40% change in size and 40 degrees of rotation, the BER is 0. In general, the shopping malls are used compressed images with QF which is higher than 90. Because the pirated image is used to replicate from original image, the proposed algorithm can identify the copyright infringement in the most cases. As shown the experimental results, the proposed algorithm is suitable to the shopping mall images with simple background. However, the future study should be carried out to enhance the robustness of the proposed algorithm because the robustness loss is occurred after mask process.
https://doi.org/10.13088/jiis.2013.19.4.147 인용 PDF KSCI

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

Choi, Hochang;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.23 no.3
- /
- pp.69-94
- /
- 2017
Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.
https://doi.org/10.13088/jiis.2017.23.3.069 인용 PDF KSCI

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

Eom, Haneul;Kim, Jaeseong;Choi, Sangok
- Journal of Intelligence and Information Systems
- /
- v.26 no.2
- /
- pp.105-129
- /
- 2020
This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.
https://doi.org/10.13088/jiis.2020.26.2.105 인용 PDF KSCI

Deep Learning-based Professional Image Interpretation Using Expertise Transplant (전문성 이식을 통한 딥러닝 기반 전문 이미지 해석 방법론)

Kim, Taejin;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.26 no.2
- /
- pp.79-104
- /
- 2020
Recently, as deep learning has attracted attention, the use of deep learning is being considered as a method for solving problems in various fields. In particular, deep learning is known to have excellent performance when applied to applying unstructured data such as text, sound and images, and many studies have proven its effectiveness. Owing to the remarkable development of text and image deep learning technology, interests in image captioning technology and its application is rapidly increasing. Image captioning is a technique that automatically generates relevant captions for a given image by handling both image comprehension and text generation simultaneously. In spite of the high entry barrier of image captioning that analysts should be able to process both image and text data, image captioning has established itself as one of the key fields in the A.I. research owing to its various applicability. In addition, many researches have been conducted to improve the performance of image captioning in various aspects. Recent researches attempt to create advanced captions that can not only describe an image accurately, but also convey the information contained in the image more sophisticatedly. Despite many recent efforts to improve the performance of image captioning, it is difficult to find any researches to interpret images from the perspective of domain experts in each field not from the perspective of the general public. Even for the same image, the part of interests may differ according to the professional field of the person who has encountered the image. Moreover, the way of interpreting and expressing the image also differs according to the level of expertise. The public tends to recognize the image from a holistic and general perspective, that is, from the perspective of identifying the image's constituent objects and their relationships. On the contrary, the domain experts tend to recognize the image by focusing on some specific elements necessary to interpret the given image based on their expertise. It implies that meaningful parts of an image are mutually different depending on viewers' perspective even for the same image. So, image captioning needs to implement this phenomenon. Therefore, in this study, we propose a method to generate captions specialized in each domain for the image by utilizing the expertise of experts in the corresponding domain. Specifically, after performing pre-training on a large amount of general data, the expertise in the field is transplanted through transfer-learning with a small amount of expertise data. However, simple adaption of transfer learning using expertise data may invoke another type of problems. Simultaneous learning with captions of various characteristics may invoke so-called 'inter-observation interference' problem, which make it difficult to perform pure learning of each characteristic point of view. For learning with vast amount of data, most of this interference is self-purified and has little impact on learning results. On the contrary, in the case of fine-tuning where learning is performed on a small amount of data, the impact of such interference on learning can be relatively large. To solve this problem, therefore, we propose a novel 'Character-Independent Transfer-learning' that performs transfer learning independently for each character. In order to confirm the feasibility of the proposed methodology, we performed experiments utilizing the results of pre-training on MSCOCO dataset which is comprised of 120,000 images and about 600,000 general captions. Additionally, according to the advice of an art therapist, about 300 pairs of 'image / expertise captions' were created, and the data was used for the experiments of expertise transplantation. As a result of the experiment, it was confirmed that the caption generated according to the proposed methodology generates captions from the perspective of implanted expertise whereas the caption generated through learning on general data contains a number of contents irrelevant to expertise interpretation. In this paper, we propose a novel approach of specialized image interpretation. To achieve this goal, we present a method to use transfer learning and generate captions specialized in the specific domain. In the future, by applying the proposed methodology to expertise transplant in various fields, we expected that many researches will be actively conducted to solve the problem of lack of expertise data and to improve performance of image captioning.
https://doi.org/10.13088/jiis.2020.26.2.079 인용 PDF KSCI

A Review of Personal Radiation Dose per Radiological Technologists Working at General Hospitals (전국 종합병원 방사선사의 개인피폭선량에 대한 고찰)

Jung, Hong-Ryang;Lim, Cheong-Hwan;Lee, Man-Koo
- Journal of radiological science and technology
- /
- v.28 no.2
- /
- pp.137-144
- /
- 2005
To find the personal radiation dose of radiological technologists, a survey was conducted to 623 radiological technologists who had been working at 44 general hospitals in Korea's 16 cities and provinces from 1998 to 2002. A total of 2,624 cases about personal radiological dose that were collected were analyzed by region, year and hospital, the results of which look as follows : 1. The average radiation dose per capita by region and year for the 5 years was 1.61 mSv. By region, Daegu showed the highest amount 4.74 mSv, followed by Gangwon 4.65 mSv and Gyeonggi 2.15 mSv. The lowest amount was recorded in Chungbuk 0.91 mSv, Jeju 0.94 mSv and Busan 0.97 mSv in order. By year, 2000 appeared to be the year showing the highest amount of radiation dose 1.80 mSv, followed by 2002 1.77 mSv, 1999 1.55 mSv, 2001 1.50 mSv and 1998 1.36 mSv. 2. In 1998, Gangwon featured the highest amount of radiological dose per capita 3.28 mSv, followed by Gwangju 2.51 mSv and Daejeon 2.25 mSv, while Jeju 0.86mSv and Chungbuk 0.85 mSv belonged to the area where the radiation dose remained less than 1.0 mSv In 1999, Gangwon also topped the list with 5.67 mSv, followed by Daegu with 4.35 mSv and Gyeonggi with 2.48 mSv. In the same year, the radiation dose was kept below 1.0 mSv. in Ulsan 0.98 mSv, Gyeongbuk 0.95 mSv and Jeju 0.91 mSv. 3. In 2000, Gangwon was again at the top of the list with 5.73 mSv. Ulsan turned out to have less than 1.0 mSv of radiation dose in the years 1998 and 1999 consecutively, whereas the amount increased relatively high to 5.20 mSv. Chungbuk remained below the level of 1.0 mSv with 0.79 mSv. 4. In 2001, Daegu recorded the highest amount of radiation dose among those ever analyzed for 5 years with 9.05 mSv, followed by Gangwon with 4.01 mSv. The area with less than 1.0 mSv included Gyeongbuk 0.99 mSv and Jeonbuk 0.92 mSv. In 2002, Gangwon also led the list with 4.65 mSv while Incheon 0.88 mSv, Jeonbuk 0.96 mSv and Jeju 0.68 mSv belonged to the regions with less than 1.0 mSv of radiation dose. 5. By hospital, KMH in Daegu showed the record high amount of average radiation dose during the period of 5 years 6.82 mSv, followed by GAH 5.88 mSv in Gangwon and CAH 3.66 mSv in Seoul. YSH in Jeonnam 0.36 mSv comes first in the order of the hospitals with least amount of radiation dose, followed by GNH in Gyeongnam 0.39 mSv and DKH in Chungnam 0.51 mSv. There is a limit to the present study in that a focus is laid on the radiological technologists who are working at the 3rd referral hospitals which are regarded to be stable in terms of working conditions while radiological technologists who are working at small-sized hospitals are excluded from the survey. Besides, there are also cases in which hospitals with less than 5 years since establishment are included in the survey and the radiological technologists who have worked for less than 5 years at a hospital are also put to survey. We can't exclude the possibility, either, of assumption that the difference of personal average radiological dose by region, hospital and year might be ascribed to the different working conditions and facilities by medical institutions. It seems therefore desirable to develop standardized instruments to measure working environment objectively and to invent device to compare and analyze them by region and hospital more accurately in the future.
PDF

Development of a Stock Trading System Using M & W Wave Patterns and Genetic Algorithms (M&W 파동 패턴과 유전자 알고리즘을 이용한 주식 매매 시스템 개발)

Yang, Hoonseok;Kim, Sunwoong;Choi, Heung Sik
- Journal of Intelligence and Information Systems
- /
- v.25 no.1
- /
- pp.63-83
- /
- 2019
Investors prefer to look for trading points based on the graph shown in the chart rather than complex analysis, such as corporate intrinsic value analysis and technical auxiliary index analysis. However, the pattern analysis technique is difficult and computerized less than the needs of users. In recent years, there have been many cases of studying stock price patterns using various machine learning techniques including neural networks in the field of artificial intelligence(AI). In particular, the development of IT technology has made it easier to analyze a huge number of chart data to find patterns that can predict stock prices. Although short-term forecasting power of prices has increased in terms of performance so far, long-term forecasting power is limited and is used in short-term trading rather than long-term investment. Other studies have focused on mechanically and accurately identifying patterns that were not recognized by past technology, but it can be vulnerable in practical areas because it is a separate matter whether the patterns found are suitable for trading. When they find a meaningful pattern, they find a point that matches the pattern. They then measure their performance after n days, assuming that they have bought at that point in time. Since this approach is to calculate virtual revenues, there can be many disparities with reality. The existing research method tries to find a pattern with stock price prediction power, but this study proposes to define the patterns first and to trade when the pattern with high success probability appears. The M & W wave pattern published by Merrill(1980) is simple because we can distinguish it by five turning points. Despite the report that some patterns have price predictability, there were no performance reports used in the actual market. The simplicity of a pattern consisting of five turning points has the advantage of reducing the cost of increasing pattern recognition accuracy. In this study, 16 patterns of up conversion and 16 patterns of down conversion are reclassified into ten groups so that they can be easily implemented by the system. Only one pattern with high success rate per group is selected for trading. Patterns that had a high probability of success in the past are likely to succeed in the future. So we trade when such a pattern occurs. It is a real situation because it is measured assuming that both the buy and sell have been executed. We tested three ways to calculate the turning point. The first method, the minimum change rate zig-zag method, removes price movements below a certain percentage and calculates the vertex. In the second method, high-low line zig-zag, the high price that meets the n-day high price line is calculated at the peak price, and the low price that meets the n-day low price line is calculated at the valley price. In the third method, the swing wave method, the high price in the center higher than n high prices on the left and right is calculated as the peak price. If the central low price is lower than the n low price on the left and right, it is calculated as valley price. The swing wave method was superior to the other methods in the test results. It is interpreted that the transaction after checking the completion of the pattern is more effective than the transaction in the unfinished state of the pattern. Genetic algorithms(GA) were the most suitable solution, although it was virtually impossible to find patterns with high success rates because the number of cases was too large in this simulation. We also performed the simulation using the Walk-forward Analysis(WFA) method, which tests the test section and the application section separately. So we were able to respond appropriately to market changes. In this study, we optimize the stock portfolio because there is a risk of over-optimized if we implement the variable optimality for each individual stock. Therefore, we selected the number of constituent stocks as 20 to increase the effect of diversified investment while avoiding optimization. We tested the KOSPI market by dividing it into six categories. In the results, the portfolio of small cap stock was the most successful and the high vol stock portfolio was the second best. This shows that patterns need to have some price volatility in order for patterns to be shaped, but volatility is not the best.
https://doi.org/10.13088/jiis.2019.25.1.063 인용 PDF KSCI HTML

Search Result 12,579, Processing Time 0.044 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)