• Title/Summary/Keyword: Document Reading

Search Result 64, Processing Time 0.021 seconds

A Study on the Current Status and Utilization of Old Map in Library and Museums in Korea (국내 도서관·박물관 소장 고지도의 현황 및 활용에 관한 연구)

  • Gi Young Kim
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.35 no.1
    • /
    • pp.97-125
    • /
    • 2024
  • The purpose of this study is to increase access to information on old maps and to discuss efficient ways to utilize old maps, such as providing services and information using old maps. To this end, the information search system of domestic institutions that provide old map information was investigated, and methods of searching for old map data and accessing information were searched on the website. In addition, the current status of the collection of old maps in domestic libraries and museums was analyzed by referring to the homepage, book, research book, and publication of each institution. As a result of the analysis, about 2,200 old maps were housed in 76 institutions, including national, public, and university libraries and museums nationwide. Each institution in the collection of old maps was carrying out publication business, such as publication of English manuscripts, exhibitions and books, publication of research document edits such as lists and summaries. However, reading and using of original documents are limited due to the rare nature of old maps and the data characteristics of the only one. In order to effectively utilize old maps, first, it is necessary to improve access to old map information services and expand academic information services. Second, it is proposed to use old maps as data for archival construction that reflects the identity of the region. Third, it is necessary to cultivate professional manpower who selects and provides information based on knowledge of old map data and humanities literacy.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

Methodology for Identifying Issues of User Reviews from the Perspective of Evaluation Criteria: Focus on a Hotel Information Site (사용자 리뷰의 평가기준 별 이슈 식별 방법론: 호텔 리뷰 사이트를 중심으로)

  • Byun, Sungho;Lee, Donghoon;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.3
    • /
    • pp.23-43
    • /
    • 2016
  • As a result of the growth of Internet data and the rapid development of Internet technology, "big data" analysis has gained prominence as a major approach for evaluating and mining enormous data for various purposes. Especially, in recent years, people tend to share their experiences related to their leisure activities while also reviewing others' inputs concerning their activities. Therefore, by referring to others' leisure activity-related experiences, they are able to gather information that might guarantee them better leisure activities in the future. This phenomenon has appeared throughout many aspects of leisure activities such as movies, traveling, accommodation, and dining. Apart from blogs and social networking sites, many other websites provide a wealth of information related to leisure activities. Most of these websites provide information of each product in various formats depending on different purposes and perspectives. Generally, most of the websites provide the average ratings and detailed reviews of users who actually used products/services, and these ratings and reviews can actually support the decision of potential customers in purchasing the same products/services. However, the existing websites offering information on leisure activities only provide the rating and review based on one stage of a set of evaluation criteria. Therefore, to identify the main issue for each evaluation criterion as well as the characteristics of specific elements comprising each criterion, users have to read a large number of reviews. In particular, as most of the users search for the characteristics of the detailed elements for one or more specific evaluation criteria based on their priorities, they must spend a great deal of time and effort to obtain the desired information by reading more reviews and understanding the contents of such reviews. Although some websites break down the evaluation criteria and direct the user to input their reviews according to different levels of criteria, there exist excessive amounts of input sections that make the whole process inconvenient for the users. Further, problems may arise if a user does not follow the instructions for the input sections or fill in the wrong input sections. Finally, treating the evaluation criteria breakdown as a realistic alternative is difficult, because identifying all the detailed criteria for each evaluation criterion is a challenging task. For example, if a review about a certain hotel has been written, people tend to only write one-stage reviews for various components such as accessibility, rooms, services, or food. These might be the reviews for most frequently asked questions, such as distance between the nearest subway station or condition of the bathroom, but they still lack detailed information for these questions. In addition, in case a breakdown of the evaluation criteria was provided along with various input sections, the user might only fill in the evaluation criterion for accessibility or fill in the wrong information such as information regarding rooms in the evaluation criteria for accessibility. Thus, the reliability of the segmented review will be greatly reduced. In this study, we propose an approach to overcome the limitations of the existing leisure activity information websites, namely, (1) the reliability of reviews for each evaluation criteria and (2) the difficulty of identifying the detailed contents that make up the evaluation criteria. In our proposed methodology, we first identify the review content and construct the lexicon for each evaluation criterion by using the terms that are frequently used for each criterion. Next, the sentences in the review documents containing the terms in the constructed lexicon are decomposed into review units, which are then reconstructed by using the evaluation criteria. Finally, the issues of the constructed review units by evaluation criteria are derived and the summary results are provided. Apart from the derived issues, the review units are also provided. Therefore, this approach aims to help users save on time and effort, because they will only be reading the relevant information they need for each evaluation criterion rather than go through the entire text of review. Our proposed methodology is based on the topic modeling, which is being actively used in text analysis. The review is decomposed into sentence units rather than considering the whole review as a document unit. After being decomposed into individual review units, the review units are reorganized according to each evaluation criterion and then used in the subsequent analysis. This work largely differs from the existing topic modeling-based studies. In this paper, we collected 423 reviews from hotel information websites and decomposed these reviews into 4,860 review units. We then reorganized the review units according to six different evaluation criteria. By applying these review units in our methodology, the analysis results can be introduced, and the utility of proposed methodology can be demonstrated.

A Study on the Persons Enjoying the Landscape of Daegodea in Hamyang and Space Hegemony through Analysis of Poetry and Letters Carved on the Rocks (시문과 바위글씨로 본 함양 대고대(大孤臺)의 경관 향유자와 장소패권(場所覇權))

  • Rho, Jae-Hyun;Lee, Jung-Han
    • Journal of the Korean Institute of Traditional Landscape Architecture
    • /
    • v.32 no.1
    • /
    • pp.10-21
    • /
    • 2014
  • This study focuses on the landscape of Daegodae(大孤臺), a prominent rock placed at the side of Namgae Stream in Hamyang, and the person who enjoy the landscape. Through the analysis of the letters such as names carved on the rocks based on ancient poetry and stone walls, the study examines the characteristics of the landscape and the space of Daegodae and the phase of hegemony to enjoy the landscape and space. The result of this study is as follow.2) There are 5 Seowon(書院: lecture halls) nearby Daegodae identified in the ancient map has 5 auditoriums nearby, and three-dimensional volume and eccentricity of the Daegodae is impressive. Daegodae, named by Noh Jin(1518~1578) in 16th century, was used in a variety of ways, including viewing, game, recreation, and meeting, by the staff of the lecture halls including Namgae Seowon(南溪書院), as a result of analyzing the ancient document Go-dae-il-Loc(孤臺日錄) written by Jung Kyung-Woon(鄭慶雲: 1556~?). The structure of Daegodae is that there is Chunggeunchung(淸近亭) on the rock face of the top and Sanangjae(山仰齋) to the west around the memorial stone for Yang Hee(梁喜: 1515~1581). The upper part of the foundation of Daegodae with 11m high and $10m^2$ wide to the east and west was widely used for lecturing and poetry reading. To the north and west of the foundation were the writing of Kim Jeong-Hee(金正喜: 1786~1856) with the words 'Seoksong Chusa(石松 秋史)' carved on the rock and the remains of a dead tree that is presumed to have been called as 'Seoksong'. They are the landscapes that further enhance the history and authenticity of this place. The two kinds of letters carved on the rock 'Daegodae Gaeeunseo(大高臺 介隱書)' and 'Mukheon JungGeunSang(鄭近相: 1893~1934)' were recorded each by Jung Jae-Gi(1811~1879) and his grandson Jung Geun-Sang, which are, as the outcome of exclusive space possession and space hegemony, the signatures indicating that they were the persons who enjoyed this place during the late Joseon and Japanese colonial era. In other words, Daegodae had some implied meaning of preoccupancy of the place as Gujolyangseonsengjangguso since the middle of Joseon, and the place was passed down as a buddhism lecturing and memorial venue called "Dungbukganghoiso Cheonryungjaeseonhyunjangguso" after going through the space hegemony of Jung Jae-Gi and Jung Geun-Sang during the late Joseon and Japanese colonial era each, Nevertheless, a number of letters carved on the rock identified also imply that 'Hadong Jung(河東鄭氏)' and 'Pungcheon Noh(豊川盧氏)' were those who enjoyed the landscape of Daegodae and the center of the space hegemony. The "letters carved on the rock of Daegudae" is another case of cultural landscape and traditional gardening space that serves as the representation of the will of enjoying the landscape in this place and the history of space hegemony.