• Title/Summary/Keyword: 온라인 문서

Search Result 215, Processing Time 0.028 seconds

The Lowest Price Matching Service Using Cosine Similarity Analysis (코사인 유사도 분석을 이용한 최저가 매칭 서비스)

  • Yoo, Songeun;Kang, Byungoh;Kim, Jimin;Lee, Ganghyeok;Lee, Minwoo;Koh, Seokju
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.624-629
    • /
    • 2020
  • 최근 온라인 쇼핑 시장이 커지면서 소비자들은 다양한 물건을 온라인에서 쉽게 접근하고 구매할 수 있게 되었다. 이와 함께 인터파크의 '톡집사', 네이버 쇼핑 등에서는 다양한 쇼핑몰의 가격 정보를 모아서 소비자들이 합리적인 가격에 상품을 구매할 수 있도록 도와주고 있다. 이에 본 논문에서는 이러한 가격 비교 시스템을 활용하여 판매자들을 대상으로 서비스하는 시스템을 제안한다. 문서 유사도를 비교하기 위하여 쓰이던 코사인 유사도 분석 기법을 쇼핑몰 상품명 분석에 이용할 수 있도록 한다. 실제 상품명 정보를 이용해 코사인 유사도 분석을 실행하고 코사인 유사도 분석 결괏값으로 관련성이 낮은 상품을 배제한다. 나머지 상품의 정보를 바탕으로 최저가 분석을 수행하여 적정 판매가격을 추출하여 제시한다. 따라서 제안하는 방식을 적용하여 상품 분석을 시행하면 비슷한 범주에 있는 상품들을 추출한 뒤 최적의 가격을 제시할 수 있을 것이다.

  • PDF

Web Information Extraction and Multidimensional Analysis Using XML (XML을 이용한 웹 정보 추출 및 다차원 분석)

  • Park, Byung-Kwon
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.5
    • /
    • pp.567-578
    • /
    • 2008
  • For analyzing a huge amount of web pages available in the Internet, we need to extract the encoded information in web pages. In this paper, we propose a method to extract and convert web information from web pages into XML documents for multidimensional analysis. For extracting information from web pages, we propose two languages: one for describing web information extraction rules based on the object-oriented model, and another for describing regular expressions of HTML tag patterns to search for target information. For multidimensional analysis on XML documents, we propose a method for constructing an XML warehouse and various XML cubes from it like the way we do for relational data. Finally, we show the validness of our method through the application to US patent web pages.

  • PDF

Analysis of Security Vulnerabilities and Personal Resource Exposure Risks in Overleaf

  • Suzi Kim;Jiyeon Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.7
    • /
    • pp.109-115
    • /
    • 2024
  • Overleaf is a cloud-based LaTeX editor, allowing users to easily create and collaborate on documents without the need for separate LaTeX installation or configuration. Thanks to this convenience, users from various fields worldwide are writing, editing, and collaborating on academic papers, reports, and more via web browsers. However, the caching that occurs during the process of converting documents written on Overleaf to PDF format poses risks of exposing sensitive information. This could potentially lead to the exposure of users' work to others, necessitating the implementation of security measures and vigilance to caution against such incidents. This paper delves into an in-depth analysis of Overleaf's security vulnerabilities and proposes various measures to enhance the protection of intellectual property.

Message Interoperability in e-Logistics System (e-Logistics시스템의 메시지 상호운용성)

  • Seo Sungbo;Lee Young Joon;Hwang Jaegak;Ryu Keun Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.5
    • /
    • pp.436-450
    • /
    • 2005
  • Existing B2B, B2C computer systems and applications that executed business trans-actions were the client- server based architecture which consists of heterogeneous hardware and software including personal computers and mainframes. Due to the active boom of electronic business, integration and compatibility of exchanged data, applications and hardwares have emerged as hot issue. This paper designs and implements a message transport system and a document transformation system in order to solve the interoperability problem of integrated logistics system in e-Business when doing electronic business. Message transport system integrated ebMS 2.0 which is standard business message exchange format of ebXML, the international standard electronic commerce framework, and JMS of J2EE enable to ensure reliable messaging. The document transformation system could convert non-standard XML documents into standard XML documents and provide the web services after integrating message system. Using suggested business scenario and various test data, our message oriented system preyed to be interoperable and stable. We participated ebXML messaging interoperability test organized by ebXML Asia Committee ITG in oder to evaluate and certify the suitability for message system.

A Case Study on Metadata Extractionfor Records Management Using ChatGPT (챗GPT를 활용한 기록관리 메타데이터 추출 사례연구)

  • Minji Kim;Sunghee Kang;Hae-young Rieh
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.24 no.2
    • /
    • pp.89-112
    • /
    • 2024
  • Metadata is a crucial component of record management, playing a vital role in properly managing and understanding the record. In cases where automatic metadata assignment is not feasible, manual input by records professionals becomes necessary. This study aims to alleviate the challenges associated with manual entry by proposing a method that harnesses ChatGPT technology for extracting records management metadata elements. To employ ChatGPT technology, a Python program utilizing the LangChain library was developed. This program was designed to analyze PDF documents and extract metadata from records through questions, both with a locally installed instance of ChatGPT and the ChatGPT online service. Multiple PDF documents were subjected to this process to test the effectiveness of metadata extraction. The results revealed that while using LangChain with ChatGPT-3.5 turbo provided a secure environment, it exhibited some limitations in accurately retrieving metadata elements. Conversely, the ChatGPT-4 online service yielded relatively accurate results despite being unable to handle sensitive documents for security reasons. This exploration underscores the potential of utilizing ChatGPT technology to extract metadata in records management. With advancements in ChatGPT-related technologies, safer and more accurate results are expected to be achieved. Leveraging these advantages can significantly enhance the efficiency and productivity of tasks associated with managing records and metadata in archives.

Detecting Spelling Errors by Comparison of Words within a Document (문서내 단어간 비교를 통한 철자오류 검출)

  • Kim, Dong-Joo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.12
    • /
    • pp.83-92
    • /
    • 2011
  • Typographical errors by the author's mistyping occur frequently in a document being prepared with word processors contrary to usual publications. Preparing this online document, the most common orthographical errors are spelling errors resulting from incorrectly typing intent keys to near keys on keyboard. Typical spelling checkers detect and correct these errors by using morphological analyzer. In other words, the morphological analysis module of a speller tries to check well-formedness of input words, and then all words rejected by the analyzer are regarded as misspelled words. However, if morphological analyzer accepts even mistyped words, it treats them as correctly spelled words. In this paper, I propose a simple method capable of detecting and correcting errors that the previous methods can not detect. Proposed method is based on the characteristics that typographical errors are generally not repeated and so tend to have very low frequency. If words generated by operations of deletion, exchange, and transposition for each phoneme of a low frequency word are in the list of high frequency words, some of them are considered as correctly spelled words. Some heuristic rules are also presented to reduce the number of candidates. Proposed method is able to detect not syntactic errors but some semantic errors, and useful to scoring candidates.

A Narrative Inquiry of Elementary School Science and Online Class Experiences (초등학교 교사의 과학과 온라인 수업 경험에 대한 내러티브 탐구)

  • Kim, Yoon-Kyung
    • Journal of the Korean Society of Earth Science Education
    • /
    • v.15 no.2
    • /
    • pp.273-284
    • /
    • 2022
  • This study was conducted to examine the practical and educational implications of teachers' operation of the curriculum through science and online classes based on data collected for 4 months from 4 teachers who had experience in science subject online classes among homeroom teachers in the 3rd to 6th grades of elementary school in D city. This study was conducted through narrative inquiry. As a result of conducting interviews and in-depth interviews based on the online class experiences of the Earth Science Unit of the study subjects, and conducting field classes with related documents such as online class-related materials and teacher journals, teachers were more likely to take online classes compared to traditional face-to-face classes. They spent more time preparing and showed difficulties in the process of adapting to the new medium used in online classes. In addition, they demanded the provision of scientific materials produced in a pandemic situation and a teaching platform for smooth class operation. In particular, in the case of experimental classes, there is a burden of completing the planned curriculum, and in a pandemic situation, students felt the need for individual experimental tools for intensive science classes. As a result, it is necessary to introduce a blended learning learning system that combines the advantages of face-to-face and online classes as a new class form for the transition to future education in preparation for the pandemic. Continuous teacher research on the format and online class experience is required.

Developing XML Messaging System for Supply Chain Management (공급사슬관리를 위한 XML 메시징 시스템 개발)

  • 김용수;임종선;주경준;주경수
    • Journal of Internet Computing and Services
    • /
    • v.3 no.5
    • /
    • pp.1-8
    • /
    • 2002
  • Because XML is a W3C standard and has characteristics like platform-independent, it has a critical role in e-commerce. Business rules and procedures should be standardized for efficient B2B integration, But a lot of companies are its own XML documents instead of standard document, Therefore many organizations try to make standards for e-commerce based on framework. Also, in case use supply chain administration system, because document transmit between corporation use on-line, need messaging system that can transfer XML message In this paper, XML messaging system is designed and implemented. Client requests service in XML and server returns response in XML in this system, So we con more easily and efficiently to exchange document between company this XML messaging system.

  • PDF

Time-Series based Dataset Selection Method for Effective Text Classification (효율적인 문헌 분류를 위한 시계열 기반 데이터 집합 선정 기법)

  • Chae, Yeonghun;Jeong, Do-Heon
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.1
    • /
    • pp.39-49
    • /
    • 2017
  • As the Internet technology advances, data on the web is increasing sharply. Many research study about incremental learning for classifying effectively in data increasing. Web document contains the time-series data such as published date. If we reflect time-series data to classification, it will be an effective classification. In this study, we analyze the time-series variation of the words. We propose an efficient classification through dividing the dataset based on the analysis of time-series information. For experiment, we corrected 1 million online news articles including time-series information. We divide the dataset and classify the dataset using SVM and $Na{\ddot{i}}ve$ Bayes. In each model, we show that classification performance is increasing. Through this study, we showed that reflecting time-series information can improve the classification performance.

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.