• Title/Summary/Keyword: document topic

Search Result 190, Processing Time 0.024 seconds

A Study on multi-translation system for e-business collaboration (e-비즈니스 협업에 적합한 다중변환 시스템 연구)

  • Ahn, Kyeong-Rim;Chung, Jin-Wook
    • Journal of Internet Computing and Services
    • /
    • v.7 no.6
    • /
    • pp.123-130
    • /
    • 2006
  • The transaction was happened within a single business entity or a single marketplace at the stage of e-business. It becomes to grow to complex form. Expecially, the need for business collaboration between business entities or marketplaces has being on the rise as the core topic. The format translation between documents is very important factor according to various the exchanged document formats. In this paper, we define ebXML as the basic format of exchanged document according to object-oriented business transaction. Also we design the multi-format translation system to support the translation of various document formats. The proposed system in this paper, is designed with model-driven method and it is possible to construct with various structure as for system environment. The proposed translation system is designed to use the proposed system as adding the corresponding parsing module even though any format of document. Also, we increase the reusability of data as using the common data set. In this paper, we prove the superiority of the proposed system to compare the performance with the legacy system for various format translation.

  • PDF

Object detection in financial reporting documents for subsequent recognition

  • Sokerin, Petr;Volkova, Alla;Kushnarev, Kirill
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.1-11
    • /
    • 2021
  • Document page segmentation is an important step in building a quality optical character recognition module. The study examined already existing work on the topic of page segmentation and focused on the development of a segmentation model that has greater functional significance for application in an organization, as well as broad capabilities for managing the quality of the model. The main problems of document segmentation were highlighted, which include a complex background of intersecting objects. As classes for detection, not only classic text, table and figure were selected, but also additional types, such as signature, logo and table without borders (or with partially missing borders). This made it possible to pose a non-trivial task of detecting non-standard document elements. The authors compared existing neural network architectures for object detection based on published research data. The most suitable architecture was RetinaNet. To ensure the possibility of quality control of the model, a method based on neural network modeling using the RetinaNet architecture is proposed. During the study, several models were built, the quality of which was assessed on the test sample using the Mean average Precision metric. The best result among the constructed algorithms was shown by a model that includes four neural networks: the focus of the first neural network on detecting tables and tables without borders, the second - seals and signatures, the third - pictures and logos, and the fourth - text. As a result of the analysis, it was revealed that the approach based on four neural networks showed the best results in accordance with the objectives of the study on the test sample in the context of most classes of detection. The method proposed in the article can be used to recognize other objects. A promising direction in which the analysis can be continued is the segmentation of tables; the areas of the table that differ in function will act as classes: heading, cell with a name, cell with data, empty cell.

Cross-Domain Text Sentiment Classification Method Based on the CNN-BiLSTM-TE Model

  • Zeng, Yuyang;Zhang, Ruirui;Yang, Liang;Song, Sujuan
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.818-833
    • /
    • 2021
  • To address the problems of low precision rate, insufficient feature extraction, and poor contextual ability in existing text sentiment analysis methods, a mixed model account of a CNN-BiLSTM-TE (convolutional neural network, bidirectional long short-term memory, and topic extraction) model was proposed. First, Chinese text data was converted into vectors through the method of transfer learning by Word2Vec. Second, local features were extracted by the CNN model. Then, contextual information was extracted by the BiLSTM neural network and the emotional tendency was obtained using softmax. Finally, topics were extracted by the term frequency-inverse document frequency and K-means. Compared with the CNN, BiLSTM, and gate recurrent unit (GRU) models, the CNN-BiLSTM-TE model's F1-score was higher than other models by 0.0147, 0.006, and 0.0052, respectively. Then compared with CNN-LSTM, LSTM-CNN, and BiLSTM-CNN models, the F1-score was higher by 0.0071, 0.0038, and 0.0049, respectively. Experimental results showed that the CNN-BiLSTM-TE model can effectively improve various indicators in application. Lastly, performed scalability verification through a takeaway dataset, which has great value in practical applications.

An Optimized e-Lecture Video Search and Indexing framework

  • Medida, Lakshmi Haritha;Ramani, Kasarapu
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.8
    • /
    • pp.87-96
    • /
    • 2021
  • The demand for e-learning through video lectures is rapidly increasing due to its diverse advantages over the traditional learning methods. This led to massive volumes of web-based lecture videos. Indexing and retrieval of a lecture video or a lecture video topic has thus proved to be an exceptionally challenging problem. Many techniques listed by literature were either visual or audio based, but not both. Since the effects of both the visual and audio components are equally important for the content-based indexing and retrieval, the current work is focused on both these components. A framework for automatic topic-based indexing and search depending on the innate content of the lecture videos is presented. The text from the slides is extracted using the proposed Merged Bounding Box (MBB) text detector. The audio component text extraction is done using Google Speech Recognition (GSR) technology. This hybrid approach generates the indexing keywords from the merged transcripts of both the video and audio component extractors. The search within the indexed documents is optimized based on the Naïve Bayes (NB) Classification and K-Means Clustering models. This optimized search retrieves results by searching only the relevant document cluster in the predefined categories and not the whole lecture video corpus. The work is carried out on the dataset generated by assigning categories to the lecture video transcripts gathered from e-learning portals. The performance of search is assessed based on the accuracy and time taken. Further the improved accuracy of the proposed indexing technique is compared with the accepted chain indexing technique.

Research on Brand Value Dimensions of Employers: Based on Online Reviews by the Employees

  • XU, Meng
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.9 no.10
    • /
    • pp.215-225
    • /
    • 2022
  • This study investigates employees' online reviews, conducts in-depth text topic mining, effectively summarizes the dimensions of employer brand value, and seeks effective ways to build employer brands from a multi-dimensional perspective. This study employs samples of employer reviews, filter keywords according to word frequency-inverse document frequency, builds a review network containing the same keywords, explore the community and summarize the theme dimensions. Simultaneously, it makes a dynamic comparison and analysis of the employer brand value dimension of different industries and enterprises. The study shows that the community exploration theme can be summarized into 11 dimensions of employer brand value, and the dimensions of employer brand value are significantly different across industries and among different enterprises within the industry. The attention to the employer brand value dimension has a significant time change. Various industries pay increasing attention to the dimension of work intensity and career development, while employers pay steady attention to the dimension of welfare benefits. The findings of this study suggest that seeking the heterogeneity of employer brand resources from the multi-dimensional differences and changes is an effective way to improve the competitiveness of enterprises in the human capital market.

Comparison of Topics Related to Nurse on the Internet Portals and Social Media Before and During the COVID-19 era Using Topic Modeling (토픽 모델링을 활용한 COVID-19 발생 전후 간호사 관련 토픽 비교: 인터넷 포털과 소셜미디어를 중심으로)

  • Yoon, Young Mi;Kim, Seong Kwang;Kim, Hye Kyeong;Kim, Eun Joo;Jeong, Yuneui
    • Journal of muscle and joint health
    • /
    • v.27 no.3
    • /
    • pp.255-267
    • /
    • 2020
  • Purpose: The purpose of this study is to compare topics through keywords related to nurses in internet portals and social media Pre coronavirus disease (COVID-19) era and during the COVID-19 era. Methods: For six months before and during the outbreak of COVID-19 in Korea, "nurse" was searched on the internet. For data collection, we implemented web crawlers in programming languages such as Python and collected keywords. The keywords collected were classified into three domains of topic Modeling. Results: The keyword 'nurse' increased by 15% during COVID-19 era. Keywords that ranked high in Term Frequency - Inverse Document Frequency (TF-IDF) values were before COVID-19, such as "nurse" and "C-section". during COVID-19, however, they were not only "nurse" but also "emergency" and "gown" related to pandemics. Conclusion: Various topics were being uploaded into the internet media. Nursing professionals should be interested in the text that is revealed in the internet media and try to continuously identify and improve problems.

Practical Implications on Delivery of Goods under the Rotterdam Rules (로테르담규칙상 운송물 인도와 실무상 유의점)

  • YANG, Jung-Ho
    • THE INTERNATIONAL COMMERCE & LAW REVIEW
    • /
    • v.74
    • /
    • pp.55-79
    • /
    • 2017
  • The Rotterdam Rules introduces new issues that have been ignored by previous international transport conventions. Among them, provisions on delivery of goods have been a much debated topic as it deviate from well established principles. Rotterdam Rules provides several alternatives in order to resolve uncertainty regarding delivery practice. The carrier have to make a resonable effort to deliver the goods following the required procedure which is different from transport document issued. Where the goods are not deliverable, the carrier could discharge from its obligations to deliver the goods when he deliver the goods by delivery instruction of shipper. In addition, he can take actions reasonably required according to circumstances if it is impossible to deliver the goods. These alternatives are not ideal, but they seem to be partly helping to solve practical problems arising in the process of delivery. However the delivery regime under the Rotterdam Rules could cause confusion in the traditional delivery principle. On the other hand, it puts a new burden on the parties concerned. In conclusion, the parties concerned should consider practical implications in issuing and transferring transport document as well as requesting and instructing delivery of goods.

  • PDF

A Experimental Study on the Usefulness of Structure Hints in the Leaf Node Language Model-Based XML Document Retrieval (단말노드 언어모델 기반의 XML문서검색에서 구조 제한의 유용성에 관한 실험적 연구)

  • Jung, Young-Mi
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.1 s.63
    • /
    • pp.209-226
    • /
    • 2007
  • XML documents format on the Web provides a mechanism to impose their content and logical structure information. Therefore, an XML processor provides access to their content and structure. The purpose of this study is to investigate the usefulness of structural hints in the leaf node language model-based XML document retrieval. In order to this purpose, this experiment tested the performances of the leaf node language model-based XML retrieval system to compare the queries for a topic containing only content-only constraints and both content constrains and structure constraints. A newly designed and implemented leaf node language model-based XML retrieval system was used. And we participated in the ad-hoc track of INEX 2005 and conducted an experiment using a large-scale XML test collection provided by INEX 2005.

Design and Implementation of Video Documents Management System (비디오 문서 관리시스템의 설계 및 구현)

  • Kweon, Jae-Gil;Bae, Jong-Min
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.8
    • /
    • pp.2287-2297
    • /
    • 2000
  • Video documents which have audio-visual and other semantics information have complex relationship among media. While user requests for topic retrieval or specific region retrieval increase, it is difficult to meet these requests with the existing design methodology, In order to support the systematic management and the various retrieval capabilities of video document, we must formulate structural and systematic model on metadata using semantics and structural informations which are abstracted automaticallv or manuallv. This paper suggests generic metadata model with which we analyze the characteristics of video document, supports various query types and serves as a generic framework for video applications, we propose the generic integrated management model(GIMM)for generic metadata,, design video documents management system(VDMS) and implement it using GIMM.

  • PDF

Groupware: Current Status Analysis II (그룹웨어의 현황 분석 II)

  • Kim, Sun-Uk;Gim, Bong-Jin
    • IE interfaces
    • /
    • v.11 no.2
    • /
    • pp.211-225
    • /
    • 1998
  • As mentioned in Part I all groupware products have been categorized into three areas which include cooperation/document management systems(CMS), collaborative writing systems(CWS), and decision-making/meeting system(DMS). This study deals with a comparative analysis of the last two areas, which is added to the first. It turns out that DMS has a higher market share than CWS. However. since effective collaboration requires the functions inherent to these two systems. they should be integrated somehow. The systems' functions that have been implemented in response to design issues have been described. Each group of the functions has been divided into three parts which consist of basic function, quasi-basic function. and others. Such a decision has been made according to the frequency rate of the functions provided in the products. While the basic functions in CWS include collaboraive writing beyond restriction of time and place, group awareness. version control. and others, in DMS realtime collaboration. brainstorming. presentation. various task support. policy formation. document management, multimedia, subgroup communication. topic commenter, categorizer, screen capture and various rile transfer. The basic functions are merged into the integrated functional model which was proposed in Part I. Since the model is so flexible that it can partially include the quasi-functions in addition to the hasic functions. a large number of products may stem from the modification of the functional model.

  • PDF