• Title/Summary/Keyword: informal document

Search Result 16, Processing Time 0.022 seconds

Feature Expansion based on LDA Word Distribution for Performance Improvement of Informal Document Classification (비격식 문서 분류 성능 개선을 위한 LDA 단어 분포 기반의 자질 확장)

  • Lee, Hokyung;Yang, Seon;Ko, Youngjoong
    • Journal of KIISE
    • /
    • v.43 no.9
    • /
    • pp.1008-1014
    • /
    • 2016
  • Data such as Twitter, Facebook, and customer reviews belong to the informal document group, whereas, newspapers that have grammar correction step belong to the formal document group. Finding consistent rules or patterns in informal documents is difficult, as compared to formal documents. Hence, there is a need for additional approaches to improve informal document analysis. In this study, we classified Twitter data, a representative informal document, into ten categories. To improve performance, we revised and expanded features based on LDA(Latent Dirichlet allocation) word distribution. Using LDA top-ranked words, the other words were separated or bundled, and the feature set was thus expanded repeatedly. Finally, we conducted document classification with the expanded features. Experimental results indicated that the proposed method improved the micro-averaged F1-score of 7.11%p, as compared to the results before the feature expansion step.

Document Management System based on SGML (SGML 을 기반으로 하는 문서관리시스템 개발)

  • Park, Nam-Kyu;Shin, D.S.
    • IE interfaces
    • /
    • v.10 no.3
    • /
    • pp.109-116
    • /
    • 1997
  • Document management system is a tool, based on the document life cycle concept, for structured management of various documents within an organization. In this paper, we address a development process of document management system based on SGML. We have developed a document management system which can support a variety of types in documents such as informal data, HTML, CGI and so on. Using the developed system, users can access documents in the system through an internet browser, and also add or modify existing documents.

  • PDF

Feature Selection for a Hangul Text Document Classification System (한글 텍스트 문서 분류시스템을 위한 속성선택)

  • Lee, Jae-Sik;Cho, You-Jung
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2003.05a
    • /
    • pp.435-442
    • /
    • 2003
  • 정보 추출(Information Retrieval) 시스템은 거대한 양의 정보들 가운데 필요한 정보의 적절한 탐색을 도와주기 위한 도구이다. 이는 사용자가 요구하는 정보를 보다 정확하고 보다 효과적이면서 보다 효율적으로 전달해주어야만 한다. 그러기 위해서는 문서내의 무수히 많은 속성들 가운데 해당 문서의 특성을 잘 반영하는 속성만을 선별해서 적절히 활용하는 것이 절실히 요구된다. 이에 본 연구는 기존의 한글 문서 분류시스템(CB_TFIDF)[1]의 정확도와 신속성 두 가지 측면의 성능향상에 초점을 두고 있다. 기존의 영문 텍스트 문서 분류시스템에 적용되었던 다양한 속성선택 기법들 가운데 잘 알려진 세가지 즉, Information Gain, Odds Ratio, Document Frequency Thresholding을 통해 선별적인 사례베이스를 구성한 다음에 한글 텍스트 문서 분류시스템에 적용시켜서 성능을 비교 평가한 후, 한글 문서 분류시스템에 가장 적절한 속성선택 기법과 속성 선택에 대한 가이드라인을 제시하고자 한다.

  • PDF

A Feature Selection Technique for an Efficient Document Automatic Classification (효율적인 문서 자동 분류를 위한 대표 색인어 추출 기법)

  • 김지숙;김영지;문현정;우용태
    • The Journal of Information Technology and Database
    • /
    • v.8 no.1
    • /
    • pp.117-128
    • /
    • 2001
  • Recently there are many researches of text mining to find interesting patterns or association rules from mass textual documents. However, the words extracted from informal documents are tend to be irregular and there are too many general words, so if we use pre-exist method, we would have difficulty in retrieving knowledge information effectively. In this paper, we propose a new feature extraction method to classify mass documents using association rule based on unsupervised learning technique. In experiment, we show the efficiency of suggested method by extracting features and classifying of documents.

  • PDF

A Design on Informal Big Data Topic Extraction System Based on Spark Framework (Spark 프레임워크 기반 비정형 빅데이터 토픽 추출 시스템 설계)

  • Park, Kiejin
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.521-526
    • /
    • 2016
  • As on-line informal text data have massive in its volume and have unstructured characteristics in nature, there are limitations in applying traditional relational data model technologies for data storage and data analysis jobs. Moreover, using dynamically generating massive social data, social user's real-time reaction analysis tasks is hard to accomplish. In the paper, to capture easily the semantics of massive and informal on-line documents with unsupervised learning mechanism, we design and implement automatic topic extraction systems according to the mass of the words that consists a document. The input data set to the proposed system are generated first, using N-gram algorithm to build multiple words to capture the meaning of the sentences precisely, and Hadoop and Spark (In-memory distributed computing framework) are adopted to run topic model. In the experiment phases, TB level input data are processed for data preprocessing and proposed topic extraction steps are applied. We conclude that the proposed system shows good performance in extracting meaningful topics in time as the intermediate results come from main memories directly instead of an HDD reading.

Supporting Media using XML-based Messages on Online Conversational Activity (온라인 대화 행위에서 XML 기반 메시지를 이용한 미디어 지원)

  • Kim, Kyung-Deok
    • The KIPS Transactions:PartB
    • /
    • v.11B no.1
    • /
    • pp.91-98
    • /
    • 2004
  • This paper proposes how to support various media on online conversational activity using XML(extensible Markup Language). The method converts media information into XML based messages and handles alike conventional text based messages. The XML based messages are unified to an XML document, and then a HTML document is generated using the XML and an XSLT documents in a server. A user in each client can play or present media through the hyperlink that is associated media information on the HTML document. The suggested method supports use of various media (text, image, audio, video, documents, etc) and efficient maintenance of font size, color, and style on messages according to extension and modification of XML tags. For application, this paper implemented the system to support media that has client and server architecture on online conversational activity. A user in each client inputs text or media based message using JAVA applet and servlet on the system, and conversational messages on every users' interfaces are automatically updated whenever a user inputs new message. Media on conversational messages are played or presented according to a user's click on hyperlink. Applications for the media presentation are as follows : distance learning, online game, collaboration, etc.

Electronic Journal : Replacement or New Paradigm\ulcorner (전자저널 : 점진적인 대체인가, 새로운 패러다임인가\ulcorner)

  • 남수현;설성수;윤배현
    • Journal of Korea Technology Innovation Society
    • /
    • v.1 no.1
    • /
    • pp.83-95
    • /
    • 1998
  • Although an academic journal has had an important role in the diffusion of knowledge and the confirmation of knowledge advancement, it has revealed several difficulties : long processing time for publishing, uni - directional communication, closed review process etc. But an electronic journal (E-Journal) can solve these problems and add several new advantages such as the multimedia expression of documents, and pre and post publication review. Moreover, it should be noted that an E-Journal is not an alternative medium which simply replaces an existing journal, but a new paradigm for scholarly communication. Although this paper reviews several issues related to E-Journal, the introduction of the concept of universal service and informal document style for flexible communication are noticeable. For effective diffusion of E-Journal, information infrastructures such as high - speed telecommunication networks and digital libraries are urgently needed. Government subsidy as in Great Britain is necessary for E-Journal publishing.

  • PDF

A Qualitative Case Study of an Exemplary Science Teacher's Earth Systems Education Experiences

  • Lee, Hyon-Yong
    • Journal of the Korean earth science society
    • /
    • v.31 no.5
    • /
    • pp.500-520
    • /
    • 2010
  • The purposes of this case study were (1) to explore one experienced teacher's views on Earth Systems Education and (2) to describe and document the characteristics of the Earth Systems Education (ESE) curriculum provided by an exemplary middle school science teacher, Dr. J. All the essential pieces of evidence were collected from observations, interviews with the experienced teacher and his eighth grade students, informal conversations, document analysis, and field notes. The $NUD^*IST$ for MS Windows was used for an initial data reduction process and to narrow down the focus of an analysis. All transcriptions and written documents were reviewed carefully and repeatedly to find rich evidence through inductive and content analysis. The findings revealed that ESE provided a conceptual focus and theme for organizing his school curriculum. The curriculum offered opportunities for students to learn relevant local topics and to connect the classroom learning to the real world. The curriculum also played an important role in developing students' value and appreciation of Earth systems and concern for the local environment. His instructional strategies were very compatible with recommendations from a constructivist theory. His major teaching methodology and strategies were hands-on learning, authentic activities-based learning, cooperative learning, project-based learning (e.g., mini-projects), and science field trips. With respect to his views about benefits and difficulties associated with ESE, the most important benefit was that the curriculum provided authentic-based, hands-on activities and made connections between students and everyday life experiences. In addition, he believed that it was not difficult to teach using ESE. However, the lack of time devoted to field trips and a lack of suitable resource materials were obstacles to the implementation of the curriculum. Implications for science education and future research are suggested.

Appropriate Technology and the Triple-Helix Model: A Case Study of Korea-Tanzania Appropriate Technology Center (적정기술과 트리플 헬릭스 모델: 한국-탄자니아 적정기술거점센터 사례 연구)

  • Lee, Sooa
    • Journal of Appropriate Technology
    • /
    • v.5 no.1
    • /
    • pp.38-45
    • /
    • 2019
  • In 2017, aiming at developing, educating, and commercializing innovative appropriate technologies that are suitable for Tanzanian environment, the Ministry of Science and ICT in Korea established an innovative technology and energy center in a Tanzanian university. Using the qualitative methodologies such as an ethnography of a research project, document analyses of memoranda of understandings, journal articles, reports, announcements, and newspaper articles, participant observation of formal and informal meetings, and semi-structured interviews with participants engaging in an appropriate technology center, this study examines how triple helix model in S&T innovation has been applied to the development of the Korea-Tanzania appropriate technology center. Despite growing importance in national S&T policies, only few studies have discussed office development aid (ODA) in association with innovation. The analysis of the appropriate technology center with the framework of the triple-helix model shows the close tie between official development aid (ODA) and the cross national innovation promoted in Korea. This study also contributes to understanding embedded organizational structure, conflicts, and barriers of an ODA project in Korea.

Status of Supplier Selection Status and the Practical Use of Purchase Specifications for Self-operated School Foodservices in the Seoul Area (서울 지역 직영 학교 급식의 공급 업체 선정 및 식재료 규격서 사용 실태 조사)

  • Ryu, Kyung
    • The Korean Journal of Food And Nutrition
    • /
    • v.20 no.2
    • /
    • pp.226-239
    • /
    • 2007
  • The purpose of this study was to identify the problems related to the purchasing processes of school foodservices that should be corrected for the food service safety, by examining the purchasing processes and the status of supplier selection. A questionnaire was given to 300 dietitians working at self-operated food services. Ninety-eight responses, excluding incomplete answers, were used for the statistical analysis. The survey consisted of three parts: the general characteristics of the school foodservice and dietitian, purchasing processes and supplier selection, and the purchase specifications. We found that 84% of the contract was made by informal purchasing, and the contract period was 6 months or one year. For supplier selection, problems related to the document screening systems were the superficiality of the content(45.7%) and the absence or lack of clarity of the appraisal criteria(34.8%). The important factors for the facility and equipment standards of suppliers were included unclear evaluation methods for content(41.1%) and inappropriate appraisal lists(21.1%), while unclear evaluation methods for content(41.9%) and absence or lack of clarity of the appraisal criteria(20.4%) were the problems pertaining to the supplier evaluation checklist. When using the Food Labeling Standards to select suppliers, confirmation of the sell-by date and the storage method had the highest score at 3.85 out of 5. For supplier selection, only 25% of the contract was made by using the purchase specifications. The levels of satisfaction of with Kimchi and rice cakes suppliers were significantly different according to employment type and educational background, respectively. Depending on working experiences, satisfaction was significantly different for the use of document screening, as a standard for the selection and management of suppliers, and for the facility and equipment standards of suppliers, The use of purchase specifications was different by employment type, while the use of purchase specifications for contracts was different by working experience. These results imply that the specialization of suppliers is necessary to unsure food safety. Therefore, the objective methods to evaluate the suppliers should be developed by the government, and appropriate education programs for dietitians should be prepared to enhance the utilization of purchase specifications.