• Title/Summary/Keyword: 텍스트화

Search Result 989, Processing Time 0.022 seconds

Improvement Plan of Web Site FAQ using Text Mining : Focused on the S University Case (텍스트마이닝을 활용한 웹사이트 FAQ 개선방안: S대학교 사례를 중심으로)

  • Ahn, su-hyun;Jo, jeong-hyun;Lee, sang-jun
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2018.05a
    • /
    • pp.361-362
    • /
    • 2018
  • 본 연구는 대학 웹페이지의 Q&A(질의응답) 게시판에 게재된 비정형화 된 데이터를 수집한 후 텍스트마이닝과 네트워크 분석을 활용하여 자주 등장하는 키워드 간 연관 패턴을 파악하고자 한다. 분석결과를 바탕으로 FAQ(자주하는 질문) 게시판을 구성한다면 반복적인 질문에 대한 민원을 간소화함으로써 수요자의 편의성과 행정의 효율성 향상에 기여하고 나아가 원활한 양방향 소통이 가능할 것으로 기대한다.

  • PDF

JDBC based Distributed Image search Web Agent (JDBC를 이용한 분산 환경에서의 이미지 검색 웹 에이전트)

  • 차상환;황병곤
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2004.05a
    • /
    • pp.644-651
    • /
    • 2004
  • 본 논문은 웹상에 존재하는 이미지를 멀티 스레드에 의한 분산 아키텍처를 이용하여 수집 및 검색 시스템으로, 웹문서에 나타나는 텍스트중 이미지의 이름이나 확장자 그리고 링크에 붙어 있는 텍스트를 추출하여 이미지 자료를 JDBC를 이용하여 데이터베이스화하였다. 이 데이터베이스에 저장된 이미지 자료는 웹 브라우저에서 질의자의 스케치에 의한 검색과 그리고 예제 영상 질의로 검색하는 방법을 제시하여 질의 효율성을 개선하였다. 또한, 멀티 스레드를 이용한 분산 아키텍처를 이용하여, 데이터베이스화 하는 시간에 효율을 개선하였다.

  • PDF

Expanded Korean Chunking by $k$-NN ($k$-NN으로 확장된 한국어 단위화)

  • 박성배;장병탁;김영택
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.182-184
    • /
    • 2000
  • 대부분의 자연언어처리에서 단위화는 구문 분석 이전의 매우 기본적인 처리 단계로, 텍스트 문장을 문법적으로 서로 관련된 단위로 분할하는 것이다. 따라서, 단위화를 이용하면 구문 분석이나 의미 분석 등에서 메모리와 시간을 효율적으로 줄일 수 있다. 일반적으로 통찰에 의한 규칙을 사용해서도 비교적 높은 단위화 성능을 얻을 수 있지만, 본 논문에서는 기계 학습 기법인 k-NN을 사용하여 보다 정확한 단위화를 구현한다. 인터넷 홈페이지에서 얻은 1,273 문장을 대상으로 학습한 결과, k-NN으로 단위화를 확장했을 때에 확장하지 않았을 때보다 2.3%의 정확도 증가를 보였다.

  • PDF

FolksoViz: A Subsumption-based Folksonomy Visualization Using the Wikipedia (FolksoViz: Wikipedia 본문을 이용한 상하위 관계 기반 폭소노미 시각화 기법)

  • Lee, Kang-Pyo;Kim, Hyun-Woo;Jang, Chung-Su;Kim, Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.4
    • /
    • pp.401-411
    • /
    • 2008
  • Folksonomy, which is created through the collaborative tagging from many users, is one of the driving factors of Web 2.0. Tags are said to be the web metadata describing a web document. If we are able to find the semantic subsumption relationships between tags created through the collaborative tagging, it can help users understand the metadata more intuitively. In this paper, targeting del.icio.us tag data, we propose a method named FolksoViz for deriving subsumption relationships between tags by using Wikipedia texts. For this purpose, we propose a statistical model for deriving subsumption relationships based on the frequency of each tag on the Wikipedia texts, and TSD(Tag Sense Disambiguation) method for mapping each tag to a corresponding Wikipedia text. The derived subsumption pairs are visualized effectively on the screen. The experiment shows that our proposed algorithm managed to find the correct subsumption pairs with high accuracy.

Natural Scene Text Binarization using Tensor Voting and Markov Random Field (텐서보팅과 마르코프 랜덤 필드를 이용한 자연 영상의 텍스트 이진화)

  • Choi, Hyun Su;Lee, Guee Sang
    • Smart Media Journal
    • /
    • v.4 no.4
    • /
    • pp.18-23
    • /
    • 2015
  • In this paper, we propose a method for detecting the number of clusters. This method can improve the performance of a gaussian mixture model function in conventional markov random field method by using the tensor voting. The key point of the proposed method is that extracts the number of the center through the continuity of saliency map of the input data of the tensor voting token. At first, we separate the foreground and background region candidate in a given natural images. After that, we extract the appropriate cluster number for each separate candidate regions by applying the tensor voting. We can make accurate modeling a gaussian mixture model by using a detected number of cluster. We can return the result of natural binary text image by calculating the unary term and the pairwise term of markov random field. After the experiment, we can confirm that the proposed method returns the optimal cluster number and text binarization results are improved.

A Study on Learners' Needs Analysis Using Text Mining Techniques : Focusing on SNS (텍스트 마이닝 기법을 이용한 학습 수요자 요구에 관한 연구 : SNS를 중심으로)

  • Lee, Myung-Suk;Lee, Kyung-Mi;Lim, Youg-Kyu;Han, Kyung-Im;Park, Hye-Jung
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.259-261
    • /
    • 2016
  • 본 연구는 교양교육에 대한 학습 수요자의 요구와 현재 편성되어 있는 교양교육 교과목들에 대한 차이를 알아본다. 학습 수요자의 다양한 생각들을 SNS를 통해 데이터를 수집하고, 텍스트 마이닝 기법을 이용하여 유용한 정보를 발견하고 시각화 분석을 통해 학습자의 요구를 제시한다. 분석 결과로는 학습자는 교수자와 상호작용 잘되는 수업 방식, 학습자가 참여할 수 있는 수업, 자기주도 학습을 선호하였다. 또한 교양교육 교과목 개설로서는 취업에 필요한 외국어, 자격증 취득이 가능한 과목, 실생활에 적용할 수 있는 실용적인 과목들을 요구하여 실제 균형에 맞게 개설된 교과목과는 차이를 보임을 알 수 있었다.

  • PDF

A Study on the Method for Extracting the Purpose-Specific Customized Information from Online Product Reviews based on Text Mining (텍스트 마이닝 기반의 온라인 상품 리뷰 추출을 통한 목적별 맞춤화 정보 도출 방법론 연구)

  • Kim, Joo Young;Kim, Dong soo
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.2
    • /
    • pp.151-161
    • /
    • 2016
  • In the era of the Web 2.0, characterized by the openness, sharing and participation, it is easy for internet users to produce and share the data. The amount of the unstructured data which occupies most of the digital world's data has increased exponentially. One of the kinds of the unstructured data called personal online product reviews is necessary for both the company that produces those products and the potential customers who are interested in those products. In order to extract useful information from lots of scattered review data, the process of collecting data, storing, preprocessing, analyzing, and drawing a conclusion is needed. Therefore we introduce the text-mining methodology for applying the natural language process technology to the text format data like product review in order to carry out extracting structured data by using R programming. Also, we introduce the data-mining to derive the purpose-specific customized information from the structured review information drawn by the text-mining.

Identifying Social Relationships using Text Analysis for Social Chatbots (소셜챗봇 구축에 필요한 관계성 추론을 위한 텍스트마이닝 방법)

  • Kim, Jeonghun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.85-110
    • /
    • 2018
  • A chatbot is an interactive assistant that utilizes many communication modes: voice, images, video, or text. It is an artificial intelligence-based application that responds to users' needs or solves problems during user-friendly conversation. However, the current version of the chatbot is focused on understanding and performing tasks requested by the user; its ability to generate personalized conversation suitable for relationship-building is limited. Recognizing the need to build a relationship and making suitable conversation is more important for social chatbots who require social skills similar to those of problem-solving chatbots like the intelligent personal assistant. The purpose of this study is to propose a text analysis method that evaluates relationships between chatbots and users based on content input by the user and adapted to the communication situation, enabling the chatbot to conduct suitable conversations. To evaluate the performance of this method, we examined learning and verified the results using actual SNS conversation records. The results of the analysis will aid in implementation of the social chatbot, as this method yields excellent results even when the private profile information of the user is excluded for privacy reasons.

A study on Customized Foreign Language Learning Contents Construction (사용자 맞춤형 외국어학습 콘텐츠 구성을 위한 연구)

  • Kim, Gui-Jung;Yi, Jae-Il
    • Journal of Digital Convergence
    • /
    • v.17 no.1
    • /
    • pp.189-194
    • /
    • 2019
  • This paper is a study on the methodology of making customized contents according to user 's tendency through the development of learning contents utilizing IT. A variety of learners around the world use mobile devices and mobile learning contents to conduct their learning activities in various fields, and foreign language learning is one of the typical mobile learning areas. Foreign language learning contents suggested in this study is constructed based on the learner's verbal and text information in accordance with the user's vocal tendency. It is necessary to find out a suitable method to translate the user's native language text into the target language and make it into user friendly content.

Suggestions on how to convert official documents to Machine Readable (공문서의 기계가독형(Machine Readable) 전환 방법 제언)

  • Yim, Jin Hee
    • The Korean Journal of Archival Studies
    • /
    • no.67
    • /
    • pp.99-138
    • /
    • 2021
  • In the era of big data, analyzing not only structured data but also unstructured data is emerging as an important task. Official documents produced by government agencies are also subject to big data analysis as large text-based unstructured data. From the perspective of internal work efficiency, knowledge management, records management, etc, it is necessary to analyze big data of public documents to derive useful implications. However, since many of the public documents currently held by public institutions are not in open format, a pre-processing process of extracting text from a bitstream is required for big data analysis. In addition, since contextual metadata is not sufficiently stored in the document file, separate efforts to secure metadata are required for high-quality analysis. In conclusion, the current official documents have a low level of machine readability, so big data analysis becomes expensive.