• Title/Summary/Keyword: Web Contents Mining

Search Result 71, Processing Time 0.029 seconds

A study on the Analysis and Forecast of Effect Factors in e-Learning Reuse Intention Using Rule Induction Techniques (규칙유도기법을 이용한 이러닝 시스템의 재이용의도 영향요인 분석 및 예측에 관한 연구)

  • Bae, Jae-Kwon;Kim, Jin-Hwa;Jeong, Hwa-Min
    • Journal of Information Technology Applications and Management
    • /
    • v.17 no.2
    • /
    • pp.71-90
    • /
    • 2010
  • Electronic learning(or e-learning) has created hype for companies, universities, and other educational institutions. It has led to the phenomenal growth in the use of web-based learning and experimentation with multimedia, video conferencing, and internet-based technologies. Many researchers are interested in the factors that affect to the performance of e-learning or e-learning services. In this sense, this study is aimed at proposing e-learning system reuse prediction models in which e-learner intention to reuse influence factors(i.e., system accessibility, system stability, information clarity, information validity, self-regulated efficacy, computer self-efficacy, perceived usefulness, perceived ease of use, flow, and parental expectation) affect e-learner intention to reuse positively. A web survey was conducted for the full members of the e-learning education institute A in Seoul, Republic of Korea, an exclusive e-learning company that provides real time video lectures via the desktop conferencing system. The web survey was conducted for 20 days from November 5, 2009, through the e-learning web site of the company A. In this study, three data mining techniques were used : the multivariate discriminant analysis, CART, and C5.0 algorithm. This study was conducted to provide the e-learning service providers, e-learning operators, and contents developers with marketing and management strategies for improving the e-learning service companies, based on the data mining analysis results.

  • PDF

Dynamic Link Recommendation Based on Anonymous Weblog Mining (익명 웹로그 탐사에 기반한 동적 링크 추천)

  • Yoon, Sun-Hee;Oh, Hae-Seok
    • The KIPS Transactions:PartC
    • /
    • v.10C no.5
    • /
    • pp.647-656
    • /
    • 2003
  • In Webspace, mining traversal patterns is to understand user's path traversal patterns. On this mining, it has a unique characteristic which objects (for example, URLs) may be visited due to their positions rather than contents, because users move to other objects according to providing information services. As a consequence, it becomes very complex to extract meaningful information from these data. Recently discovering traversal patterns has been an important problem in data mining because there has been an increasing amount of research activity on various aspects of improving the quality of information services. This paper presents a Dynamic Link Recommendation (DLR) algorithm that recommends link sets on a Web site through mining frequent traversal patterns. It can be employed to any Web site with massive amounts of data. Our experimentation with two real Weblog data clearly validate that our method outperforms traditional method.

Detecting spam mails using Text Mining Techniques (광고성 메일을 자동으로 구별해내는 Text Mining 기법 연구)

  • 이종호
    • Proceedings of the Korean Society for Cognitive Science Conference
    • /
    • 2002.05a
    • /
    • pp.35-39
    • /
    • 2002
  • 광고성 메일이 개인 당 하루 평균 10통 내외로 오며, 그 제목만으로는 광고메일을 효율적으로 제거하기 어려운 현실이다. 이러한 어려움은 주로 광고 제목을 교묘히 인사말이나 답신처럼 변경하는 데에서 오는 것이며, 이처럼 제목으로 광고를 삭제할 수 없도록 은폐하는 노력은 계속될 추세이다. 그래서 제목을 통한 변화에 적응하면서, 제목뿐만 아니라 내용에 대한 의미 파악을 자동으로 수행하여 스팸 메일을 차단하는 방법이 필요하다. 본 연구에서는 정상 메일과 스팸 메일의 범주화(classification) 방식으로 접근하였다. 이러한 범주화 방식에 대한 기준을 자동으로 알기 위해서는 사람처럼 문장 해독을 통한 의미파악이 필요하지만, 기계가 문장 해독을 통해서 의미파악을 하는 비용이 막대하므로, 의미파악을 단어수준 등에서 효율적으로 대신하는 text mining과 web contents mining 기법들에 대한 적용 및 비교 연구를 수행하였다. 약 500 통에 달하는 광고메일을 표본으로 하였으며, 정상적인 편지군(500 통)에 대해서 동일한 기법을 적용시켜 false alarm도 측정하였다. 비교 연구 결과에 의하면, 메일 패턴의 가변성이 너무 커서 wrapper generation 방법으로는 해결하기 힘들었고, association rule analysis와 link analysis 기법이 보다 우수한 것으로 평가되었다.

  • PDF

A Study on Web Mining System for Real-Time Monitoring of Opinion Information Based on Web 2.0 (의견정보 모니터링을 위한 웹 마이닝 시스템에 관한 연구)

  • Joo, Hae-Jong;Hong, Bong-Hwa;Jeong, Bok-Cheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.15 no.1
    • /
    • pp.149-157
    • /
    • 2010
  • As the use of the Internet has recently increased, the demand for opinion information posted on the Internet has grown. However, such resources only exist on the website. People who want to search for information on the Internet find it inconvenient to visit each website. This paper focuses on the opinion information extraction and analysis system through Web mining that is based on statistics collected from Web contents. That is, users' opinion information which is scattered across several websites can be automatically analyzed and extracted. The system provides the opinion information search service that enables users to search for real-time positive and negative opinions and check their statistics. Also, users can do real-time search and monitoring about other opinion information by putting keywords in the system. Proposed technologies proved to have outstanding capabilities in comparison to existing ones through tests. The capabilities to extract positive and negative opinion information were assessed. Specifically, test movie review sentence testing data was tested and its results were analyzed.

A Designing for Successful Learning on the Web

  • Ahn, Jeong-Yong;Han, Kyung-Soo;Han, Beom-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.4
    • /
    • pp.1083-1090
    • /
    • 2003
  • Web-based learning is currently an active area of research and a considerable number of studies have been conducted on its application in the learning environment. However, in spite of many advances in the research and development of the educational contents, questions about how the environment affects learning remains largely unanswered. In this article, we propose a Web-based learning environment to improve the educational effect. The goal of this article is not to provide a complete system to support Web-based learning but rather to describe some meaningful strategies and fundamental design concepts that utilize information technologies to support teaching and learning.

  • PDF

Improvement Plan of Web Site FAQ using Text Mining : Focused on the S University Case (텍스트마이닝을 활용한 웹사이트 FAQ 개선방안: S대학교 사례를 중심으로)

  • Ahn, su-hyun;Jo, jeong-hyun;Lee, sang-jun
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2018.05a
    • /
    • pp.361-362
    • /
    • 2018
  • 본 연구는 대학 웹페이지의 Q&A(질의응답) 게시판에 게재된 비정형화 된 데이터를 수집한 후 텍스트마이닝과 네트워크 분석을 활용하여 자주 등장하는 키워드 간 연관 패턴을 파악하고자 한다. 분석결과를 바탕으로 FAQ(자주하는 질문) 게시판을 구성한다면 반복적인 질문에 대한 민원을 간소화함으로써 수요자의 편의성과 행정의 효율성 향상에 기여하고 나아가 원활한 양방향 소통이 가능할 것으로 기대한다.

  • PDF

Design and Adaptation for Internet News Data Extraction Middleware(INDEM) System

  • Sun, Bok-Keun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.4
    • /
    • pp.55-62
    • /
    • 2016
  • In this paper, we propose the INDEM(Internet News Data Extraction Middleware) system for the removal of the unnecessary data in internet news. Although data on the internet can be used in various fields such as source of data of IR(Information Retrieval), Data mining and knowledge information service, it contains a lot of unnecessary information. The removal of the unnecessary data is a problem to be solved prior to the study of the knowledge-based information service that is based on the data of the web page. The INDEM system parses html and explores the XPath, and it is to perform the analysis. The user simply utilize INDEM by implementing an abstract class that provides INDEM, and can obtain the analysis information. INDEM System through this process delivers the analysis information including the main contents of news site to the users. In this paper, the INDEM system was adapted in a stand-alone and web service system and it was evaluated on the basis of 16 news site. As a result, performance of the INDEM system is affected in html source data size and complexity of used html grammar than the main news data size.

Text Mining and Visualization of Papers Reviews Using R Language

  • Li, Jiapei;Shin, Seong Yoon;Lee, Hyun Chang
    • Journal of information and communication convergence engineering
    • /
    • v.15 no.3
    • /
    • pp.170-174
    • /
    • 2017
  • Nowadays, people share and discuss scientific papers on social media such as the Web 2.0, big data, online forums, blogs, Twitter, Facebook and scholar community, etc. In addition to a variety of metrics such as numbers of citation, download, recommendation, etc., paper review text is also one of the effective resources for the study of scientific impact. The social media tools improve the research process: recording a series online scholarly behaviors. This paper aims to research the huge amount of paper reviews which have generated in the social media platforms to explore the implicit information about research papers. We implemented and shown the result of text mining on review texts using R language. And we found that Zika virus was the research hotspot and association research methods were widely used in 2016. We also mined the news review about one paper and derived the public opinion.

Personal Sentiment Analysis and Opinion Mining (개인감정분석과 마이닝)

  • Lee, Hyun Chang;Shin, Seong Yoon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2017.07a
    • /
    • pp.344-345
    • /
    • 2017
  • Opinion mining and sentiment analysis(OMSA) as a research discipline has emerged during last 15 years and provides a methodology to computationally process the unstructured data mainly to extract opinions and identify their sentiments. The relatively new but fast growing research discipline has changed a lot during these years. This paper presents a scientometric analysis of research work done on OMSA during 2007-2016. For the literature analysis, research publications indexed in Web of Science (WoS) database are used as input data. The publication data is analyzed computationally to identify year-wise publication pattern, rate of growth of publications, research areas.

  • PDF

Text Extraction Algorithm using the HTML Logical Structure Analysis (HTML 논리적 구조분석을 통한 본문추출 알고리즘)

  • Jeon, Hyun-Gee;KOH, Chan
    • Journal of Digital Contents Society
    • /
    • v.16 no.3
    • /
    • pp.445-455
    • /
    • 2015
  • According as internet and computer technology develops, the amount of information has increased exponentially, arising from a variety of web authoring tools and is a new web standard of appearance and a wide variety of web content accessibility as more convenient for the web are produced very quickly. However, web documents are put out on a variety of topics divided into some blocks where each of the blocks are dealing with a topic unrelated to one another as well as you can not see with contents such as many navigations, simple decorations, advertisements, copyright. Extract only the exact area of the web document body to solve this problem and to meet user requirements, and to study the effective information. Later on, as the reconstruction method, we propose a web search system can be optimized systematically manage documents.