• Title/Summary/Keyword: Web-Blog

Search Result 116, Processing Time 0.028 seconds

Logistic Regression Ensemble Method for Extracting Significant Information from Social Texts (소셜 텍스트의 주요 정보 추출을 위한 로지스틱 회귀 앙상블 기법)

  • Kim, So Hyeon;Kim, Han Joon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.5
    • /
    • pp.279-284
    • /
    • 2017
  • Currenty, in the era of big data, text mining and opinion mining have been used in many domains, and one of their most important research issues is to extract significant information from social media. Thus in this paper, we propose a logistic regression ensemble method of finding the main body text from blog HTML. First, we extract structural features and text features from blog HTML tags. Then we construct a classification model with logistic regression and ensemble that can decide whether any given tags involve main body text or not. One of our important findings is that the main body text can be found through 'depth' features extracted from HTML tags. In our experiment using diverse topics of blog data collected from the web, our tag classification model achieved 99% in terms of accuracy, and it recalled 80.5% of documents that have tags involving the main body text.

Blog-based Schedule Risk Management System (블로그 기반 공정리스크 관리시스템)

  • Jin, Soo-Myeong;Yoon, You-Sang;Jang, Myung-Hoon;Suh, Sang-Wook
    • Korean Journal of Construction Engineering and Management
    • /
    • v.10 no.5
    • /
    • pp.47-56
    • /
    • 2009
  • As contemporary society becomes information-oriented, major construction companies have recently invested in information management, but small and medium ones have not. Though the small and medium companies recognize the information management is necessary and important, they don't have an ability to possess technical expertises. The purpose of this study is to develop a Schedule Risk Control System based on blogs using IT and web technologies for construction managers in small and medium construction companies. The system makes a chance to identify risk factors and prepare for responses to settle risks happened in preconstruction phases, and enables the managers and headquarters to predict and handle schedule risks by sharing information about risks on the blogs.

A Splog Detection System Using Support Vector Systems (지지벡터기계를 이용한 스팸 블로그(Splog) 판별 시스템)

  • Lee, Song-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.1
    • /
    • pp.163-168
    • /
    • 2011
  • Blogs are an easy way to publish information, engage in discussions, and form communities on the Internet. Recently, there are several varieties of spam blog whose purpose is to host ads or raise the PageRank of target sites. Our purpose is to develope the system which detects these spam blogs (splogs) automatically among blogs on Web environment. After removing HTML of blogs, they are tagged by part of speech(POS) tagger. Words and their POS tags information is used as a feature type. Among features, we select useful features with X2 statistics and train the SVM with the selected features. Our system acquired 90.5% of F1 measure with SPLOG data set.

The study for the analysis and implementation of Alternate Reality Game by media convergence (미디어 융합에 따른 대체현실게임의 분석 및 구축에 관한 연구)

  • Kim, Jae-Ha
    • Journal of Broadcast Engineering
    • /
    • v.13 no.5
    • /
    • pp.752-759
    • /
    • 2008
  • The latest convergence of broadcasting and communication makes the boundary between medias be ambiguous. It pursues the new type of culture that the unspecified many participants seek the satisfaction for the same target in an realtime environment. ARG(Alternative Reality Game) is a new alternate real game created by media convergence and it processes the more realistic interactivity between users through ARS, SMS, BLOG, Cafe and Messages. This paper analyzes the implementation, design, and construction schemes of ARG in the side of the human thinking and media. We also propose the evolution and role of the broadcasting by combining broadcasting for the activation of ARG.

TRIB: A Clustering and Visualization System for Responding comments on WebBlog (TRIB: 웹블로그 댓글분류 시각화 시스템)

  • Bae, Min-Jung;Lee, Yun-Jung;Ji, Jeong-Hoon;Woo, Gyun;Cho, Hwan-Gyu
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.04a
    • /
    • pp.226-229
    • /
    • 2009
  • 최근 들어 인터넷 게시판이나 개인 블로그 등은 온라인상에서 사람들의 정보 공유나 의견 교환의 중요한 매체가 되고 있다. 많은 수의 블로그들은 현재 사회적으로 이슈가 되는 여러 문제들을 반영하고 있다. 또한 최근 댓글을 통해 적극적으로 자신의 의사 표현하거나 다른 사람들의 의견을 살피는 인터넷 사용자의 증가로 인터넷 뉴스나 블로그 기사에 많은 수의 댓글이 달리고 있다. 그러나 대부분의 블로그나 인터넷 포털 사이트의 경우 기사나 댓글들을 순차적인 목록 형태로 제공하므로 자신이 원하는 내용의 댓글을 검색하거나 전체 댓글에 대한 전반적인 파악은 힘든 일이다. 따라서 본 논문에서는 기사에 달린 많은 수의 댓글들을 분류하고, 이를 시각화 하는 시스템인 TRIB(Telescope for Responding comments for Internet Blog)을 제안한다. TRIB은 미리 정의된 사용자 정의 사전을 이용하여 댓글을 내용에 따라 분류하여 시각화 하므로 사용자들은 자신의 관심과 흥미에 따라 개인화 된 뷰를 볼 수 있다. 1,000개 이상의 댓글을 가진 뉴스 기사들을 대상으로 한 실험을 통해 TRIB 시스템의 댓글 분류와 시각화 성능을 보인다.

Comparison of the transformation methods for Flash Videos to Web Videos (플래시 비디오에서 웹비디오로의 변환기법 비교)

  • Lee, Hyun-Lee;Kim, Kyoung-Soo;Ceong, Hee-Taek
    • Journal of Digital Contents Society
    • /
    • v.11 no.4
    • /
    • pp.579-588
    • /
    • 2010
  • Generalization of the web, development of one-person media such as the blog and mini homepage, and integration of video digital devices have generalized multimedia video services on the web. However, flash videos, the previously used bit map-based multimedia videos, exhibit problems like the waterfall phenomenon, lag phenomenon, or non-synchronization of audios or videos. Thereupon, This study is conducted to suggest a converting technique to provide efficient web video service on the web by solving problems of bitmap-based flash video through file format-converting software and movie editing programs. And this paper also conducts experiments on five videos for 13 CODECs and analyzes converted results comparatively. The recommendable method considering the characteristics of each videos is to utilize MainConcept H.264 Video CODEC using SWF2Video pro. The result of this research can be used to produce web videos on the web more effectively.

The Evaluation of Web Contents by User 'Likes' Count: An Usefulness of hT-index for Topic Preference Measurement

  • Song, Yeseul;Park, Ji-Hong;Shim, Jiyoung
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.49 no.2
    • /
    • pp.27-49
    • /
    • 2015
  • The purpose of this study is to suggest an appropriate index for evaluating preferences of Web contents by examining the h-index and its variants. It focuses on how successfully each index represents relative user preference towards topical subjects. Based on data obtained from a popular IT blog (engadget.com), subject values of the h-index and its variants were calculated using 53 subject categories, article counts and the 'Likes' counts aggregated in each category. These values were compared through critical analysis of the indices and Spearman rank correlation analysis. A PFNet (Pathfinder Network) of subjects weighted by $h_T$ values was drawn and cluster analysis was conducted. Based on the four criteria suggested for the evaluation of Web contents, we concluded that the $h_T$-index is a relatively appropriate tool for the Web contents preference evaluation. The $h_T$-index was applied to visually represent the relative weight (topic preference by user 'Likes' count) for each subject category of the real online contents after suggesting the relative appropriateness of the $h_T$-index. Applying scientometric indicators to Web information could provide new insights into, and potential methods for, Web contents evaluation. In addition, information on the focus of users' attention would help online informants to plan more effective content strategies. The study tries to expand the application area of the h-type indices to non-academic online environments. The research procedure enables examination of the appropriateness of the index and highlights considerations for applying the indicators to Web contents.

Library Information Service on the Web 2.0 (웹 2.0 기반의 도서관 정보서비스)

  • Yang, Byeong-Hoon
    • Journal of Information Management
    • /
    • v.39 no.1
    • /
    • pp.199-220
    • /
    • 2008
  • Most people choose Internet search engine first more than the library for their information search in these days. Many users do not know library homepage's content. How to improve the users in library homepage? This study aims to suggest the direction of library homepage service in Web 2.0. For this study the author analyzed library homepage that is introducing some representative Web 2.0 and other Web 2.0 sites. AJAX, RSS, Open API, MashUp, Wikis, Blog are the main technologies in Web 2.0. Those technologies become a tool that can do user centered library homepage. But, more important thing is information production that introduce to users. Web 2.0 suggests good information transfer for users. It needs to produce the information that stimulates the library user. It means that Web 2.0 give a good opportunity for libraries as an information production.

Intelligent Web Crawler for Supporting Big Data Analysis Services (빅데이터 분석 서비스 지원을 위한 지능형 웹 크롤러)

  • Seo, Dongmin;Jung, Hanmin
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.12
    • /
    • pp.575-584
    • /
    • 2013
  • Data types used for big-data analysis are very widely, such as news, blog, SNS, papers, patents, sensed data, and etc. Particularly, the utilization of web documents offering reliable data in real time is increasing gradually. And web crawlers that collect web documents automatically have grown in importance because big-data is being used in many different fields and web data are growing exponentially every year. However, existing web crawlers can't collect whole web documents in a web site because existing web crawlers collect web documents with only URLs included in web documents collected in some web sites. Also, existing web crawlers can collect web documents collected by other web crawlers already because information about web documents collected in each web crawler isn't efficiently managed between web crawlers. Therefore, this paper proposed a distributed web crawler. To resolve the problems of existing web crawler, the proposed web crawler collects web documents by RSS of each web site and Google search API. And the web crawler provides fast crawling performance by a client-server model based on RMI and NIO that minimize network traffic. Furthermore, the web crawler extracts core content from a web document by a keyword similarity comparison on tags included in a web documents. Finally, to verify the superiority of our web crawler, we compare our web crawler with existing web crawlers in various experiments.

A Self-Directed Learning System of the Learner center e-Learning 2.0 Environment (e-Learning 2.0 환경에서의 학습자 중심의 자기주도적 학습 시스템)

  • Sung, Kyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2007.10a
    • /
    • pp.687-690
    • /
    • 2007
  • Opening and participation, blog that mean public ownership, 2.0 series with UCC etc.. etc. is popularized to supply spread of latest web 2.0 technologies. While e-Learning was speaker initiative, lean much to elder brother knowledge last month and real condition that participation of learners is inactive in this running race meantime, e-Learning 2.0 that apply web 2.0 know-hows is offering environment that can spring this boundary. Because e-Learning applies well web 2.0, necessity of development of this running race 2.0 systems is risen with participation of learners and public ownership. e-Learning 2.0 may improve each other efficiency of study epochally exchanging studying method or know-how, information, summary notebook etc. with participation and public ownership.

  • PDF