• Title/Summary/Keyword: news Web

Search Result 250, Processing Time 0.025 seconds

How Do People Evaluate a Web Site's Credibility (이용자들의 웹 사이트 신뢰성 평가 방법에 관한 연구)

  • Kim, Young-Ki
    • Journal of Korean Library and Information Science Society
    • /
    • v.38 no.3
    • /
    • pp.53-72
    • /
    • 2007
  • The Internet is now an integral part of the everyday lives of a majority of people. They are demanding web sites that offer credible information - Just as much as they want sites that are easy to navigate. But the online reality today is that few Internet users say they can trust the web sites that have products for sale or the sites that offer advice about which products and services to buy. Users want the web sites they visit to provide clear information to allow them to judge the site's credibility. Users want to know who runs the site; how to reach those people; the site's privacy policy; and how the site deals with mistakes. In the eyes of users all sites ate not equal. Users have different credibility standards for different types of sites. For news and information sites users want advertising clearly labeled as advertising. And users want the site to provide a list of the editors responsible for the site's contents, including the editor's email address. For e-commerce sites, user expectations and demands are just about as high as they can be. They say that it is very important that these sites provide specific, accurate information about the site's policies and practices.

  • PDF

A Study of Perception of Golfwear Using Big Data Analysis (빅데이터를 활용한 골프웨어에 관한 인식 연구)

  • Lee, Areum;Lee, Jin Hwa
    • Fashion & Textile Research Journal
    • /
    • v.20 no.5
    • /
    • pp.533-547
    • /
    • 2018
  • The objective of this study is to examine the perception of golfwear and related trends based on major keywords and associated words related to golfwear utilizing big data. For this study, the data was collected from blogs, Jisikin and Tips, news articles, and web $caf{\acute{e}}$ from two of the most commonly used search engines (Naver & Daum) containing the keywords, 'Golfwear' and 'Golf clothes'. For data collection, frequency and matrix data were extracted through Textom, from January 1, 2016 to December 31, 2017. From the matrix created by Textom, Degree centrality, Closeness centrality, Betweenness centrality, and Eigenvector centrality were calculated and analyzed by utilizing Netminer 4.0. As a result of analysis, it was found that the keyword 'brand' showed the highest rank in web visibility followed by 'woman', 'size', 'man', 'fashion', 'sports', 'price', 'store', 'discount', 'equipment' in the top 10 frequency rankings. For centrality calculations, only the top 30 keywords were included because the density was extremely high due to high frequency of the co-occurring keywords. The results of centrality calculations showed that the keywords on top of the rankings were similar to the frequency of the raw data. When the frequency was adjusted by subtracting 100 and 500 words, it showed different results as the low-ranking keywords such as J. Lindberg in the frequency analysis ranked high along with changes in the rankings of all centrality calculations. Such findings of this study will provide basis for marketing strategies and ways to increase awareness and web visibility for Golfwear brands.

The University Gusdance System using the Alexa (Alexa를 이용한 대학안내 시스템)

  • Kim, Tae Jin;Kim, Dong Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.11
    • /
    • pp.2061-2066
    • /
    • 2017
  • The voice recognition technology is to recognize the voice of an user and execute the command. Recently, the voice recognition is evolving to the artificial intelligence voice recognition by adding the scheme of the natural language processing. The AI voice recognition is exploited to control the IoT devices or provide the information, such as the news or the wether. The University Information which is one of fields serviced by the information provider is mainly presented on the web. However, since too much information are presented on the web, it is difficult for an user to find efficiently the specific information which the user want to know. In this paper, we design and implement the university guidance system to recognize the user voice searching the information and provide the result using the voice. To do this, we classify the university data and design the lambda function to provide the data.

A Study on VoiceXML Application of User-Controlled Form Dialog System (사용자 주도 폼 다이얼로그 시스템의 VoiceXML 어플리케이션에 관한 연구)

  • Kwon, Hyeong-Joon;Roh, Yong-Wan;Lee, Hyon-Gu;Hong, Hwang-Seok
    • The KIPS Transactions:PartB
    • /
    • v.14B no.3 s.113
    • /
    • pp.183-190
    • /
    • 2007
  • VoiceXML is new markup language which is designed for web resource navigation via voice based on XML. An application using VoiceXML is classified into mutual-controlled and machine-controlled form dialog structure. Such dialog structures can't construct service which provide free navigation of web resource by user because a scenario is decided by application developer. In this paper, we propose VoiceXML application structure using user-controlled form dialog system which decide service scenario according to user's intention. The proposed application automatically detects recognition candidates from requested information by user, and then system uses recognition candidate as voice-anchor. Also, system connects each voice-anchor with new voice-node. An example of proposed system, we implement news service with IT term dictionary, and we confirm detection and registration of voice-anchor and make an estimate of hit rate about measurement of an successive offer from information according to user's intention and response speed. As the experiment result, we confirmed possibility which is more freely navigation of web resource than existing VoiceXML form dialog systems.

Analysis and Utilization of Housing Information based on Open API and Web Scraping (오픈API와 웹스크래핑에 기반한 주택정보 분석 및 활용방안)

  • Shin-Hyeong Choi
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.5
    • /
    • pp.323-329
    • /
    • 2024
  • In an era of low interest rates around the world, interest in real estate has increased. We can collect real estate information using the Internet, but it takes a lot of time to find. In this paper, real estate information from January 2015 to April 2024 is collected from three places to help users more easily collect real estate information of interest and use it for sales. First, by analyzing HTML documents using web scraping techniques, information on real estate of interest is automatically extracted from the website of the platform company. Second, the actual transaction price of the real estate is additionally collected through the open API provided by the Ministry of Land, Infrastructure and Transport. Third, real estate-related news is provided so that users can learn about the future value and prospects of real estate. The simulation results for the data collected in this study show that the lowest price predicted by the ARIMA model is expected to be in May 2024 among the next eight months. Therefore, by following this procedure, real estate buyers can make more efficient home sales by referring to related information including the predicted transaction price.

Multi-Category Sentiment Analysis for Social Opinion Related to Artificial Intelligence on Social Media (소셜 미디어 상에서의 인공지능 관련 사회적 여론에 대한 다 범주 감성 분석)

  • Lee, Sang Won;Choi, Chang Wook;Kim, Dong Sung;Yeo, Woon Young;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.51-66
    • /
    • 2018
  • As AI (Artificial Intelligence) technologies have been swiftly evolved, a lot of products and services are under development in various fields for better users' experience. On this technology advance, negative effects of AI technologies also have been discussed actively while there exists positive expectation on them at the same time. For instance, many social issues such as trolley dilemma and system security issues are being debated, whereas autonomous vehicles based on artificial intelligence have had attention in terms of stability increase. Therefore, it needs to check and analyse major social issues on artificial intelligence for their development and societal acceptance. In this paper, multi-categorical sentiment analysis is conducted over online public opinion on artificial intelligence after identifying the trending topics related to artificial intelligence for two years from January 2016 to December 2017, which include the event, match between Lee Sedol and AlphaGo. Using the largest web portal in South Korea, online news, news headlines and news comments were crawled. Considering the importance of trending topics, online public opinion was analysed into seven multiple sentimental categories comprised of anger, dislike, fear, happiness, neutrality, sadness, and surprise by topics, not only two simple positive or negative sentiment. As a result, it was found that the top sentiment is "happiness" in most events and yet sentiments on each keyword are different. In addition, when the research period was divided into four periods, the first half of 2016, the second half of the year, the first half of 2017, and the second half of the year, it is confirmed that the sentiment of 'anger' decreases as goes by time. Based on the results of this analysis, it is possible to grasp various topics and trends currently discussed on artificial intelligence, and it can be used to prepare countermeasures. We hope that we can improve to measure public opinion more precisely in the future by integrating empathy level of news comments.

Development of Information Extraction System from Multi Source Unstructured Documents for Knowledge Base Expansion (지식베이스 확장을 위한 멀티소스 비정형 문서에서의 정보 추출 시스템의 개발)

  • Choi, Hyunseung;Kim, Mintae;Kim, Wooju;Shin, Dongwook;Lee, Yong Hun
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.111-136
    • /
    • 2018
  • In this paper, we propose a methodology to extract answer information about queries from various types of unstructured documents collected from multi-sources existing on web in order to expand knowledge base. The proposed methodology is divided into the following steps. 1) Collect relevant documents from Wikipedia, Naver encyclopedia, and Naver news sources for "subject-predicate" separated queries and classify the proper documents. 2) Determine whether the sentence is suitable for extracting information and derive the confidence. 3) Based on the predicate feature, extract the information in the proper sentence and derive the overall confidence of the information extraction result. In order to evaluate the performance of the information extraction system, we selected 400 queries from the artificial intelligence speaker of SK-Telecom. Compared with the baseline model, it is confirmed that it shows higher performance index than the existing model. The contribution of this study is that we develop a sequence tagging model based on bi-directional LSTM-CRF using the predicate feature of the query, with this we developed a robust model that can maintain high recall performance even in various types of unstructured documents collected from multiple sources. The problem of information extraction for knowledge base extension should take into account heterogeneous characteristics of source-specific document types. The proposed methodology proved to extract information effectively from various types of unstructured documents compared to the baseline model. There is a limitation in previous research that the performance is poor when extracting information about the document type that is different from the training data. In addition, this study can prevent unnecessary information extraction attempts from the documents that do not include the answer information through the process for predicting the suitability of information extraction of documents and sentences before the information extraction step. It is meaningful that we provided a method that precision performance can be maintained even in actual web environment. The information extraction problem for the knowledge base expansion has the characteristic that it can not guarantee whether the document includes the correct answer because it is aimed at the unstructured document existing in the real web. When the question answering is performed on a real web, previous machine reading comprehension studies has a limitation that it shows a low level of precision because it frequently attempts to extract an answer even in a document in which there is no correct answer. The policy that predicts the suitability of document and sentence information extraction is meaningful in that it contributes to maintaining the performance of information extraction even in real web environment. The limitations of this study and future research directions are as follows. First, it is a problem related to data preprocessing. In this study, the unit of knowledge extraction is classified through the morphological analysis based on the open source Konlpy python package, and the information extraction result can be improperly performed because morphological analysis is not performed properly. To enhance the performance of information extraction results, it is necessary to develop an advanced morpheme analyzer. Second, it is a problem of entity ambiguity. The information extraction system of this study can not distinguish the same name that has different intention. If several people with the same name appear in the news, the system may not extract information about the intended query. In future research, it is necessary to take measures to identify the person with the same name. Third, it is a problem of evaluation query data. In this study, we selected 400 of user queries collected from SK Telecom 's interactive artificial intelligent speaker to evaluate the performance of the information extraction system. n this study, we developed evaluation data set using 800 documents (400 questions * 7 articles per question (1 Wikipedia, 3 Naver encyclopedia, 3 Naver news) by judging whether a correct answer is included or not. To ensure the external validity of the study, it is desirable to use more queries to determine the performance of the system. This is a costly activity that must be done manually. Future research needs to evaluate the system for more queries. It is also necessary to develop a Korean benchmark data set of information extraction system for queries from multi-source web documents to build an environment that can evaluate the results more objectively.

VoiceXML Dialog System Based on RSS for Contents Syndication (콘텐츠 배급을 위한 RSS 기반의 VoiceXML 다이얼로그 시스템)

  • Kwon, Hyeong-Joon;Kim, Jung-Hyun;Lee, Hyon-Gu;Hong, Kwang-Seok
    • The KIPS Transactions:PartB
    • /
    • v.14B no.1 s.111
    • /
    • pp.51-58
    • /
    • 2007
  • This paper suggests prototype of dialog system combining VXML(VoiceXML) that is the W3C's standard XML format for specifying interactive voice dialogues between human and computer, and RSS(RDF Site Summary or Really Simple Syndication) that is representative technology of semantic web for syndication and subscription of updated web-contents. Merits of the proposed system are as following: 1) It is a new method that recognize spoken contents using ire and wireless telephone networks and then provide contents to user via STT(Speech-to-Text) and TTS(Text-to-Speech) instead of traditional method using web only. 2) It can apply advantage of RSS that subscription of updated contents is converted to VXML without modifying traditional method to provide RSS service, 3) In terms of users, it can reduce restriction on time-spate in search of contents that is provided by RSS because it uses ire and wireless telephone networks, not internet environment. 4) In terms of information provider, it does not need special component for syndication of the newest contents using speech recognition and synthesis technology. We implemented a news service system using VXML and RSS for performance evaluation of the proposed system. In experiment results, we estimated the response time and the speech recognition rate in subscription and search of actuality contents, and confirmed that the proposed system can provide contents those are provided using RSS Feed.

A Study on the Enhancement of Information & Communication Technology Literacy Capacity through Web News-data in Education (웹 신문학습을 통한 정보통신기술 소양 능력 신장에 관한 연구)

  • Bang, Ju-Hye;Lee, Yong-Bae
    • 한국정보교육학회:학술대회논문집
    • /
    • 2006.01a
    • /
    • pp.77-82
    • /
    • 2006
  • 학습자들은 소양교육으로 정보통신기술에 대한 기초적인 지식과 활용 능력을 습득하고 이를 토대로 각 교과에서 정보통신기술을 활용할 수 있어야 한다. 이 두 가지의 교육이 서로 연계하여 이루어질 때 정보통신기술 활용 능력은 가장 효과적으로 신장된다. 이러한 소양교육과 활용교육 연계선상에서 이루어지는 교육이 바로 웹 신문학습이다. 그러나 웹 신문학습이 정보통신기술 소양 능력 향상에 영향을 미치는지의 검증 연구가 진행되지 않아 교육효과에 대한 불확실성을 갖고 있었다. 이에 컴퓨터 교육의 목표를 효율적으로 달성할 수 있는 교수 학습 방법인 웹 신문학습을 수행한 후 정보통신기술 소양 능력을 측정하여 객관적인 수치로 교육적 효과를 검증하였다.

  • PDF

A Hybrid Sentence Alignment Method for Building a Korean-English Parallel Corpus (한영 병렬 코퍼스 구축을 위한 하이브리드 기반 문장 자동 정렬 방법)

  • Park, Jung-Yeul;Cha, Jeong-Won
    • MALSORI
    • /
    • v.68
    • /
    • pp.95-114
    • /
    • 2008
  • The recent growing popularity of statistical methods in machine translation requires much more large parallel corpora. A Korean-English parallel corpus, however, is not yet enoughly available, little research on this subject is being conducted. In this paper we present a hybrid method of aligning sentences for Korean-English parallel corpora. We use bilingual news wire web pages, reading comprehension materials for English learners, computer-related technical documents and help files of localized software for building a Korean-English parallel corpus. Our hybrid method combines sentence-length based and word-correspondence based methods. We show the results of experimentation and evaluate them. Alignment results from using a full translation model are very encouraging, especially when we apply alignment results to an SMT system: 0.66% for BLEU score and 9.94% for NIST score improvement compared to the previous method.

  • PDF