• Title/Summary/Keyword: Text-confidence features

Search Result 6, Processing Time 0.017 seconds

Text-Confidence Feature Based Quality Evaluation Model for Knowledge Q&A Documents (텍스트 신뢰도 자질 기반 지식 질의응답 문서 품질 평가 모델)

  • Lee, Jung-Tae;Song, Young-In;Park, So-Young;Rim, Hae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.608-615
    • /
    • 2008
  • In Knowledge Q&A services where information is created by unspecified users, document quality is an important factor of user satisfaction with search results. Previous work on quality prediction of Knowledge Q&A documents evaluate the quality of documents by using non-textual information, such as click counts and recommendation counts, and focus on enhancing retrieval performance by incorporating the quality measure into retrieval model. Although the non-textual information used in previous work was proven to be useful by experiments, data sparseness problem may occur when predicting the quality of newly created documents with such information. To solve data sparseness problem of non-textual features, this paper proposes new features for document quality prediction, namely text-confidence features, which indicate how trustworthy the content of a document is. The proposed features, extracted directly from the document content, are stable against data sparseness problem, compared to non-textual features that indirectly require participation of service users in order to be collected. Experiments conducted on real world Knowledge Q&A documents suggests that text-confidence features show performance comparable to the non-textual features. We believe the proposed features can be utilized as effective features for document quality prediction and improve the performance of Knowledge Q&A services in the future.

Learner-Generated Digital Listening Materials Using Text-to-Speech for Self-Directed Listening Practice

  • Moon, Dosik
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.4
    • /
    • pp.148-155
    • /
    • 2020
  • This study investigated learners' perceptions of using self-generated listening materials based on Text to Speech. After taking an online training session to learn how to make listening materials for extensive listening practice outside the classroom, the learners were engaged in practice with self-generated listening materials for 10 weeks in a self-directed way. The results show that a majority of the learners found the TTS-based listening materials helpful to reduce anxiety toward listening and enhance self-confidence and motivation, with a positive effect on improving their listening ability. The learners' general satisfaction can be attributed to some beneficial features of TTS-based listening material, including freedom to choose what they want to learn, convenient accessibility to the material, availability of various native speakers' voices, and novelty of digital tools. This suggests that TTS-based digital listening materials can be a useful educational tool to support learners' self-directed listening practice outside the classroom in EFL settings.

Generation of Natural Referring Expressions by Syntactic Information and Cost-based Centering Model (구문 정보와 비용기반 중심화 이론에 기반한 자연스러운 지시어 생성)

  • Roh Ji-Eun;Lee Jong-Hyeok
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.12
    • /
    • pp.1649-1659
    • /
    • 2004
  • Text Generation is a process of generating comprehensible texts in human languages from some underlying non-linguistic representation of information. Among several sub-processes for text generation to generate coherent texts, this paper concerns referring expression generation which produces different types of expressions to refer to previously-mentioned things in a discourse. Specifically, we focus on pronominalization by zero pronouns which frequently occur in Korean. To build a generation model of referring expressions for Korean, several features are identified based on grammatical information and cost-based centering model, which are applied to various machine learning techniques. We demonstrate that our proposed features are well defined to explain pronominalization, especially pronominalization by zero pronouns in Korean, through 95 texts from three genres - Descriptive texts, News, and Short Aesop's Fables. We also show that our model significantly outperforms previous ones with a 99.9% confidence level by a T-test.

The Effect of Expert Reviews on Consumer Product Evaluations: A Text Mining Approach (전문가 제품 후기가 소비자 제품 평가에 미치는 영향: 텍스트마이닝 분석을 중심으로)

  • Kang, Taeyoung;Park, Do-Hyung
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.63-82
    • /
    • 2016
  • Individuals gather information online to resolve problems in their daily lives and make various decisions about the purchase of products or services. With the revolutionary development of information technology, Web 2.0 has allowed more people to easily generate and use online reviews such that the volume of information is rapidly increasing, and the usefulness and significance of analyzing the unstructured data have also increased. This paper presents an analysis on the lexical features of expert product reviews to determine their influence on consumers' purchasing decisions. The focus was on how unstructured data can be organized and used in diverse contexts through text mining. In addition, diverse lexical features of expert reviews of contents provided by a third-party review site were extracted and defined. Expert reviews are defined as evaluations by people who have expert knowledge about specific products or services in newspapers or magazines; this type of review is also called a critic review. Consumers who purchased products before the widespread use of the Internet were able to access expert reviews through newspapers or magazines; thus, they were not able to access many of them. Recently, however, major media also now provide online services so that people can more easily and affordably access expert reviews compared to the past. The reason why diverse reviews from experts in several fields are important is that there is an information asymmetry where some information is not shared among consumers and sellers. The information asymmetry can be resolved with information provided by third parties with expertise to consumers. Then, consumers can read expert reviews and make purchasing decisions by considering the abundant information on products or services. Therefore, expert reviews play an important role in consumers' purchasing decisions and the performance of companies across diverse industries. If the influence of qualitative data such as reviews or assessment after the purchase of products can be separately identified from the quantitative data resources, such as the actual quality of products or price, it is possible to identify which aspects of product reviews hamper or promote product sales. Previous studies have focused on the characteristics of the experts themselves, such as the expertise and credibility of sources regarding expert reviews; however, these studies did not suggest the influence of the linguistic features of experts' product reviews on consumers' overall evaluation. However, this study focused on experts' recommendations and evaluations to reveal the lexical features of expert reviews and whether such features influence consumers' overall evaluations and purchasing decisions. Real expert product reviews were analyzed based on the suggested methodology, and five lexical features of expert reviews were ultimately determined. Specifically, the "review depth" (i.e., degree of detail of the expert's product analysis), and "lack of assurance" (i.e., degree of confidence that the expert has in the evaluation) have statistically significant effects on consumers' product evaluations. In contrast, the "positive polarity" (i.e., the degree of positivity of an expert's evaluations) has an insignificant effect, while the "negative polarity" (i.e., the degree of negativity of an expert's evaluations) has a significant negative effect on consumers' product evaluations. Finally, the "social orientation" (i.e., the degree of how many social expressions experts include in their reviews) does not have a significant effect on consumers' product evaluations. In summary, the lexical properties of the product reviews were defined according to each relevant factor. Then, the influence of each linguistic factor of expert reviews on the consumers' final evaluations was tested. In addition, a test was performed on whether each linguistic factor influencing consumers' product evaluations differs depending on the lexical features. The results of these analyses should provide guidelines on how individuals process massive volumes of unstructured data depending on lexical features in various contexts and how companies can use this mechanism from their perspective. This paper provides several theoretical and practical contributions, such as the proposal of a new methodology and its application to real data.

A Distance Approach for Open Information Extraction Based on Word Vector

  • Liu, Peiqian;Wang, Xiaojie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.6
    • /
    • pp.2470-2491
    • /
    • 2018
  • Web-scale open information extraction (Open IE) plays an important role in NLP tasks like acquiring common-sense knowledge, learning selectional preferences and automatic text understanding. A large number of Open IE approaches have been proposed in the last decade, and the majority of these approaches are based on supervised learning or dependency parsing. In this paper, we present a novel method for web scale open information extraction, which employs cosine distance based on Google word vector as the confidence score of the extraction. The proposed method is a purely unsupervised learning algorithm without requiring any hand-labeled training data or dependency parse features. We also present the mathematically rigorous proof for the new method with Bayes Inference and Artificial Neural Network theory. It turns out that the proposed algorithm is equivalent to Maximum Likelihood Estimation of the joint probability distribution over the elements of the candidate extraction. The proof itself also theoretically suggests a typical usage of word vector for other NLP tasks. Experiments show that the distance-based method leads to further improvements over the newly presented Open IE systems on three benchmark datasets, in terms of effectiveness and efficiency.

Quality Prediction of Knowledge Search Documents Using Text-Confidence Features (신뢰도 자질을 이용한 지식검색 문서의 품질 평가)

  • Lee, Jung-Tae;Song, Young-In;Rim, Hae-Chang
    • Annual Conference on Human and Language Technology
    • /
    • 2007.10a
    • /
    • pp.62-67
    • /
    • 2007
  • 불특정 사용자의 참여에 의해 정보가 생성되는 지식검색 서비스에서는 문서의 품질이 검색 만족도에 중요한 요소 중 하나이다. 지식검색 문서의 품질 평가에 관한 기존 연구는 조회 수나 추천 수 등의 비텍스트 정보를 이용하여 문서의 품질을 평가하고, 이를 검색 모형에 반영하여 검색 성능을 높이는데 집중하였다. 이러한 비텍스트 정보는 그 유용성이 실험을 통해 증명되었지만, 새로 작성된 문서와 같은 경우 심각한 자료 부족 문제가 발생할 수 있다는 단점이 있다. 본 논문에서는 이러한 비텍스트 정보의 자료 부족 문제를 완화할 수 있는 새로운 문서 품질 평가 자질로서 문서 내용의 신뢰성을 반영하는 신뢰도 자질을 제안한다. 제안하는 자질은 문서의 내용으로부터 직접 추출되며, 따라서 추천 수나 조회 수 등 서비스 사용자의 참여나 이용을 필요로 하는 비텍스트 자질보다 자료 부족 문제에 견고하다는 장점이 있다. 또한 제안하는 신뢰도 자질은 문서 품질 평가에 유용하다고 알려진 비텍스트 자질과 유사하거나 향상된 성능을 실험에서 보였으며, 추후 자질 추출 방법을 개선한다면 효과적인 품질 평가 자질로서 기능을 할 수 있을 것으로 기대된다.

  • PDF