• Title/Summary/Keyword: Multilingual Classification

Search Result 16, Processing Time 0.018 seconds

Multilingual SPLOG classification using language independent features (언어 독립적인 자질을 이용한 다국어 스플로그 분류)

  • Hong, Seong-Hak
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.284-287
    • /
    • 2011
  • 블로그는 검색 서비스에 노출되는 주요 사용자 생성 콘텐트 중 하나이며 스팸과 SEO를 위한 주요 대상이 되어왔다. 최근에는 인터넷 보급의 보편화로 비영어권 국가에서의 블로그 사용자가 증가하면서 블로그 검색에서도 여러 언어로 작성된 블로그와 스팸이 노출되고 있다. 일반적인 블로그 검색엔진에서의 스팸 필터의 경우 특정 국가나 언어를 위한 스팸 필터 시스템을 각기 구성하여 이를 별도로 사용하지만이는 자원 소모의 문제와 함께 크롤을 통해 유입되는 다양한 언어로 작성된 블로그 스팸을 미리 감지하기 어렵다. 본 논문에서는 블로그를 크롤하여 서비스하는 국제화를 지원하는 블로그 검색엔진에서 스플로그를 탐지하기 위해 속성 및 단어 기반 자질들을 이용한 다국어 공용 스플로그 감지 모델을 생성 하는 방법과 효과를 확인하기 위해 실험을 수행하였으며 가능성이 있음을 확인하였다.

Component Analysis for Constructing an Emotion Ontology (감정 온톨로지의 구축을 위한 구성요소 분석)

  • Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Korean Journal of Cognitive Science
    • /
    • v.21 no.1
    • /
    • pp.157-175
    • /
    • 2010
  • Understanding dialogue participant's emotion is important as well as decoding the explicit message in human communication. It is well known that non-verbal elements are more suitable for conveying speaker's emotions than verbal elements. Written texts, however, contain a variety of linguistic units that express emotions. This study aims at analyzing components for constructing an emotion ontology, that provides us with numerous applications in Human Language Technology. A majority of the previous work in text-based emotion processing focused on the classification of emotions, the construction of a dictionary describing emotion, and the retrieval of those lexica in texts through keyword spotting and/or syntactic parsing techniques. The retrieved or computed emotions based on that process did not show good results in terms of accuracy. Thus, more sophisticate components analysis is proposed and the linguistic factors are introduced in this study. (1) 5 linguistic types of emotion expressions are differentiated in terms of target (verbal/non-verbal) and the method (expressive/descriptive/iconic). The correlations among them as well as their correlation with the non-verbal expressive type are also determined. This characteristic is expected to guarantees more adaptability to our ontology in multi-modal environments. (2) As emotion-related components, this study proposes 24 emotion types, the 5-scale intensity (-2~+2), and the 3-scale polarity (positive/negative/neutral) which can describe a variety of emotions in more detail and in standardized way. (3) We introduce verbal expression-related components, such as 'experiencer', 'description target', 'description method' and 'linguistic features', which can classify and tag appropriately verbal expressions of emotions. (4) Adopting the linguistic tag sets proposed by ISO and TEI and providing the mapping table between our classification of emotions and Plutchik's, our ontology can be easily employed for multilingual processing.

  • PDF

A study on Survive and Acquisition for YouTube Partnership of Entry YouTubers using Machine Learning Classification Technique (머신러닝 분류기법을 활용한 신생 유튜버의 생존 및 수익창출에 관한 연구)

  • Hoik Kim;Han-Min Kim
    • Information Systems Review
    • /
    • v.25 no.2
    • /
    • pp.57-76
    • /
    • 2023
  • This study classifies the success of creators and YouTubers who have created channels on YouTube recently, which is the most influential digital platform. Based on the actual information disclosure of YouTubers who are in the field of science and technology category, video upload cycle, video length, number of selectable multilingual subtitles, and information from other social network channels that are being operated, the success of YouTubers using machine learning was classified and analyzed, which is the closest to the YouTube revenue structure. Our findings showed that neural network algorithm provided the best performance to predict the success or failure of YouTubers. In addition, our five factors contributed to improve the performance of the classification. This study has implications in suggesting various approaches to new individual entrepreneurs who want to start YouTube, influencers who are currently operating YouTube, and companies who want to utilize these digital platforms. We discuss the future direction of utilizing digital platforms.

Analysis of LinkedIn Jobs for Finding High Demand Job Trends Using Text Processing Techniques

  • Kazi, Abdul Karim;Farooq, Muhammad Umer;Fatima, Zainab;Hina, Saman;Abid, Hasan
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.223-229
    • /
    • 2022
  • LinkedIn is one of the most job hunting and career-growing applications in the world. There are a lot of opportunities and jobs available on LinkedIn. According to statistics, LinkedIn has 738M+ members. 14M+ open jobs on LinkedIn and 55M+ Companies listed on this mega-connected application. A lot of vacancies are available daily. LinkedIn data has been used for the research work carried out in this paper. This in turn can significantly tackle the challenges faced by LinkedIn and other job posting applications to improve the levels of jobs available in the industry. This research introduces Text Processing in natural language processing on datasets of LinkedIn which aims to find out the jobs that appear most in a month or/and year. Therefore, the large data became renewed into the required or needful source. This study thus uses Multinomial Naïve Bayes and Linear Support Vector Machine learning algorithms for text classification and developed a trained multilingual dataset. The results indicate the most needed job vacancies in any field. This will help students, job seekers, and entrepreneurs with their career decisions

Classification and Evaluation of Service Requirements in Mobile Tourism Application Using Kano Model and AHP

  • Choedon, Tenzin;Lee, Young-Chan
    • The Journal of Information Systems
    • /
    • v.27 no.1
    • /
    • pp.43-65
    • /
    • 2018
  • Purpose The emergence of mobile applications has simplified our life in various ways. Regarding tourism activities, mobile applications are already efficient in providing personalized tourism related information and are very much effective in booking hotels, flights, etc. However, there are very few studies on classifying the actual service requirements and improving the customer satisfaction in mobile tourism applications. The purpose of this study is to implement a practical mobile tourism application. To serve the purpose, we classify and categorize the service requirement of mobile tourism applications in Korea. We employed Kano model and analytic hierarchy process (AHP). Specifically, we conducted a focus group study to find out the service requirements in mobile tourism applications. Design/methodology/approach The data for this study were collected from Koreans and Foreigners who has the experience using mobile tourism applications. Participants needed to be familiar with mobile tourism applications because such users may be more aware of the mobile tourism applications services. We analyzed 147 valid data using Kano model and conducted AHP analysis on five experts in the field of tourism using Expert Choice software. Findings In this paper, we identified the 17 service quality requirements in the mobile tourism applications. The results reveal that the service requirement such as Geo-location map, Multilingual option, Compatibility with different operating systems were unavoidable service, absent of such requirements leads to the dissatisfaction. Based on the results of the integrated application of both Kano model and AHP analysis, this study provide specific implications for improving the service quality of the mobile tourism applications in Korea.

Development of the Rule-based Smart Tourism Chatbot using Neo4J graph database

  • Kim, Dong-Hyun;Im, Hyeon-Su;Hyeon, Jong-Heon;Jwa, Jeong-Woo
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.179-186
    • /
    • 2021
  • We have been developed the smart tourism app and the Instagram and YouTube contents to provide personalized tourism information and travel product information to individual tourists. In this paper, we develop a rule-based smart tourism chatbot with the khaiii (Kakao Hangul Analyzer III) morphological analyzer and Neo4J graph database. In the proposed chatbot system, we use a morpheme analyzer, a proper noun dictionary including tourist destination names, and a general noun dictionary including containing frequently used words in tourist information search to understand the intention of the user's question. The tourism knowledge base built using the Neo4J graph database provides adequate answers to tourists' questions. In this paper, the nodes of Neo4J are Area based on tourist destination address, Contents with property of tourist information, and Service including service attribute data frequently used for search. A Neo4J query is created based on the result of analyzing the intention of a tourist's question with the property of nodes and relationships in Neo4J database. An answer to the question is made by searching in the tourism knowledge base. In this paper, we create the tourism knowledge base using more than 1300 Jeju tourism information used in the smart tourism app. We plan to develop a multilingual smart tour chatbot using the named entity recognition (NER), intention classification using conditional random field(CRF), and transfer learning using the pretrained language models.