• Title/Summary/Keyword: real-world dataset

Search Result 140, Processing Time 0.023 seconds

The Impact of Mobile Channel Adoption on Video Consumption: Are We Watching More and for Longer? (모바일 채널 수용이 고객의 동영상 소비에 미치는 영향에 관한 실증 연구)

  • SangA Choi;Minhyung Lee;HanByeol Stella Choi;Heeseok Lee
    • Information Systems Review
    • /
    • v.25 no.3
    • /
    • pp.121-138
    • /
    • 2023
  • The advancement in mobile technology brought disruptive innovation in media industry. The introduction of mobile devices broke spatial and temporal restrictions in media consumption. This study investigates the impact of mobile channel adoption on video viewing behavior, using real-world dataset obtained from a particular on-demand service provider in South Korea. We find that the adoption of a mobile channel significantly increases the total viewing time of video-on-demand via TV and the number of contents viewed. Our results suggest that the mobile channels act as a complement channel to conventional TV channels. We provide theoretical and practical insights on consumer usage in the emerging over-the-top market.

Time to presentation and mortality outcomes among patients with diabetes and acute myocardial infarction

  • Min-A Shin;Seok Oh;Min Chul Kim;Doo Sun Sim;Young Joon Hong;Ju Han Kim;Youngkeun Ahn;Myung Ho Jeong
    • The Korean journal of internal medicine
    • /
    • v.39 no.1
    • /
    • pp.110-122
    • /
    • 2024
  • Background/Aims: Due to limited real-world evidence on the association between time to presentation (T2P) and outcomes following acute myocardial infarction and diabetes (AMI-DM), we investigated the characteristics of patients with AMI-DM and their outcomes based on their T2P. Methods: 4,455 patients with AMI-DM from a Korean nationwide observational cohort (2011-2015) were divided into early and late presenters according to symptom-to-door time. The effects of T2P on three-year all-cause mortality were estimated using inverse probability of treatment weighting (IPTW) and survival analysis. Results: The incidence of all-cause mortality was consistently higher in late presenters than in early presenters (11.4 vs. 17.2%; p < 0.001). In the IPTW-adjusted dataset, the incidence of all-cause mortality was numerically higher in late presenters than in early presenters (9.1 vs. 12.4%; p = 0.072). In the survival analysis, the cumulative incidence of all-cause mortality was significantly higher in late presenters than in early presenters before and after IPTW. In the subgroup with ST-elevation myocardial infarction, late presenters had a higher incidence of cardiac death than early presenters before (4.8 vs. 10.5%; p < 0.001) and after IPTW (4.2 vs. 9.7%; p = 0.034). In the initial glycated hemoglobin (HbA1c)-stratified analysis, these effects were attenuated in patients with HbA1c ≥ 9.0% before (adjusted hazard ratio [HR]: 1.45, 95% confidence interval [CI]: 0.80-2.64) and after IPTW (adjusted HR: 0.82, 95% CI: 0.40-1.67). Conclusions: Late presentation was associated with higher mortality in patients with AMI-DM; therefore, multifaceted and systematic interventions are needed to decrease pre-hospital delays.

Research on Ocular Data Analysis and Eye Tracking in Divers

  • Ye Jun Lee;Yong Kuk Kim;Da Young Kim;Jeongtack Min;Min-Kyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.43-51
    • /
    • 2024
  • This paper proposes a method for acquiring and analyzing ocular data using a special-purpose diver mask targeted at divers who primarily engage in underwater activities. This involves tracking the user's gaze with the help of a custom-built ocular dataset and a YOLOv8-nano model developed for this purpose. The model achieved an average processing time of 45.52ms per frame and successfully recognized states of eyes being open or closed with 99% accuracy. Based on the analysis of the ocular data, a gaze tracking algorithm was developed that can map to real-world coordinates. The validation of this algorithm showed an average error rate of about 1% on the x-axis and about 6% on the y-axis.

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Development of Crack Detection System for Highway Tunnels using Imaging Device and Deep Learning (영상장비와 딥러닝을 이용한 고속도로 터널 균열 탐지 시스템 개발)

  • Kim, Byung-Hyun;Cho, Soo-Jin;Chae, Hong-Je;Kim, Hong-Ki;Kang, Jong-Ha
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.25 no.4
    • /
    • pp.65-74
    • /
    • 2021
  • In order to efficiently inspect rapidly increasing old tunnels in many well-developed countries, many inspection methodologies have been proposed using imaging equipment and image processing. However, most of the existing methodologies evaluated their performance on a clean concrete surface with a limited area where other objects do not exist. Therefore, this paper proposes a 6-step framework for tunnel crack detection deep learning model development. The proposed method is mainly based on negative sample (non-crack object) training and Cascade Mask R-CNN. The proposed framework consists of six steps: searching for cracks in images captured from real tunnels, labeling cracks in pixel level, training a deep learning model, collecting non-crack objects, retraining the deep learning model with the collected non-crack objects, and constructing final training dataset. To implement the proposed framework, Cascade Mask R-CNN, an instance segmentation model, was trained with 1561 general crack images and 206 non-crack images. In order to examine the applicability of the trained model to the real-world tunnel crack detection, field testing is conducted on tunnel spans with a length of about 200m where electric wires and lights are prevalent. In the experimental result, the trained model showed 99% precision and 92% recall, which shows the excellent field applicability of the proposed framework.

A Hybrid Recommender System based on Collaborative Filtering with Selective Use of Overall and Multicriteria Ratings (종합 평점과 다기준 평점을 선택적으로 활용하는 협업필터링 기반 하이브리드 추천 시스템)

  • Ku, Min Jung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.85-109
    • /
    • 2018
  • Recommender system recommends the items expected to be purchased by a customer in the future according to his or her previous purchase behaviors. It has been served as a tool for realizing one-to-one personalization for an e-commerce service company. Traditional recommender systems, especially the recommender systems based on collaborative filtering (CF), which is the most popular recommendation algorithm in both academy and industry, are designed to generate the items list for recommendation by using 'overall rating' - a single criterion. However, it has critical limitations in understanding the customers' preferences in detail. Recently, to mitigate these limitations, some leading e-commerce companies have begun to get feedback from their customers in a form of 'multicritera ratings'. Multicriteria ratings enable the companies to understand their customers' preferences from the multidimensional viewpoints. Moreover, it is easy to handle and analyze the multidimensional ratings because they are quantitative. But, the recommendation using multicritera ratings also has limitation that it may omit detail information on a user's preference because it only considers three-to-five predetermined criteria in most cases. Under this background, this study proposes a novel hybrid recommendation system, which selectively uses the results from 'traditional CF' and 'CF using multicriteria ratings'. Our proposed system is based on the premise that some people have holistic preference scheme, whereas others have composite preference scheme. Thus, our system is designed to use traditional CF using overall rating for the users with holistic preference, and to use CF using multicriteria ratings for the users with composite preference. To validate the usefulness of the proposed system, we applied it to a real-world dataset regarding the recommendation for POI (point-of-interests). Providing personalized POI recommendation is getting more attentions as the popularity of the location-based services such as Yelp and Foursquare increases. The dataset was collected from university students via a Web-based online survey system. Using the survey system, we collected the overall ratings as well as the ratings for each criterion for 48 POIs that are located near K university in Seoul, South Korea. The criteria include 'food or taste', 'price' and 'service or mood'. As a result, we obtain 2,878 valid ratings from 112 users. Among 48 items, 38 items (80%) are used as training dataset, and the remaining 10 items (20%) are used as validation dataset. To examine the effectiveness of the proposed system (i.e. hybrid selective model), we compared its performance to the performances of two comparison models - the traditional CF and the CF with multicriteria ratings. The performances of recommender systems were evaluated by using two metrics - average MAE(mean absolute error) and precision-in-top-N. Precision-in-top-N represents the percentage of truly high overall ratings among those that the model predicted would be the N most relevant items for each user. The experimental system was developed using Microsoft Visual Basic for Applications (VBA). The experimental results showed that our proposed system (avg. MAE = 0.584) outperformed traditional CF (avg. MAE = 0.591) as well as multicriteria CF (avg. AVE = 0.608). We also found that multicriteria CF showed worse performance compared to traditional CF in our data set, which is contradictory to the results in the most previous studies. This result supports the premise of our study that people have two different types of preference schemes - holistic and composite. Besides MAE, the proposed system outperformed all the comparison models in precision-in-top-3, precision-in-top-5, and precision-in-top-7. The results from the paired samples t-test presented that our proposed system outperformed traditional CF with 10% statistical significance level, and multicriteria CF with 1% statistical significance level from the perspective of average MAE. The proposed system sheds light on how to understand and utilize user's preference schemes in recommender systems domain.

A Two-Phase On-Device Analysis for Gender Prediction of Mobile Users Using Discriminative and Popular Wordsets (모바일 사용자의 성별 예측을 위한 식별 및 인기 단어 집합 기반 2단계 기기 내 분석)

  • Choi, Yerim;Park, Kyuyon;Kim, Solee;Park, Jonghun
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.1
    • /
    • pp.65-77
    • /
    • 2016
  • As respecting one's privacy becomes an important issue in mobile device data analysis, on-device analysis is getting attention, in which the data analysis is conducted inside a mobile device without sending data from the device to outside. One possible application of the on-device analysis is gender prediction using text data in mobile devices, such as text messages, search keyword, website bookmarks, and contact, which are highly private, and the limited computing power of mobile devices can be addressed by utilizing the word comparison method, where words are selected beforehand and delivered to a mobile device of a user to determine the user's gender by matching mobile text data and the selected words. Moreover, it is known that performing prediction after filtering instances using definite evidences increases accuracy and reduces computational complexity. In this regard, we propose a two-phase approach to on-device gender prediction, where both discriminability and popularity of a word are sequentially considered. The proposed method performs predictions using a few highly discriminative words for all instances and popular words for unclassified instances from the previous prediction. From the experiments conducted on real-world dataset, the proposed method outperformed the compared methods.

SWAT: A Study on the Efficient Integration of SWRL and ATMS based on a Distributed In-Memory System (SWAT: 분산 인-메모리 시스템 기반 SWRL과 ATMS의 효율적 결합 연구)

  • Jeon, Myung-Joong;Lee, Wan-Gon;Jagvaral, Batselem;Park, Hyun-Kyu;Park, Young-Tack
    • Journal of KIISE
    • /
    • v.45 no.2
    • /
    • pp.113-125
    • /
    • 2018
  • Recently, with the advent of the Big Data era, we have gained the capability of acquiring vast amounts of knowledge from various fields. The collected knowledge is expressed by well-formed formula and in particular, OWL, a standard language of ontology, is a typical form of well-formed formula. The symbolic reasoning is actively being studied using large amounts of ontology data for extracting intrinsic information. However, most studies of this reasoning support the restricted rule expression based on Description Logic and they have limited applicability to the real world. Moreover, knowledge management for inaccurate information is required, since knowledge inferred from the wrong information will also generate more incorrect information based on the dependencies between the inference rules. Therefore, this paper suggests that the SWAT, knowledge management system should be combined with the SWRL (Semantic Web Rule Language) reasoning based on ATMS (Assumption-based Truth Maintenance System). Moreover, this system was constructed by combining with SWRL reasoning and ATMS for managing large ontology data based on the distributed In-memory framework. Based on this, the ATMS monitoring system allows users to easily detect and correct wrong knowledge. We used the LUBM (Lehigh University Benchmark) dataset for evaluating the suggested method which is managing the knowledge through the retraction of the wrong SWRL inference data on large data.

Illumination Estimation Based on Nonnegative Matrix Factorization with Dominant Chromaticity Analysis (주색도 분석을 적용한 비음수 행렬 분해 기반의 광원 추정)

  • Lee, Ji-Heon;Kim, Dae-Chul;Ha, Yeong-Ho
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.8
    • /
    • pp.89-96
    • /
    • 2015
  • Human visual system has chromatic adaptation to determine the color of an object regardless of illumination, whereas digital camera records illumination and reflectance together, giving the color appearance of the scene varied under different illumination. NMFsc(nonnegative matrix factorization with sparseness constraint) was recently introduced to estimate original object color by using sparseness constraint. In NMFsc, low sparseness constraint is used to estimate illumination and high sparseness constraint is used to estimate reflectance. However, NMFsc has an illumination estimation error for images with large uniform area, which is considered as dominant chromaticity. To overcome the defects of NMFsc, illumination estimation via nonnegative matrix factorization with dominant chromaticity image is proposed. First, image is converted to chromaticity color space and analyzed by chromaticity histogram. Chromaticity histogram segments the original image into similar chromaticity images. A segmented region with the lowest standard deviation is determined as dominant chromaticity region. Next, dominant chromaticity is removed in the original image. Then, illumination estimation using nonnegative matrix factorization is performed on the image without dominant chromaticity. To evaluate the proposed method, experimental results are analyzed by average angular error in the real world dataset and it has shown that the proposed method with 5.5 average angular error achieve better illuminant estimation over the previous method with 5.7 average angular error.

Improved Focused Sampling for Class Imbalance Problem (클래스 불균형 문제를 해결하기 위한 개선된 집중 샘플링)

  • Kim, Man-Sun;Yang, Hyung-Jeong;Kim, Soo-Hyung;Cheah, Wooi Ping
    • The KIPS Transactions:PartB
    • /
    • v.14B no.4
    • /
    • pp.287-294
    • /
    • 2007
  • Many classification algorithms for real world data suffer from a data class imbalance problem. To solve this problem, various methods have been proposed such as altering the training balance and designing better sampling strategies. The previous methods are not satisfy in the distribution of the input data and the constraint. In this paper, we propose a focused sampling method which is more superior than previous methods. To solve the problem, we must select some useful data set from all training sets. To get useful data set, the proposed method devide the region according to scores which are computed based on the distribution of SOM over the input data. The scores are sorted in ascending order. They represent the distribution or the input data, which may in turn represent the characteristics or the whole data. A new training dataset is obtained by eliminating unuseful data which are located in the region between an upper bound and a lower bound. The proposed method gives a better or at least similar performance compare to classification accuracy of previous approaches. Besides, it also gives several benefits : ratio reduction of class imbalance; size reduction of training sets; prevention of over-fitting. The proposed method has been tested with kNN classifier. An experimental result in ecoli data set shows that this method achieves the precision up to 2.27 times than the other methods.