• 제목/요약/키워드: data crawling

검색결과 195건 처리시간 0.022초

불쾌지수가 외야수의 경기 기록 데이터에 미치는 영향 (The Effect of Discomfort Index on Outfielder's Game Record Data)

  • 김세민;신좌철
    • 한국정보통신학회논문지
    • /
    • 제24권8호
    • /
    • pp.978-984
    • /
    • 2020
  • 본 연구에서는 빅 데이터 분석 방법을 활용하여 스포츠 기록과 기상 데이터의 상관관계를 분석하였다. 이를 위하여 API와 크롤링으로 데이터를 수집하고, 가공한 다음 이것을 토대로 통계를 낸 후에 데이터 시각화를 진행하였다. 본 연구의 대상으로는 2019년 KBO 리그에 출전한 외야수 중에서 규정타석에 진입한 선수를 대상으로 하였다. 또한 기상 데이터는 불쾌지수를 활용하였으며 70이상과 70미만을 기준으로 하여 분석하였다. 연구 결과로는 투수가 개입하는 기록인 각종 타격지표에서는 불쾌지수가 높을수록 외야수의 기록이 향상하였으나, 투수가 개입하지 않는 도루, 볼넷, 투구 수, 도루 성공률, 타석 당 투구 수, 경기 당 투구 수 등의 기록에서는 외야수가 투수를 힘들게 하였음을 알 수 있었다. 본 연구를 통하여 스포츠 데이터 산업의 발전과 야구 선수, 구단, 코칭스태프의 경기력에 도움이 될 것으로 기대한다.

비주얼 의류 검색기술을 위한 의류 속성 기반 Annotation 기법 개발 (Annotation Technique Development based on Apparel Attributes for Visual Apparel Search Technology)

  • 이은경;김양원;김선숙
    • 한국의류산업학회지
    • /
    • 제17권5호
    • /
    • pp.731-740
    • /
    • 2015
  • Mobile (smartphone) search engine marketing is increasingly important. Accordingly, the development of visual apparel search technology to obtain easier and faster access to visual information in the apparel field is urgently needed. This study helps establish a proper classifying system for an apparel search after an analysis of search techniques for apparel search applications and existing domestic and overseas apparel sites. An annotation technique is developed in accordance with visual attributes and apparel categories based on collected data obtained by web crawling and apparel images collecting. The categorical composition of apparel is divided into wearing, image and style. The web evaluation site traces the correlations of the apparel category and apparel factors as dependent upon visual attributes. An appraisal team of 10 individuals evaluated 2860 pieces of merchandise images. Data analysis consisted of correlations between apparel, sleeve length and apparel category (based on an average analysis), and correlation between fastener and apparel category (based on an average analysis). The study results can be considered as an epoch-making mobile apparel search system that can contribute to enhancing consumer convenience since it enables an effective search of type, price, distributor, and apparel image by a mobile photographing of the wearing state.

FPS게임의 사용자 현존감과 플레이어 경험에 대한 키워드 연구 - Steam 리뷰와 게임 이용 시간을 중심으로 (Key words research of players' experience and presence in FPS genre-focusing on game play time and Steam reviews)

  • 최영우;유승호
    • 한국게임학회 논문지
    • /
    • 제21권6호
    • /
    • pp.13-30
    • /
    • 2021
  • 본 논문은 스팀(Steam)의 리뷰 데이터를 활용하여 게임이용시간에 따른 FPS에서의 사용자 현존감 경험과 플레이어 경험을 분석했다. Python을 활용하여 크롤링을 통해 데이터를 얻었다. 분석 결과 게임 이용 시간이 적은 집단에서는 통제 가능한 물리적 현존감과 통제 불가능한 사회적 현존감에 관한 이슈가 나왔고 게임이용시간이 많은 집단에서는 이전 집단에 있던 물리적 현존감 요인이 통제 가능한 사회적 현존감으로 변화한 것을 알 수 있었다. 더 나아가 플레이어 경험 분석을 통해 게임 플레이 요인인 "반동(recoil)"이란 키워드가 중요하다는 것을 알 수 있었다.

데이터 마이닝을 활용한 외식업체의 평점에 영향을 미치는 선행 요인 (A Study on Key Factors Influencing Customers' Ratings of Restaurants by Using Data Mining Method)

  • 김선주;김병수
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제31권2호
    • /
    • pp.1-18
    • /
    • 2022
  • Purpose Customer review is a major factor in choosing certain restaurants. This study investigates the key factors affecting customer's evaluation about restaurants. With the recent intensification of competition among restaurants in the service industry, the analysis results are expected to provide in-depth insights for enhancing customer experiences. Design/methodology/approach We collected information and reviews provided at the restaurants in the Kakao Map platform. The information collected is based on the information of 3,785 restaurants in Daegu registered on Kakao Map. Based on the information collected, seven independent variables, including number of rating registered, number of reviews, presence or absence of safe restaurants, presence or absence of a posting about holding facilities, presence or absence of a posting about business hours, presence or absence of a posting about hashtags, and presence or absence of break times, were used. Dependent variable is restaurant rating. Multiple regression between independent variables and restaurant rating was carried out. Findings The results of the study confirmed that number of rating registered, presence or absence of a posting about business hours, and presence or absence of a posting about hash tags have an positive effects on the restaurant rating. The number of reviews had a negative effect on the restaurant rating. In addition, in order to confirm the role of customer's reviews, we carried out LDA topic modeling. We divided the topics into the positive review and the negative reviews.

Development of Dataset Items for Commercial Space Design Applying AI

  • Jung Hwa SEO;Segeun CHUN;Ki-Pyeong, KIM
    • 한국인공지능학회지
    • /
    • 제11권1호
    • /
    • pp.25-29
    • /
    • 2023
  • In this paper, the purpose is to create a standard of AI training dataset type for commercial space design. As the market size of the field of space design continues to increase and the time spent increases indoors after COVID-19, interest in space is expanding throughout society. In addition, more and more consumers are getting used to the digital environment. Therefore, If you identify trends and preemptively propose the atmosphere and specifications that customers require quickly and easily, you can increase customer trust and conduct effective sales. As for the data set type, commercial districts were divided into a total of 8 categories, and images that could be processed were derived by refining 4,009,30MB JPG format images collected through web crawling. Then, by performing bounding and labeling operations, we developed a 'Dataset for AI Training' of 3,356 commercial space image data in CSV format with a size of 2.08MB. Through this study, elements of spatial images such as place type, space classification, and furniture can be extracted and used when developing AI algorithms, and it is expected that images requested by clients can be easily and quickly collected through spatial image input information.

간호법 제정에 대한 언론 동향 및 사회적 인식 분석 (Analysis of Media Trends and Social Perceptions on Nursing Law Legislation)

  • 이승희;주민호
    • 대한간호학회지
    • /
    • 제53권4호
    • /
    • pp.439-452
    • /
    • 2023
  • Purpose: This study aimed to derive considerations for the enactment of nursing law by analyzing the trends and social perceptions of nursing law mentioned in major daily newspapers, cafes, and blogs. Methods: Main texts and comments that included nursing law as a keyword were collected from major daily news and online postings from January 2021 to August 2022. The data collected through web crawling were analyzed using a TousFlux program used for big data analysis. Results: During the period of study, the awareness level around nursing law enactment increased. In particular, public concern over nursing law enactment intensified due to the two political parties' policy pledges related to nursing law in January 2022 and the failure to introduce the nursing law to the national assembly judiciary committee in May 2022. Except in December 2021, public perception of nursing law enactment was generally favorable, with public opinion tilting more in favor of than against enactment. Conclusion: Public opinion should be considered when drafting and implementing the nursing law to make it easier for the people to understand what the law constitutes. In addition, it is necessary to pay attention to and continuously promote the relationship between medical care and nursing in the nursing law system of developed nations. Lastly, nursing law enactment can enhance nurses' retention intention and provide a sense of efficacy to medical services.

Machine Learning for Flood Prediction in Indonesia: Providing Online Access for Disaster Management Control

  • Reta L. Puspasari;Daeung Yoon;Hyun Kim;Kyoung-Woong Kim
    • 자원환경지질
    • /
    • 제56권1호
    • /
    • pp.65-73
    • /
    • 2023
  • As one of the most vulnerable countries to floods, there should be an increased necessity for accurate and reliable flood forecasting in Indonesia. Therefore, a new prediction model using a machine learning algorithm is proposed to provide daily flood prediction in Indonesia. Data crawling was conducted to obtain daily rainfall, streamflow, land cover, and flood data from 2008 to 2021. The model was built using a Random Forest (RF) algorithm for classification to predict future floods by inputting three days of rainfall rate, forest ratio, and stream flow. The accuracy, specificity, precision, recall, and F1-score on the test dataset using the RF algorithm are approximately 94.93%, 68.24%, 94.34%, 99.97%, and 97.08%, respectively. Moreover, the AUC (Area Under the Curve) of the ROC (Receiver Operating Characteristics) curve results in 71%. The objective of this research is providing a model that predicts flood events accurately in Indonesian regions 3 months prior the day of flood. As a trial, we used the month of June 2022 and the model predicted the flood events accurately. The result of prediction is then published to the website as a warning system as a form of flood mitigation.

토픽모델링을 활용한 해운물류 뉴스 분석 (Analysis of Shipping and Logistics News Articles using Topic Modeling)

  • 윤희영;곽일엽
    • 무역학회지
    • /
    • 제46권4호
    • /
    • pp.61-76
    • /
    • 2021
  • This study focuses on three logistics-related news (Logistics Newspaper, Korea Shipping Gadget, and Korea Shipping Newspaper) in order to present changes in logistics issues, centering on Corona 19, which has recently had the greatest impact in the world. For data collection, two-year news articles in 2019 and 2020 (title, article, content, date, article classification, article URL) were collected through web crawling (using Python's BeautifulSoup, requests module) on the homepages of three representative logistics-related media companies. As for the data analysis methods, fundamental statistical analysis, Latent Dirichlet Allocation (LDA) for topic modeling, and Scattertext were performed. The analysis results were as follows. First, among the three news media related to logistics, the Korea Shipping Newspaper was carrying out the most active media activities. Second, through topic modeling with LDA, eight logistics-related topics were identified, and keywords and significant issues of each topic were presented. Third, the keywords were visually expressed through Scattertext. This is the first study to present changes in the logistics field, focusing on articles from representative logistics-related media in 2019 and 2020. In particular, 2019 and 2020 can be divided into before and after the outbreak of Corona 19, which has had a great impact not only on the logistics field but also on our lives as a whole. For future work, a multi-faceted approach is required, such as comparative studies of logistics issues between countries or presenting implications based on long-term time-series articles.

Gift-giving Behaviors via SNS Mobile App: An Exploratory Study of Fashion Products

  • Ji Yoon Kim;Jiyeon Lee;Kyu-Hye Lee
    • 패션비즈니스
    • /
    • 제27권6호
    • /
    • pp.110-123
    • /
    • 2023
  • As social distancing strengthened after the COVID-19 incident, people looked for things they could do alone. Additionally, as people have more financial resources, they purchase products they had previously considered purchasing, and the phenomenon of giving gifts to oneself has also appeared. Accordingly, this study analyzed fashion product reviews of KakaoTalk Gift, the service to exchange gift via SNS mobile app, to discover the phenomenon of self-gifting and the differences from interpersonal-gifting. For post-hoc data, in collected 18,354 pieces after excluding unnecessary data using a Python-based web crawling technique. The self-gifting behavior of KakaoTalk Gift different from the previous study for self-gift. Regardless of the gift-giving contexts, it determines that most self-gift products are material items. There are differences in product types and price levels when choosing gifts for others and oneself. As a self-gift, people typically buy luxury jewelry and branded bags/wallets to wear and show off. As interpersonal, among fashion products, people usually buy beauty products that reflect less personal tastes. When gift-giving to others, people buy products to appropriate prices to reduce the burden on both. When gift-giving to oneself, people buy wanted products regardless of the price. This study is significant because it suggests a new direction in self-gift research by limited online places to give gifts.

클러스터링을 이용한 스마트폰 사용자 추천 시스템 만들기 (Creating a Smartphone User Recommendation System Using Clustering)

  • Jin Hyoung AN
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제2권1호
    • /
    • pp.1-6
    • /
    • 2024
  • In this paper, we develop an AI-based recommendation system that matches the specifications of smartphones from company 'S'. The system aims to simplify the complex decision-making process of consumers and guide them to choose the smartphone that best suits their daily needs. The recommendation system analyzes five specifications of smartphones (price, battery capacity, weight, camera quality, capacity) to help users make informed decisions without searching for extensive information. This approach not only saves time but also improves user satisfaction by ensuring that the selected smartphone closely matches the user's lifestyle and needs. The system utilizes unsupervised learning, i.e. clustering (K-MEANS, DBSCAN, Hierarchical Clustering), and provides personalized recommendations by evaluating them with silhouette scores, ensuring accurate and reliable grouping of similar smartphone models. By leveraging advanced data analysis techniques, the system can identify subtle patterns and preferences that might not be immediately apparent to consumers, enhancing the overall user experience. The ultimate goal of this AI recommendation system is to simplify the smartphone selection process, making it more accessible and user-friendly for all consumers. This paper discusses the data collection, preprocessing, development, implementation, and potential impact of the system using Pandas, crawling, scikit-learn, etc., and highlights the benefits of helping consumers explore the various options available and confidently choose the smartphone that best suits their daily lives.