• Title/Summary/Keyword: data crawling

Search Result 195, Processing Time 0.027 seconds

A Study on Sentiment Analysis of Media and SNS response to National Policy: focusing on policy of Child allowance, Childbirth grant (국가 정책에 대한 언론과 SNS 반응의 감성 분석 연구 -아동 수당, 출산 장려금 정책을 중심으로-)

  • Yun, Hye Min;Choi, Eun Jung
    • Journal of Digital Convergence
    • /
    • v.17 no.2
    • /
    • pp.195-200
    • /
    • 2019
  • Nowadays as the use of mobile communication devices such as smart phones and tablets and the use of Computer is expanded, data is being collected exponentially on the Internet. In addition, due to the development of SNS, users can freely communicate with each other and share information in various fields, so various opinions are accumulated in the from of big data. Accordingly, big data analysis techniques are being used to find out the difference between the response of the general public and the response of the media. In this paper, we analyzed the public response in SNS about child allowance and childbirth grant and analyzed the response of the media. Therefore we gathered articles and comments of users which were posted on Twitter for a certain period of time and crawling the news articles and applied sentiment analysis. From these data, we compared the opinion of the public posted on SNS with the response of the media expressed in news articles. As a result, we found that there is a different response to some national policy between the public and the media.

A Study on the Development of Product Planning Prediction Model Using Logistic Regression Algorithm (로지스틱 회귀 알고리즘을 활용한 상품 기획 예측 모형 개발에 관한 연구)

  • Ahn, Yeong-Hwil;Park, Koo-Rack;Kim, Dong-Hyun;Kim, Do-Yeon
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.9
    • /
    • pp.39-47
    • /
    • 2021
  • This study was conducted to propose a product planning prediction model using logistic regression algorithm to predict seasonal factors and rapidly changing product trends. First, we collected unstructured data of consumers in portal sites and online markets using web crawling, and analyzed meaningful information about products through preprocessing for transformation of standardized data. The datasets of 11,200 were analyzed by Logistic Regression to analyze consumer satisfaction, frequency analysis, and advantages and disadvantages of products. The result of analysis showed that the satisfaction of consumers was 92% and the defective issues of products were confirmed through frequency analysis. The results of analysis on the use satisfaction, system efficiency, and system effectiveness items of the developed product planning prediction program showed that the satisfaction was high. Defective issues are very meaningful data in that they provide information necessary for quickly recognizing the current problem of products and establishing improvement strategies.

Image Super-Resolution for Improving Object Recognition Accuracy (객체 인식 정확도 개선을 위한 이미지 초해상도 기술)

  • Lee, Sung-Jin;Kim, Tae-Jun;Lee, Chung-Heon;Yoo, Seok Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.6
    • /
    • pp.774-784
    • /
    • 2021
  • The object detection and recognition process is a very important task in the field of computer vision, and related research is actively being conducted. However, in the actual object recognition process, the recognition accuracy is often degraded due to the resolution mismatch between the training image data and the test image data. To solve this problem, in this paper, we designed and developed an integrated object recognition and super-resolution framework by proposing an image super-resolution technique to improve object recognition accuracy. In detail, 11,231 license plate training images were built by ourselves through web-crawling and artificial-data-generation, and the image super-resolution artificial neural network was trained by defining an objective function to be robust to the image flip. To verify the performance of the proposed algorithm, we experimented with the trained image super-resolution and recognition on 1,999 test images, and it was confirmed that the proposed super-resolution technique has the effect of improving the accuracy of character recognition.

Correlations among Motor Function, Quality of Life, and Caregiver Depression Levels in Children with Cerebral Palsy

  • Yoo, Ji-Na
    • The Journal of Korean Physical Therapy
    • /
    • v.28 no.6
    • /
    • pp.385-392
    • /
    • 2016
  • Purpose: This study aimed to evaluate the relationships among quality of life, caregiver depression levels, and disease severity, especially motor function, in children with cerebral palsy. Methods: Data were collected through questionnaires using survey and interview from 80 caregivers of children with cerebral palsy. The caregivers' quality of life was measured using medical outcomes study 36-item short form health survey, and level of depression was scored using the beck depression inventory. In addition, children's motor function was evaluated using gross motor function measure-88 and functional independence measure scores. Results: Among 8 domains of medical outcomes study 36-item short form health survey, "physical functioning," "physical role functioning," "mental health," and "bodily pain" domains were significantly correlated to "total" percentage scores of gross motor function measure-88. In addition, "mental health" and "bodily pain" domains were correlated to each sub-dimension, including "lying and rolling," "sitting," "crawling and kneeling," "standing," and "walking, running, and jumping." Similarly, the "running" and "jumping" dimensions including motor function measures correlated with "transfer," "locomotion," and "motor subtotal" of functional independence measure scores. The beck depression inventory scores were negatively correlated to "lying and rolling," "sitting," "crawling and kneeling," and the "total" percentage scores of gross motor function measure-88. The beck depression inventory scores were negatively correlated to "sphincter control," "communication," "social cognition," "cognitive subtotal," and "total" functional independence measure scores. Conclusion: It is necessary to consider the quality of life and emotional problems of caregivers of CP children and support them both physically and psychologically with comprehensive rehabilitation.

Conparison of Data Collection Methods for Big Data Analysis (빅데이터 분석을 위한 자료 수집 방안 비교)

  • Kim, Sung-kook;Oh, Chang-heon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.422-424
    • /
    • 2018
  • Recently there has been growing interest in big data analysis and methods for collecting data have been developed diversely but researchers are still not easy to collect and use these large scale data. In this paper, researchers try to compare and analyze the method of collecting big data by using several methods and present it. I hope that you can provide the results of your research if you select and use methods that match your research objectives.

  • PDF

Analysis of Current Status of Marine Products and Characteristics of Processed Products Seafood in Joseon - via the Veritable Records of the Joseon Dynasty based data - (『조선왕조실록(朝鮮王朝實錄)』 속 수산물 현황과 가공식품 특성 분석)

  • Kim, Mi-Hye
    • Journal of the Korean Society of Food Culture
    • /
    • v.37 no.1
    • /
    • pp.26-38
    • /
    • 2022
  • This study used the big data method to analyze the chronological frequency of seafood appearance and variety mentioned by the veritable records of the Joseon dynasty. The findings will be used as a basis for Joseon Period's food cultural research. The web-crawling method was used to digitally scrap from the veritable records of the Joseon dynasty of Joseon's first to the twenty-seventh king. A total of 9,536 cases indicated the appearance of seafood out of the 384,582 articles. Seafood were termed "seafood" as a collective noun 107 times (1.12%), 27 types of fish 8,372 times (87.79%), 3 types of mollusca (1.28%), 18 types of shellfish 213 times (2.23%), 6 types of crustacean 188 times (1.97%), 9 types of seaweed 534 times (5.60%). Fish appeared most frequently out of all the recorded seafood. Sea fish appeared more frequently than the freshwater fish. Kings that showed the most Strong Interest Inventory (SII) were: Sungjong from the 15thcentury, Sehjo from the 15th, Youngjo from the 18th, Sehjong from the 15th, and Jungjo from the 18th respectively. Kings of Chosen were most interested in seafood in the 15th and 18th centuries.

Designing Cost Effective Open Source System for Bigdata Analysis (빅데이터 분석을 위한 비용효과적 오픈 소스 시스템 설계)

  • Lee, Jong-Hwa;Lee, Hyun-Kyu
    • Knowledge Management Research
    • /
    • v.19 no.1
    • /
    • pp.119-132
    • /
    • 2018
  • Many advanced products and services are emerging in the market thanks to data-based technologies such as Internet (IoT), Big Data, and AI. The construction of a system for data processing under the IoT network environment is not simple in configuration, and has a lot of restrictions due to a high cost for constructing a high performance server environment. Therefore, in this paper, we will design a development environment for large data analysis computing platform using open source with low cost and practicality. Therefore, this study intends to implement a big data processing system using Raspberry Pi, an ultra-small PC environment, and open source API. This big data processing system includes building a portable server system, building a web server for web mining, developing Python IDE classes for crawling, and developing R Libraries for NLP and visualization. Through this research, we will develop a web environment that can control real-time data collection and analysis of web media in a mobile environment and present it as a curriculum for non-IT specialists.

Implementation and Performance Aanalysis of Efficient Big Data Processing System Through Dynamic Configuration of Edge Server Computing and Storage Modules (BigCrawler: 엣지 서버 컴퓨팅·스토리지 모듈의 동적 구성을 통한 효율적인 빅데이터 처리 시스템 구현 및 성능 분석)

  • Kim, Yongyeon;Jeon, Jaeho;Kang, Sungjoo
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.16 no.6
    • /
    • pp.259-266
    • /
    • 2021
  • Edge Computing enables real-time big data processing by performing computing close to the physical location of the user or data source. However, in an edge computing environment, various situations that affect big data processing performance may occur depending on temporary service requirements or changes of physical resources in the field. In this paper, we proposed a BigCrawler system that dynamically configures the computing module and storage module according to the big data collection status and computing resource usage status in the edge computing environment. And the feature of big data processing workload according to the arrangement of computing module and storage module were analyzed.

Implementation of place recommendation site based on user's location (사용자 위치에 기반한 장소 추천 사이트의 구현)

  • Yong, Seunglim;Ji, Changeon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2018.07a
    • /
    • pp.345-346
    • /
    • 2018
  • 본 논문에서는 사용자의 위치 정보를 입력받아 근처에 위치한 식당이나 어트랙션 장소를 추천하는 사이트를 구현하고 이를 제안한다. 웹 페이지를 통해 사용자의 위치정보를 입력 받고, SNS에서 추천하는 장소를 크롤링하여 데이터베이스를 구축하고 분석하여 식당과 어트랙션 장소를 추천해 준다. 추천 장소는 사용자에게 지도를 이용하여 그 위치를 보여주며 지도 위에 추천 장소의 간략 정보를 표시한다.

  • PDF

빅데이터 기반 패션 추천 도우미 Shoes Navigator 설계 및 구현

  • 조현우 ;장지완 ;최현선;정목동
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.389-390
    • /
    • 2023
  • 본 논문에서는 패션 매칭의 어려움을 해결해주기 위하여 '무신사' 쇼핑몰을 이용하여 크롤링하고 이를 정제한 dataset을 이용하여 패션 스타일의 핵심 요소 중 하나인 신발에 초점을 맞추어, 이미지 기반의 패션 매칭 시스템인 빅데이터 기반 패션 도우미, Shoes Navigator 를 제안한다. 이를 위해 컴퓨터 비전 및 딥 러닝 기술을 활용하여 이미지에서 의류 항목을 자동으로 감지하고, 스타일, 색상과 같은 패션 특성을 추출한다. 또한, 사용자의 개인적인 스타일을 고려하여 최적의 매칭을 제안하기 때문에 패션 코디 문제를 용이하게 해결할 수 있다.