Search | Korea Science

A proposal on a proactive crawling approach with analysis of state-of-the-art web crawling algorithms (최신 웹 크롤링 알고리즘 분석 및 선제적인 크롤링 기법 제안)

Na, Chul-Won;On, Byung-Won
- Journal of Internet Computing and Services
- /
- v.20 no.3
- /
- pp.43-59
- /
- 2019
Today, with the spread of smartphones and the development of social networking services, structured and unstructured big data have stored exponentially. If we analyze them well, we will get useful information to be able to predict data for the future. Large amounts of data need to be collected first in order to analyze big data. The web is repository where these data are most stored. However, because the data size is large, there are also many data that have information that is not needed as much as there are data that have useful information. This has made it important to collect data efficiently, where data with unnecessary information is filtered and only collected data with useful information. Web crawlers cannot download all pages due to some constraints such as network bandwidth, operational time, and data storage. This is why we should avoid visiting many pages that are not relevant to what we want and download only important pages as soon as possible. This paper seeks to help resolve the above issues. First, We introduce basic web-crawling algorithms. For each algorithm, the time-complexity and pros and cons are described, and compared and analyzed. Next, we introduce the state-of-the-art web crawling algorithms that have improved the shortcomings of the basic web crawling algorithms. In addition, recent research trends show that the web crawling algorithms with special purposes such as collecting sentiment words are actively studied. We will one of the introduce Sentiment-aware web crawling techniques that is a proactive web crawling technique as a study of web crawling algorithms with special purpose. The result showed that the larger the data are, the higher the performance is and the more space is saved.
https://doi.org/10.7472/jksii.2019.20.3.43 인용 PDF KSCI HTML

A Web Application for Open Data Visualization Using R (R 이용 오픈데이터 시각화 웹 응용)

Kim, Kwang-Seob;Lee, Ki-Won
- Journal of the Korean Association of Geographic Information Studies
- /
- v.17 no.2
- /
- pp.72-81
- /
- 2014
As big data are one of main issues in the recent days, the interests on their technologies are also increasing. Among several technological bases, this study focuses on data visualization and R based on open source. In general, the term of data visualization can be summarized as the web technologies for constructing, manipulating and displaying various types of graphic objects in the interactive mode. R is an operating environment or a language for statistical data analysis from basic to advanced level. In this study, a web application with these technological aspects and components is newly implemented and exemplified with data visualization for geo-based open data provided by public organizations or government agencies. This application model does not need users' data building or proprietary software installation. Futhermore it is designed for users in the geo-spatial application field with less experiences and little knowledges about R. The results of data visualization by this application can support decision making process of web users accessible to this service. It is expected that the more practical and various applications with R-based geo-statistical analysis functions and complex operations linked to big data contribute to expanding the scope and the range of the geo-spatial application.
https://doi.org/10.11108/kagis.2014.17.2.072 인용 PDF KSCI

Recent progress in astronomy education in Makerspace situation

Kim, Yonggi;Kim, Hyoungbum
- The Bulletin of The Korean Astronomical Society
- /
- v.46 no.2
- /
- pp.73.2-73.2
- /
- 2021
본 연구는 지능정보기술을 천문교육에 활용하여 어떻게 천문교육에 활용할 것인가에 대한 방안을 모색해보았다. 3D프린터, 레이저커팅기, 빅데이터, 인공지능, 드론 등 지능정보기술을 확보한 메이커스페이스 공간에서 이들 기술을 활용하여 천문교육 프로그램을 개발해보는 일은 4차산업혁명시대의 핵심역량을 함양하는데 크게 기여할 것으로 판단된다. 또한 2021년 8월에 중기부 사업으로 선정된 충북대 Pro 메이커스 센터를 중심으로 메이커스페이스 환경에서 다양한 천문교구 개발 및 개발된 천문교구를 활용한 프로그램이 개발되어 형식교육의 현장 뿐만 아니라 비형식 교육의 현장에 다양하게 적용될 계획이다. 이에 향후 메이커스페이스 환경에서 대중천문프로그램이 어떻게 발전될 것인가에 대한 견해 및 토론도 발표될 예정이다.
PDF

A Sampling based Pruning Approach for Efficient Angular Space Partitioning based Skyline Query Processing (효율적인 각 기반 공간 분할 병렬 스카이라인 질의 처리를 위한 데이터 샘플링 기반 프루닝 기법)

Choi, Woo-Sung;Min, Jong-Hyeon;Chung, Jaehwa;Jung, SoonYoung
- Proceedings of the Korea Information Processing Society Conference
- /
- 2016.04a
- /
- pp.55-58
- /
- 2016
스카이라인 질의란 다수의 선택지 중 '선호될 만한(preferable)' 선택지를 요청하는 질의이다. 사용자가 검토해야하는 선택지의 수를 대폭 감소시키는 스카이라인 질의는 데이터가 폭증하는 빅데이터 환경에서 매우 유용하게 활용된다. 이러한 배경에서 대용량 데이터에 대한 스카이라인 질의를 분산 병렬 처리하는 기법이 각광을 받고 있으며, 특히 맵리듀스(MapReduce) 기반의 분산 병렬 처리 기법 연구가 활발히 진행 중이다. 맵리듀스 기반 알고리즘의 병렬성 제고를 위해서는 부하 불균등 문제 중복 계산 문제 과다한 네트워크 비용 발생 문제를 해소해야 한다. 최근 각 기반 공간분할 기법을 사용하여 부하 불균등 문제와 중복 계산 문제를 해소하는 맵리듀스 기반 스카이라인 질의 처리 기법이 제안되었으나 해당 기법은 네트워크 비용 관점에서 최적화되어있지 않다. 본 논문에서는 부하 불균등 문제와 중복 계산 문제를 해소하면서도 프루닝을 통해 네트워크 비용 절감 시킬 수 있는 새로운 맵리듀스 기반 병렬 스카이라인 질의 처리 기법인 MR-SEAP(MapReduce sample Skyline object Equality Angular Partitioning)을 제안한다. MR-SEAP에서는 데이터를 샘플링하여 샘플 스카이라인 객체를 추출한 뒤 해당 객체들을 균등 분배하는 각도를 기준으로 공간을 분할하여 스카이라인 질의를 병렬 계산하되, 샘플 스카이라인을 이용하여 다수의 객체를 사전에 프루닝함으로써 네트워크 비용을 절감한다. 본 논문에서는 다양한 데이터 수량(cardinality) 및 분포(distribution)에 따른 제안 기법의 성능을 실험 평가함으로써 제안 기법의 우수성을 검증한다.
https://doi.org/10.3745/PKIPS.y2016m04a.55 인용 PDF

A Study on the Change of Smart City's Issues and Perception : Focus on News, Blog, and Twitter (스마트도시의 이슈와 인식변화에 관한 연구 : 뉴스, 블로그, 트위터 자료를 중심으로)

Jang, Hwan-Young
- Journal of Cadastre & Land InformatiX
- /
- v.49 no.2
- /
- pp.67-82
- /
- 2019
The purpose of this study is to analyze the issues and perceptions of smart cities. First, based on the big data analysis platform, big data analysis on smart cities were conducted to derive keywords by year, word cloud, and frequency of generation of smart city keywords by time. Second, trend and flow by area were analyzed by reclassifying major keywords by year based on meta-keywords. Third, emotional recognition flow for smart cities and major emotional keywords were derived. While U-City in the past is mostly centered on creating infrastructure for new towns, recent smart cities are focusing on sustainable urban construction led by citizens, according to the analysis. In addition, it was analyzed that while infrastructure, service, and technology were emphasized in the past, management and methodology were emphasized recently, and positive perception of smart cities was growing. The study could be used as basic data for the past, present and future of smart cities in Korea at a time when smart city services are being built across the country.
https://doi.org/10.22640/lxsiri.2019.49.2.67 인용 PDF KSCI

A study of the vitalization strategy for public sports facility through big-data (빅데이터 분석을 활용한 기금지원 체육시설 활성화 방안)

Kim, Mi-ok;Ko, Jin-soo;Noh, Seung-Chul;Chung, Jae-Hoon
- Journal of Digital Convergence
- /
- v.15 no.2
- /
- pp.527-535
- /
- 2017
As interest increases in health promotion through sports, demand for public sports facilities is steadily growing. However, there is a lack of research on operation and management compared with the supply plan of public sports facility. In this context, the aim of this study is to address problems of management of public sports centers and suggest strategies for vitalizing the facilities through the big-data. The data are collected from web such as news, blog, and cafe for one year in 2015. From the big-data, We can find that the national sports centers and the open gyms showed similar users' behavior but showed different needs. Both facilities have been used as sports and leisure area and have a high percentage of visitors for other purposes such as walking, picnics, etc. However, while the national sports facilities which were used for more specialized programs, the open sports center were used as leisure space.
https://doi.org/10.14400/JDC.2017.15.2.527 인용 PDF KSCI

The Analysis of the Recent News on Domestic Drought Situation by National Drought Information-Analysis System (국가가뭄정보분석시스템을 활용한 최근 가뭄관련 언론현황 분석 및 고찰)

Lee, Ho Sun;Chun, Gun Il;Park, Jae Young
- Proceedings of the Korea Water Resources Association Conference
- /
- 2017.05a
- /
- pp.340-340
- /
- 2017
최근 전 세계적으로 기후변화로 인한 가뭄이 빈번히 발생하고 있으며 우리나라도 '14~'15년 장기화된 가뭄으로 인해 많은 어려움을 겪었다. 이러한 가뭄은 비교적 느린 속도로 진행되고 그 영향이 복잡하게 나타나기 때문에 적절한 사전대응이 이루어지지 않으면 상당한 피해를 겪게 된다. 최근 기존 수자원 정보의 수집과 분석을 탈피해서 다른 사회 시스템과의 연계 추진하는 빅데이터 개념의 적용시도가 이루어지고 있다. K-water 국가가뭄정보분석센터에서는 가뭄의 사전인지와 영향평가의 보조적인 수단으로서 뉴스를 활용하는 방법론을 도출하고 이를 시스템에 구현하여 적용하여 활용성을 분석하였다. 언론(뉴스)정보는 가뭄의 발생, 영향, 대응 등을 포괄적으로 검색할 수 있도록 가뭄진행 순서에 따라 가뭄징조 및 예측, 가뭄발생, 가뭄영향, 가뭄대응, 가뭄대비 및 해소 관련 5개 카테고리와 이와 관련된 69개 세부 키워드로 구분하고 이를 시스템에 반영하였다. 빅데이터 기능을 적용하여 인터넷 뉴스를 해당키워드를 적용해 자동으로 수집할 수 있도록 하였으며 중복되거나 관련 없는 뉴스를 제외하고 이를 다시 발생지역으로 공간 구분하여 GIG 맵에 표출될 수 있도록 구축하였다. 구축된 시스템을 활용하여 '16년을 대상으로 수집된 총 448건의 뉴스자료를 분석한 결과 시스템에 구축되어 있는 '16년 용수공급체계를 반영한 가뭄평가결과와 발생위치, 발생시기, 피해내용 등이 '16년 물수급 현황을 잘 나타내는 것으로 나타났다. 향후 센터에서는 뉴스이외에 소셜미디어와 SNS등에서 다양한 가뭄관련정보를 빅데이터 수집방식에 의해 확보하고 이를 가뭄인자와 영향평가에 대한 참고자료로서 활용하기 위한 방안과 시스템 적용을 통한 검증을 지속적으로 진행할 예정이다.
PDF

Research of Water-related Disaster Monitoring Using Satellite Bigdata Based on Google Earth Engine Cloud Computing Platform (구글어스엔진 클라우드 컴퓨팅 플랫폼 기반 위성 빅데이터를 활용한 수재해 모니터링 연구)

Park, Jongsoo;Kang, Ki-mook
- Korean Journal of Remote Sensing
- /
- v.38 no.6_3
- /
- pp.1761-1775
- /
- 2022
Due to unpredictable climate change, the frequency of occurrence of water-related disasters and the scale of damage are also continuously increasing. In terms of disaster management, it is essential to identify the damaged area in a wide area and monitor for mid-term and long-term forecasting. In the field of water disasters, research on remote sensing technology using Synthetic Aperture Radar (SAR) satellite images for wide-area monitoring is being actively conducted. Time-series analysis for monitoring requires a complex preprocessing process that collects a large amount of images and considers the noisy radar characteristics, and for this, a considerable amount of time is required. With the recent development of cloud computing technology, many platforms capable of performing spatiotemporal analysis using satellite big data have been proposed. Google Earth Engine (GEE)is a representative platform that provides about 600 satellite data for free and enables semi real time space time analysis based on the analysis preparation data of satellite images. Therefore, in this study, immediate water disaster damage detection and mid to long term time series observation studies were conducted using GEE. Through the Otsu technique, which is mainly used for change detection, changes in river width and flood area due to river flooding were confirmed, centered on the torrential rains that occurred in 2020. In addition, in terms of disaster management, the change trend of the time series waterbody from 2018 to 2022 was confirmed. The short processing time through javascript based coding, and the strength of spatiotemporal analysis and result expression, are expected to enable use in the field of water disasters. In addition, it is expected that the field of application will be expanded through connection with various satellite bigdata in the future.
https://doi.org/10.7780/kjrs.2022.38.6.3.2 인용 PDF KSCI HTML

Development of big data-based water supply and demand analysis technique for digital twin (디지털 트윈을 위한 빅데이터 기반 물수급 분석 기법 개발)

Kim, Jang-Gyeong;Moon, Soo-Jin;Yeo, In-Hee;Kim, Tae-Jeong;Nam, Woo-Sung
- Proceedings of the Korea Water Resources Association Conference
- /
- 2022.05a
- /
- pp.224-224
- /
- 2022
물부족, 수질오염, 조류발생 등 효율적 물관리를 위해서는 물정보 통합이 필요하지만 부처별/목적별로 개별 생산·관리되어 물관리 현안에 효과적으로 대응하기 어려운 실정이다. 물관리 현안 대응 의사결정을 위해서는 현재 상황에 대한 정확한 인식과 장래(1,3개월) 수자원 상황을 고려한 예측·분석체계 구축 필요하며, 이를 위해서는 수원별 가용수량, 지역별 물사용량 및 회귀수량 등 지자체, 유역, 하천을 연계한 실제 물이용 정보 기반의 물배분 현황 분석체계 구축이 필요하다. 본 연구에서는 물수급 관련 수요·공급 시설의 위치를 연결하는 물수급 분석 알고리즘 개발을 통해 지형공간정보의 위상(topology) 관계를 설정하여 물수급 분석의 계산순서를 선정하고, 시계열 DB를 입력하여 전국 약 40만개 이상의 일단위 물수급 분석 정보생산체계를 구축하였다. 본 연구에서 개발된 물수급 분석 모형은 향후 물관련 이슈 지역의 용수공급능력 평가 및 디지털트윈 등 다양한 수자원 정책평가에 활용될 것으로 기대된다.
PDF

Hadoop-based Large Data Management and Analysis for Parking Enforcement System (주정차 단속 시스템을 위한 하둡 기반 대용량 데이터 관리 및 분석)

Baek, Na-Eun;Song, Youngho;Shin, Jaehwan;Chang, Jae-Woo
- Proceedings of the Korea Information Processing Society Conference
- /
- 2017.04a
- /
- pp.429-432
- /
- 2017
자동차 보급률 증가로 인해 교통 혼잡, 불법 주정차 등의 사회적 문제가 발생하고 있다. 특히 불법 주정차는 교통 혼잡, 주차 공간 부족 등 부가적인 문제를 발생시키고 있다. 따라서 각 지방자치단체에서는 불법 주정차 문제를 해결하기 위한 방안을 연구하고 있다. 그러나 이러한 방안은 초기 비용 발생 및 인력 부족 등의 한계가 있다. 한편, 정보통신의 발달에 따라 공공 업무에도 대량의 공공데이터를 효율적으로 처리하기 위한 연구가 진행되고 있다. 하지만 이러한 연구 또한 빅데이터 처리 플랫폼 부족 및 분석 시스템이 미흡한 한계가 존재한다. 따라서 본 논문에서는 불법 주정차 데이터와 같은 공공 데이터를 효율적으로 처리하기 위해, 주정차 단속 시스템을 위한 하둡 기반 대용량 데이터 관리 및 분석 시스템을 제안한다. 제안하는 시스템은 첫째, 주차단속을 수행할 때 주차단속 데이터를 하이브(Hive)를 통해 저장하고, 단속된 차량의 차주를 검색하여 단속임을 알리거나 과태료를 부과한다. 둘째, 웹 인터페이스를 통해 수집된 주차단속 데이터에 대한 다양한 분석을 수행하고, 분석된 데이터에 대한 R을 이용한 시각화를 제공한다.
https://doi.org/10.3745/PKIPS.y2017m04a.429 인용 PDF

Search Result 308, Processing Time 0.031 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)