• 제목/요약/키워드: Science and Technology Data

검색결과 15,621건 처리시간 0.058초

SVM-Based Incremental Learning Algorithm for Large-Scale Data Stream in Cloud Computing

  • Wang, Ning;Yang, Yang;Feng, Liyuan;Mi, Zhenqiang;Meng, Kun;Ji, Qing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제8권10호
    • /
    • pp.3378-3393
    • /
    • 2014
  • We have witnessed the rapid development of information technology in recent years. One of the key phenomena is the fast, near-exponential increase of data. Consequently, most of the traditional data classification methods fail to meet the dynamic and real-time demands of today's data processing and analyzing needs--especially for continuous data streams. This paper proposes an improved incremental learning algorithm for a large-scale data stream, which is based on SVM (Support Vector Machine) and is named DS-IILS. The DS-IILS takes the load condition of the entire system and the node performance into consideration to improve efficiency. The threshold of the distance to the optimal separating hyperplane is given in the DS-IILS algorithm. The samples of the history sample set and the incremental sample set that are within the scope of the threshold are all reserved. These reserved samples are treated as the training sample set. To design a more accurate classifier, the effects of the data volumes of the history sample set and the incremental sample set are handled by weighted processing. Finally, the algorithm is implemented in a cloud computing system and is applied to study user behaviors. The results of the experiment are provided and compared with other incremental learning algorithms. The results show that the DS-IILS can improve training efficiency and guarantee relatively high classification accuracy at the same time, which is consistent with the theoretical analysis.

An Automatic Urban Function District Division Method Based on Big Data Analysis of POI

  • Guo, Hao;Liu, Haiqing;Wang, Shengli;Zhang, Yu
    • Journal of Information Processing Systems
    • /
    • 제17권3호
    • /
    • pp.645-657
    • /
    • 2021
  • Along with the rapid development of the economy, the urban scale has extended rapidly, leading to the formation of different types of urban function districts (UFDs), such as central business, residential and industrial districts. Recognizing the spatial distributions of these districts is of great significance to manage the evolving role of urban planning and further help in developing reliable urban planning programs. In this paper, we propose an automatic UFD division method based on big data analysis of point of interest (POI) data. Considering that the distribution of POI data is unbalanced in a geographic space, a dichotomy-based data retrieval method was used to improve the efficiency of the data crawling process. Further, a POI spatial feature analysis method based on the mean shift algorithm is proposed, where data points with similar attributive characteristics are clustered to form the function districts. The proposed method was thoroughly tested in an actual urban case scenario and the results show its superior performance. Further, the suitability of fit to practical situations reaches 88.4%, demonstrating a reasonable UFD division result.

SATELLITE-DERIVED SENSIBLE HEAT FLUX OVER THE OCEAN

  • .Kubota Masahisa;Ohnishi Keisuke;Iwasaki Shinsuke;Tomita Hiroyuki
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2005년도 Proceedings of ISRS 2005
    • /
    • pp.30-33
    • /
    • 2005
  • Though sensible heat flux is one of heat flux components, it is generally considered that the importance is low compared with other components because of the small value. Actually sensible heat flux over the tropical ocean is extremely small, less than $100\;W/m^2$ .. However, it should be noted that sensible heat flux in boreal winter over the western boundary current regions is considerably large, about $100\;W/m^2$, and not neglected. In this study we carry out intercomparison of various global sensible heat flux data including not only satellite-derived data but also reanalysis data in order to clarify the characteristics of those data.

  • PDF

글로벌 개방형 연구데이터 커먼즈 및 시사점 (Global Open Research Commons and Implications)

  • 송사광;조민희;이미경;임형준
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2023년도 추계학술발표대회
    • /
    • pp.85-88
    • /
    • 2023
  • 오픈 사이언스 운동의 활성화로 인해 다양한 연구 관련 자원들 간의 상호 운용성 확보를 위한 노력이 활발해지고 있다. 특히, 글로벌 연구데이터 커먼즈(Global Open Research Commons) 모델 개발 관련 표준화 활동이 세계 최대 연구데이터 커뮤니티인 RDA(Research Data Alliance)의 주도로 진행되어 왔고, 최근에 GORC Working Group 에서 버전 1.0 모델을 오픈하였다. 이에 이 모델에 대해 살펴보고 국내의 연구데이터 커먼즈인 KRDC(Korea Research Data Commons)와 비교 및 시사점을 논하고 향후 연구 방향을 소개한다.

Linked Data 연계를 위한 SKOS 기반 용어 온톨로지 모델링 (Term Ontology Modeling for Linked Data using SKOS)

  • 김평;이승우;서동민;정한민;성원경
    • 한국콘텐츠학회:학술대회논문집
    • /
    • 한국콘텐츠학회 2010년도 춘계 종합학술대회 논문집
    • /
    • pp.456-458
    • /
    • 2010
  • 시맨틱 웹은 인간 중심의 데이터 표현을 위한 HTML 기반의 기존 웹과는 달리, 웹에서 데이터의 의미를 표현함으로써 다양한 어플리케이션 간의 데이터 상호 교환을 통한 데이터 통합, 재사용성 증대, 기계에 의한 자동화된 처리를 가능하게 해준다. 온톨로지는 데이터의 의미를 표현하기 위한 방법으로 식별자(URI) 기반의 리소스 명명을 통해 데이터의 의미를 표현하며, Linked Data는 RDF 형식의 데이터 간 링크를 통해 웹 데이터 간의 연계 및 활용할 수 있는 환경을 제공해 준다. 본 연구에서는 용어 정보의 효과적인 공유 및 연계를 위한 방법으로, SKOS 기반 용어 온톨로지 모델링을 통해 용어 정보가 Linked Data에 연계되기 위한 방법을 제시한다.

  • PDF

포스텍 캠퍼스의 전력 사용 데이터 수집 및 분석 (Collection and Analysis of Electricity Consumption Data in POSTECH Campus)

  • 류도현;김광재;고영명;김영진;송민석
    • 품질경영학회지
    • /
    • 제50권3호
    • /
    • pp.617-634
    • /
    • 2022
  • Purpose: This paper introduces Pohang University of Science Technology (POSTECH) advanced metering infrastructure (AMI) and Open Innovation Big Data Center (OIBC) platform and analysis results of electricity consumption data collected via the AMI in POSTECH campus. Methods: We installed 248 sensors in seven buildings at POSTECH for the AMI and collected electricity consumption data from the buildings. To identify the amounts and trends of electricity consumption of the seven buildings, electricity consumption data collected from March to June 2019 were analyzed. In addition, this study compared the differences between the amounts and trends of electricity consumption of the seven buildings before and after the COVID-19 outbreak by using electricity consumption data collected from March to June 2019 and 2020. Results: Users can monitor, visualize, and download electricity consumption data collected via the AMI on the OIBC platform. The analysis results show that the seven buildings consume different amounts of electricity and have different consumption trends. In addition, the amounts of most buildings were significantly reduced after the COVID-19 outbreak. Conclusion: POSTECH AMI and OIBC platform can be a good reference for other universities that prepare their own microgrid. The analysis results provides a proof that POSTECH needs to establish customized strategies on reducing electricity for each building. Such results would be useful for energy-efficient operation and preparation of unusual energy consumptions due to unexpected situations like the COVID-19 pandemic.

QSDB: An Encrypted Database Model for Privacy-Preserving in Cloud Computing

  • Liu, Guoxiu;Yang, Geng;Wang, Haiwei;Dai, Hua;Zhou, Qiang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권7호
    • /
    • pp.3375-3400
    • /
    • 2018
  • With the advent of database-as-a-service (DAAS) and cloud computing, more and more data owners are motivated to outsource their data to cloud database in consideration of convenience and cost. However, it has become a challenging work to provide security to database as service model in cloud computing, because adversaries may try to gain access to sensitive data, and curious or malicious administrators may capture and leak data. In order to realize privacy preservation, sensitive data should be encrypted before outsourcing. In this paper, we present a secure and practical system over encrypted cloud data, called QSDB (queryable and secure database), which simultaneously supports SQL query operations. The proposed system can store and process the floating point numbers without compromising the security of data. To balance tradeoff between data privacy protection and query processing efficiency, QSDB utilizes three different encryption models to encrypt data. Our strategy is to process as much queries as possible at the cloud server. Encryption of queries and decryption of encrypted queries results are performed at client. Experiments on the real-world data sets were conducted to demonstrate the efficiency and practicality of the proposed system.

대학도서관 연구데이터 관리 서비스 현황 및 제안 - 과학기술특성화 대학을 중심으로 - (Current Status and Proposal of University Library Research Data Management Service: Focused on Science and Technology Specialized Universities)

  • 김주섭 ;김선태
    • 한국문헌정보학회지
    • /
    • 제57권3호
    • /
    • pp.279-301
    • /
    • 2023
  • 데이터 중심의 연구환경으로 빠르게 변화하고 있다. 이에 따라 국내 대학도서관에서도 대학의 연구자를 지원하기 위한 연구데이터 관리 서비스 구축 및 운영을 준비하고 있다. 본 연구는 과학기술특성화 대학도서관에서 연구자를 지원하기 위한 연구데이터 관리 서비스를 제안하고자 설계되었다. 해당 서비스를 제안하기 위하여 해외 및 국내 과학기술특성화 대학 중 11곳을 선택하여 해당 기관의 연구데이터 관리 서비스를 분석하였다. 분석 결과, 연구데이터 관리, 전자 연구노트 그리고 RDM 교육으로 핵심 카테고리를 도출하였으며 특히, '연구데이터 관리' 카테고리는 DMP, 데이터 수집, 데이터 관리, 데이터 보존, 데이터 공유 및 출판, 데이터 재사용, 인프라 및 도구 그리고 RDM 가이드 및 정책으로 구성하였다. 본 연구 결과는 과학기술특성화 대학도서관에서 연구데이터 관리 서비스를 도입하고 운영하는데 도움이 될 것이다.

Detection of multi-type data anomaly for structural health monitoring using pattern recognition neural network

  • Gao, Ke;Chen, Zhi-Dan;Weng, Shun;Zhu, Hong-Ping;Wu, Li-Ying
    • Smart Structures and Systems
    • /
    • 제29권1호
    • /
    • pp.129-140
    • /
    • 2022
  • The effectiveness of system identification, damage detection, condition assessment and other structural analyses relies heavily on the accuracy and reliability of the measured data in structural health monitoring (SHM) systems. However, data anomalies often occur in SHM systems, leading to inaccurate and untrustworthy analysis results. Therefore, anomalies in the raw data should be detected and cleansed before further analysis. Previous studies on data anomaly detection mainly focused on just single type of data anomaly for denoising or removing outliers, meanwhile, the existing methods of detecting multiple data anomalies are usually time consuming. For these reasons, recognising multiple anomaly patterns for real-time alarm and analysis in field monitoring remains a challenge. Aiming to achieve an efficient and accurate detection for multi-type data anomalies for field SHM, this study proposes a pattern-recognition-based data anomaly detection method that mainly consists of three steps: the feature extraction from the long time-series data samples, the training of a pattern recognition neural network (PRNN) using the features and finally the detection of data anomalies. The feature extraction step remarkably reduces the time cost of the network training, making the detection process very fast. The performance of the proposed method is verified on the basis of the SHM data of two practical long-span bridges. Results indicate that the proposed method recognises multiple data anomalies with very high accuracy and low calculation cost, demonstrating its applicability in field monitoring.

연구데이터의 고성능 네트워킹을 위한 Science DMZ 확장성 연구 (Research on Science DMZ scalability for the high performance research data networking)

  • 이찬균;장민석;노민기;석우진
    • KNOM Review
    • /
    • 제22권2호
    • /
    • pp.22-28
    • /
    • 2019
  • Science DeMilitarized Zone (DMZ)은 연구데이터의 특성에 최적화 된 대용량 연구데이터 전용 네크워크 기술이다. Science DMZ는 망을 사용하는 연구자 간의 신뢰성을 보장하는 폐쇄망을 구성하여, 전송성능을 저하할 수 있는 보안장비등을 배제함으로써 단대단 성능을 보장한다. Data Transfer Node (DTN)는 연구 데이터의 송수신 기능만을 담당하며 망의 성능과 보안을 보장하는 Science DMZ의 필수 구성요소이다. 현재의 Science DMZ 구조에서는 망사용자마다 DTN 서버를 포설하며 이는 과도한 망 관리 부담, 신규 사용자의 진입장벽, 그리고 망 전체 CAPEX 측면에서 확장성의 한계가 있다. 본 논문에서는 전술한 Science DMZ의 확장성 문제를 해결하기 위해 연구망 사용자들을 그룹화하여 중앙 집중형 공용 DTN 서버를 공유하는 구조에 대해 제시한다. 특히 상용 컴퓨팅 장비의 성능대비 장비 비용 추세를 적용하여 네트워크 로드에 따른 네트워크 장비 구성비용을 비교함으로써, 제안하는 공용 DTN 방안의 효과에 대해 예측 분석한다.