• Title/Summary/Keyword: Large Scale Data

Search Result 2,773, Processing Time 0.033 seconds

FINER-SCALE SST FRONT OF THE SOUTHERN ECS IN WINTERTIME FROM SATELLITE AND SHIPBOARD DATA

  • Chang, Yi;Shimada, Theruhisa;Sakaida, Futoki;Kawamura, Hiroshi;Chan, Jui-Wen;Liu, Dong-Chan;Lee, Ming-An
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.740-743
    • /
    • 2006
  • We identify two distinct finer-scale frontal bands: 'Mainland China Coastal Front' (MCCF) and 'Kuroshio Front' (KF). The MCCF is along the 50-m isobath with large temperature gradient. The front is a boundary between the Mainland China Coastal Current and the offshore shelf waters. On the other hand, the KF is extending from the northeastern coast of Taiwan toward the northeast and into the shelf of south ECS. It forms a broad semicircle-shape and curving along 100-m isobath, it also deviates from eastward at around 26.5N-122E and leaves the shelf of ECS. This front should be the boundary between the Kuroshio water and the other shelf waters.

  • PDF

Analysis of Traffic Accident by Circular Intersection Type in Korea Using Count Data Model (가산자료 모형을 이용한 국내 원형교차로 유형별 교통사고 분석)

  • Kim, Tae Yang;Lee, Min Yeong;Park, Byung Ho
    • Journal of the Korean Society of Safety
    • /
    • v.32 no.5
    • /
    • pp.129-134
    • /
    • 2017
  • This study aims to develop the traffic accident models by circular intersection type using count data model. The number of accident, the number of fatal and injured persons(FSI), and EPDO are calculated from the traffic accident data of TAAS. The circular intersection accident models are developed through Poisson and negative binomial regression analysis. The main results of this study are as follows. First, the null hypotheses that there are differences in the number of traffic accidents, FSI and EPDO by type of circular intersections are rejected. Second, the scale of intersection(median, large), number of approach road, mean width and length of exit road, area of the circulating roadway and central island are selected as factors influencing the number of traffic accidents, FSI and EPDO in rotary. Third, the scale of intersection(median), guide signs(limited speed, direction, roundabout), number of approach road, entry angle, area of the intersection and central island are adopted as factors influencing the number of traffic accidents, FSI and EPDO in roundabout. Finally, transferring from rotary to roundabout could be expected to make the accident decrease.

Enhanced Locality Sensitive Clustering in High Dimensional Space

  • Chen, Gang;Gao, Hao-Lin;Li, Bi-Cheng;Hu, Guo-En
    • Transactions on Electrical and Electronic Materials
    • /
    • v.15 no.3
    • /
    • pp.125-129
    • /
    • 2014
  • A dataset can be clustered by merging the bucket indices that come from the random projection of locality sensitive hashing functions. It should be noted that for this to work the merging interval must be calculated first. To improve the feasibility of large scale data clustering in high dimensional space we propose an enhanced Locality Sensitive Hashing Clustering Method. Firstly, multiple hashing functions are generated. Secondly, data points are projected to bucket indices. Thirdly, bucket indices are clustered to get class labels. Experimental results showed that on synthetic datasets this method achieves high accuracy at much improved cluster speeds. These attributes make it well suited to clustering data in high dimensional space.

Evaluating and Mitigating Malicious Data Aggregates in Named Data Networking

  • Wang, Kai;Bao, Wei;Wang, Yingjie;Tong, Xiangrong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.9
    • /
    • pp.4641-4657
    • /
    • 2017
  • Named Data Networking (NDN) has emerged and become one of the most promising architectures for future Internet. However, like traditional IP-based networking paradigm, NDN may not evade some typical network threats such as malicious data aggregates (MDA), which may lead to bandwidth exhaustion, traffic congestion and router overload. This paper firstly analyzes the damage effect of MDA using realistic simulations in large-scale network topology, showing that it is not just theoretical, and then designs a fine-grained MDA mitigation mechanism (MDAM) based on the cooperation between routers via alert messages. Simulations results show that MDAM can significantly reduce the Pending Interest Table overload in involved routers, and bring in normal data-returning rate and data-retrieval delay.

Reinforcement learning multi-agent using unsupervised learning in a distributed cloud environment

  • Gu, Seo-Yeon;Moon, Seok-Jae;Park, Byung-Joon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.192-198
    • /
    • 2022
  • Companies are building and utilizing their own data analysis systems according to business characteristics in the distributed cloud. However, as businesses and data types become more complex and diverse, the demand for more efficient analytics has increased. In response to these demands, in this paper, we propose an unsupervised learning-based data analysis agent to which reinforcement learning is applied for effective data analysis. The proposal agent consists of reinforcement learning processing manager and unsupervised learning manager modules. These two modules configure an agent with k-means clustering on multiple nodes and then perform distributed training on multiple data sets. This enables data analysis in a relatively short time compared to conventional systems that perform analysis of large-scale data in one batch.

A guideline for the statistical analysis of compositional data in immunology

  • Yoo, Jinkyung;Sun, Zequn;Greenacre, Michael;Ma, Qin;Chung, Dongjun;Kim, Young Min
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.4
    • /
    • pp.453-469
    • /
    • 2022
  • The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the alternative approach using Dirichlet regression analysis, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.

Environmental Equity Analysis of Fine Dust in Daegu Using MGWR and KT Sensor Data (다중 스케일 지리가중회귀 모형과 KT 측정기 자료를 활용한 대구시 미세먼지에 대한 환경적 형평성 분석)

  • Euna CHO;Byong-Woon JUN
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.26 no.4
    • /
    • pp.218-236
    • /
    • 2023
  • This study attempted to analyze the environmental equity of fine dust(PM10) in Daegu using MGWR(Multi-scale Geographically Weighted Regression) and KT(Korea Telecom Corporation) sensor data. Existing national monitoring network data for measuring fine dust are collected at a small number of ground-based stations that are sparsely distributed in a large area. To complement these drawbacks, KT sensor data with a large number of IoT(Internet of Things) stations densely distributed were used in this study. The MGWR model was used to deal with spatial heterogeneity and multi-scale contextual effects in the spatial relationships between fine dust concentration and socioeconomic variables. Results indicate that there existed an environmental inequity by land value and foreigner ratio in the spatial distribution of fine dust in Daegu metropolitan city. Also, the MGWR model showed better the explanatory power than Ordinary Least Square(OLS) and Geographically Weighted Regression(GWR) models in explaining the spatial relationships between the concentration of fine dust and socioeconomic variables. This study demonstrated the potential of KT sensor data as a supplement to the existing national monitoring network data for measuring fine dust.

Characteristics and Application of Large-area Multi-temporal Remote Sensing Data (광역 시계열 원격탐사자료 분석의 특성과 응용)

  • 성정창
    • Korean Journal of Remote Sensing
    • /
    • v.16 no.1
    • /
    • pp.1-11
    • /
    • 2000
  • Multi-temporal data have been used frequently for analyzing dynamic characteristics of ecological environment. Little research, however, shows the characteristics and problems of the analysis of continental- or global-scale, multi-temporal satellite data. This research investigated the characteristics of large-area, multi-temporal data analysis and the problems of phenological difference of ground vegetation and scarcity of training data for a long term period. This research suggested a latitudinal image segmentation method and an invariant pixel method. As an application, the image segmentation and invariant pixel methods were applied to a set of AVHRR data covering most part of Asia from 1982 to 1993. Fuzzy classification results showed the decrease of forests and the increase of croplands at densely populated areas, however an opposite trend was detected at sparsely populated or depopulated areas.

An Analysis of Fuzzy Survey Data Based on the Maximum Entropy Principle (최대 엔트로피 분포를 이용한 퍼지 관측데이터의 분석법에 관한 연구)

  • 유재휘;유동일
    • Journal of the Korea Society of Computer and Information
    • /
    • v.3 no.2
    • /
    • pp.131-138
    • /
    • 1998
  • In usual statistical data analysis, we describe statistical data by exact values. However, in modem complex and large-scale systems, it is difficult to treat the systems using only exact data. In this paper, we define these data as fuzzy data(ie. Linguistic variable applied to make the member-ship function.) and Propose a new method to get an analysis of fuzzy survey data based on the maximum entropy Principle. Also, we propose a new method of discrimination by measuring distance between a distribution of the stable state and estimated distribution of the present state using the Kullback - Leibler information. Furthermore, we investigate the validity of our method by computer simulations under realistic situations.

  • PDF

Analysis of Overviews of Working Environment Measurement and its Results in Korean Industry (우리나라 사업장의 작업환경측정 및 노출기준 초과실태 분석)

  • 김정호;원정일
    • Journal of environmental and Sanitary engineering
    • /
    • v.11 no.3
    • /
    • pp.53-61
    • /
    • 1996
  • The subject of this study was to analyse overviews of companies which exceed TLV by industry, hazardous factors, and to estimate the numbers of companies measured in 1993 and the implementation rate of working environment measurement by the act of industrial safety and health The result of this study was as follow. 1. The number of cases which exceed TLV was 5,937 companies. In distribution of excess companies by the scale of workers, small scale cases under the 49 workers were 3,150 companies(53.0%) of total cases, medium scale cases between 50 - 299 workers were 2,248 companies(37.9%), and large scale cases over 300 workers were 539 companies(9.1%). By the industry of excess companies, it was marked high rate in manufacture of fabricared metal products(except machinary and equipment), manufacture of textiles of each 1,048 companies(17.7%), and 1,018 companies(17.1%). By the area of excess companies, it was shown high rate in Kyeongki area marked 1,679 companies(28.3%) and Daegu-Kyeongbuk area were marked 1,417 companies (23.9%). By the hazardous factors of excess companies, noise was recorded high rate in 5,160 companies (86.9%), dust was shown in 1,245 companies(21.0%), organic solvent was marked 130 companies(7.9%). The number of excess factors by the company was 1.2. In this result, the more it was bigger scale companies, the more excess factors were much more and the more it was recorded higher rate in noise organic solvent heavy metals, etc. 2. The measured cases in institutes during 1994 were 1,596 companies, and excess cases were 157 companies(9.8%) among them. By the scale of workers, small scale cases under the 49 workers were 190 companies (17.9%) among 1,064 companies, cases of medium scale cases between 50-299 workers were 127 companies (27.9%) among 463 companies, and large scale cases over 300 workers were 31 companies(44.9%) among 69 companies. In this result industry of the highest rate shown was manufacture of basic metals in 20 companies exceeded among 53 companies (37.7%), and was manufacture of pulp, paper production in 14 companies exceeded among 40 companies(35.0%), and the excess rate were high in bigger scale. 3. Companies estimated by the data of excess cases and excess rate in 1993 were 30,474 implementation rate estimated for measurement of working environment was 34.3% of companies in korean industry. In this result, it was comparatively shown of measurement rate for the working environment in manufacture of pulp, paper product, manufacture of machinary and equipment n.e.c., and of high measurement rate and excess rate in manufacture of electrial machinary and apparatus, and manufacture of basic metals.

  • PDF