• Title/Summary/Keyword: Large Scale Data

Search Result 2,773, Processing Time 0.026 seconds

Scale Efficiency and Fishing Capacity Analysis for Large Pair-Trawl Vessels in Korean Waters (한국 근해 쌍끌이 대형기선저인망어선의 규모별 효율성과 어회능력 활용도 평가)

  • Lee, Dong-Woo;Lee, Jae-Bong;Jung, Suk-Geun;Kim, Yeong-Hye
    • Korean Journal of Fisheries and Aquatic Sciences
    • /
    • v.41 no.6
    • /
    • pp.485-492
    • /
    • 2008
  • To propose proper vessel characteristics for sustainable fisheries in Korean waters, we analyzed the fishing capacity, scale efficiency and utilization of large pair-trawl vessels based on the database of catch, effort and vessel characteristics (gross tonnage and engine power) in 1990 by applying data envelopment analysis (DEA). The input factors were gross tonnage, horse power and days operated; whereas the output factor was expected catch by vessel characteristics. The optimal vessel types, selected based on the input-oriented technical efficiency and gross tonnages, was 100 GT with engine power <600 HP. The output-oriented unbiased estimate of capacity utilization (CD) decreased with increasing vessel tonnage. For the same tonnage vessels, the CD decreased with increasing engine power.

3D Segmentation for High-Resolution Image Datasets Using a Commercial Editing Tool in the IoT Environment

  • Kwon, Koojoo;Shin, Byeong-Seok
    • Journal of Information Processing Systems
    • /
    • v.13 no.5
    • /
    • pp.1126-1134
    • /
    • 2017
  • A variety of medical service applications in the field of the Internet of Things (IoT) are being studied. Segmentation is important to identify meaningful regions in images and is also required in 3D images. Previous methods have been based on gray value and shape. The Visible Korean dataset consists of serially sectioned high-resolution color images. Unlike computed tomography or magnetic resonance images, automatic segmentation of color images is difficult because detecting an object's boundaries in colored images is very difficult compared to grayscale images. Therefore, skilled anatomists usually segment color images manually or semi-automatically. We present an out-of-core 3D segmentation method for large-scale image datasets. Our method can segment significant regions in the coronal and sagittal planes, as well as the axial plane, to produce a 3D image. Our system verifies the result interactively with a multi-planar reconstruction view and a 3D view. Our system can be used to train unskilled anatomists and medical students. It is also possible for a skilled anatomist to segment an image remotely since it is difficult to transfer such large amounts of data.

DDS/SDN integration architecture with real-time support for large-scale distributed simulation environments (대규모 분산 시뮬레이션 환경을 위한 실시간성 지원 DDS/SDN 통합 아키텍쳐)

  • Kim, Daol;Joe, Inwhee;Kim, Wontae
    • Journal of IKEEE
    • /
    • v.22 no.1
    • /
    • pp.136-142
    • /
    • 2018
  • Recently, as the development system has become larger, sequential simulation methods have become impossible to verify systems that take a long time or require real time results. Therefore, a study of a distributed simulation system that simulates several processes has been conducted. In order to simulate real-time systems, efficient data exchange between distributed systems is required. Data Distribution Service is a data-oriented communication middleware proposed by Object Management Group and provides efficient data exchange and various QoS. However, in a large-scale distributed simulation environment distributed over a wide area, there is a problem of Participant Discovery and QoS guarantee due to domain separation in data exchange. Therefore, in this paper, we propose a DDS/SDN architecture that can guaranteed QoS and effective Participant Discovery in an SDN-based network.

An Analysis for Price Determinants of Small and Medium-sized Office Buildings Using Data Mining Method in Gangnam-gu (데이터마이닝기법을 활용한 강남구 중소형 오피스빌딩의 매매가격 결정요인 분석)

  • Mun, Keun-Sik;Choi, Jae-Gyu;Lee, Hyun-seok
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.414-427
    • /
    • 2015
  • Most Studies for office market have focused on large-scale office buildings. There is, if any, a little research for small and medium-sized office buildings due to the lack of data. This study uses the self-searched and established 1,056 data in Gangnam-Gu, and estimates the data by not only linear regression model, but also data mining methods. The results provide investors with various information of price determinants, for small and medium-sized office buildings, comparing with large-scale office buildings. The important variables are street frontage condition, zoning of commercial area, distance to subway station, and so on.

An Energy Efficient RF Protocol Structure for a Large-Scale In-Home Display Deployment (대규모 In-Home Display 보급을 위한 에너지 효율적 RF 통신 프로토콜 체계)

  • Lee, Seung-Min;Son, Sung-Yong
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.4 no.1
    • /
    • pp.53-60
    • /
    • 2011
  • In-Home Display (IHD) is one of the most popular ways to induce voluntary customer participation in energy savings. Various communication technologies are used for recent IHD implementations, but most IHD systems are designed for each house because of their limitations such as communication coverage area and operation complexity. In this study, 400MHz RF communication is used for economical large-scale deployment of IHDs especially for apartment complexes that represent typical residentioal environment in Korea. Since it is essential to use internal batteries to increase the usability of IHD, the frequent changes of them should be avoided. By dividing communication data into 3 types such as common data, long term data, and short term data depending on their update periods, energy efficient communication protocol is designed and proposed. In result, the quantity of data and the battery consumption of IHD are reduced to 23.4% and 31.5% each without harming service quality.

Processing large-scale data with Apache Spark (Apache Spark를 활용한 대용량 데이터의 처리)

  • Ko, Seyoon;Won, Joong-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1077-1094
    • /
    • 2016
  • Apache Spark is a fast and general-purpose cluster computing package. It provides a new abstraction named resilient distributed dataset, which is capable of support for fault tolerance while keeping data in memory. This type of abstraction results in a significant speedup compared to legacy large-scale data framework, MapReduce. In particular, Spark framework is suitable for iterative machine learning applications such as logistic regression and K-means clustering, and interactive data querying. Spark also supports high level libraries for various applications such as machine learning, streaming data processing, database querying and graph data mining thanks to its versatility. In this work, we introduce the concept and programming model of Spark as well as show some implementations of simple statistical computing applications. We also review the machine learning package MLlib, and the R language interface SparkR.

Harmony Search for Virtual Machine Replacement (화음 탐색법을 활용한 가상머신 재배치 연구)

  • Choi, Jae-Ho;Kim, Jang-Yeop;Seo, Young Jin;Kim, Young-Hyun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.2
    • /
    • pp.26-35
    • /
    • 2019
  • By operating servers, storage, and networking devices, Data centers consume a lot of power such as cooling facilities, air conditioning facilities, and emergency power facilities. In the United States, The power consumed by data centers accounted for 1.8% of total power consumption in 2004. The data center industry has evolved to a large scale, and the number of large hyper scale data centers is expected to grow in the future. However, as a result of examining the server share of the data center, There is a problem where the server is not used effectively such that the average occupancy rate is only about 15% to 20%. To solve this problem, we propose a Virtual Machine Reallocation research using virtual machine migration function. In this paper, we use meta-heuristic for effective virtual machine reallocation. The virtual machine reallocation problem with the goal of maximizing the idle server was designed and solved through experiments. This study aims to reducing the idle rate of data center servers and reducing power consumption simultaneously by solving problems.

Movable-Bed Modeling Law for Beach Response Experiments Using Equilibrium Beach Profile Formula (평형해빈단면식을 이용한 해빈반응실험에 대한 이동상 모형법)

  • Kim, Jin Hoon;Kim, In Ho;Lee, Jung Lyul
    • Journal of Ocean Engineering and Technology
    • /
    • v.32 no.5
    • /
    • pp.351-360
    • /
    • 2018
  • The construction of large scale harbor structures at Maengbang beach, which is located on the eastern coast of Korea, is of great concern because it may cause disastrous beach erosion in the vicinity. Therefore, a hydraulic model experiment was conducted to examine the morphological changes after such construction. The water depth was scaled using the method of Van Rijn (2010), which is a well-known scale law, but the results appeared to be overestimated. The present study developed a new scale law that applies an equilibrium beach profile formula to scale the model evolution to the prototype scale. When compared with survey data observed at Maengbang beach, the proposed method showed better agreement than the method of Van Rijn (2010).

Evaluating Perceived Smartness of Product from Consumer's Point of View: The Concept and Measurement

  • Lee, Won-Jun
    • The Journal of Asian Finance, Economics and Business
    • /
    • v.6 no.1
    • /
    • pp.149-158
    • /
    • 2019
  • Due to the rapid development of IT (information technology) and internet, products become smart and able to collect, process and produce information and can think of themselves to provide better service to consumers. However, research on the characteristics of smart product is still sparse. In this paper, we report the systemic development of a scale to measure the perceived product smartness associated with smart product. To develop product smartness scale, this study follows systemic scale development processes of item generation, item reduction, scale validation, reliability and validity test consequently. And, after acquiring a large amount of qualitative interview data asking the definition of smart product, we add a unique process to reduce the initial items using both a text mining method using 'r' s/w and traditional reliability and validity tests including factor analysis. Based on an initial qualitative inquiry and subsequent quantitative survey, an eight-factor scale of product smartness is developed. The eight factors are multi-functionality, human-like touch, ability to cooperate, autonomy, situatedness, network connectivity, integrity, and learning capability consequently. Results from Korean samples support the proposed measures of product smartness in terms of reliability, validity, and dimensionality. Implications and directions for further study are discussed. The developed scale offers important theoretical and pragmatic implications for researchers and practitioners.

RHadoop platform for K-Means clustering of big data (빅데이터 K-평균 클러스터링을 위한 RHadoop 플랫폼)

  • Shin, Ji Eun;Oh, Yoon Sik;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.609-619
    • /
    • 2016
  • RHadoop is a collection of R packages that allow users to manage and analyze data with Hadoop. In this paper, we implement K-Means algorithm based on MapReduce framework with RHadoop to make the clustering method applicable to large scale data. The main idea introduces a combiner as a function of our map output to decrease the amount of data needed to be processed by reducers. We showed that our K-Means algorithm using RHadoop with combiner was faster than regular algorithm without combiner as the size of data set increases. We also implemented Elbow method with MapReduce for finding the optimum number of clusters for K-Means clustering on large dataset. Comparison with our MapReduce implementation of Elbow method and classical kmeans() in R with small data showed similar results.