• Title/Summary/Keyword: Science-based Cluster

Search Result 1,270, Processing Time 0.038 seconds

Progress Report on the Relationship Between the Bright and Faint Galaxies in Abell 3659

  • Lee, Hye-Ran;Lee, Joon Hyeop;Kim, Minjin;Oh, Seulhee;Ree, Chang Hee;Jeong, Hyunjin;Kyeong, Jaemann;Kim, Sang Chul;Lee, Jong Chul;Ko, Jongwan;Park, Byeong-Gon;Sheen, Yun-Kyeong
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.2
    • /
    • pp.55.1-55.1
    • /
    • 2013
  • The properties of bright galaxies are closely related to those neighbors and satellite galaxies. However, the effects of nearby companion are known to be very weak in a galaxy cluster, when the companions are bright galaxies. On the other hand, until now, it has not been clear whether the properties of bright galaxies are affected by their faint satellites in a galaxy cluster. Recently, J. H. Lee et al. (in preparation) have found that the colors of bright galaxies in WHL J085910.0+294957, a galaxy cluster at z = 0.3, show a measurable correlation with the mean colors of faint galaxies around them. To confirm that result and to investigate the host-satellite relationship depending on cluster properties, we carry out follow-up studies of a few galaxy clusters, beginning with Abell 3659 (z ~ 0.0907) imaged in the g' and r' bands using IMACS on the Magellan (Baade) 6.5m telescope. Cluster members are selected based on the distributions of color, size and concentration along magnitude and spatial distribution. In this poster, we present some preliminary results: marginal correlations in color between bright galaxies and their faint companions are found at the central region of Abell 3659. The implication of these results is discussed.

  • PDF

Cluster Based Routing Protocol Using Fixed Cell in Mobile Ad hoc Networks (MANET) (Mobile Ad Hoc Networks(MANET)에서의 고정셀을 이용한 Cluster Based Routing Protocol)

  • 정종광;김재훈
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.04a
    • /
    • pp.583-585
    • /
    • 2002
  • Mobile Ad Hoc Network에서는 무선으로 연결된 호스트들이 쉽게 이동할 수 있으며 미리 설치된 유선망을 이용하는 셀롤러망과 달리 이동 호스트 사이의 통신만으로 이루어진 망이다. Mobile Ad Hoc Network에서는 각각의 노드들의 이동성이 높기 때문에 이 각각의 노드들의 라우팅 경로를 결정하는 것이 중요하다. 이에 따라 Mobile Ad Hoc Network를 위한 많은 라우팅 프로토콜이 제안되었다. 본 논문에서는 기존에 제안된 Cluster Based Routing Protocol(CBRP)극 변형하여 마치 셀롤러망에서의 셀과 같이 고정된 위치를 하나의 셀로 정의하고 그 하나의 셀이 클러스터를 형성하여 라우팅 오버 헤드를 줄일 수 있는 기법을 제안한다.

  • PDF

Semantic-Based K-Means Clustering for Microblogs Exploiting Folksonomy

  • Heu, Jee-Uk
    • Journal of Information Processing Systems
    • /
    • v.14 no.6
    • /
    • pp.1438-1444
    • /
    • 2018
  • Recently, with the development of Internet technologies and propagation of smart devices, use of microblogs such as Facebook, Twitter, and Instagram has been rapidly increasing. Many users check for new information on microblogs because the content on their timelines is continually updating. Therefore, clustering algorithms are necessary to arrange the content of microblogs by grouping them for a user who wants to get the newest information. However, microblogs have word limits, and it has there is not enough information to analyze for content clustering. In this paper, we propose a semantic-based K-means clustering algorithm that not only measures the similarity between the data represented as a vector space model, but also measures the semantic similarity between the data by exploiting the TagCluster for clustering. Through the experimental results on the RepLab2013 Twitter dataset, we show the effectiveness of the semantic-based K-means clustering algorithm.

Scaling of Hadoop Cluster for Cost-Effective Processing of MapReduce Applications (비용 효율적 맵리듀스 처리를 위한 클러스터 규모 설정)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.1
    • /
    • pp.107-114
    • /
    • 2020
  • This paper studies a method for estimating the scale of a Hadoop cluster to process big data as a cost-effective manner. In the case of medical institutions, demands for cloud-based big data analysis are increasing as medical records can be stored outside the hospital. This paper first analyze the Amazon EMR framework, which is one of the popular cloud-based big data framework. Then, this paper presents a efficiency model for scaling the Hadoop cluster to execute a Mapreduce application more cost-effectively. This paper also analyzes the factors that influence the execution of the Mapreduce application by performing several experiments under various conditions. The cost efficiency of the analysis of the big data can be increased by setting the scale of cluster with the most efficient processing time compared to the operational cost.

A Novel Node Management in Hadoop Cluster by using DNA

  • Balaraju. J;PVRD. Prasada Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.9
    • /
    • pp.134-140
    • /
    • 2023
  • The distributed system is playing a vital role in storing and processing big data and data generation is speedily increasing from various sources every second. Hadoop has a scalable, and efficient distributed system supporting commodity hardware by combining different networks in the topographical locality. Node support in the Hadoop cluster is rapidly increasing in different versions which are facing difficulty to manage clusters. Hadoop does not provide Node management, adding and deletion node futures. Node identification in a cluster completely depends on DHCP servers which managing IP addresses, hostname based on the physical address (MAC) address of each Node. There is a scope to the hacker to theft the data using IP or Hostname and creating a disturbance in a distributed system by adding a malicious node, assigning duplicate IP. This paper proposing novel node management for the distributed system using DNA hiding and generating a unique key using a unique physical address (MAC) of each node and hostname. The proposed mechanism is providing better node management for the Hadoop cluster providing adding and deletion node mechanism by using limited computations and providing better node security from hackers. The main target of this paper is to propose an algorithm to implement Node information hiding in DNA sequences to increase and provide security to the node from hackers.

Structural Parameters of Galaxies in the Virgo Cluster

  • Kim, Suk;Yi, Wonhyeong;Rey, Soo-Chang;Sung, Eon-Chang;Jerjen, Helmut;Lisker, Thorsten;Lee, Youngdae;Lee, Woong;Chung, Jiwon;Pak, Mina
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.38 no.2
    • /
    • pp.47.1-47.1
    • /
    • 2013
  • We present structural parameters of galaxies in the Extended Viro Cluster Catalog (EVCC), new catalog of galaxies in the Viro cluster using homogeneous Sloan Digital Sky Survey (SDSS) Date Release 7 (DR7) data. The EVCC covers more extended region of the Viro cluster than of the Virgo Cluster Catalog (VCC) and presents updated morphologies of galaxies using multi-band images and spectral features. We obtain the surface brightness profiles of galaxies using ellipse task in IRAF. Based on the analysis of surface brightness profile we construct a catalog of various structural parameters of galaxies, i.e. central surface brightness, effective radius, sersic index, effective surface brightness, and mean effective surface brightness. Taking advantage of these structural parameters in various parameter spaces, we refine criteria of dividing giant elliptical and dwarf elliptical galaxies. In addition, we found that bulge dominated galaxies have larger sersic index and brighter central surface brightness than disk dominated galaxies. At fixed magnitude, dwarf elliptical galaxies dwarf lenticular galaxies, and dwarf irregular low surface brightness (LSB) galaxies show larger effective radii than giant elliptical galaxies, giant lenticular galaxies, and irregular high surface brightness (HSB) galaxies, respectively. Dwarf elliptical galaxies and dwarf irregular LSB galaxies occupy the similar structural parameter spaces. We suggest that giant elliptical galaxies and dwarf elliptical galaxies may have different origin.

  • PDF

HyperDB - A High Performance Data Analysis System Based on Grid Computing Technology

  • Kim, Tae-Kyung;Na, Jong-Hwa;Chon, Wan-Sup
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.1
    • /
    • pp.161-174
    • /
    • 2007
  • In this paper, we propose a high performance database cluster system called HyperDB to process OLAP queries efficiently. HyperDB is a virtual database system running on top of internet-connected PCs; the PCs are used for their own purpose at ordinary times, but they are able to participate in the database cluster system at non-office hours. We propose fully logical replication technique and optimal parallel intra-query routing technique for extensibility and performance. Experiment for TPC-R benchmark shows significant performance upgrade compared with conventional approaches.

  • PDF

On the Categorical Variable Clustering

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.2
    • /
    • pp.219-226
    • /
    • 1996
  • Basic objective in cluster analysis is to discover natural groupings of items or variables. In general, variable clustering was conducted based on some similarity measures between variables which have binary characteristics. We propose a variable clustering method when variables have more categories ordered in some sense. We also consider some measures of association as a similarity between variables. Numerical example is included.

  • PDF

Back-end Prefetching Scheme for Improving the Performance of Cluster-based Web Servers (클러스터 웹 서버에서 성능 향상을 위한 노드간 선인출 기법)

  • Park, Seon-Yeong;Park, Do-Hyeon;Lee, Joon-Won;Cho, Jung-Wan
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.29 no.5
    • /
    • pp.265-273
    • /
    • 2002
  • With the explosive growth of WWW traffic, there is an increasing demand for the high performance Web servers to provide a stable Web service to users. The cluster-based Web server is a solution to core with the heavy access from users, easily scaling the server according to the loads. In the cluster-based Web sewer, a back-end node may not be able to serve some HTTP requests directly because it does not have the requested contents in its main memory. In this case, the back-end node has to retrieve the requested contents from its local disk or other back-end nodes in the cluster. To reduce service latency, we introduce a new prefetch scheme. The back-end nodes predict the next HTTP requests and prefetch the contents of predicted requests before the next requests arrive. We develop three prefetch algorithms bated on some useful information gathered from many clients'HTTP requests. Through trace-driven simulation, the service latency of the prefetch scheme is reduced by 10 ~ 25% as compared with no prefetch scheme. Among the proposed prefetch algorithms, Time and Access Probability-based Prefetch (TAP2) algorithm, which uses the access probability and the inter-reference time of Web object, shows the best performance.

Known-Item Retrieval Performance of a PICO-based Medical Question Answering Engine

  • Vong, Wan-Tze;Then, Patrick Hang Hui
    • Asia pacific journal of information systems
    • /
    • v.25 no.4
    • /
    • pp.686-711
    • /
    • 2015
  • The performance of a novel medical question-answering engine called CliniCluster and existing search engines, such as CQA-1.0, Google, and Google Scholar, was evaluated using known-item searching. Known-item searching is a document that has been critically appraised to be highly relevant to a therapy question. Results show that, using CliniCluster, known-items were retrieved on average at rank 2 ($MRR@10{\approx}0.50$), and most of the known-items could be identified from the top-10 document lists. In response to ill-defined questions, the known-items were ranked lower by CliniCluster and CQA-1.0, whereas for Google and Google Scholar, significant difference in ranking was not found between well- and ill-defined questions. Less than 40% of the known-items could be identified from the top-10 documents retrieved by CQA-1.0, Google, and Google Scholar. An analysis of the top-ranked documents by strength of evidence revealed that CliniCluster outperformed other search engines by providing a higher number of recent publications with the highest study design. In conclusion, the overall results support the use of CliniCluster in answering therapy questions by ranking highly relevant documents in the top positions of the search results.