PERFORMANCE-GUARANTEED DATA REPLICATION FOR DATA-INTENSIVE SCIENTIFIC APPLICATIONS

Anirban Sam , Arpit Solanki

doi:10.64149/gjaets.12.4.19-23

Authors

Anirban Sam , Arpit Solanki Author

DOI:

https://doi.org/10.64149/gjaets.12.4.19-23

Keywords:

Data Grid, Data Replication, Scheduling

Abstract

A Data Grid is composed of multiple interconnected sites organized in a hierarchical structure with one top-level site and several institutional sites. The top-level site functions as the central management unit and is responsible for maintaining the Replica Catalogue (RC), which stores metadata about data files and their replicas across different sites. Each site possesses both computational and storage capabilities, enabling job execution and data storage within the Grid environment. Communication within a site is assumed to have negligible delay due to high internal bandwidth.

In the data replication framework, multiple data files are generated by designated source sites, and a single site may act as the source for more than one file. Since each Grid node has limited storage capacity, it can replicate and store only a limited number of data files. Efficient replication strategies are therefore essential to optimize storage utilization, data availability, and overall system performance. This study focuses on addressing the data file replication problem in such a distributed Grid environment while considering storage constraints and system architecture.

PERFORMANCE-GUARANTEED DATA REPLICATION FOR DATA-INTENSIVE SCIENTIFIC APPLICATIONS

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

Make a Submission

LOGO

Latest publications

Information