Data Transfer Nodes (DTNs)

Data Transfer Nodes (DTNs)

Overview

Data Transfer Nodes (DTNs) at the Center for Computational Sciences (CCS) provide dedicated, high-performance endpoints for moving large datasets into and out of our HPC systems. They are optimized for wide-area data movement, inter-cluster transfers, and transfers between CCS storage and external collaborators.

DTNs are part of the CCS Research Network and follow a Science DMZ design pattern, enabling predictable, high-throughput transfers without the bottlenecks present on general-purpose campus networks.


Why Use a DTN?

Login nodes and compute nodes are not designed for data movement.

DTNs provide:

  • High-bandwidth network paths tuned for scientific data

  • Optimized TCP settings and parallel I/O tools

  • Dedicated connectivity to external research networks (Internet2)

  • Access to shared storage systems (GPFS, NAS, Ceph where applicable)

  • Isolation from user workloads to ensure stability

Using the DTNs helps prevent login node slowdowns and ensures that transfers run efficiently.


Available DTN Systems

CCS operates multiple DTNs associated with its clusters:

Cluster

DTN Hostname

Cluster

DTN Hostname

Lipscomb Compute Cluster (LCC)

dtn.ccs.uky.edu

Morgan Compute Cluster (MCC)

mcc-dtn.ccs.uky.edu

EduceLab Compute Cluster (ECC)

ecc-dtn.educelab.uky.edu

These systems provide identical functionality; choose the DTN associated with the cluster where your data resides.


Network Performance

  • Intra-campus research network: up to 100 Gbps connectivity between local compute/storage systems

  • External transfers (off-campus): routed through a 40 Gbps wide-area link connecting CCS to campus border routers and Internet2

  • DTNs are positioned to maximize throughput across both local and external paths

Actual performance depends on the remote endpoint, protocol, and storage system involved.


Recommended Tools

DTNs support several optimized data-transfer tools:

Globus (Preferred)

  • Reliable, restartable, high-performance transfers

  • Suitable for large datasets or cross-institution collaborations

  • CCS Collections:

    • Lipscomb Compute Cluster (LCC)

    • Morgan Compute Cluster (MCC)

    • EduceLab Compute Cluster (ECC)

Globus installation and documentation: https://ukyrcd.atlassian.net/wiki/spaces/RCDDocs/pages/162108436

Rclone

  • Useful for S3, Google Drive, OneDrive, Box, and other cloud storage

  • Supports parallel transfers and checksum validation

  • Pre-installed on DTNs

Rclone documentation: https://ukyrcd.atlassian.net/wiki/spaces/RCDDocs/pages/162108905

Command-Line Tools

  • rsync (high-latency optimized flags recommended)

  • scp / sftp (not recommended for very large datasets)


Storage Access From DTNs

DTNs can access major CCS storage systems, including:

  • GPFS filesystems (/home, /project, /scratch, /pscratch)

  • NAS systems used for research groups or condo storage

  • Ceph object storage (only if explicitly configured for a project)

Access patterns depend on the cluster environment and user permissions.


Best Practices

  • Use DTNs for all large data transfers.

  • Avoid transferring large datasets on login nodes.

  • Prefer Globus for any multi-GB or multi-TB transfer.

  • Use parallel transfer options when available (rclone --transfers, rsync --partial --append-verify).

  • When moving many small files, consider creating a tarball first.


Example Workflows

Transfer data to/from CCS using Globus

  1. Log in to https://www.globus.org

  2. Select a CCS endpoint (e.g., Lipscomb Compute Cluster (LCC) )

  3. Select remote endpoint (e.g., national lab, collaborator institution, cloud bucket)

  4. Start transfer

  5. Globus handles retries, parallelism, integrity checks

Transfer data from local machine using DTN + rsync

rsync -avhP /local/data/ username@dtn.ccs.uky.edu:/project/myproj/

Important Notes

  • No CCS storage system is backed up.
    Users are responsible for maintaining redundant copies of critical data.

  • DTNs are shared resources.
    Very long-running or extremely heavy transfers should be scheduled during off-peak hours.

  • If your transfer consistently underperforms, contact CCS for help benchmarking paths or tuning workflows.


Need Help?

If you need assistance choosing the best transfer method, optimizing throughput, or troubleshooting performance submit a support request at: CCS Ticketing System

Center for Computational Sciences