Platform R: Data Ingress/Egress

Overview

What are the approved methods of data ingress?

Data transferred from both Restricted and Research Drive into Platform R is allowed. 

What is the approved method of data egress?

Currently, the only approved method of data egress from Platform R is through Restricted Drive.

Why is it important to transfer my data?

The scratch storage within Platform R is temporary and, as such, is not backed up which could lead to data loss.

Why are there limitations to how I transfer my data?

Due to the nature of the data that can be utilized within Platform R, it's essential to restrict and minimize unnecessary connections. By reducing the number of access points present, we lower the likelihood of data leaks as well as limit possible attack surfaces for individuals with malicious intent.

Looking for more information about data classifications or approved tools for moving and storing PHI?

Processes

Data Ingress

Transferring data into Platform R from Reasearch or Restricted Drive is as simple using rsync to copy files and folders into your scratch group.

 ex: $ rsync /mnt/researchdrive/netid/sample_data /mnt/scratch/group/netid

Data Egress

Transferring data out of Platform R to Restricted Drive also uses rysnc to move data from your scratch group.

 ex: $ rsync /mnt/scratch/group/netid/sample_data /mnt/restricteddrive/netid/processed

Helpful Tips

Important Notes

  • Data transferring to and from Platform R is only accessible through the head node.
  • By default, files in the destination location will be overwritten.
  • When transferring, a trailing "/" will copy the contents of a folder rather than the folder itself.
    • /mnt/researchdrive/netid/sample_data will transfer the folder "sample_data".
    • /mnt/researchdrive/netid/sample_data/ will transfer the contents within "sample_data".
  • You can test transfers before executing them to ensure correctness by using "-n" or "--dry-run" in the command.
 ex: $ rsync -n /mnt/researchdrive/netid/sample_data /mnt/scratch/group/netid

Common Command Line Options

Common Command Line Arguments
Command Line Argument Function the Argument Provides
-a Enables archive mode which creates a recursive copy of files and folders
-v  Shows a more detailed output 
-h Creates a more readable output - including sizes of files (in MB, KB, etc)
--progress Monitors status of the transfer
-u Skip files that are newer in the destination location, prevents overwriting
--ignore-existing Skip updating files which already exist in the destination

Looking for more? Check out rsync's official documentation here.

Repository Mirroring

Repository Mirroring is the process of automatically copying a Git repository from one location to another, ensuring that both remain synchronized. Since you are not allowed to access GitHub from Platform R for security reasons, this can be particularly useful for allowing you to mirror GitHub repositories in your personal UW GitLab spaces to make them accessible from pR. Repository Mirroring can also be useful for other purposes like maintaining backups and integrating with external systems without exposing the primary repository. 

See GitLab's documentation on repository mirroring here



Keywords:
data, moving, platform r, ingress, egress
Doc ID:
150962
Owned by:
Nolan J. in SMPH Research Applications
Created:
2025-05-19
Updated:
2025-06-02
Sites:
SMPH Research Applications