ECC Storage

ECC Storage

See File System Basics here .

 

What are $HOME, $SCRATCH, $PROJECT, and $PSCRATCH directories, and what are they used for?

These directories are integral components of the file system on LCC and MCC, each serving a distinct purpose:

  • $HOME: This directory represents your home directory. It is a personal space allocated to each user for storing personal files, configuration settings, and executable scripts. Your home directory is accessible only to you and is often used for managing your user-specific data and settings. Allocated quota: 10GB.

  • $SCRATCH: The $SCRATCH directory is intended for the temporary storage of large data sets, intermediate files, and computational results. The $SCRATCH directory is only accessible to you. It offers ample storage space, with a quota much larger than the $HOME directory. Files in $SCRATCH should be short-lived and are automatically purged after 90 days since their last accessed time. Allocated quota: 25TB.

  • $PROJECT: The $PROJECT directory is designated for storing project-related files and data. This directory is accessible to all members of your group. It provides a shared space for collaborative research projects, allowing multiple users to access and work on project data simultaneously. The $PROJECT directory usually has a larger quota than $HOME, facilitating the storage of large datasets and project-related files. Allocated quota: 1TB.

  • $PSCRATCH: Similar to $SCRATCH, the $PSCRATCH directory serves as temporary storage for large-scale parallel computing tasks. However, $PSCRATCH is specifically allocated for project-related computational workloads that should be accessible to all your group members. It is ideal for storing temporary files generated during parallel computations, such as MPI jobs or distributed data processing tasks. Allocated quota: 50TB.

 

How do I check my storage usage and quotas?

ECC:

ecc projects

Example output

[XXX@ecc-login001 ~]$ ecc quotas ========== xxxxx ========== Path Data Used Max Data Notes /home/xxxxx 400K 10G Permanent /scratch/xxxxx 46.08G 25T Transient (90-day deletion) ========== xxxx_uksr ========== Path Data Used Max Data Notes /project/xxxx_uksr 50G 1T Permanent /pscratch/xxxx_uksr 50G 50T Transient (90-day deletion)

Note: ‘Data Used’ represents the cumulative size of all files within the specified path, while ‘Max Data’ denotes the maximum storage capacity available within the particular path.

 

ECC Storage Quotas/Limit

Remember, all data in /scratch and /pscratch are automatically purged after 90 days since their last accessed time.

Name

Location

Quota/Limit

90-day purge?

Is data backed up ?

Name

Location

Quota/Limit

90-day purge?

Is data backed up ?

User scratch

$SCRATCH

25 TB (per user)

Yes

No

Group project scratch

$PSCRATCH/PI-username_uksr   

50 TB(shared among all members in a PI group)

Yes

No

User home

$HOME

10 GB(per user)

No

No

Group Project home

$PROJECT/PI-username_uksr

1 TB(shared among all members in a PI group

No

No

 

Note: We do not backup any user data on the systems. Please make your own backups to other resources.

Please see Paths to Use in Your Job Scripts for updated paths to your allocated storage spaces.

For all large data transfers, please login to our data transfer node (DTN) and do your transfers(rclone, scp etc). The login nodes are on a slower external up-link connection.

We also provide a GLOBUS endpoint on our DTN node for external data transfers. Click here for details.

 

Combating 'Disk quota exceeded'

ECC enforces storage quotas, meaning certain directories have an upper limit on how much data you can store in them. If you exceed this limit, you'll receive a 'Disk quota exceeded' error, and your job will crash. You can review the established quotas for ECC above.

Some applications will invisibly create directories under /home and store large files there. Because /home/$USER has the most strict quotas, this is the most frequent culprit for errors. For example, when running Singularity containers, Singularity will create a /home/$USER/.singularity directory and download large image files which will quickly hit your home quota. But there is a simple solution for this problem.

The solution is to move the directory in question to your /scratch/$USER space, and provide a symlink from home to scratch. This will allow applications to look under /home/$USER but actually be directed to /scratch/$USER where the quotas are much more forgiving. Following is a two-command example of this, using .singularity as a case study.

mv /home/$USER/.singularity $SCRATCH/.singularity
ln -s $SCRATCH/.singularity /home/$USER/.singularity

 

For users installing conda packages:

Please do the same as above since conda installs all the packages in your home directory in the folder $HOME/.conda. Since you need conda packages for long term use put them in your shared project space instead of your scratch folder.

Eg: If your shared group project space is /project/xxxx_uksr  then

      mkdir -p /project/xxxx_uksr/$USER/my_conda
      ln -s /project/xxxx_uksr/$USER/my_conda /home/$USER/.conda

Center for Computational Sciences