File System Basics

What are $HOME, $SCRATCH, $PROJECT, and $PSCRATCH directories, and what are they used for?

These directories are integral components of the file system on LCC and MCC, each serving a distinct purpose:

  • $HOME: This directory represents your home directory. It is a personal space allocated to each user for storing personal files, configuration settings, and executable scripts. Your home directory is accessible only to you and is often used for managing your user-specific data and settings. Allocated quota: 10GB.

  • $SCRATCH: The $SCRATCH directory is intended for the temporary storage of large data sets, intermediate files, and computational results. The $SCRATCH directory is only accessible to you. It offers ample storage space, with a quota much larger than the $HOME directory. Files in $SCRATCH should be short-lived and are automatically purged after 90 days since their last accessed time. Allocated quota: 25TB.

  • $PROJECT: The $PROJECT directory is designated for storing project-related files and data. This directory is accessible to all members of your group. It provides a shared space for collaborative research projects, allowing multiple users to access and work on project data simultaneously. The $PROJECT directory usually has a larger quota than $HOME, facilitating the storage of large datasets and project-related files. Allocated quota: 1TB.

  • $PSCRATCH: Similar to $SCRATCH, the $PSCRATCH directory serves as temporary storage for large-scale parallel computing tasks. However, $PSCRATCH is specifically allocated for project-related computational workloads that should be accessible to all your group members. It is ideal for storing temporary files generated during parallel computations, such as MPI jobs or distributed data processing tasks. Allocated quota: 50TB.

 

How do I check my storage usage and quotas?

LCC:

Enter the following command to see the storage usage and quotas:

lcc quotas

Example output:

image-20240212-145256.png

MCC:

Enter the following command to see the storage usage and quotas:

projects quotas

Example output:

image-20240212-145047.png

Note: ‘Data Used’ represents the cumulative size of all files within the specified path, while ‘Max Data’ denotes the maximum storage capacity available within the particular path.

Paths to use in your job scripts

As of January 31, 2020, we have released a module called ccs/lcc-user that is automatically loaded for all users. It sets up some environment variables that point to some important paths on the system. It is vital that you use these environment variables instead of hard-coding paths in your job scripts. These environment variables allow our system to be storage-agnostic so that we can move users between different storage appliances based on performance or other factors.

You should update all of your existing job scripts using the following guide, or you may find your jobs breaking in the future.

For the following exemplary table, assume your username is user123 and that you are in a project called pi456_uksr.

DON'T USE

DO USE

Command  to show the actual path

/home/user123

$HOME

 echo $HOME

/project/pi456_uksr

$PROJECT/pi456_uksr

echo $PROJECT/pi456_uksr

/pscratch/pi456_uksr

$PSCRATCH/pi456_uksr

echo $PSCRATCH/pi456_uksr

/scratch/user123

$SCRATCH

echo $SCRATCH

/share

$SHARE

echo $SHARE

If you run module list you'll notice the ccs/lcc-user module is loaded with an (S) next to it. The (S) means sticky so that the module won't be unloaded when you run module purge which a lot of users do at the beginning of their job scripts. This helps ensure that this module stays loaded and that your environment is configured correctly.

[jdch223@login001 ~]$ module list   Currently Loaded Modules:   1) autotools   2) prun/1.3   3) intel/19.0.4.243   4) impi/2019.4.243   5) ohpc   6) ccs/lcc-user (S)     Where:    S:  Module is Sticky, requires --force to unload or purge     [jdch223@login001 ~]$ module purge The following modules were not unloaded:   (Use "module --force purge" to unload all):     1) ccs/lcc-user [jdch223@login001 ~]$ module list   Currently Loaded Modules:   1) ccs/lcc-user (S)     Where:    S:  Module is Sticky, requires --force to unload or purge

*Note:  command "echo $SCRATCH"  will point to your scratch space or the command "cd $SCRATCH ; pwd " will point to your scratch.

 

LCC Storage Quotas/Limit

Remember, all data in /scratch and /pscratch are automatically purged after 90 days since their last accessed time.

Name

Location

Quota/Limit

90-day purge?

Is data backed up ?

User scratch

$SCRATCH

25 TB (per user)

Yes

No

Group project scratch

$PSCRATCH/PI_linkblue_uksr   

50 TB(shared among all members in a PI group)

Yes

No

User home

$HOME

10 GB(per user)

No

No

Group Project home

$PROJECT/PI_linkblue_uksr

1 TB(shared among all members in a PI group

No

No

  • We provide additional 10TB storage per PI on our Object store that is not connected to the HPC cluster and this is by request only and not by default. Please contact us if you need accounts on our Object storage.

Note: We do not backup any user data on the systems. Please make your own backups to other resources.

Please see Paths to Use in Your Job Scripts for updated paths to your allocated storage spaces.

For all large data transfers, please login to our data transfer node (DTN) and do your transfers(rclone, scp etc). The login nodes are on a slower external up-link connection.

We also provide a GLOBUS endpoint on our DTN node for external data transfers. Click here for details.

 

Combating 'Disk quota exceeded'

LCC enforces storage quotas, meaning certain directories have an upper limit on how much data you can store in them. If you exceed this limit, you'll receive a 'Disk quota exceeded' error, and your job will crash. You can review the established quotas for LCC above.

Some applications will invisibly create directories under /home and store large files there. Because /home/$USER has the most strict quotas, this is the most frequent culprit for errors. For example, when running Singularity containers, Singularity will create a /home/$USER/.singularity directory and download large image files which will quickly hit your home quota. But there is a simple solution for this problem.

The solution is to move the directory in question to your /scratch/$USER space, and provide a symlink from home to scratch. This will allow applications to look under /home/$USER but actually be directed to /scratch/$USER where the quotas are much more forgiving. Following is a two-command example of this, using .singularity as a case study.

mv /home/$USER/.singularity $SCRATCH/.singularity
ln -s $SCRATCH/.singularity /home/$USER/.singularity

 

For users installing conda packages:

Please do the same as above since conda installs all the packages in your home directory in the folder $HOME/.conda. Since you need conda packages for long term use put them in your shared project space instead of your scratch folder.

Eg: If your shared group project space is /project/xxxx_uksr  then

      mkdir -p /project/xxxx_uksr/$USER/my_conda
      ln -s /project/xxxx_uksr/$USER/my_conda /home/$USER/.conda

 

For users installing R packages:

Please do the same as above since R installs all the packages in your home directory in the folder $HOME/R. Since you need R packages for long term use put them in your shared project space instead of your scratch folder.

Eg: If your shared group project space is /project/xxxx_uksr  then

      mkdir -p /project/xxxx_uksr/$USER/my_R
      ln -s /project/xxxx_uksr/$USER/my_R /home/$USER/R

Also make sure you load the same R module (eg: module load ccs/conda/r-base-4.0.0) each time if you want to use the R packages that were installed with that version of R.

 

For users installing Julia packages:

Please do the same as above since Julia installs all the packages and the registries in your home directory in the folder $HOME/.julia. Since you need Julia packages for long term use put them in your shared project space instead of your scratch folder.

Eg: If your shared group project space is /project/xxxx_uksr  then

      mkdir -p /project/xxxx_uksr/$USER/my_julia
      ln -s /project/xxxx_uksr/$USER/my_julia /home/$USER/.julia

Also make sure you load the same Julia module (eg: module load ccs/julia/1.6.1) each time if you want to use the Julia packages that were installed with that version of Julia.

Center for Computational Sciences