High Performance Computing (HPC) Systems
This documentation is intended for users of the Summit High Performance Computer system, a joint activity of Colorado State University (CSU) and the University of Colorado Boulder (CU). Summit is housed at, and operated by, CU IT staff. The emphasis in this manual is for new CSU users. Some of the information presented herein is particular to CSU’s environment and some is pertinent to CU’s environment, as Summit resides on the CU network and behind the CU IT security infrastructure. All user and application support for CSU users are performed by Graduate Research Assistants (GRAs) resident at CSU or CU staff dedicated to CSU user support. CU staff provide all system administration support. All Summit help requests should be directed to email@example.com and will be directed accordingly.
The Summit High-Performance Computing (HPC) system was established under NSF MRI Award #1532235 to both CSU and CU. The $3.55 million system was awarded under the auspices of the Rocky Mountain Advanced Computing Consortium (RMACC; www.rmacc.org), an activity in which ISTeC (istec.colostate.edu) at CSU participates. ISTeC coordinated the proposal for CSU. The system went into full production on February 2, 2017. Summit allocations are partitioned in proportion to the amount invested by each institution, with 10% reserved for the RMACC, as follows: RMACC – 10% of the total; CSU – 25% of the remaining 90% = 22.5% of the total; CU – 75% of the remaining 90% = 67.5% of the total. Allocations are approximately on a “use it or lose it” basis monthly, and are somewhat oversubscribed to allow users who need additional resources within a month to use resources available from other users with unused resources in that month. The initial large Summit system has since been augmented with nodes under the “condo” model, where users buy into add additional nodes to the system that they own. Additionally, the Summit system has been augmented with a small Knights Landing system, using Intel’s Many Integrated Cores (MIC) technology.
The Summit system architecture is comprised of general compute nodes, GPU compute nodes, high-memory compute nodes and Phi nodes. For detailed information on the system, see the CU Boulder Research Computing resources page.
Allocations of time on Summit are based on Service Units (SU) and CPU core-hours as follows:
1 SU = 1 CPU core-hour (i.e., a single fully utilized CPU core on 1 compute node for 1 hour)
CSU users as a whole have an allocation of roughly 25% of the total SUs/yr for Summit (21,996,360 SU/yr)
All new summit users will be added to the csu-general allocation, which amounts to about 50,000 SU/yr.
If users require additional compute resources, they can apply for an allocation. We strongly encourage users to apply for an allocation once they have a sense of how many core hours their project requires.
To apply for an allocation, submit the Summit Allocation Request Form . Log in using your Summit credentials, formatted for Duo authentication just as they would be when logging in to Summit (username: <eID>@colostate.edu, password: <eID_password>,push).
- The form will ask you to provide details about your project
- Then, you can request an allocation for your project by uploading a proposal based on this template. For more details on allocations, see CU Research computing’s documentation.
The Summit Management and Allocations Review Committee will review and assess these applications and respond within 1 week on the status of the request.
The Summit system offered an optional Condominium Computing Model (“Condo Model”). In the Condo Model, costs are split between researchers and Central IT. Researchers purchase their own compute nodes and Central IT provides the hosting environment and support services for those nodes.
Due to the age of the system, the final opportunity to buy into the Condo Model on Summit concluded in January 2018. We expect to offer Condo Model buy in opportunities on future HPC systems.
Condo jobs have the following privileges
- can request longer run times (up to 168 hrs. (7 D))
- get queue priority boost (equal to 1 D boost)
- can access all compute nodes
To properly activate Condo shares, Condo users should send the following info to firstname.lastname@example.org.
- full name
- condo group ID (see table below)
You will receive an email note when your condo group ID assignment is complete. You’ll then be able to submit jobs using your Condo allocation.
The table below shows the condo group ID that has been assigned to each principal investigator and their department affiliation.
|PI||Dept.||Condo group ID|
|Asa Ben-Hur||Computer Science||hal|
|Stephen Guzik||Mechanical Engineering||cfd|
|Chris Weinberger||Mechanical Engineering||crw|
Condo Job Submission
To submit jobs using your Condo allocation, include the following lines in your Slurm batch job file
#SBATCH --qos condo #SBATCH -A csu-summit-xxx
where “xxx” is your condo group ID from the table above. Note the double-dash for the “qos”parameter. When you submit a Slurm batch job file with these parameters, the job will run with the additional privileges described above.
The following statement may be used for Summit Acknowledgements:
“This work utilized the RMACC Summit supercomputer, which is supported by the National Science Foundation (awards ACI-1532235 and ACI-1532236), the University of Colorado Boulder and Colorado State University. The RMACC Summit supercomputer is a joint effort of the University of Colorado Boulder and Colorado State University.”
The following document includes information that may be used for grants, RFPs and other solicitations:
CSU User's Guide
Logging into Summit requires DUO two factor authentication. To set up DUO, follow the instructions at http://authenticate.colostate.edu
Access to Research Computing resources is available by way of the Secure Shell, or ssh, protocol. Access is provided via a dedicated login node.
ssh command can be run from the Linux and OS X command-line.
where you should replace csu_eID with your eid.
When logging in from Windows, we recommend the PuTTY application.
If you are using the DUO smartphone app: When it asks for your password, type in your CSU password followed by a comma followed by the word “push”.
Don’t forget the comma! The DUO app on your phone will ask you to approve the request.
Alternatively: you can ask the DUO app to generate a 6-digit code called a DUO key. Use your CSU password, followed by a comma, followed by the 6 digit number generated by your app.
Don’t forget the comma!
NOTE: The DUO_key mentioned above cycles every 15 seconds. If you do not log in within 15 seconds of generating the key, it will expire and you’ll have to generate another key.
To log in without the smartphone app:
When it asks for your password, type in your CSU password followed by a comma followed by the word “phone”.
You will receive a call at your registered phone number and will be asked to use the keypad to authenticate.
If you are having trouble logging in and you suspect DUO is the issue, contact the Central IT Support Helpdesk for DUO support
Transferring files to Summit is typically facilitated by sftp or Globus.
ssh File transfer
ssh File transfer (sftp) is recommended for smaller files. A good rule of thumb is that if you’re willing to sit through the file transfer, sftp is a good option. You can use sftp from the command line or other software. The login credentials are the same ones you use to ssh into the login node. Please see the section on remote login for more details.
Other sftp clients
You can also use other file transfer software like
- FileZilla, a multi-protocol, multi-platform file-transfer application.
- WinSCP, a basic SCP/SFTP file-transfer application for Windows
Generally, the information you need to use this method is as follows:
- Host: login.rc.colorado.edu
- Protocol: choose “SFTP – SSH File Transfer Protocol”
- User and password: the same credentials that you use to ssh into Summit (see remote login section)
Note: Because Research Computing uses one-time passwords for authentication, you must disable password retention / saving in your file-transfer client if you are using the DUO key authentication method. Failure to do so may cause your account to be temporarily disabled after the client attempts and fails to authenticate repeatedly in the background.
For more information about using Globus, including populating transfer endpoints (such as Research Computing, a local machine, etc.), see the documentation provided by CU Boulder research computing.
CSU has joined Globus to provide researchers a means to transfer and share data among researchers between institutions. To find out more and start using the system, go to the Globus website and login with your eID.
- On the Globus home page select “Log In” at the top right-hand corner.
- Select “Colorado State University” as your organization.
- Select “Continue” to go to the CILogin page.
- Use your eID and eID password to log on to CILogin.
- You will land on the Globus File Manager page.
- Click in the “Collection” field, where it says “Start here.”
- Follow the prompts from Globus. You may need to install Globus Connect Personal to make your local computer an endpoint. If Globus Connect Personal is already installed, click “Your Collections” to see the endpoints you already have.
Summit users have 3 main directories: Home, Projects and Scratch.
The home directory (/home/csu_eID@colostate.edu) has 2 GB of storage that is backed up locally into a hidden directory (.snapshot/) at 2 hour, daily, and weekly intervals and to a second site for disaster recovery nightly. Because the home directory is not on high-performance storage, it should not be written to by compute jobs.
The projects directory (/projects/csu_eID@colostate.edu) is intended to store software builds and smaller datasets and to share data and software with other users. It has 250 GB of storage that is backed up locally into a hidden directory (.snapshot/) at 6 hour, daily, and weekly intervals and to a second site for disaster recovery nightly. Like the home directory, because projects is not on high-performance storage, it should not be written to by compute jobs.
Summit Scratch directory
The Summit scratch directory (/scratch/summit/csu_eID@colostate.edu) is intended for input and output for compute jobs running on Summit and uses GPFS (General Parallel File System) for fast parallel I/O. Each user is limited to 10 TB of scratch storage and a total of 20 million files and directories. If you need a larger allocation, email email@example.com. Files in the scratch directory are automatically purged 90 days after their creation, are NOT backed up, and may be purged at any time. Transfer your data to the projects directory or to permanent data storage soon after your job completes.
For more in-depth information, and a table of backup frequencies for each directory, see the CU Boulder RC User Guide topic on Filesystems.
Summit uses a module system that allows the installation of multiple versions of common software packages that users can switch between. To use the software, the module must be loaded first. Loading a module will alter aspects of your environment, such as the $PATH variable. Finally, you must be on a Summit compile node to load modules.
After logging in on Summit, type:
to move to a compile node.
Summit uses Lmod environment module system to simplify shell configuration and software application management. See CU Boulder RC Summit User guide for more information about the module system, including commands for module loading and exploration.
Summit uses a module system to publish software. See the lmod documentation to learn how to see which modules are installed and how to load them.
Installing custom software
Users should install custom software in the project directory. Users may also take advantage of the module system to publish local module files to configure a running environment for the software. Such a module could be adopted as a centrally supported module if it has wide community use.
Most custom software should be installed using one of the Summit compile nodes. These nodes are identical to the general Summit compute nodes, which is ideal for compiling software to run on the system. Once you are connected to the login node via ssh, you can connect to a compile node by running the following.
For information on how to compile your software on Summit, please see the CU Boulder RC Summit User Guide.
For more information about using Summit, see the following Resources:
Note that the CU examples will include information about how they log in to Summit. Please substitute the information from our Remote login section.
Help & Additional Information:
Summit support requests: firstname.lastname@example.org
Cray support requests: email@example.com
To receive Summit system updates and other announcements, send a message to firstname.lastname@example.org
To receive information on training and events associated with Summit, email email@example.com.
To receive information on Cray system status, email firstname.lastname@example.org.
CSU provides educational resources for data planning, analysis, and archiving at https://lib.colostate.edu/services/data-management/