OpenStack Cloud
Local Flexible Computing Infrastructure
The OpenStack-based cloud computing system enables scientists to benefit from flexibility in designing workflows and maintaining pipelines. The benefit of virtualising the underlying system is to provide a reproducible platform for development - a crucial tenet of best scientific practice.
We offer versatile modes of interaction with the cloud, through the command line accessible on the internal network, or via a web-based GUI - to ensure the system is as accessible as possible for all users. Usage of the system - just as for the farm - is well documented on the internal confluence platform.
Instances and flavours¶
Users can draw from a virtualised pool of resources to produce instances of particular flavours. Similarly to the farm - the flavours support different use cases.
The flavours are delineated as in the below table, highlighting the generations of each.
| Flavour | m1 - dedicated CPU (1st generation) | m2/m4 - dedicated CPU (2nd/4th generation) | s2 - dedicated CPU (short-lived) | o2 - 8x oversubscribed CPU (2nd generation) |
|---|---|---|---|---|
| CPU - Memory | CPU - Memory | CPU - Memory | CPU - Memory | |
| tiny | 1 CPU - 8GB | 1 CPU - 11GB | 1 CPU - 11GB | 1 CPU - 1GB |
| small | 2 CPU - 17GB | 2 CPU - 23GB | 2 CPU - 23GB | 2 CPU - 2GB |
| medium | 4 CPU - 34GB | 4 CPU - 47GB | 4 CPU - 47GB | 4 CPU - 5GB |
| large | 8 CPU - 68GB | 8 CPU - 95GB | 8 CPU - 95GB | 8 CPU - 11GB |
| xlarge | 16 CPU - 137GB | 16 CPU - 190GB | 16 CPU - 190GB | 16 CPU - 23GB |
| 2xlarge | 26 CPU - 223GB | 24 CPU - 285GB | 24 CPU - 285GB | 30 CPU - 44GB |
| 3xlarge | 54 CPU - 464GB | 54 CPU - 357GB | 54 CPU - 357GB | 60 CPU - 88GB |
| 4xlarge | 60 CPU - 714GB | 60 CPU - 714GB | 60 CPU - 177GB |
The s* instances are to be treated as ephemeral by the user since they will run directly
on the hypervisor. This enables fast resource allocation for jobs with short lifetimes e.g.
CI/CD runner machines on the internal GitLab server (see software for more).
Interfacing to native storage¶
Functionality is provided within the cloud computing service to access the data storage platforms native to the Sanger. This includes instances with secure access to Lustre for intra-pipeline data reading and writing, as well as iRODS for archival sequencer results with client images.
The RTP team has had an influential role in industry standards and has contributed to the Lustre documentation and best practices due to the innovation of allowing secure access to Lustre through cloud-based virtual machines. See the following slides from the Lustre User Group meeting for details: Learn more
Kubernetes¶
For the creation and management of Kubernetes clusters we have an internal Rancher instance, which can be accessed with Sanger credentials at rancher.internal.sanger.ac.uk.
Alternatively, users can instantiate clusters in a self-service manner (which is useful for trying out a number of configuration options). The configuration is maintained by the RTP and RSE team.
Dedicated RSE and RTP are available to consult on use of the Kubernetes platform and to provide general assistance with using the cloud.