Difference between revisions of "Hints for purchase of computing resources"

From Dynamo
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 5: Line 5:
 
== Visualization Worsktation ==
 
== Visualization Worsktation ==
  
If you are interested in using ''Dynamo'' for interacting with your tomograms (visualization, annotation, etc), you should get a reasonably good workstation. A suitable  machine should be able to fit tomograms in its memory comfortably. A RAM memory of at least 32 Gb is not so expensive these days, and will save you a lot of frustration in front of your screen.
+
If you are interested in using ''Dynamo'' for interacting with your tomograms (visualization, annotation, etc), you should get a reasonably good workstation. A suitable  machine should be able to fit tomograms in its memory comfortably. A RAM memory of at least 128 Gb  or even 256 Gb is not so expensive these days, and will save you a lot of frustration in front of your screen.
  
 
This is a very reasonable investment. Operating on tomograms in a remotely located machine makes interaction sluggish and frustrating. And a powerful workstation will be useful for many other tasks of Structural Biology.
 
This is a very reasonable investment. Operating on tomograms in a remotely located machine makes interaction sluggish and frustrating. And a powerful workstation will be useful for many other tasks of Structural Biology.
Line 26: Line 26:
 
=== Requirements on the CPU ===
 
=== Requirements on the CPU ===
  
For ''Dynamo'' you don't need a lot of memory. If you are planning to buy a machine just for ''Dynamo'',  a RAM of 16Gb should be enough fro the CPU that harnesses the GPU devices. Bear in mind however that other CryoEM software (as Relion) may require a much more memory. A RAM of 256 of 512Gb is very expensive, so that if you are on a tight budget you should make your planning according to the intended purpose of the machine.
+
For ''Dynamo'' you don't need a lot of memory. If you are planning to buy a machine just for ''Dynamo'',  a RAM of 16Gb should be enough fro the CPU that harnesses the GPU devices. Bear in mind however that other CryoEM software (as Relion) may require a much more memory. A RAM of 512 to 1024Gb is very expensive, so that if you are on a tight budget you should make your planning according to the intended purpose of the machine.
  
 +
=== Requirements on the GPU ===
  
=== Requirements on the GPU ===
+
''Dynamo'' uses CUDA 9.0 or superior.
 +
 
 +
By January 2022, have set up servers with GTX 3090. These correspond to the ''cheaper'' line of NVIDIA cards that can be used for scientific computing. When considering the particular GPU type you want to install, you will notice that NVIDIA also provides cards like the A100. They are considerably more expensive (for the same number of CUDA cores) because of the [https://en.wikipedia.org/wiki/ECC_memory ECC memory check]. However this is a functionality you don't need necessarily need for running subtomogram averaging projects, and purchase more economic models like the ones described above is deemed safe.
  
 +
Some performance measurements we have recorded experimentally are indicated [https://doi.org/10.1107/S2059798317003369 here], together with a further discussion on the different types of available cards.
  
 
=== Practical considerations ===
 
=== Practical considerations ===
 
We suggest directly contacting your IT admins to make certain that the data center in your institution will be able to host the GPUs you want to buy, in terms of space in the racks, appropriate cooling and powering, etc...
 
We suggest directly contacting your IT admins to make certain that the data center in your institution will be able to host the GPUs you want to buy, in terms of space in the racks, appropriate cooling and powering, etc...
 +
 +
=== Price estimations ===
 +
Our estimation for the price of a dedicated server would be in the range of  8000K-20000K US$.
  
 
==File systems==
 
==File systems==
Line 41: Line 48:
  
 
Note that if you have separate file systems in your machines, you will probably invest large amounts of time [[tarring projects | transferring projects and data]] between systems.
 
Note that if you have separate file systems in your machines, you will probably invest large amounts of time [[tarring projects | transferring projects and data]] between systems.
 +
 +
A generous amount of SSD is always a good idea, 4Tb in a workstation is something you will certainly appreciate.
  
 
== External computing resources==
 
== External computing resources==
 
If you don't wish to buy a dedicated GPU servers, there are other alteratives for the use of ''Dynamo''
 
If you don't wish to buy a dedicated GPU servers, there are other alteratives for the use of ''Dynamo''
  
=== Clusters ===
+
=== Large scale clusters ===
 
Many research institutions offer access to computing clusters, both for CPU and/or GPU machines. They are normally accessed through a queue that distributes the available resources among many users.  
 
Many research institutions offer access to computing clusters, both for CPU and/or GPU machines. They are normally accessed through a queue that distributes the available resources among many users.  
This option requires some tedious adaption to the syntax of the queuing system.  
+
 
 +
This option requires some tedious adaption to the particular syntax of the queuing system and compiting for computing times with the rest of the users. Moreover,  in many cases you may be forced to operate transfers of projects and data between different file systems.  
  
 
=== Cloud computing ===
 
=== Cloud computing ===
 
''Dynamo'' will shortly deliver a (free) virtual environment for the Amazon EC2 cloud. In this setting, you don't need to purchase and maintain any hardware, or install any software. Users just pay Amazon for the time they actually use.
 
''Dynamo'' will shortly deliver a (free) virtual environment for the Amazon EC2 cloud. In this setting, you don't need to purchase and maintain any hardware, or install any software. Users just pay Amazon for the time they actually use.

Latest revision as of 14:34, 24 January 2022

Quite often, Dynamo users ask us questions like "which computer do I need to use Dynamo", "should I buy some special equipment?". This page serves as a guide on the points you should consider to decide if and what resources you need to purchase.

As a first point, bear in mind that there are very different tasks that can be performed by Dynamo, and they will run optimally in different machines. On-line navegation of tomograms is a very different operation than parallel nmber crunching of thousand of particles!

Visualization Worsktation

If you are interested in using Dynamo for interacting with your tomograms (visualization, annotation, etc), you should get a reasonably good workstation. A suitable machine should be able to fit tomograms in its memory comfortably. A RAM memory of at least 128 Gb or even 256 Gb is not so expensive these days, and will save you a lot of frustration in front of your screen.

This is a very reasonable investment. Operating on tomograms in a remotely located machine makes interaction sluggish and frustrating. And a powerful workstation will be useful for many other tasks of Structural Biology.

Cores

A good workstation will probably provide at least 8 cores without forcing your budget. Although it's not strictly necessary for visualization, there are some additional ancillary operations frequently needed in Dynamo that benefit from parallelization: particle cropping, averaging, some steps in PCA analysis, manual management of models. Even alignment projects with modest number of particles can be reasonably executed on a local workstation. Thus, having in your local machine 24 cores will not be an useless luxury. You'll use them, although probably not as often as other features of your workstation, so of you want to keep your budget low, do not buy too many cores.

Screens

We strongly encourage to run Dynamo on two screens when you are visualizing tomograms. Dynamo tends to create lots of graphical windows to offer different functionalities. A Dynamo session can easily overcrowd a single screen...

Price estimation

Depending on prestations, the price range for a suitable workstation would be 3500-9000 US$.

Dedicated GPU servers for number crunching

For real number crunching we recommend trying to buy your own dedicated server. Our typical setting is a rack mounted unit with CPU with 8 cores and, moving 8 GPU machines.

Requirements on the CPU

For Dynamo you don't need a lot of memory. If you are planning to buy a machine just for Dynamo, a RAM of 16Gb should be enough fro the CPU that harnesses the GPU devices. Bear in mind however that other CryoEM software (as Relion) may require a much more memory. A RAM of 512 to 1024Gb is very expensive, so that if you are on a tight budget you should make your planning according to the intended purpose of the machine.

Requirements on the GPU

Dynamo uses CUDA 9.0 or superior.

By January 2022, have set up servers with GTX 3090. These correspond to the cheaper line of NVIDIA cards that can be used for scientific computing. When considering the particular GPU type you want to install, you will notice that NVIDIA also provides cards like the A100. They are considerably more expensive (for the same number of CUDA cores) because of the ECC memory check. However this is a functionality you don't need necessarily need for running subtomogram averaging projects, and purchase more economic models like the ones described above is deemed safe.

Some performance measurements we have recorded experimentally are indicated here, together with a further discussion on the different types of available cards.

Practical considerations

We suggest directly contacting your IT admins to make certain that the data center in your institution will be able to host the GPUs you want to buy, in terms of space in the racks, appropriate cooling and powering, etc...

Price estimations

Our estimation for the price of a dedicated server would be in the range of 8000K-20000K US$.

File systems

While you don't need any special type of file system, it is comfortable to have a file shared simultaneously accessible to by the local visualization workstation and the computing server. This way, you can use the visualization workstation for tomogram annotation, cropping particles, and any preprocessing related to the creation of alignment project. Once the project has been created (as a folder and an execution script), ideally it should be accessed through the computing server for execution. Moreover, while the computing server creates its results, it's interesting to be able to access the project from your visualization workstation, so that you can get depictions of the averages as tables as they get computed inside the project folder.

Note that if you have separate file systems in your machines, you will probably invest large amounts of time transferring projects and data between systems.

A generous amount of SSD is always a good idea, 4Tb in a workstation is something you will certainly appreciate.

External computing resources

If you don't wish to buy a dedicated GPU servers, there are other alteratives for the use of Dynamo

Large scale clusters

Many research institutions offer access to computing clusters, both for CPU and/or GPU machines. They are normally accessed through a queue that distributes the available resources among many users.

This option requires some tedious adaption to the particular syntax of the queuing system and compiting for computing times with the rest of the users. Moreover, in many cases you may be forced to operate transfers of projects and data between different file systems.

Cloud computing

Dynamo will shortly deliver a (free) virtual environment for the Amazon EC2 cloud. In this setting, you don't need to purchase and maintain any hardware, or install any software. Users just pay Amazon for the time they actually use.