Comparison of cluster software
From The Right Wiki
Jump to navigationJump to search
The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).
General information
Software | Maintainer | Category | Development status | Latest release | ArchitectureOCS | High-Performance / High-Throughput Computing | License | Platforms supported | Cost | Paid support available |
---|---|---|---|---|---|---|---|---|---|---|
Amoeba | No active development | MIT | ||||||||
Base One Foundation Component Library | Proprietary | |||||||||
DIET | INRIA, SysFera, Open Source | All in one | GridRPC, SPMD, Hierarchical and distributed architecture, CORBA | HTC/HPC | CeCILL | Unix-like, Mac OS X, AIX | Free | |||
DxEnterprise | DH2i | Nodes management | Actively developed | v23.0 | Proprietary | Windows 2012R2/2016/2019/2022 and 8+, RHEL 7/8/9, CentOS 7, Ubuntu 16.04/18.04/20.04/22.04, SLES 15.4 | Cost | Yes | ||
Enduro/X | Mavimax, Ltd. | Job/Data Scheduler | actively developed | SOA Grid | HTC/HPC/HA | GPLv2 or Commercial | Linux, FreeBSD, MacOS, Solaris, AIX | Free / Cost | Yes | |
Ganglia | Monitoring | actively developed | BSD | Unix, Linux, Microsoft Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. | Free | |||||
Grid MP | Univa (formerly United Devices) | Job Scheduler | no active development | Distributed master/worker | HTC/HPC | Proprietary | Windows, Linux, Mac OS X, Solaris | Cost | ||
Apache Mesos | Apache | actively developed | Apache license v2.0 | Linux | Free | Yes | ||||
Moab Cluster Suite | Adaptive Computing | Job Scheduler | actively developed | HPC | Proprietary | Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms | Cost | Yes | ||
NetworkComputer | Runtime Design Automation | actively developed | HTC/HPC | Proprietary | Unix-like, Windows | Cost | ||||
OpenHPC | OpenHPC project | all in one | actively developed | v2.61 February 2, 2023 | HPC | Linux (CentOS / OpenSUSE Leap) | Free | No | ||
OpenLava | None. Formerly Teraproc | Job Scheduler | Halted by injunction | Master/Worker, multiple admin/submit nodes | HTC/HPC | Illegal due to being a pirated version of IBM Spectrum LSF | Linux | Not legally available | No | |
PBS Pro | Altair | Job Scheduler | actively developed | Master/worker distributed with fail-over | HPC/HTC | AGPL or Proprietary | Linux, Windows | Free or Cost | Yes | |
Proxmox Virtual Environment | Proxmox Server Solutions | Complete | actively developed | Open-source AGPLv3 | Linux, Windows, other operating systems are known to work and are community supported | Free | Yes | |||
Rocks Cluster Distribution | Open Source/NSF grant | All in one | actively developed | (Manzanita) | HTC/HPC | OpenSource | CentOS | Free | ||
Popular Power | ||||||||||
ProActive | INRIA, ActiveEon, Open Source | All in one | actively developed | Master/Worker, SPMD, Distributed Component Model, Skeletons | HTC/HPC | GPL | Unix-like, Windows, Mac OS X | Free | ||
RPyC | Tomer Filiba | actively developed | MIT License | *nix/Windows | Free | |||||
SLURM | SchedMD | Job Scheduler | actively developed | v23.11.3 January 24, 2024 | HPC/HTC | GPL | Linux/*nix | Free | Yes | |
Spectrum LSF | IBM | Job Scheduler | actively developed | Master node with failover/exec clients, multiple admin/submit nodes, Suite addOns | HPC/HTC | Proprietary | Unix, Linux, Windows | Cost and Academic - model - Academic, Express, Standard, Advanced and Suites | Yes | |
Oracle Grid Engine | Oracle Grid Engine (Sun Grid Engine, SGE) | Altair | Job Scheduler | active Development moved to Altair Grid Engine | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Proprietary | *nix/Windows | Cost | ||
Some Grid Engine / Son of Grid Engine / Sun Grid Engine | daimh | Job Scheduler | actively developed (stable/maintenance) | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Open-source SISSL | *nix | Free | No | |
SynfiniWay | Fujitsu | actively developed | HPC/HTC | ? | Unix, Linux, Windows | Cost | ||||
Techila Distributed Computing Engine | Techila Technologies Ltd. | All in one | actively developed | Master/worker distributed | HTC | Proprietary | Linux, Windows | Cost | Yes | |
TORQUE Resource Manager | Adaptive Computing | Job Scheduler | actively developed | Proprietary | Linux, *nix | Cost | Yes | |||
UniCluster | Univa | All in One | Functionality and development moved to UniCloud (see above) | Free | Yes | |||||
UNICORE | ||||||||||
Xgrid | Apple Computer | |||||||||
Warewulf | Provision and clusters management | actively developed | v4.4.1 July 6, 2023 | HPC | Open Source | Linux | Free | |||
xCAT | Provision and clusters management | actively developed | v2.16.5 March 7, 2023 | HPC | Eclipse Public License | Linux | Free | |||
Software | Maintainer | Category | Development status | Latest release | Architecture | High-Performance/ High-Throughput Computing | License | Platforms supported | Cost | Paid support available |
Table explanation
- Software: The name of the application that is described
Technical information
Software | Implementation Language | Authentication | Encryption | Integrity | Global File System | Global File System + Kerberos | Heterogeneous/ Homogeneous exec node | Jobs priority | Group priority | Queue type | SMP aware | Max exec node | Max job submitted | CPU scavenging | Parallel job | Job checkpointing | Python interface |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Enduro/X | C/C++ | OS Authentication | GPG, AES-128, SHA1 | None | Any cluster Posix FS (gfs, gpfs, ocfs, etc.) | Any cluster Posix FS (gfs, gpfs, ocfs, etc.) | Heterogeneous | OS Nice level | OS Nice level | SOA Queues, FIFO | Yes | OS Limits | OS Limits | Yes | Yes | No | No |
HTCondor | C++ | GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous | None, Triple DES, BLOWFISH | None, MD5 | None, NFS, AFS | Not official, hack with ACL and NFS4 | Heterogeneous | Yes | Yes | Fair-share with some programmability | basic (hard separation into different node) | tested ~10000? | tested ~100000? | Yes | MPI, OpenMP, PVM | Yes | Yes, and native Python Binding |
PBS Pro | C/Python | OS Authentication, Munge | Any, e.g., NFS, Lustre, GPFS, AFS | Limited availability | Heterogeneous | Yes | Yes | Fully configurable | Yes | tested ~50,000 | Millions | Yes | MPI, OpenMP | Yes | Yes | ||
OpenLava | C/C++ | OS authentication | None | NFS | Heterogeneous Linux | Yes | Yes | Configurable | Yes | Yes, supports preemption based on priority | Yes | Yes | No | ||||
Slurm | C | Munge, None, Kerberos | Heterogeneous | Yes | Yes | Multifactor Fair-share | yes | tested 120k | tested 100k | No | Yes | Yes | PySlurm | ||||
Spectrum LSF | C/C++ | Multiple - OS Authentication/Kerberos | Optional | Optional | Any - GPFS/Spectrum Scale, NFS, SMB | Any - GPFS/Spectrum Scale, NFS, SMB | Heterogeneous - HW and OS agnostic (AIX, Linux or Windows) | Policy based - no queue to computenode binding | Policy based - no queue to computegroup binding | Batch, interactive, checkpointing, parallel and combinations | yes and GPU aware (GPU License free) | > 9.000 compute hots | > 4 mio jobs a day | Yes, supports preemption based on priority, supports checkpointing/resume | Yes, fx parallel submissions for job collaboration over fx MPI | Yes, with support for user, kernel or library level checkpointing environments | Yes |
Torque | C | SSH, munge | None, any | Heterogeneous | Yes | Yes | Programmable | Yes | tested | tested | Yes | Yes | Yes | Yes | |||
Software | Implementation Language | Authentication | Encryption | Integrity | Global File System | Global File System + Kerberos | Heterogeneous/ Homogeneous exec node | Jobs priority | Group priority | Queue type | SMP aware | Max exec node | Max job submitted | CPU scavenging | Parallel job | Job checkpointing |
Table Explanation
- Software: The name of the application that is described
- SMP aware:
- basic: hard split into multiple virtual host
- basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
- dynamic: split the resource of the computer (CPU/Ram) on demand
See also
- List of volunteer computing projects
- List of cluster management software
- Computer cluster
- Grid computing
- World Community Grid
- Distributed computing
- Distributed resource management
- High-Throughput Computing
- Job Processing Cycle
- Batch processing
- Fallacies of Distributed Computing