A lightweight GPU monitor designed for real-time web-based viewing of GPU server status.
-
Updated
Jun 13, 2024 - Python
CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.
A lightweight GPU monitor designed for real-time web-based viewing of GPU server status.
A high-throughput and memory-efficient inference and serving engine for LLMs
Open Voice OS Status Page
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
Implementations of various simulations for integrate and fire models, as well as conductance based models with synaptic neurotransmission
A high-performance inference system for large language models, designed for production environments.
A retargetable MLIR-based machine learning compiler and runtime toolkit.
cuML - RAPIDS Machine Learning Library
CEED Library: Code for Efficient Extensible Discretizations
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
High performance CUDA/Python library for computing quantum chemistry density-based descriptors for larger systems using GPUs.
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
PygmalionAI's large-scale inference engine
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
CUDA C++ Core Libraries
Created by Nvidia
Released June 23, 2007