inference

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning inference nvidia gpu-acceleration tensorrt

Updated Jun 12, 2024
C++

microsoft / aici

Star

AICI: Prompts as (Wasm) Programs

rust ai wasm inference transformer language-model model-serving wasmtime llm llmops llm-serving llm-inference llm-framework

Updated Jun 11, 2024
Rust

quic / ai-hub-models

Star

The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

machine-learning inference pytorch machinelearning deeplearning demos inference-engine onnx tensorflow-lite qnn inference-api

Updated Jun 11, 2024
Python

microsoft / DeepSpeed

Star

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jun 11, 2024
Python

openvinotoolkit / openvino_notebooks

Star

📚 Jupyter notebook tutorials for OpenVINO™

machine-learning computer-vision deep-learning inference openvino

Updated Jun 11, 2024
Jupyter Notebook

microsoft / private-benchmarking

Star

A platform that enables users to perform private benchmarking of machine learning models. The platform facilitates the evaluation of models based on different trust levels between the model owners and the dataset owners.

benchmarking inference secure private mpc large-language-models llms-benchmarking private-benchmarking ezpc

Updated Jun 11, 2024
Python

vectorch-ai / ScaleLLM

Star

A high-performance inference system for large language models, designed for production environments.

performance gpu model production cuda efficiency inference transformer llama speculative serving llm llm-inference llama3

Updated Jun 12, 2024
C++

xorbitsai / inference

Star

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated Jun 12, 2024
Python

google / XNNPACK

Star

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated Jun 12, 2024
C

google-ai-edge / mediapipe

Star

Cross-platform, customizable ML solutions for live and streaming media.

android c-plus-plus calculator machine-learning framework computer-vision deep-learning inference pipeline-framework stream-processing video-processing perception mobile-development audio-processing graph-framework graph-based mediapipe

Updated Jun 11, 2024
C++

CapStats-ML / CapStats

Star

Espero que en este repo encuentres inspiración para aprender y desarrollarte en el mundo de la Estadística, no soy perfecto en todo así que si tienes una sugerencia la aceptares con todo el gusto, espero disfrutes lo que puedes encontrar en la página **Este repo aun esta en construcción**

python blog machine-learning r statistics probability regression inference statistical-analysis