Making large AI models cheaper, faster and more accessible
-
Updated
Jun 12, 2024 - Python
Making large AI models cheaper, faster and more accessible
A high-throughput and memory-efficient inference and serving engine for LLMs
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
A universal scalable machine learning model deployment solution
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
AICI: Prompts as (Wasm) Programs
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
📚 Jupyter notebook tutorials for OpenVINO™
A platform that enables users to perform private benchmarking of machine learning models. The platform facilitates the evaluation of models based on different trust levels between the model owners and the dataset owners.
A high-performance inference system for large language models, designed for production environments.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Cross-platform, customizable ML solutions for live and streaming media.
Espero que en este repo encuentres inspiración para aprender y desarrollarte en el mundo de la Estadística, no soy perfecto en todo así que si tienes una sugerencia la aceptares con todo el gusto, espero disfrutes lo que puedes encontrar en la página **Este repo aun esta en construcción**
TensorRT C++ API Tutorial
🏗️ Fine-tune, build, and deploy open-source LLMs easily!
Utilities to use the Hugging Face Hub API
Add a description, image, and links to the inference topic page so that developers can more easily learn about it.
To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."