Neural Magic
1.59K subscribers
52:35
vLLM Office Hours - Advanced Techniques for Maximizing vLLM Performance - September 19, 2024
Neural Magic
79 views • 12 hours ago
1:13:14
vLLM Office Hours - Using NVIDIA CUTLASS for High-Performance Inference - September 05, 2024
Neural Magic
1.6K views • 2 weeks ago
48:13
vLLM Office Hours - vLLM on AMD GPUs and Google TPUs - August 21, 2024
Neural Magic
435 views • 4 weeks ago
50:03
vLLM Office Hours - Multimodal Models in vLLM with Roblox - August 8, 2024
Neural Magic
400 views • 1 month ago
50:37
vLLM Office Hours - Model Quantization for Efficient vLLM Inference - July 25, 2024
Neural Magic
680 views • 1 month ago
33:21
Deploy LLMs More Efficiently with vLLM and Neural Magic
Neural Magic
563 views • 2 months ago
56:09
vLLM Office Hours - FP8 Quantization Deep Dive - July 9, 2024
Neural Magic
848 views • 2 months ago
53:19
vLLM Office Hours - June 20, 2024
Neural Magic
407 views • 3 months ago
44:47
vLLM and Neural Magic Office Hours - June 5, 2024
Neural Magic
406 views • 3 months ago
6:31
Are MLOps disappearing?
Neural Magic
351 views • 1 year ago
1:06
5x Faster YOLOv8 on CPUs
Neural Magic
4K views • 1 year ago
47:52
Deploy Fast and Accurate YOLOv8 Object Detection Models on CPUs You Already Have
Neural Magic
3.5K views • 1 year ago
42:27
Unlock Faster and More Efficient LLMs with SparseGPT
Neural Magic
2.1K views • 1 year ago
52:31
Pruning and Quantizing ML Models With One Shot Without Retraining
Neural Magic
1.9K views • 1 year ago
8:15
Sparse Transferring Hugging Face Models With SparseML
Neural Magic
493 views • 1 year ago
41:42
Apply Second-Order Pruning Algorithms for SOTA Model Compression
Neural Magic
879 views • 1 year ago
6:53
Use Sparse Transfer Learning to Create Sparse Models Fine-Tuned to Your Datasets
Neural Magic
422 views • 1 year ago
5:02
Intro to SparseML
Neural Magic
631 views • 1 year ago
4:23
Accelerate Image Segmentation Tasks With Sparsity and the DeepSparse Runtime
Neural Magic
185 views • 1 year ago
4:20
Accelerate Image Classification Tasks With Sparsity and the DeepSparse Runtime
Neural Magic
170 views • 1 year ago
4:50
Accelerate Object Detection Tasks With Sparsity and the DeepSparse Runtime
Neural Magic
823 views • 1 year ago
7:38
Intro to DeepSparse Runtime
Neural Magic
1.4K views • 1 year ago
7:08
Intro to Deep Learning Model Sparsification
Neural Magic
869 views • 1 year ago
6:16
Intro to SparseZoo
Neural Magic
316 views • 1 year ago
9:56
Intro to Neural Magic & Software-Delivered AI
Neural Magic
1K views • 1 year ago
4:51
Accelerate NLP Tasks With Sparsity and the DeepSparse Runtime
Neural Magic
195 views • 1 year ago
41:38
Sparse Training of Neural Networks Using AC/DC
Neural Magic
670 views • 1 year ago
50:27
How Well Do Sparse Models Transfer?
Neural Magic
430 views • 1 year ago
59:10
How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models
Neural Magic
3.3K views • 2 years ago
1:12:21
Workshop: How to Optimize Deep Learning Models for Production
Neural Magic
2.2K views • 2 years ago
Load More