Goal: Reduce model size and speed up inference by converting floating-point (FP32) to integer (INT8) operations.
import torch
from torchreid.models import build_model
# Load pre-trained model
model = build_model("osnet_x0_25", pretrained=True)
# Dynamic Quantization (CPU only)
quantized_model = torch.quantization.quantize_dynamic(
model, {torch.nn.Linear}, dtype=torch.qint8
)
torch.save(quantized_model.state_dict(), "osnet_quantized.pt")
import tensorflow as tf
# Convert to TFLite with quantization
converter = tf.lite.TFLiteConverter.from_saved_model("saved_model_dir")
converter.optimizations = [tf.lite.Optimize.DEFAULT] # INT8 quantization
tflite_model = converter.convert()
# Save the quantized model
with open("model_quant.tflite", "wb") as f:
f.write(tflite_model)
Impact:
Goal: Leverage mobile GPUs/TPUs via platform-specific backends.
Goal: Reduce compute by downsizing Re-ID input crops.
# In your DeepSORT embedder class:
def preprocess(self, crop):
import cv2
# Resize from e.g., 128x64 to 64x32
resized = cv2.resize(crop, (64, 32), interpolation=cv2.INTER_AREA)
return resized # Smaller input → faster inference
Impact: