Downsizing Re-ID Inputs

Downsizing Re-ID Inputs Without Breaking the Model

Yes, Re-ID models expect specific input dimensions (e.g., 128x64 for pedestrian Re-ID), but you can resize intelligently while preserving aspect ratio and minimizing distortion. Here’s how to do it correctly:

1. Preserve Aspect Ratio

Re-ID models (like OSNet or Mars-small128) are trained on fixed aspect ratios (e.g., 2:1 for pedestrians). Downsizing must maintain this ratio to avoid stretching.

Correct Resizing

import cv2

def resize_crop(crop, target_height=64, target_width=32):
    # Original crop: e.g., 128x64 (HxW)
    h, w = crop.shape[:2]

    # Calculate new dimensions (maintain aspect ratio)
    scale = min(target_height / h, target_width / w)
    new_h, new_w = int(h * scale), int(w * scale)

    # Resize with anti-aliasing
    resized = cv2.resize(crop, (new_w, new_h), interpolation=cv2.INTER_AREA)

    # Pad to target size (if needed)
    delta_h = target_height - new_h
    delta_w = target_width - new_w
    top = delta_h // 2
    bottom = delta_h - top
    left = delta_w // 2
    right = delta_w - left

    # Pad with mean pixel values (or zero)
    padded = cv2.copyMakeBorder(
        resized, top, bottom, left, right,
        cv2.BORDER_CONSTANT, value=(128, 128, 128)  # Gray padding
    return padded

Why This Works:

Maintains the object’s proportions (no squishing).
Pads with neutral values (avoiding edge artifacts).
Uses INTER_AREA interpolation (best for downsizing).

2. Normalize for the Model

After resizing, ensure pixel values match the model’s training preprocessing:

def preprocess(crop):
    # Resize (as above)
    resized = resize_crop(crop, target_height=64, target_width=32)

    # Normalize to [0, 1] or model-specific range (e.g., -1 to 1)
    normalized = resized.astype("float32") / 255.0

    # Convert to channel-first (if needed, e.g., PyTorch)
    normalized = np.transpose(normalized, (2, 0, 1))  # [C, H, W]

    return normalized

3. Trade-offs of Downsizing

Resolution	Speed	Re-ID Accuracy	Use Case
Original (e.g., `128x64`)	Slow	High	High-end devices
Half-size (`64x32`)	2x Faster	Slight drop (~5% mAP)	Mobile/edge
Quarter-size (`32x16`)	4x Faster	Significant drop	Only for large objects

Rule of Thumb:

For pedestrians, never go below 64x32.