Model Card: Whisper
This is the official codebase for running the automatic speech recognition (ASR) models (Whisper models) trained and released by OpenAI.
Following Model Cards for Model Reporting (Mitchell et al.), we're providing some information about the automatic speech...
Find all the people in a video,
Count how many there are in shot,
Count how many there have been in total (with no duplication when someone leaves and re-enters), and
Track all the movements of all the individual people.
Get the output as annotated images or videos, and as JSON metadata.