MIT traffic data set is for research on activity analysis and crowded scenes. It includes a traffic video sequence of 90 minutes long. It is recorded by a stationary camera. The size of the scene is 720 by 480. It is divided into 20 clips and can be downloaded from the following links.
In order to evaluate the performance of human detection on this data set, ground truth of pedestrians of some sampled frames are manually labeled. It can be downloaded below. A readme file provides the instructions of how to use it.
Ground truth of pedestrians