A Cheat Sheet For Multi-Object Tracking

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

5289

Abstract

unfair to the re-ID or object tracking task.The object detection and re-ID tasks are treated equally in FairMOT.<figure id="1ecd"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*Hq6YRB6T4Iggx-r8.png"><figcaption>FairMOT(Source: <a href="https://arxiv.org/pdf/2004.01888.pdf">FairMOT</a>)</figcaption></figure>The input image is fed to an encoder-decoder network to extract high-resolution feature maps.FairMOT then adds two homogeneous branches for detecting objects and extracting re-ID features to obtain a good trade-off between detection and re-ID.Read this <a href="https://readmedium.com/9fd6249a76b6">article</a> for a detailed understanding on different MOT algorithm<h1 id="81e1">BytrTrack Algorithm</h1><blockquote id="ce51"><a href="https://readmedium.com/86f1f3632a85">ByteTrack</a> performs MOT on a video using the high-performance detector <a href="https://arshren.medium.com/yolox-new-improved-yolo-d430c0e4cf20">YOLOX</a> and performs association between the detection boxes and the tracks using BYTE.</blockquote>BYTE keeps all detection boxes and separates them into high score ones (Dʰᶦᵍʰ) and low score(Dˡᵒʷ) ones. BYTE uses a Kalman filter to predict the new locations in the current frame of each track in T.The first association in BYTE is performed between the high score detection boxes Dʰᶦᵍʰ to all the tracklets. Similarity for the first association is computed using IoU or the Re-ID feature distances between the detection boxes Dʰᶦᵍʰ and the predicted box of tracks T.Some tracklets get unmatched because they do not match an appropriate high score detection box Dʰᶦᵍʰ, which occurs when occlusion, motion blur, or size change occurs.<figure id="0c15"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/0*B95LMEDCIxNkFS_M.png"><figcaption>Inspired by <a href="https://arshren.medium.com/bytetrack-a-simple-yet-effective-multi-object-tracking-technique-86f1f3632a85">ByteTrack: A Simple Yet Effective Multi-Object Tracking Technique</a></figcaption></figure>The second association is performed after the first association between the low score detection boxes Dˡᵒʷ and the remaining unmatched tracklets(Tʳᵉᵐᵃᶤⁿ) to recover the objects in low score detection boxes and filter out the background.Keep the unmatched tracks in Tʳᵉ-ʳᵉᵐᵃᶤⁿ and delete all the unmatched low score detection boxes as those are considered background.<h2 id="d239">Characteristics of MOT Evaluation Metrics</h2>MOT evaluation metrics need to exhibit two significant properties<ol><li>MOT evaluation metrics need to address five error types in MOT. These five error types are False negatives(FN), False positives(FP), Fragmentation, Mergers(ID Switch), and Deviation.</li></ol><figure id="02c8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*rsXOUXkenRpa1PrhaXUuHQ.png"><figcaption>Source:MOT16: <a href="https://arxiv.org/pdf/1603.00831.pdf">A Benchmark for Multi-Object Tracking</a></figcaption></figure>2. MOT evaluation metrics should have monotonicity, and error types should be differentiable so that the metrics have the tracker’s performance concerning each of the five basic error types.<h2 id="44d8">Commonly used MOT evaluation metrics.</h2><h2 id="ed70">Track-mAP</h2>Track mAP performs both matching and association at a trajectory level and is biased toward measuring association. It operates based on the confidence-ranked potential tracking results. Track-mAP is non-monotonic in detection.<h2 id="63e2">Multi-Object Tracking Accuracy- MOTA</h2>MOTA is the most widely used metric that closely represents human visual assessment. In MOTA, matching is done at a detection level. Association is measured in MOTA using Identity Switch (IDSW), which occurs when a tracker wrongfully swaps object identities or when a track is lost and is reinitialized with a different identity. MOTA measures three types of tracking errors: False Positive, False Negative, and ID Switch<figure id="5e75"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*dsPQ4VJ9BxLuvvrf9gakAw.png"><figcaption></figcaption></figure><h2 id="f788">The Identification Metrics: IDF1</h2>IDF1 emphasizes Association accuracy rather than detection. IDF1 uses IDTP(Identity True Positives), where prID is matched with grID when S ≥ α of trajectories. IDF1 is the ratio of correctly identified detections over the average number of ground-truth and computed detections. The Hungarian algorithm selects trajectories to match for minimizing the sum of IDFP and IDFN.<figure id="83f8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*Gg9-Y7U4kB0NP--RF3388A.png"><figcaption>A tracking example displaying the single best trajectory matching performed by IDF1(Source: <a href="https://arxiv.org/pdf/2009.07736v2.pdf">HOTA: A Higher-Order Metric for Evaluating Multi-Object Tracking</a>)</figcaption></figure><p id="6b07"

Options

IDF1 combines IDP(ID Precision) and IDR(ID Recall).<figure id="c47b"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*VHrLjbva-VdBFjc_8NkFmQ.png"><figcaption></figcaption></figure><h2 id="dd23">Higher-Order Tracking Accuracy-HOTA</h2>HOTA is a single unified metric for ranking trackers. HOTA can be decomposed into components that correspond to these five error types: Detection Recall, Detection Precision, Association Recall, Association Precision, and Localisation Accuracy. As a result, HOTA has its error type differentiable and is strictly monotonic, providing information about the tracker’s performance concerning each of the different basic error typesHOTA tracking errors are categorized into Detection errors, Association errors, and Localization errors.<ol><li>Detection error occurs when a tracker predicts detections that don’t exist in the ground truth or fails to predict detections in the ground truth. Detection errors can be further categorized as detection recall (measured by FNs) and detection precision (measured by FPs)</li><li>Association error occurs when trackers assign the same prID to two detections with different gtIDs or assign different prIDs to two detections that should have the same gtID. Association errors are further categorized into errors of association recall (measured by FNAs) and association precision (measured by FPAs)</li><li>Localization errors occur when prDets are not perfectly spatially aligned with gtDets.</li></ol><figure id="13a8"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*EGmDDLNOUjl7jybIj08zQQ.png"><figcaption>Source: <a href="https://arxiv.org/pdf/2009.07736v2.pdf">HOTA: A Higher-Order Metric for Evaluating Multi-Object Tracking</a>)</figcaption></figure>MOTA performs both matching and association scoring at a local detection level but accentuates detection accuracy, whereas IDF1 performs at a trajectory level by emphasizing the effect of association.Track-mAP is similar to IDF1 as it performs both matching and association at a trajectory level and is biased toward measuring association.HOTA balances both by being an explicit combination of a detection score and an association score by performing matches at the detection level while scoring association globally over trajectories.<figure id="1bcf"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*EtmSc3kmWG0KiWitKakWKQ.png"><figcaption>An overview of different evaluation metrics for MOT(Source: <a href="https://arxiv.org/pdf/2009.07736v2.pdf">HOTA: A Higher-Order Metric for Evaluating Multi-Object Tracking</a>)</figcaption></figure>Read <a href="https://arshren.medium.com/evaluation-metrics-for-multiple-object-tracking-7b26ef23ef5f">this</a> article for a detailed understanding of different MOT evaluation metrics<h2 id="2eb3">References:</h2><a href="https://arxiv.org/pdf/2110.06864.pdf">Multi-Object Tracking by Associating Every Detection Box by Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo2, Wenyu Liu, Xinggang Wang</a><a href="https://www.kalmanfilter.net/modeling.html">Online Kalman Filter Tutoria</a>l<a href="https://www.kalmanfilter.net/modeling.html">www.kalmanfilter.net</a><a href="https://arxiv.org/pdf/1602.00763.pdf">SIMPLE REAL-TIMEND REALTIME TRACKING Alex Bewley</a><a href="https://arxiv.org/pdf/1703.07402.pdf">SIMPLE ONLINE AND REAL-TIME TRACKING WITH A DEEP ASSOCIATION METRIC</a><a href="http://mdpi.com/2076-3417/12/3/1319">Sort and Deep-SORT Based Multi-Object Tracking for Mobile Robotics: Evaluation with New Data Association Metrics</a><a href="https://arxiv.org/pdf/2004.01888.pdf">FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu</a><a href="https://arxiv.org/pdf/2009.07736v2.pdf">HOTA: A Higher-Order Metric for Evaluating Multi-Object Tracking</a><a href="https://autonomousvision.github.io/hota-metrics/">How to evaluate tracking with the HOTA metrics</a><a href="https://arxiv.org/pdf/1603.00831.pdf">MOT16: A Benchmark for Multi-Object Tracking</a><a href="https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.367.6279&rep=rep1&type=pdf">Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics</a><a href="https://www.idiap.ch/~odobez/publications/SmithGaticaOdobezBa-cvpr-eemcv05.pdf">Evaluating Multi-Object Tracking</a><a href="https://arshren.medium.com/an-introduction-to-object-tracking-9fd6249a76b6">An Introduction to Object Tracking</a><a href="https://arshren.medium.com/bytetrack-a-simple-yet-effective-multi-object-tracking-technique-86f1f3632a85">ByteTrack: A Simple Yet Effective Multi-Object Tracking Technique</a><a href="https://arshren.medium.com/evaluation-metrics-for-multiple-object-tracking-7b26ef23ef5f">Evaluation Metrics for Multiple Object Tracking</a></article></body>