rohit.vision
Notes Graph Search IDE About Portfolio
Notes / Computer Vision / Object Detection

Object Detection

R-CNN family, YOLO variants, DETR, anchor-free detectors, and open-vocabulary detection

1.
Selective Search WIP
Region proposal algorithm for object detection
2.
OverFeat
Sliding window + bbox regression + classification
3.
R-CNN Family
R-CNN, SPPNet, Fast R-CNN, Faster R-CNN evolution of region-based detectors
4.
YOLO
You Only Look Once — single-shot grid-based object detection
5.
SSD WIP
Single Shot MultiBox Detector
6.
RetinaNet
Focal loss for dense object detection addressing class imbalance
7.
CornerNet
Anchor-free detection using top-left and bottom-right corner heatmaps
8.
CenterNet
Center-based keypoint detection using corners as proposals
9.
ExtremeNet
Object detection via extreme point prediction
10.
DETR
End-to-end object detection with transformers using bipartite matching
11.
Swin Transformer WIP
Hierarchical vision transformer with shifted windows
12.
YOLOE WIP
Efficient YOLO variant
13.
YOLO-World
Open-vocabulary YOLO with vision-language modeling and RepVL-PAN
14.
DINO & Grounding DINO WIP
DETR with Improved deNoising anchOr boxes and open-set grounding
15.
OWLv2 WIP
Open-World Localization with vision-language models
16.
InternImage WIP
Large-scale vision foundation model with deformable convolutions
GitHub LinkedIn Google Scholar

© 2026 Rohit Kumar. rohit.vision