OWLv2 WIP cv computer-vision cv object-detection open-vocabulary owl transformer vision-language 1 min read Open-World Localization with vision-language models Overview TODO: Add content Previous OverFeat Next R-CNN Family Related Notes in CV Generative Models Inference-Time Scaling in Diffusion (ReflectionFlow) Image Classification Image Captioning Multimodal AI Unified Latent Architectures Video Understanding Visual Question Answering (VQA) CenterNet CornerNet