To address the challenges in generating multimodal instruction data, we developed ProVision, a scalable, programmatic framework that employs scene graphs and human-written programs to systematically synthesize vision-centric instruction data.
From new open source models to evaluation frameworks, our AI Research team has been moving the needle in AI. Take a look at some of our 2024 highlights.