Today, we’re publicly releasing the Video Joint Embedding Predictive Architecture (V-JEPA) model, a crucial step in advancing machine intelligence with a more grounded understanding of the world.

This early example of a physical world model excels at detecting and understanding highly detailed interactions between objects.

In the spirit of responsible open science, we’re releasing this model under a Creative Commons NonCommercial license for researchers to further explore.

Send me a message or webmention
Back to feed