EPIC-KITCHENS-100 · Video-Text Retrieval Demo

Model: AVION ViT-L fine-tuned with Symmetric Multi-Similarity Loss

Score: 69.68 nDCG AVG on Codabench EK-100 Multi-Instance Retrieval

Type any kitchen action to retrieve the most relevant video clip narrations from the official EPIC-KITCHENS-100 challenge test set (9,668 video clips). Results are ranked by cosine similarity between your query embedding and pre-encoded narration embeddings using the fine-tuned AVION text encoder.

5 20
Example queries — click to try
Pages:

Top Matching Narrations from EK-100 Test Set

Top Matching Narrations from EK-100 Test Set

Dataset: EPIC-KITCHENS-100 · Annotations: epic-kitchens-100-annotations · Base Model: AVION pre-trained on Ego4D via LaViLa · Loss: SMS Loss (Wang et al., 2024)