Interpretation of the paper X-CLIP : Expanding Language-Image Pretrained Models for General Video Recognition

NoSuchKey

Guess you like

Origin blog.csdn.net/flyingluohaipeng/article/details/126648783