Playing with Vision Embeddings

(prestonbjensen.com)

24 points | by prestoj 2 days ago

2 comments

  • jcattle 33 minutes ago
    Very nice visualizations, thanks for that!

    One thing I still struggle with in my head is how these vision embeddings can then be used to give LLMs eyes.

    Because you somehow need a giant training set which describes images in natural language, no? Is that actually how it works, or is there some smart trick so you don't need to pay labellers a bunch of money to look at pictures and describe them.