Using “Crossmodal” search to index text, image, audio, and video.
Do you have a wide range of media you want to have easily searchable via your new Vector Index?
Let's say you run an e-commerce store and your customers can submit text reviews, images, and videos of them using the products, and you want to make all of those searchable via a single interface.
Or perhaps you want your AI agents to be able to search them.
Either way, you need “Crossmodal” embeddings. This means you can index multiple types of media with a single model.
In the case of Bedrock, one option would be their amazon.nova-2-multimodal-embeddings-v1:0 model.
Fun side note: I would bet you could index even more mediums.
For example, if you had a large collection of 3D Models, or perhaps CAD or G-Code files, I would bet you could encode those, though I doubt Nova’s crossmodal/multimodal solution has been trained on that… yet.
If anyone knows of a vector embedding model for 3D files, I would love to hear about it.