“ImageBind is part of Meta’s efforts to create multimodal AI systems that learn from all possible types of data around them. As the number of modalities increases, ImageBind opens the floodgates for researchers to try to develop new, holistic systems, such as combining 3D and IMU sensors to design or experience immersive, virtual worlds. ImageBind could also provide a rich way to explore memories — searching for pictures, videos, audio files or text messages using a combination of text, audio, and image. ”
Source : ImageBind: Holistic AI learning across six modalities