https://github.com/JusperLee/Looking-to-Listen-at-the-Cocktail-Party
http://sound-of-pixels.csail.mit.edu/
https://andrewowens.com/multisensory/
- fully define techn rquirement for deployment
- figure out metric for user experience (snr)
- → how well it works in engineering words
Technical evaluation:
- use weka to make MATCH
- → once visual attention + audio attention both match (to certain threshold) → provide augmentation option