If i understand it correctly, u need voice from the applause and other to be cut, and all other voices stay where they are, so as video on that parts where the voice is removed, or cut the video too on that parts?
Both is not the problem, just to be sure.
I can do it right away.
I can do it in any program u like, for example after effect, premiere pro, camtasia or final cut pro.
I have putted 5 hours, because this video needs 2-3 hours of rendering,