NCA-GENM Exam Dumps | You are working on a project that involves generating music from video. The approach uses a pre-trained

<< Prev Question Next Question >>

Question 29/192

You are working on a project that involves generating music from video. The approach uses a pre-trained video encoder and a pre- trained music decoder. You find that the generated music often lacks a clear connection to the visual content of the video. To improve the coherence between the video and the generated music, which of the following steps would be the MOST effective? (Select TWO)

A. Train the video encoder and music decoder separately on larger datasets.

B. Introduce a cross-modal attention mechanism to allow the music decoder to attend to relevant visual features during music generation.

C. Remove the video encoder and generate music directly from random noise.

D. Fine-tune the entire system end-to-end with a loss function that encourages temporal alignment between video and music features.

E. Only use videos that are shorter than 5 seconds.

Question 29/192

LEAVE A REPLY

Download PDF File