NCA-GENL Exam Dumps | In transformer-based LLMs, how does the use of multi-head attention improve model performance compared

<< Prev Question Next Question >>

Question 19/23

In transformer-based LLMs, how does the use of multi-head attention improve model performance compared to single-head attention, particularly for complex NLP tasks?

A. Multi-head attention reduces the model's memory footprint by sharing weights across heads.

B. Multi-head attention allows the model to focus on multiple aspects of the input sequence simultaneously.

C. Multi-head attention eliminates the need for positional encodings in the input sequence.

D. Multi-head attention simplifies the training process by reducing the number of parameters.

Question 19/23

LEAVE A REPLY

Download PDF File