Correct Answer: D
Explanation
Tasks in a stage may be executed by multiple machines at the same time.
This is correct. Within a single stage, tasks do not depend on each other. Executors on multiple machines may execute tasks belonging to the same stage on the respective partitions they are holding at the same time.
Different stages in a job may be executed in parallel.
No. Different stages in a job depend on each other and cannot be executed in parallel. The nuance is that every task in a stage may be executed in parallel by multiple machines.
For example, if a job consists of Stage A and Stage B, tasks belonging to those stages may not be executed in parallel. However, tasks from Stage A may be executed on multiple machines at the same time, with each machine running it on a different partition of the same dataset. Then, afterwards, tasks from Stage B may be executed on multiple machines at the same time.
Stages may contain multiple actions, narrow, and wide transformations.
No, stages may not contain multiple wide transformations. Wide transformations mean that shuffling is required. Shuffling typically terminates a stage though, because data needs to be exchanged across the cluster. This data exchange often causes partitions to change and rearrange, making it impossible to perform tasks in parallel on the same dataset.
Stages ephemerally store transactions, before they are committed through actions.
No, this does not make sense. Stages do not "store" any data. Transactions are not "committed" in Spark.
Stages consist of one or more jobs.
No, it is the other way around: Jobs consist of one more stages.
More info: Spark: The Definitive Guide, Chapter 15.