Associate-Developer-Apache-Spark Exam Dumps | Which of the following code blocks creates a new DataFrame with two columns season and wind_speed

Valid Associate-Developer-Apache-Spark Dumps shared by EduDump.com for Helping Passing Associate-Developer-Apache-Spark Exam! EduDump.com now offer the newest Associate-Developer-Apache-Spark exam dumps, the EduDump.com Associate-Developer-Apache-Spark exam questions have been updated and answers have been corrected get the newest EduDump.com Associate-Developer-Apache-Spark dumps with Test Engine here:

Access Associate-Developer-Apache-Spark Dumps Premium Version
(179 Q&As Dumps, 35%OFF Special Discount Code: freecram)

<< Prev Question Next Question >>

Question 40/63

Which of the following code blocks creates a new DataFrame with two columns season and wind_speed_ms where column season is of data type string and column wind_speed_ms is of data type double?

A. spark.DataFrame({"season": ["winter","summer"], "wind_speed_ms": [4.5, 7.5]})

B. spark.createDataFrame([("summer", 4.5), ("winter", 7.5)], ["season", "wind_speed_ms"])

C. 1. from pyspark.sql import types as T
2. spark.createDataFrame((("summer", 4.5), ("winter", 7.5)), T.StructType([T.StructField("season",

D. CharType()), T.StructField("season", T.DoubleType())]))

E. spark.newDataFrame([("summer", 4.5), ("winter", 7.5)], ["season", "wind_speed_ms"])

F. spark.createDataFrame({"season": ["winter","summer"], "wind_speed_ms": [4.5, 7.5]})

Correct Answer: B

Explanation
spark.createDataFrame([("summer", 4.5), ("winter", 7.5)], ["season", "wind_speed_ms"]) Correct. This command uses the Spark Session's createDataFrame method to create a new DataFrame. Notice how rows, columns, and column names are passed in here: The rows are specified as a Python list. Every entry in the list is a new row. Columns are specified as Python tuples (for example ("summer", 4.5)). Every column is one entry in the tuple.
The column names are specified as the second argument to createDataFrame(). The documentation (link below) shows that "when schema is a list of column names, the type of each column will be inferred from data" (the first argument). Since values 4.5 and 7.5 are both float variables, Spark will correctly infer the double type for column wind_speed_ms. Given that all values in column
"season" contain only strings, Spark will cast the column appropriately as string.
Find out more about SparkSession.createDataFrame() via the link below.
spark.newDataFrame([("summer", 4.5), ("winter", 7.5)], ["season", "wind_speed_ms"]) No, the SparkSession does not have a newDataFrame method.
from pyspark.sql import types as T
spark.createDataFrame((("summer", 4.5), ("winter", 7.5)), T.StructType([T.StructField("season",
T.CharType()), T.StructField("season", T.DoubleType())]))
No. pyspark.sql.types does not have a CharType type. See link below for available data types in Spark.
spark.createDataFrame({"season": ["winter","summer"], "wind_speed_ms": [4.5, 7.5]}) No, this is not correct Spark syntax. If you have considered this option to be correct, you may have some experience with Python's pandas package, in which this would be correct syntax. To create a Spark DataFrame from a Pandas DataFrame, you can simply use spark.createDataFrame(pandasDf) where pandasDf is the Pandas DataFrame.
Find out more about Spark syntax options using the examples in the documentation for SparkSession.createDataFrame linked below.
spark.DataFrame({"season": ["winter","summer"], "wind_speed_ms": [4.5, 7.5]}) No, the Spark Session (indicated by spark in the code above) does not have a DataFrame method.
More info: pyspark.sql.SparkSession.createDataFrame - PySpark 3.1.1 documentation and Data Types - Spark 3.1.2 Documentation Static notebook | Dynamic notebook: See test 1

Question 40/63

LEAVE A REPLY

Download PDF File