Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
623 views
in Technique[技术] by (71.8m points)

scala - Create new Dataframe with empty/null field values

I am creating a new Dataframe from an existing dataframe, but need to add new column ("field1" in below code) in this new DF. How do I do so? Working sample code example will be appreciated.

val edwDf = omniDataFrame 
  .withColumn("field1", callUDF((value: String) => None)) 
  .withColumn("field2",
    callUdf("devicetypeUDF", (omniDataFrame.col("some_field_in_old_df")))) 

edwDf
  .select("field1", "field2")
  .save("odsoutdatafldr", "com.databricks.spark.csv"); 
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

It is possible to use lit(null):

import org.apache.spark.sql.functions.{lit, udf}

case class Record(foo: Int, bar: String)
val df = Seq(Record(1, "foo"), Record(2, "bar")).toDF

val dfWithFoobar = df.withColumn("foobar", lit(null: String))

One problem here is that the column type is null:

scala> dfWithFoobar.printSchema
root
 |-- foo: integer (nullable = false)
 |-- bar: string (nullable = true)
 |-- foobar: null (nullable = true)

and it is not retained by the csv writer. If it is a hard requirement you can cast column to the specific type (lets say String), with either DataType

import org.apache.spark.sql.types.StringType

df.withColumn("foobar", lit(null).cast(StringType))

or string description

df.withColumn("foobar", lit(null).cast("string"))

or use an UDF like this:

val getNull = udf(() => None: Option[String]) // Or some other type

df.withColumn("foobar", getNull()).printSchema
root
 |-- foo: integer (nullable = false)
 |-- bar: string (nullable = true)
 |-- foobar: string (nullable = true)

A Python equivalent can be found here: Add an empty column to spark DataFrame


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...