how to convert json string to dataframe on spark

For Spark 2.2+:

import spark.implicits._
val jsonStr = """{ "metadata": { "key": 84896, "value": 54 }}"""
val df = spark.read.json(Seq(jsonStr).toDS)

For Spark 2.1.x:

val events = sc.parallelize("""{"action":"create","timestamp":"2016-01-07T00:01:17Z"}""" :: Nil)    
val df = sqlContext.read.json(events)

Hint: this is using sqlContext.read.json(jsonRDD: RDD[Stirng]) overload.
There is also sqlContext.read.json(path: String) where it reads a Json file directly.

For older versions:

val jsonStr = """{ "metadata": { "key": 84896, "value": 54 }}"""
val rdd = sc.parallelize(Seq(jsonStr))
val df = sqlContext.read.json(rdd)

Leave a Comment