Pyspark - converting json string to DataFrame

You can do the following

newJson = '{"Name":"something","Url":"https://stackoverflow.com","Author":"jangcy","BlogEntries":100,"Caller":"jangcy"}'
df = spark.read.json(sc.parallelize([newJson]))
df.show(truncate=False)

which should give

+------+-----------+------+---------+-------------------------+
|Author|BlogEntries|Caller|Name     |Url                      |
+------+-----------+------+---------+-------------------------+
|jangcy|100        |jangcy|something|https://stackoverflow.com|
+------+-----------+------+---------+-------------------------+

More Related Contents:

PySpark in iPython notebook raises Py4JJavaError when using count() and first()
How to find median and quantiles using Spark
Spark Dataframe distinguish columns with duplicated name
How to change dataframe column names in pyspark?
I can’t seem to get –py-files on Spark to work
How can we JOIN two Spark SQL dataframes using a SQL-esque “LIKE” criterion?
Spark RDD to DataFrame python
Pyspark ‘NoneType’ object has no attribute ‘_jvm’ error
Why is Apache-Spark – Python so slow locally as compared to pandas?
Spark groupByKey alternative
How to extract an element from a array in pyspark
PySpark create new column with mapping from a dict
Tuning parameters for implicit pyspark.ml ALS matrix factorization model through pyspark.ml CrossValidator
Spark iteration time increasing exponentially when using join
Create Spark DataFrame. Can not infer schema for type
PySpark DataFrames – way to enumerate without converting to Pandas?
Getting Spark, Python, and MongoDB to work together
How to transform data with sliding window over time series data in Pyspark
Using UDF ignores condition in when
Create single row dataframe from list of list PySpark
Total size of serialized results of 16 tasks (1048.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
pyspark parse fixed width text file
Concatenating string by rows in pyspark
What is the equivalent to scala.util.Try in pyspark?
Concatenate two PySpark dataframes
How do I convert an array (i.e. list) column to Vector
Filtering DataFrame using the length of a column
How to drop all columns with null values in a PySpark DataFrame?
PySpark: compute row maximum of the subset of columns and add to an exisiting dataframe
Random numbers generation in PySpark

Pyspark – converting json string to DataFrame

Leave a Comment Cancel reply