Read XML in spark

heirarchy should be rootTag and att should be rowTag as df = spark.read \ .format(“com.databricks.spark.xml”) \ .option(“rootTag”, “hierarchy”) \ .option(“rowTag”, “att”) \ .load(“test.xml”) and you should get +—–+——+—————————-+ |Order|attval|children | +—–+——+—————————-+ |1 |Data |[[[1, Studyval], [2, Site]]]| |2 |Info |[[[1, age], [2, gender]]] | +—–+——+—————————-+ and schema root |– Order: long (nullable = true) |– … Read more