If you want to view the content of a RDD, one way is to use collect()
:
myRDD.collect().foreach(println)
That’s not a good idea, though, when the RDD has billions of lines. Use take()
to take just a few to print out:
myRDD.take(n).foreach(println)
More Related Contents:
- Encoder error while trying to map dataframe row to updated row
- How to define schema for custom type in Spark SQL?
- Specifying the filename when saving a DataFrame as a CSV [duplicate]
- NullPointerException in Scala Spark, appears to be caused be collection type?
- how to make saveAsTextFile NOT split output into multiple file?
- What does setMaster `local[*]` mean in spark?
- Spark unionAll multiple dataframes
- How can I pass extra parameters to UDFs in Spark SQL?
- How to split a dataframe into dataframes with same column values?
- Spark performance for Scala vs Python
- How to use regex to include/exclude some input files in sc.textFile?
- How to define a custom aggregation function to sum a column of Vectors?
- How to write spark streaming DF to Kafka topic
- Stackoverflow due to long RDD Lineage
- Better way to convert a string field into timestamp in Spark
- Apache Spark how to append new column from list/array to Spark dataframe
- How to find spark RDD/Dataframe size?
- Encode an ADT / sealed trait hierarchy into Spark DataSet column
- Provide schema while reading csv file as a dataframe in Scala Spark
- Replace null values in Spark DataFrame
- Spark RDD default number of partitions
- Calculate Cosine Similarity Spark Dataframe
- Spark : how to run spark file from spark shell
- How can I connect to a postgreSQL database into Apache Spark using scala?
- Spark Dataframe Random UUID changes after every transformation/action
- Spark column string replace when present in other column (row)
- How to vectorize DataFrame columns for ML algorithms?
- How to set hadoop configuration values from pyspark
- Spark DataFrames when udf functions do not accept large enough input variables
- Filter spark DataFrame on string contains