data = sc.textFile('path_to_data')
header = data.first() #extract header
data = data.filter(row => row != header) #filter out header
More Related Contents:
- Write single CSV file using spark-csv
- Specifying the filename when saving a DataFrame as a CSV [duplicate]
- How to bucket the range of values from a column and count how many values fall into each interval in scala?
- How to store custom objects in Dataset?
- How do I detect if a Spark DataFrame has a column
- How to define schema for custom type in Spark SQL?
- How to avoid duplicate columns after join?
- Flattening Rows in Spark
- Spark unionAll multiple dataframes
- Spark performance for Scala vs Python
- How to zip two (or more) DataFrame in Spark
- How to force DataFrame evaluation in Spark
- How to write spark streaming DF to Kafka topic
- Stackoverflow due to long RDD Lineage
- call of distinct and map together throws NPE in spark library
- How to get path to the uploaded file
- how to filter out a null value from spark dataframe
- Is there a reason not to use SparkContext.getOrCreate when writing a spark job?
- What are the various join types in Spark?
- Provide schema while reading csv file as a dataframe in Scala Spark
- Replace missing values with mean – Spark Dataframe
- How to calculate the size of dataframe in bytes in Spark?
- Why does partition parameter of SparkContext.textFile not take effect?
- Spark Scala list folders in directory
- Exception in thread “main” java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)
- Spark-Monotonically increasing id not working as expected in dataframe?
- How to remove parentheses around records when saveAsTextFile on RDD[(String, Int)]?
- Reading in multiple files compressed in tar.gz archive into Spark [duplicate]
- How to create a Dataset of Maps?
- Spark ML VectorAssembler returns strange output