Read a CSV File Using Spark Scala

How to Read a CSV File Using Spark Scala

CSV files are a widely used format for storing data, and Spark provides powerful capabilities to read and process them efficiently. Here, we will walk through the steps to read a CSV file using Spark with Scala.

Use the read method to load the CSV file into a DataFrame. For example I have considered below sample data

Sample Data

Roll Name Age
1 Rahul 30
2 Sanjay 67
3 Ranjan 67
import org.apache.spark.sql.{DataFrame, SparkSession}
object ReadCSV {

    def main(args: Array[String]): Unit = {
      
      val sparkSession = SparkSession
        .builder()
        .appName("read a csv file")
        .master("local")
        .getOrCreate()

      val rawDF=sparkSession.read.csv("data/csv/test.csv")
      rawDF.show()
    }
}

Output:

Alps

Read a CSV File Using Spark when you have a header in your csv file


  val salesDF=sparkSession.read.option("header","true").csv("data/csv/test.csv")
  salesDF.show()

Output:

Alps

Below list contains some most commonly used options while reading a csv file