SparkSession PySpark example
Creating a Spark session is a crucial step when working with PySpark for big data processing tasks. This guide will walk you through the process of setting up a Spark session in PySpark.
A SparkSession is the entry point for using Spark with the DataFrame and Dataset API. It provides a unified interface for interacting with Spark's various functionalities. Prior to Spark 2.0, SparkContext was the main entry point, but SparkSession now integrates the functionalities of SparkContext and provides additional features for easier data processing.
Now that we've set up PySaprk on our local machine, it's time to write our very first program. which will create a sparksession.
from pyspark.sql import SparkSession spark_session = SparkSession.builder.master("local").appName("testing").getOrCreate() spark_session.sql("select 1").show()
Note:- This spark session looks very small because we have not used any additional conf. please check here to check list of conf