Explanation of all Spark SQL, RDD, DataFrame and Dataset examples present on this project are available at https://sparkbyexamples.com/ , All these examples are coded in Scala language and tested in our development environment.

Table of Contents (Spark Examples in Scala)

Spark RDD Examples

Create a Spark RDD using Parallelize
Spark – Read multiple text files into single RDD?
Spark load CSV file into RDD
Different ways to create Spark RDD
Spark – How to create an empty RDD?
Spark RDD Transformations with examples
Spark RDD Actions with examples
Spark Pair RDD Functions
Spark Repartition() vs Coalesce()
Spark Shuffle Partitions
Spark Persistence Storage Levels
Spark RDD Cache and Persist with Example
Spark Broadcast Variables
Spark Accumulators Explained
Convert Spark RDD to DataFrame | Dataset

Spark SQL Tutorial

Spark Create DataFrame with Examples
Spark DataFrame withColumn
Ways to Rename column on Spark DataFrame
Spark – How to Drop a DataFrame/Dataset column
Working with Spark DataFrame Where Filter
Spark SQL “case when” and “when otherwise”
Collect() – Retrieve data from Spark RDD/DataFrame
Spark – How to remove duplicate rows
How to Pivot and Unpivot a Spark DataFrame
Spark SQL Data Types with Examples
Spark SQL StructType & StructField with examples
Spark schema – explained with examples
Spark Groupby Example with DataFrame
Spark – How to Sort DataFrame column explained
Spark SQL Join Types with examples
Spark DataFrame Union and UnionAll
Spark map vs mapPartitions transformation
Spark foreachPartition vs foreach | what to use?
Spark DataFrame Cache and Persist Explained
Spark SQL UDF (User Defined Functions
Spark SQL DataFrame Array (ArrayType) Column
Working with Spark DataFrame Map (MapType) column
Spark SQL – Flatten Nested Struct column
Spark – Flatten nested array to single array column
[Spark explode array and map columns to rows

Spark SQL Functions

Spark SQL String Functions Explained
Spark SQL Date and Time Functions
Spark SQL Array functions complete list
Spark SQL Map functions – complete list
Spark SQL Sort functions – complete list
Spark SQL Aggregate Functions
Spark Window Functions with Examples

Spark Data Source API

Spark Read CSV file into DataFrame
Spark Read and Write JSON file into DataFrame
Spark Read and Write Apache Parquet
Spark Read XML file using Databricks API
Read & Write Avro files using Spark DataFrame
Using Avro Data Files From Spark SQL 2.3.x or earlier
Spark Read from & Write to HBase table | Example
Create Spark DataFrame from HBase using Hortonworks
Spark Read ORC file into DataFrame
Spark 3.0 Read Binary File into DataFrame

Spark Streaming & Kafka

Spark Streaming – Different Output modes explained
Spark Streaming files from a directory
Spark Streaming – Reading data from TCP Socket
Spark Streaming with Kafka Example
Spark Streaming – Kafka messages in Avro format
Spark SQL Batch Processing – Produce and Consume Apache Kafka Topic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Table of Contents (Spark Examples in Scala)

Spark RDD Examples

Spark SQL Tutorial

Spark SQL Functions

Spark Data Source API

Spark Streaming & Kafka

Files

README.md

Latest commit

History

README.md

File metadata and controls

Table of Contents (Spark Examples in Scala)

Spark RDD Examples

Spark SQL Tutorial

Spark SQL Functions

Spark Data Source API

Spark Streaming & Kafka