How to Avoid Throwing Exception in

Published Jun 16, 2022

We can read data programmatically in Spark using

How can we prevent Spark from throwing an Exception when a file is not found?

Suppose we want to use an instance of SparkSesson called spark to read from S3.

We can wrap our command inside a try-catch block to handle the errors manually. Let’s check out some errors we may run into.

Handling FileNotFoundException

If we specify a non-existent bucket in the S3 path, then we’ll hit a FileNotFoundException. Bucket fake-bucket does not exist

Handling AnalysisException

If our glob does not match any files, we’ll get an AnalysisException.

org.apache.spark.sql.AnalysisException: Path does not exist: s3a://real-bucket/fake/path/*.json;

Avoid exceptions in

In this scenario, let’s return an empty Dataset<Row> when no files match our S3 path.

try {
  Dataset<Row> dataset =;
} catch (Exception e) {
  if (e instanceof AnalysisException || e instanceof FileNotFoundException) {
    return spark.emptyDataFrame();
  throw new RuntimeException(e);

In the case of any other exception, we’ll throw a RuntimeException.