jeudi 19 novembre 2015

How do I unit test PySpark programs?

My current Java/Spark Unit Test approach works (detailed here) by instantiating a SparkContext using "local" and running unit tests using JUnit.

The code has to be organized to do I/O in one function and then call another with multiple RDDs.

This works great. I have a highly tested data transformation written in Java + Spark.

Can I do the same with Python?

How would I run Spark unit tests with Python?

Aucun commentaire:

Enregistrer un commentaire