There's many ways to try to test your Spark code. Depending on if it's Java (you can do basic JUnit tests to test non-Spark pieces) or ScalaTest for your Scala code. You can also do full integration tests by running Spark locally or in a small test cluster.
Another awesome choice from Holden is using Spark-Testing-Base.
There are a few presentations and articles about doing so:
Add it to your SBT for Spark 1.6.0.
"com.holdenkarau" %% "spark-testing-base" % "1.6.0_0.3.1" parallelExecution in Test := false
Check out the Wiki for usage details.
Use RDDComparisons to see if your RDD is as expected.
Some other Testing Resources for Apache Spark:
There are many options, I suggest trying a few and definitely using Spark Testing Base and ScalaTest at a minimum. Always deploy locally first and try with a subset of data before moving to production. Develop Test Driven and in an iterative fashion just like a program you are writing.