SparkPipelineFramework.Testing

SparkPipelineFramework.Testing

PyPi: https://pypi.org/project/sparkpipelineframework.testing/
Source Code: https://github.com/icanbwell/SparkPipelineFramework.Testing

This package provides technology to easily unit test each part of your data pipeline on a developer’s machine and in Github (or other source control systems). Engineers can easily create unit tests by providing a set of input files and expected output files. This package will automatically run the data processing with the input file and then verify that the output matched the output file.

from pathlib import Path


from pyspark.sql import SparkSession


from spark_pipeline_framework_testing.test_runner import SparkPipelineFrameworkTestRunner




def test_folder(spark_session: SparkSession) -> None:
    data_dir: Path = Path(__file__).parent.joinpath('./')


    SparkPipelineFrameworkTestRunner.run_tests(spark_session=spark_session, folder_path=data_dir)