snowmobile.core.qa¶
Derived Statement classes.
These objects derive from snowmobile.core.statement.Statement and override its process() method to perform additional post-processing of the statement’s results in conjunction with any parameters provided within the statement’s tags.
s.process() modifies a statement’s outcome attribute (bool) on which an assertion is run before continuing execution of the script.
Note
The on_exception and on_failure parameters of script.run() are passed directly and only applicable to these derived statement classes.
on_exceptionis used to control the exception-handling of errors encountered in the post-processing invoked bys.process()
on_failureis used to control the exception-handling of a failed assertion ran on the outcome of the post-processing invoked bys.process()
Module Contents¶
Classes¶
Base class for QA st. | |
QA class for verification that a statement’s results are empty. | |
QA class for comparison of values within a table based on |
- class
snowmobile.core.qa.QA(sn: snowmobile.core.connection.Snowmobile, **kwargs)¶ Bases:
snowmobile.core.StatementBase class for QA st.
Initialize self. See help(type(self)) for accurate signature.
-
set_outcome(self)¶ Updates ._outcome upon completion of processing invoked by .process().
-
- class
snowmobile.core.qa.Empty(sn: snowmobile.core.connection.Snowmobile, **kwargs)¶ Bases:
snowmobile.core.qa.QAQA class for verification that a statement’s results are empty.
The most widely applicable use of
Emptyis for simple verification that a table’s dimensions are as expected.Initialize self. See help(type(self)) for accurate signature.
-
process(self) → snowmobile.core.qa.QA¶ Over-ride method; checks if results are empty and updates outcome
-
- class
snowmobile.core.qa.Diff(sn: snowmobile.core.connection.Snowmobile = None, **kwargs)¶ Bases:
snowmobile.core.qa.QAQA class for comparison of values within a table based on partitioning on a field.
-
partition_on¶ Column name to partition data on before comparing the partitioned datasets; defaults to ‘src_description`.
- Type
-
end_index_at¶ Column name that marks the last column to use as an index column when joining the partitioned datasets back together.
- Type
-
compare_patterns¶ Regex patterns to match columns on that should be included in comparison (numeric columns you’re running QA on).
- Type
-
ignore_patterns¶ Regex patterns to match columns on that should be ignored both for the comparison and the index.
- Type
-
generic_metric_col_nm¶ Column name to use for the melted field names; defaults to ‘Metric’.
- Type
-
compare_cols¶ Columns that are used in comparison once statement is executed and parsing is applied.
- Type
-
idx_cols¶ Columns that are used for the index to join the data back together once statement is executed and parsing is applied.
- Type
-
ub_raw¶ Maximum absolute raw difference (upper bound) that two fields that are being compared can differ from each other without causing a failure.
- Type
-
ub_perc¶ Maximum absolute percentage difference (upper bound) that two comparison fields can differ from each other without causing a failure.
- Type
Instantiates a
qa-diffstatement.- Parameters
delta_column_suffix (str) – Suffix to add to columns that comparison is being run on; defaults to ‘Delta’.
partition_on (str) – Column to partition the data on in order to compare.
end_index_at (str) – Column name that marks the last column to use as an index when joining the partitioned datasets back together.
compare_patterns (list) – Regex patterns matching columns to be included in comparison.
ignore_patterns (list) – Regex patterns to match columns on that should be ignored both for the comparison and the index.
generic_metric_col_nm (str) – Column name to use for the melted field names; defaults to ‘Metric’.
raw_upper_bound (float) – Maximum absolute raw difference that two fields that are being compared can differ from each other without causing a failure.
percentage_upper_bound (float) – Maximum absolute percentage difference that two comparison fields can differ from each other without causing a failure.
-
split_cols(self) → snowmobile.core.qa.Diff¶ Post-processes results returned from a
qa-diffstatement.- Executes private methods to split columns into:
Index columns
Drop columns
Comparison columns
Then runs checks needed to ensure minimum requirements are met in order for a valid partition/comparison to be made.
- property
partitioned_by(self) → Set[Any]¶ Distinct values within the
partition_oncolumn that data is partitioned by.
- static
partitions_are_equal(partitions: Dict[str, pd.DataFrame], abs_tol: float, rel_tol: float) → bool¶ Evaluates if a dictionary of DataFrames are identical.
- Parameters
partitions (Dict[str, pd.DataFrame]) – A dictionary of DataFrames returned by
snowmobile.DataFrame().abs_tol (float) – Absolute tolerance for difference in any value amongst the DataFrames being compared.
rel_tol (float) – Relative tolerance for difference in any value amongst the DataFrames being compared.
- Returns (bool):
Indication of equality amongst all the DataFrames contained in
partitions.
-
process(self) → snowmobile.core.qa.Diff¶ Post-processing for
Diff-specific results.
-