snowmobile.core.qa
¶
Derived Statement
classes.
These objects derive from snowmobile.core.statement.Statement
and override its process()
method to perform additional post-processing of the statement’s results in conjunction with any parameters provided within the statement’s tags.
s.process()
modifies a statement’s outcome
attribute (bool
) on which an assertion is run before continuing execution of the script.
Note
The on_exception
and on_failure
parameters of script.run()
are passed directly and only applicable to these derived statement classes.
on_exception
is used to control the exception-handling of errors encountered in the post-processing invoked bys.process()
on_failure
is used to control the exception-handling of a failed assertion ran on the outcome of the post-processing invoked bys.process()
Module Contents¶
Classes¶
Base class for QA st. | |
QA class for verification that a statement’s results are empty. | |
QA class for comparison of values within a table based on |
- class
snowmobile.core.qa.
QA
(sn: snowmobile.core.connection.Snowmobile, **kwargs)¶ Bases:
snowmobile.core.Statement
Base class for QA st.
Initialize self. See help(type(self)) for accurate signature.
-
set_outcome
(self)¶ Updates ._outcome upon completion of processing invoked by .process().
-
- class
snowmobile.core.qa.
Empty
(sn: snowmobile.core.connection.Snowmobile, **kwargs)¶ Bases:
snowmobile.core.qa.QA
QA class for verification that a statement’s results are empty.
The most widely applicable use of
Empty
is for simple verification that a table’s dimensions are as expected.Initialize self. See help(type(self)) for accurate signature.
-
process
(self) → snowmobile.core.qa.QA¶ Over-ride method; checks if results are empty and updates outcome
-
- class
snowmobile.core.qa.
Diff
(sn: snowmobile.core.connection.Snowmobile = None, **kwargs)¶ Bases:
snowmobile.core.qa.QA
QA class for comparison of values within a table based on partitioning on a field.
-
partition_on
¶ Column name to partition data on before comparing the partitioned datasets; defaults to ‘src_description`.
- Type
-
end_index_at
¶ Column name that marks the last column to use as an index column when joining the partitioned datasets back together.
- Type
-
compare_patterns
¶ Regex patterns to match columns on that should be included in comparison (numeric columns you’re running QA on).
- Type
-
ignore_patterns
¶ Regex patterns to match columns on that should be ignored both for the comparison and the index.
- Type
-
generic_metric_col_nm
¶ Column name to use for the melted field names; defaults to ‘Metric’.
- Type
-
compare_cols
¶ Columns that are used in comparison once statement is executed and parsing is applied.
- Type
-
idx_cols
¶ Columns that are used for the index to join the data back together once statement is executed and parsing is applied.
- Type
-
ub_raw
¶ Maximum absolute raw difference (upper bound) that two fields that are being compared can differ from each other without causing a failure.
- Type
-
ub_perc
¶ Maximum absolute percentage difference (upper bound) that two comparison fields can differ from each other without causing a failure.
- Type
Instantiates a
qa-diff
statement.- Parameters
delta_column_suffix (str) – Suffix to add to columns that comparison is being run on; defaults to ‘Delta’.
partition_on (str) – Column to partition the data on in order to compare.
end_index_at (str) – Column name that marks the last column to use as an index when joining the partitioned datasets back together.
compare_patterns (list) – Regex patterns matching columns to be included in comparison.
ignore_patterns (list) – Regex patterns to match columns on that should be ignored both for the comparison and the index.
generic_metric_col_nm (str) – Column name to use for the melted field names; defaults to ‘Metric’.
raw_upper_bound (float) – Maximum absolute raw difference that two fields that are being compared can differ from each other without causing a failure.
percentage_upper_bound (float) – Maximum absolute percentage difference that two comparison fields can differ from each other without causing a failure.
-
split_cols
(self) → snowmobile.core.qa.Diff¶ Post-processes results returned from a
qa-diff
statement.- Executes private methods to split columns into:
Index columns
Drop columns
Comparison columns
Then runs checks needed to ensure minimum requirements are met in order for a valid partition/comparison to be made.
- property
partitioned_by
(self) → Set[Any]¶ Distinct values within the
partition_on
column that data is partitioned by.
- static
partitions_are_equal
(partitions: Dict[str, pd.DataFrame], abs_tol: float, rel_tol: float) → bool¶ Evaluates if a dictionary of DataFrames are identical.
- Parameters
partitions (Dict[str, pd.DataFrame]) – A dictionary of DataFrames returned by
snowmobile.DataFrame()
.abs_tol (float) – Absolute tolerance for difference in any value amongst the DataFrames being compared.
rel_tol (float) – Relative tolerance for difference in any value amongst the DataFrames being compared.
- Returns (bool):
Indication of equality amongst all the DataFrames contained in
partitions
.
-
process
(self) → snowmobile.core.qa.Diff¶ Post-processing for
Diff
-specific results.
-