snowmobile.core

snowmobile lives in snowmobile.core to keep from cluttering intellisense/autocomplete while interacting with the API.

Subpackages

Submodules

Package Contents

Classes

Generic

Generic dunder implementation for snowmobile objects.

ExceptionHandler

All snowmobile classes contain a ExceptionHandler.

Configuration

A parsed snowmobile.toml file.

Snowmobile

Primary method of statement execution and accessor to parsed snowmobile.toml.

connect

Primary method of statement execution and accessor to parsed snowmobile.toml.

Section

Represents any (1-6 level) header section within Script Name (doc).md.

Scope

Handles the scope/context for Statement objects and derived classes.

Name

Handles the decomposition/parsing of statement name.

Statement

Base class for all Statement objects.

Column

A single column within a SnowFrame.

Diff

QA class for comparison of values within a table based on

Empty

QA class for verification that a statement’s results are empty.

SnowFrame

Extends a DataFrame with a .snf entry point.

SQL

SQL class for generation & execution of common sql commands.

Markup

Contains all sections within the context of a Script.

Script

Parser and operator of local sql files.

Table

Constructed with a DataFrame and a table name to load into.

class snowmobile.core.Generic

Bases: object

Generic dunder implementation for snowmobile objects.

Base class for all snowmobile objects that do not inherit from pydantic’s BaseModel or configuration class, Config.

class snowmobile.core.ExceptionHandler(within: snowmobile.core.errors.Optional[Any] = None, ctx_id: snowmobile.core.errors.Optional[int] = None, in_context: bool = False, children: snowmobile.core.errors.Dict[int, Any] = None, is_active_parent: bool = False, to_mirror: snowmobile.core.errors.Optional[List[Any]] = None)

Bases: snowmobile.core.Generic

All snowmobile classes contain a ExceptionHandler.

Parameters
  • within (Optional[Any]) – Class for which the ExceptionHandler is intended.

  • ctx_id (Optional[int]) – Context ID; set/unset by methods when entering/exiting certain contexts.

  • in_context (bool) – Class is currently within a specific ctx_id

  • children (Dict[int, Any]) – Attributes of the within class for which the ExceptionHandler should mirror the methods called on the parent class. # TODO: Refactor this out; it’s essentially janky multi inheritance

  • is_active_parent (bool) – The within class is currently enforcing the context rules on its children

  • to_mirror (Optional[List[Any]]) – Methods called in the attr:`within class that should be applied to its children (i.e. set/reset context ID, etc)

property current(self)

All exceptions in the current context.

collect(self, e: Any[snowmobile_errors])

Stores an exception.

property first(self)snowmobile.core.errors.Error

First exception encountered.

property last(self)snowmobile.core.errors.Error

Last exception encountered.

seen(self, from_ctx: snowmobile.core.errors.Optional[int] = None, of_type: snowmobile.core.errors.Optional[Any[snowmobile_errors], List[snowmobile_errors]] = None, to_raise: snowmobile.core.errors.Optional[bool] = None, with_ids: snowmobile.core.errors.Optional[int, List[int], Set[int]] = None, all_time: bool = False)bool

Boolean indicator of if an exception has been seen.

get(self, from_ctx: snowmobile.core.errors.Optional[int] = None, of_type: snowmobile.core.errors.Optional[Any[snowmobile_errors], List[snowmobile_errors]] = None, to_raise: snowmobile.core.errors.Optional[bool] = None, with_ids: snowmobile.core.errors.Optional[int, List[int], Set[int]] = None, all_time: bool = False, last: bool = False, first: bool = False, _raise: bool = False)

Boolean indicator of if an exception has been seen.

property ctx_id(self)

Current context id.

set(self, ctx_id: snowmobile.core.errors.Optional[int] = None, in_context: bool = False, outcome: snowmobile.core.errors.Optional[int] = None)

Set attributes on self.

set_from(self, other: snowmobile.core.exception_handler.ExceptionHandler)snowmobile.core.exception_handler.ExceptionHandler

Updates attributes of self with those from ‘other’.

reset(self, ctx_id: bool = False, in_context: bool = False, outcome: bool = False)snowmobile.core.exception_handler.ExceptionHandler

Resets attributes on self.

property by_tmstmp(self)

All exceptions by timestamp, ordered by most to least recent.

class snowmobile.core.Configuration(creds: Optional[str] = None, config_file_nm: Optional[str] = None, from_config: Optional[Path, str] = None, export_dir: Optional[Path, str] = None, silence: bool = False)

Bases: snowmobile.core.base.Generic

A parsed snowmobile.toml file.

All keyword arguments optional.

Parameters
  • config_file_nm (Optional[str]) – Name of configuration file to use; defaults to snowmobile.toml.

  • creds (Optional[str]) – Alias for the set of credentials to authenticate with; default behavior will fall back to the connection.default-creds specified in snowmobile.toml, or the first set of credentials stored if this configuration option is left blank.

  • from_config (Optional[str, Path]) – A full path to a specific configuration file to use; bypasses any checks for a cached file location and can be useful for container-based processes with restricted access to the local file system.

  • export_dir (Optional[Path]) – Path to save a template snowmobile.toml file to; if pr, the file will be exported within the __init__ method and nothing else will be instantiated.

file_nm

Configuration file name; defaults to ‘snowmobile.toml’.

Type

str

cache

Persistent cache; caches location.

Type

snowmobile.core.cache.Cache

location

Full path to configuration file.

Type

pathlib.Path

connection :Optional[cfg.Connection]

[connection] from snowmobile.toml.

Type

snowmobile.core.cfg.Connection

loading :Optional[cfg.Loading]

[loading] from snowmobile.toml.

Type

snowmobile.core.cfg.Loading

script :Optional[cfg.Script]

[script] from snowmobile.toml.

Type

snowmobile.core.cfg.Script

sql :Optional[cfg.SQL]

[sql] from snowmobile-ext.toml.

Type

snowmobile.core.cfg.SQL

ext_sources :Optional[cfg.Location]

[external-sources] from snowmobile.toml.

Type

snowmobile.core.cfg.Location

property markdown(self)snowmobile.core.cfg.Markup

Accessor for cfg.script.markdown.

property attrs(self)snowmobile.core.cfg.Attributes

Accessor for cfg.script.markdown.attributes.

property wildcards(self)snowmobile.core.cfg.Wildcard

Accessor for cfg.script.patterns.wildcards.

static batch_set_attrs(obj: Any, attrs: dict, to_none: bool = False)

Batch sets attributes on an object from a dictionary.

Parameters
  • obj (Any) – Object to set attributes on.

  • attrs (dict) – Dictionary containing attributes.

  • to_none (bool) – Set all of the object’s attributes batching a key in wrap to None; defaults ot False.

Returns (Any):

Object post-setting attributes.

static attrs_from_obj(obj: Any, within: Optional[List[str]] = None)Dict[str, MethodType]

Utility to return attributes/properties from an object as a dictionary.

static methods_from_obj(obj: Any, within: Optional[List[str]] = None)Dict[str, MethodType]

Returns callable components of an object as a dictionary.

property scopes(self)

All combinations of scope type and scope attribute.

scopes_from_kwargs(self, only_populated: bool = False, **kwargs)Dict

Turns script.filter() arguments into a valid set of kwargs for Scope.

Returns dictionary of all combinations of ‘arg’ (“kw”, “obj”, “desc”, “anchor” and “nm”), including empty sets for any ‘arg’ not included in the keyword arguments provided.

scopes_from_tag(self, t: Any)

Generates list of keyword arguments to instantiate all scopes for a wrap.

json(self, by_alias: bool = False, **kwargs)

Serialization method for core object model.

class snowmobile.core.Snowmobile(creds: Optional[str] = None, delay: bool = False, ensure_alive: bool = True, config_file_nm: Optional[str] = None, from_config: Optional[str, Path] = None, silence: bool = False, **connect_kwargs)

Bases: snowmobile.core.sql.SQL

Primary method of statement execution and accessor to parsed snowmobile.toml.

Parameters
  • creds (Optional[str]) – Alias for the set of credentials to authenticate with; default behavior will fall back to the connection.default-creds specified in snowmobile.toml, or the first set of credentials stored if this configuration option is left blank.

  • delay (bool) – Optionally delay establishing a connection when the object is instantiated, enabling access to the configuration object model through the Connection.cfg attribute; defaults to False.

  • ensure_alive (bool) – Establish a new connection if a method requiring a connection against the database is called while alive is False; defaults to True.

  • config_file_nm (Optional[str]) – Name of configuration file to use; defaults to snowmobile.toml.

  • from_config (Optional[str, Path]) – A full path to a specific configuration file to use; bypasses any checks for a cached file location and can be useful for container-based processes with restricted access to the local file system.

  • **connect_kwargs

    Additional arguments to provide to snowflake.connector.connect(); arguments provided here will over-ride connection arguments specified in snowmobile.toml, including:

    • Connection parameters in connection.default-arguments

    • Credentials parameters associated with a given alias

    • Connection parameters associated with a given alias

Initializes a snowmobile.SQL object.

cfg :Configuration

snowmobile.toml

Type

snowmobile.core.configuration.Configuration

con :Optional[SnowflakeConnection]

Can be None until set by Snowmobile.connect()

Type

SnowflakeConnection

e :ExceptionHandler

Exception / context management

Type

snowmobile.core.exception_handler.ExceptionHandler

ensure_alive :bool

Reconnect to Snowflake if connection is lost

Type

bool

connect(self, **kwargs)snowmobile.core.connection.Snowmobile

Establishes connection to Snowflake.

Re-implements snowflake.connector.connect() with connection arguments sourced from snowmobile’s object model, specifically:

  • Credentials from snowmobile.toml.

  • Default connection arguments from snowmobile.toml.

  • Optional keyword arguments either passed to snowmobile.connect() or directly to this method.

kwargs:

Optional keyword arguments to pass to snowflake.connector.connect(); arguments passed here will over-ride connection.default-arguments specified in snowmobile.toml.

disconnect(self)snowmobile.core.connection.Snowmobile

Disconnect from connection with which Connection() was instantiated.

property alive(self)bool

Check if the connection is alive.

property cursor(self)snowflake.connector.connection.SnowflakeCursor

SnowflakeCursor accessor.

property dictcursor(self)snowflake.connector.DictCursor

DictCursor accessor.

ex(self, sql: str, on_error: Optional[str] = None, **kwargs)snowflake.connector.connection.SnowflakeCursor

Executes a command via SnowflakeCursor.

Parameters
  • sql (str) – sql command as a string.

  • on_error (str) – String value to impose a specific behavior if an error occurs during the execution of sql.

  • **kwargs – Optional keyword arguments for SnowflakeCursor.execute().

Returns (SnowflakeCursor):

SnowflakeCursor object that executed the command.

exd(self, sql: str, on_error: Optional[str] = None, **kwargs)snowflake.connector.DictCursor

Executes a command via DictCursor.

Parameters
  • sql (str) – sql command as a string.

  • on_error (str) – String value to impose a specific behavior if an error occurs during the execution of sql.

  • **kwargs – Optional keyword arguments for SnowflakeCursor.execute().

Returns (DictCursor):

DictCursor object that executed the command.

query(self, sql: str, as_df: bool = False, as_cur: bool = False, as_dcur: bool = False, as_scalar: bool = False, lower: bool = True, on_error: Optional[str] = None)Union[pd.DataFrame, SnowflakeCursor]

Execute a query and return results.

Default behavior of results=True will return results as a pandas.DataFrame, otherwise will execute the sql provided with a SnowflakeCursor and return the cursor object.

Parameters
  • sql (str) – Raw SQL to execute.

  • as_df (bool) – Return results in DataFrame.

  • as_cur (bool) – Return results in Cursor.

  • as_dcur (bool) – Return results in a DictCursor.

  • as_scalar (bool) – Return results as a single scalar value.

  • lower (bool) – Boolean value indicating whether or not to return results with columns lower-cased.

  • on_error (str) – String value to impose a specific behavior if an error occurs during the execution of sql.

Returns (Union[pd.DataFrame, SnowflakeCursor]):

Results from sql as a DataFrame by default or the SnowflakeCursor object if results=False.

class snowmobile.core.connect(creds: Optional[str] = None, delay: bool = False, ensure_alive: bool = True, config_file_nm: Optional[str] = None, from_config: Optional[str, Path] = None, silence: bool = False, **connect_kwargs)

Bases: snowmobile.core.sql.SQL

Primary method of statement execution and accessor to parsed snowmobile.toml.

Parameters
  • creds (Optional[str]) – Alias for the set of credentials to authenticate with; default behavior will fall back to the connection.default-creds specified in snowmobile.toml, or the first set of credentials stored if this configuration option is left blank.

  • delay (bool) – Optionally delay establishing a connection when the object is instantiated, enabling access to the configuration object model through the Connection.cfg attribute; defaults to False.

  • ensure_alive (bool) – Establish a new connection if a method requiring a connection against the database is called while alive is False; defaults to True.

  • config_file_nm (Optional[str]) – Name of configuration file to use; defaults to snowmobile.toml.

  • from_config (Optional[str, Path]) – A full path to a specific configuration file to use; bypasses any checks for a cached file location and can be useful for container-based processes with restricted access to the local file system.

  • **connect_kwargs

    Additional arguments to provide to snowflake.connector.connect(); arguments provided here will over-ride connection arguments specified in snowmobile.toml, including:

    • Connection parameters in connection.default-arguments

    • Credentials parameters associated with a given alias

    • Connection parameters associated with a given alias

Initializes a snowmobile.SQL object.

cfg :Configuration

snowmobile.toml

Type

snowmobile.core.configuration.Configuration

con :Optional[SnowflakeConnection]

Can be None until set by Snowmobile.connect()

Type

SnowflakeConnection

e :ExceptionHandler

Exception / context management

Type

snowmobile.core.exception_handler.ExceptionHandler

ensure_alive :bool

Reconnect to Snowflake if connection is lost

Type

bool

connect(self, **kwargs)snowmobile.core.connection.Snowmobile

Establishes connection to Snowflake.

Re-implements snowflake.connector.connect() with connection arguments sourced from snowmobile’s object model, specifically:

  • Credentials from snowmobile.toml.

  • Default connection arguments from snowmobile.toml.

  • Optional keyword arguments either passed to snowmobile.connect() or directly to this method.

kwargs:

Optional keyword arguments to pass to snowflake.connector.connect(); arguments passed here will over-ride connection.default-arguments specified in snowmobile.toml.

disconnect(self)snowmobile.core.connection.Snowmobile

Disconnect from connection with which Connection() was instantiated.

property alive(self)bool

Check if the connection is alive.

property cursor(self)snowflake.connector.connection.SnowflakeCursor

SnowflakeCursor accessor.

property dictcursor(self)snowflake.connector.DictCursor

DictCursor accessor.

ex(self, sql: str, on_error: Optional[str] = None, **kwargs)snowflake.connector.connection.SnowflakeCursor

Executes a command via SnowflakeCursor.

Parameters
  • sql (str) – sql command as a string.

  • on_error (str) – String value to impose a specific behavior if an error occurs during the execution of sql.

  • **kwargs – Optional keyword arguments for SnowflakeCursor.execute().

Returns (SnowflakeCursor):

SnowflakeCursor object that executed the command.

exd(self, sql: str, on_error: Optional[str] = None, **kwargs)snowflake.connector.DictCursor

Executes a command via DictCursor.

Parameters
  • sql (str) – sql command as a string.

  • on_error (str) – String value to impose a specific behavior if an error occurs during the execution of sql.

  • **kwargs – Optional keyword arguments for SnowflakeCursor.execute().

Returns (DictCursor):

DictCursor object that executed the command.

query(self, sql: str, as_df: bool = False, as_cur: bool = False, as_dcur: bool = False, as_scalar: bool = False, lower: bool = True, on_error: Optional[str] = None)Union[pd.DataFrame, SnowflakeCursor]

Execute a query and return results.

Default behavior of results=True will return results as a pandas.DataFrame, otherwise will execute the sql provided with a SnowflakeCursor and return the cursor object.

Parameters
  • sql (str) – Raw SQL to execute.

  • as_df (bool) – Return results in DataFrame.

  • as_cur (bool) – Return results in Cursor.

  • as_dcur (bool) – Return results in a DictCursor.

  • as_scalar (bool) – Return results as a single scalar value.

  • lower (bool) – Boolean value indicating whether or not to return results with columns lower-cased.

  • on_error (str) – String value to impose a specific behavior if an error occurs during the execution of sql.

Returns (Union[pd.DataFrame, SnowflakeCursor]):

Results from sql as a DataFrame by default or the SnowflakeCursor object if results=False.

class snowmobile.core.Section(cfg: snowmobile.core.Configuration, is_marker: bool = None, h_contents: Optional[str] = None, index: Optional[int] = None, parsed: Optional[Dict] = None, raw: Optional[str] = None, sql: Optional[str] = None, results: Optional[pd.DataFrame] = None, incl_sql_tag: bool = False, is_multiline: bool = False, result_wrap: Optional[str] = None)

Bases: snowmobile.core.Generic

Represents any (1-6 level) header section within Script Name (doc).md.

Class is created with a call to the as_section() method or by the snowmobile.core.markup.Markup class in the case of a Marker.

In order to include execution metadata if available without sacrificing base-case parsing, the below implementation heavily relies properties over attributes to reconcile what’s populated in the st vs executed attributes of Script.

Parameters
  • is_marker (bool) – Information provided is associated with a marker as opposed to a statement; defaults to False.

  • h_contents (str) – String representation of header contents.

  • index (int) – Statement index position or None if marker.

  • parsed (dict) – Parsed arguments from the statement or marker within the script.

  • raw (str) – Raw wrap as parsed was parsed from.

  • sql (str) – Statement’s raw sql or None if marker.

  • results (pd.DataFrame) – Results returned by statement’s sql as a DataFrame; will be None if statement hasn’t been executed or if a marker.

hx

String representation of header level (e.g. ‘#’ for h1), based on the script/statement header-level specifications in snowmobile.toml.

Type

str

Instantiation of a script.Section object.

reorder_attrs(self, parsed: dict, cfg: snowmobile.core.Configuration)Dict

Re-orders parsed attributes based on configuration.

parse_contents(self, cfg: snowmobile.core.Configuration)List[Item]

Unpacks sorted dictionary of parsed attributes into formatted Items.

property header(self)str

Constructs the header for a section.

Uses specifications in snowmobile.toml to determine:
  1. The level of the header depending on whether it’s a statement section or a script section.

  2. Whether or not to include the statement index as part of the header.

Returns

Formatted header line as a string.

property sql_md(self)str

Returns renderable sql or an empty string if script-level section.

property body(self)str

All section content except for header.

property md(self)str

Constructs a full section as a string from various components.

Returns

Full string of valid markdown for the section.

class snowmobile.core.Scope(arg: str, base: str)

Bases: snowmobile.core.Generic

Handles the scope/context for Statement objects and derived classes.

Should never be interacted with from the user-facing API.

base

The left-most word within a statement wrap. For generic st this will be the keyword and for QA statements this will be the literal word qa.

Type

str

component

The component within a given wrap that is being evaluated; this will be exactly one of kw, obj, anchor, desc, or nm.

Type

str

incl_arg

The keyword argument that would be used to exclude a given component;

Type

str

excl_arg

The keyword argument that would be used to exclude a given component; this would be the same as the above example except the value would be excl_kw as opposed to incl_kw.

Type

str

fallback_to

The default values to fall back to for incl_arg and excl_arg if they are not passed as a keyword argument by the user in Script; defaults to including the base and excluding an empty list.

type

dict

provided_args (dict):

The set of keyword arguments provided at the time of the last call to eval().

check_against_args (dict):

The set of keyword arguments checked against at the time of the last call to eval(); will use provided arguments if they exist and the arguments from fallback_to otherwise.

is_included (bool):

Name is included based on the results of the last call to eval().

is_excluded (bool):

Name is excluded based on the results of the last call to eval().

Instantiates a Scope object.

parse_kwargs(self, **kwargs)None

Parses all filter arguments looking for those that match its base.

Looks for include/exclude arguments within kwargs, populating provided_args with those that were provided and populates check_against_args with the same values if they were provided and fills in defaults from fallback_to otherwise.

Parameters

**kwargs – Keyword arguments passed to Script.filter() (e.g. incl_kw, excl_kw, ..)

matches_patterns(self, arg: str)bool

Returns indication of if base matches a given set of patterns.

Parameters

arg (str) – Will either be the value of incl_arg or exclude_arg.

Returns (bool):

Indication of whether any matches were found.

property included(self)

Name is included based on results of last eval().

eval(self, **kwargs)bool

Evaluates filter arguments and updates context accordingly.

Updates the values of is_included, is_excluded, and included.

Parameters

**kwargs – Keyword arguments passed to Script.filter() (e.g. incl_kw, excl_kw, ..)

Returns (bool):

Indicator of whether or not the statement should be included/excluded based on the context/keyword arguments pr.

class snowmobile.core.Name(configuration: snowmobile.core.Configuration, nm_pr: Optional[str] = None, sql: Optional[str] = None, index: Optional[int] = None)

Bases: snowmobile.core.Generic

Handles the decomposition/parsing of statement name.

Should never be instantiated directly by the user-facing API but its attributes are likely to be accessed often as part of Statement and derived classes.

cfg

snowmobile.Configuration object; represents fully parsed snowmobile.toml file.

Type

snowmobile.Configuration

patt

snowmobile.Schema.Pattern object; represents script.patterns section of snowmobile.toml.

Type

snowmobile.Schema.Pattern

_nm_pr

Provided wrap name for a given Statement; can be empty.

Type

str

index

Statement index position within Script; can be empty.

Type

int

is_included

Indicator of whether or not the combination of all scopes for this statement wrap is included within a given context.

Type

bool

incl_idx_in_desc

Indicator of whether or not to include the statement index in the description component of the wrap; defaults to True so that all generated statement tags are guaranteed to be unique for a given script.

  • Mainly included for testing purposes where setting to False enables comparing generated to provided statement tags without having to change the index position of the hard-coded/pr statement wrap when adding/removing tests.

Type

bool

first_line_remainder

The remainder of the first line once excluding the first_keyword and stripping repeating whitespace.

Type

str

scopes

Combination of all scopes for a given wrap; this is essentially the all possible combinations of including/excluding any of the kw, nm, obj, desc, and anchor for a given instance of Name.

Type

set[Scope]

scope(self, **kwargs)bool

Evaluates all component’s of a wrap’s scope against a set of filter args.

**kwargs:

Keyword arguments passed to Script.filter() (e.g. incl_kw, excl_kw, ..)

Returns (bool):

Value indicating whether or not the statement should be included based on the outcome of the evaluation of all of its components.

nm(self, ge: bool = False, pr: bool = False, og: bool = True)str

The final statement’s name that is used by the API.

This will be the full statement name if a tag exists and a parsed/generated name otherwise.

kw(self, ge: bool = False, pr: bool = False)

The final statement’s keyword that is used by the API.

This will be the provided keyword if a statement wrap exists and a parsed/ge keyword otherwise.

obj(self, ge: bool = False, pr: bool = False)

The final statement’s object that is used by the API.

This will be the object within a wrap if a statement wrap exists and follows the correct structure and a parsed/ge object otherwise.

desc(self, ge: bool = False, pr: bool = False)

The final statement’s description that is used by the API.

This will be the description within a wrap if a statement wrap exists and follows the correct structure and a parsed/ge description otherwise.

anchor(self, ge: bool = False, pr: bool = False)

The final statement’s anchor that is used by the API.

This will be the anchor within a wrap if a statement wrap exists and follows the correct structure and a parsed/ge wrap name otherwise.

set(self, key, value)snowmobile.core.name.Name

Custom attribute setting.

class snowmobile.core.Statement(sn: snowmobile.core.connection.Snowmobile, statement: Union[sqlparse.sql.Statement, str], index: Optional[int] = None, attrs_raw: Optional[str] = None, e: Optional[ExceptionHandler] = None, **kwargs)

Bases: snowmobile.core.tag.Attrs, snowmobile.core.Name, snowmobile.core.Generic

Base class for all Statement objects.

Home for attributes and methods that are associated with all statement objects, generic or QA.

sn

snowmobile.connect object.

Type

snowmobile.connect

statement

A sqlparse.sql.Statement object.

Type

Union[sqlparse.sql.Statement, str]

index

The context-specific index position of a statement within a script; can be None.

Type

int

patterns

config.Pattern object for more succinct access to values specified in snowmobile.toml.

Type

config.Pattern

results

The results of the statement if executed as a pandas.DataFrame.

Type

pd.DataFrame

outcome

Numeric indicator of outcome; defaults to 0 and is modified based on the outcome of statement execution and/or QA validation for derived classes.

Type

int

outcome_txt

Plain text of outcome (‘skipped’, ‘failed’, ‘completed’, ‘passed’).

Type

str

outcome_html

HTML text for the outcome as an admonition/information banner based on the following mapping of outcome_txt to admonition argument:

  • failed ——> warning

  • completed –> info

  • passed —–> success

Type

str

start_time

Unix timestamp of the query start time if executed; 0 otherwise.

Type

int

end_time

Unix timestamp of the query end time if executed; 0 otherwise.

Type

int

execution_time

Execution time of the query in seconds if executed; 0 otherwise.

Type

int

execution_time_txt

Plain text description of execution time if executed; returned in seconds if execution time is less than 60 seconds, minutes otherwise.

Type

str

first_keyword

The first keyword within the statement as a sqlparse.sql.Token.

Type

sqlparse.sql.Token

sql

The sql associated with the statement as a raw string.

Type

str

Initialize self. See help(type(self)) for accurate signature.

sql(self, set_as: Optional[str] = None, tag: bool = False)Union[str, Statement]

Raw sql from statement, including result limit if enabled.

parse(self)Tuple[Dict, str]

Parses tag contents into a valid dictionary.

Uses the values specified in snowmobile.toml to parse a raw string of statement attributes into a valid dictionary.

Note

  • If is_multiline is True and name is not included within the arguments, an assertion error will be thrown.

  • If is_multiline is False, the raw string within the wrap will be treated as the name.

  • The wrap attribute is set once parsing is completed and name has been validated.

Returns (dict):

Parsed wrap arguments as a dictionary.

start(self)

Sets start_time attribute.

end(self)

Updates execution time attributes.

In namespace, sets:
trim(self)str

Statement as a string including only the sql and a single-line wrap name.

Note

The wrap name used here will be the user-pr wrap from the original script or a generated Name.nm if a wrap was not provided for a given statement.

property is_derived(self)

Indicates whether or not it’s a generic or derived (QA) statement.

property lines(self)List[str]

Returns each line within the statement as a list.

as_section(self, incl_sql_tag: Optional[bool] = None, result_wrap: Optional[str] = None)snowmobile.core.Section

Returns current statement as a Section object.

set_state(self, index: Optional[int] = None, ctx_id: Optional[int] = None, in_context: Optional[bool] = None, filters: dict = None)snowmobile.core.statement.Statement

Sets current state/context on a statement object.

Parameters
  • ctx_id (int) – Unix timestamp the script.filter() context manager was invoked.

  • filters (dict) – Kwargs passed to script.filter().

  • index (int) – Integer to set as the statement’s index position.

reset(self, index: bool = False, ctx_id: bool = False, in_context: bool = False, scope: bool = False)snowmobile.core.statement.Statement

Resets attributes on the statement object to reflect as if read from source.

In its current form, includes:
  • Resetting the statement/wrap’s index to their original values.

  • Resetting the is_included attribute of the statement’s wrap to True.

  • Populating error_last with errors from current context.

  • Caching current context’s timestamp and resetting back to None.

process(self)

Used by derived classes for post-processing the returned results.

run(self, as_df: bool = True, lower: bool = True, render: bool = False, on_error: Optional[str] = None, on_exception: Optional[str] = None, on_failure: Optional[str] = None, ctx_id: Optional[int] = None)snowmobile.core.statement.Statement

Run method for all statement objects.

Parameters
  • as_df (bool) – Store results of query as pandas.DataFrame or SnowflakeCursor.

  • lower (bool) – Lower case column names in results DataFrame if as_df=True.

  • render (bool) – Render the sql executed as markdown.

  • on_error (str) –

    Behavior if an execution/database error is encountered
    • None: default behavior, exception will be raised

    • c: continue with execution

  • on_exception (str) –

    Behavior if an exception is raised in the post-processing of results from a derived class of Statement ( Empty and Diff).

    • None: default behavior, exception will be raised

    • c: continue with execution

  • on_failure (str) –

    Behavior if no error is encountered in execution or post-processing but the result of the post-processing has turned the statement’s outcome attribute to False, indicating the results returned by the statement have failed validation.

    • None: default behavior, exception will be raised

    • c: continue with execution

Returns (Statement):

Statement object post-executing query.

outcome_txt(self, _id: Optional[int] = None)str

Outcome as a string.

property outcome_html(self)str

Outcome as an html admonition banner.

class snowmobile.core.Column(original: str, current: Optional[str] = None, prior: Optional[str] = None, src: Optional[str] = None)

Bases: snowmobile.core.Generic

A single column within a SnowFrame.

original

Original column name.

Type

str

current

Current version of the column name.

Type

str

prior

Prior version of the column name (version n-1).

Type

str

src

Column source (original df or added by snowmobile).

Type

str

original

Original column name.

Type

str

src

Source of column; ‘df’ if from source DataFrame, ‘snowmobile’ otherwise.

Type

str

current

Current column name.

Type

str

prior

Prior (version of) column name.

Type

str

update(self)

Migrate from prior to current context within this context.

lower(self)str

Lower case column.

upper(self)str

Upper case column.

static dedupe(current: str, char: Optional[str] = None)str

Dedupes consecutive characters within a string.

Note

  • Must iterate through matches and perform replacements in the order of the largest to the smallest by number of characters; this is to avoid altering the matches found before replacing them.

Parameters
  • current (str) – String containing characters to dedupe.

  • char (str) – Character to dedupe.

reformat(self, fill_char: Optional[str] = None, dedupe_special: bool = True)str

Reformat column for a load to the database.

Parameters
  • fill_char (str) – Character to replace special characters and whitespace with; defaults to _.

  • dedupe_special (bool) – Dedupe consecutive special characters; defaults to True.

class snowmobile.core.Diff(sn: snowmobile.core.connection.Snowmobile = None, **kwargs)

Bases: snowmobile.core.qa.QA

QA class for comparison of values within a table based on partitioning on a field.

partition_on

Column name to partition data on before comparing the partitioned datasets; defaults to ‘src_description`.

Type

str

end_index_at

Column name that marks the last column to use as an index column when joining the partitioned datasets back together.

Type

str

compare_patterns

Regex patterns to match columns on that should be included in comparison (numeric columns you’re running QA on).

Type

list

ignore_patterns

Regex patterns to match columns on that should be ignored both for the comparison and the index.

Type

list

generic_metric_col_nm

Column name to use for the melted field names; defaults to ‘Metric’.

Type

str

compare_cols

Columns that are used in comparison once statement is executed and parsing is applied.

Type

list

drop_cols

Columns that are dropped once statement is executed and parsing is applied.

Type

list

idx_cols

Columns that are used for the index to join the data back together once statement is executed and parsing is applied.

Type

list

ub_raw

Maximum absolute raw difference (upper bound) that two fields that are being compared can differ from each other without causing a failure.

Type

float

ub_perc

Maximum absolute percentage difference (upper bound) that two comparison fields can differ from each other without causing a failure.

Type

float

Instantiates a qa-diff statement.

Parameters
  • delta_column_suffix (str) – Suffix to add to columns that comparison is being run on; defaults to ‘Delta’.

  • partition_on (str) – Column to partition the data on in order to compare.

  • end_index_at (str) – Column name that marks the last column to use as an index when joining the partitioned datasets back together.

  • compare_patterns (list) – Regex patterns matching columns to be included in comparison.

  • ignore_patterns (list) – Regex patterns to match columns on that should be ignored both for the comparison and the index.

  • generic_metric_col_nm (str) – Column name to use for the melted field names; defaults to ‘Metric’.

  • raw_upper_bound (float) – Maximum absolute raw difference that two fields that are being compared can differ from each other without causing a failure.

  • percentage_upper_bound (float) – Maximum absolute percentage difference that two comparison fields can differ from each other without causing a failure.

split_cols(self)snowmobile.core.qa.Diff

Post-processes results returned from a qa-diff statement.

Executes private methods to split columns into:
  • Index columns

  • Drop columns

  • Comparison columns

Then runs checks needed to ensure minimum requirements are met in order for a valid partition/comparison to be made.

property partitioned_by(self)Set[Any]

Distinct values within the partition_on column that data is partitioned by.

static partitions_are_equal(partitions: Dict[str, pd.DataFrame], abs_tol: float, rel_tol: float)bool

Evaluates if a dictionary of DataFrames are identical.

Parameters
  • partitions (Dict[str, pd.DataFrame]) – A dictionary of DataFrames returned by snowmobile.DataFrame().

  • abs_tol (float) – Absolute tolerance for difference in any value amongst the DataFrames being compared.

  • rel_tol (float) – Relative tolerance for difference in any value amongst the DataFrames being compared.

Returns (bool):

Indication of equality amongst all the DataFrames contained in partitions.

process(self)snowmobile.core.qa.Diff

Post-processing for Diff-specific results.

class snowmobile.core.Empty(sn: snowmobile.core.connection.Snowmobile, **kwargs)

Bases: snowmobile.core.qa.QA

QA class for verification that a statement’s results are empty.

The most widely applicable use of Empty is for simple verification that a table’s dimensions are as expected.

Initialize self. See help(type(self)) for accurate signature.

process(self)snowmobile.core.qa.QA

Over-ride method; checks if results are empty and updates outcome

class snowmobile.core.SnowFrame(df: pandas.DataFrame)

Bases: snowmobile.core.Generic

Extends a DataFrame with a .snf entry point.

shared_cols(self, df2: pandas.DataFrame)List[Tuple[pd.Series, pd.Series]]

Returns list of tuples containing column pairs that are common between two DataFrames.

static series_max_diff_abs(col1: pandas.Series, col2: pandas.Series, tolerance: float)bool

Determines if the max absolute difference between two pandas.Series is within a tolerance level.

static series_max_diff_rel(col1: pandas.Series, col2: pandas.Series, tolerance: float)bool

Determines if the maximum relative difference between two pandas.Series is within a tolerance level.

df_max_diff_abs(self, df2: pandas.DataFrame, tolerance: float)bool

Determines if the maximum absolute difference between any value in the shared columns of 2 DataFrames is within a tolerance level.

df_max_diff_rel(self, df2: pandas.DataFrame, tolerance: float)bool

Determines if the maximum relative difference between any value in the shared columns of 2 DataFrames is within a tolerance level.

df_diff(self, df2: pandas.DataFrame, abs_tol: Optional[float] = None, rel_tol: Optional[float] = None)bool

Determines if the column-wise difference between two DataFrames is within a relative or absolute tolerance level.

Note

  • df1 and df2 are assumed to have a shared, pre-defined index.

  • Exactly one of abs_tol and rel_tol is expected to be a a valid float; the other is expected to be None.

  • If valid float values are provided for both abs_tol and rel_tol, the outcome of the maximum absolute difference with respect to abs_tol will be returned regardless of the value of rel_tol.

Parameters
  • df2 (pd.DataFrame) – 2nd DataFrame for comparison.

  • abs_tol (float) – Absolute tolerance; default is None.

  • rel_tol (float) – Relative tolerance; default is None.

Returns (bool):

Boolean indicating whether or not difference is within tolerance.

partitions(self, on: str)Dict[str, pd.DataFrame]

Returns a dictionary of DataFrames given a DataFrame and a partition column.

Note

  • The number of distinct values within partition_on column will be 1:1 with the number of partitions that are returned.

  • The partition_on column is dropped from the partitions that are returned.

  • The depth of a vertical concatenation of all partitions should equal the depth of the original DataFrame.

Parameters

on (str) – The column name to use for partitioning the data.

Returns (Dict[str, pd.DataFrame]):

Dictionary of {(str) partition_value: (pd.DataFrame) associated subset of df}

ddl(self, table: str)str

Returns a string containing ‘create table’ DDL given a table name

lower(self, col: Optional[str] = None)pandas.DataFrame

Lower cases all column names or all values within col if pr.

upper(self, col: Optional[str] = None)pandas.DataFrame

Upper cases all column names or all values within col if pr.

reformat(self)

Re-formats DataFrame’s columns via Column.reformat().

append_dupe_suffix(self)

Adds a trailing index number ‘_i’ to duplicate column names.

to_list(self, col: Optional[str] = None, n: Optional[int] = None)List

Succinctly retrieves a column as a list.

Parameters
  • col (str) – Name of column.

  • n (int) – Number of records to return; defaults to full depth of column.

add_tmstmp(self, col_nm: Optional[str] = None)pandas.DataFrame

Adds a column containing the current timestamp to a DataFrame.

Parameters

col_nm (str) – Name for column; defaults to LOADED_TMSTMP.

property original(self)pandas.DataFrame

Returns the DataFrame in its original form (drops columns added by SnowFrame and reverts to original column names).

property has_dupes(self)bool

DataFrame has duplicate column names.

cols_matching(self, patterns: List[str], ignore_patterns: List[str] = None)List[str]

Returns a list of columns given a list of patterns to find.

Parameters
  • patterns (List[str]) – List of regex patterns to match columns on.

  • ignore_patterns (List[str]) – Optional list of regex patterns to exclude.

Returns (List[str]):

List of columns found/excluded.

cols_ending(self, nm: str, ignore_patterns: Optional[List] = None)List[str]

Returns all columns up to nm in a DataFrame.

Parameters
  • nm (str) – Name of column to end index at.

  • ignore_patterns (List[str]) – Optional list of regex patterns to exclude in the list that’s returned; primarily used to for getting end-index-at list while excluding src_description.

Returns (List[str]):

List of column names matching criterion.

class snowmobile.core.SQL(_query_func: Callable, _cfg: snowmobile.core.configuration.Configuration, nm: Optional[str] = None, schema: Optional[str] = None, obj: Optional[str] = None, auto_run: Optional[bool] = True)

Bases: snowmobile.core.Generic

SQL class for generation & execution of common sql commands.

Intended to be interacted with as a parent of Snowmobile.

Note

  • All arguments except for sn are optional.

  • The benefit of setting the other attributes on an instance of SQL is to (optionally) avoid passing the same information to multiple methods when generating a variety of statements around the same object.

nm

Object name to use in generated sql (e.g. ‘some_table_name’)

Type

str

obj

Object type to use in generated sql (e.g. ‘table’)

Type

str

schema

Schema to use when dot-prefixing sql; defaults to the schema with which the sn is connected to.

Type

str

auto_run

Indicates whether to automatically execute the sql generated by a given method; defaults to True

Type

bool

Initializes a snowmobile.SQL object.

info_schema(self, loc: str, where: Optional[List[str]] = None, fields: Optional[List[str]] = None, order_by: Optional[List] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Generic case of selecting from information schema location.

table_info(self, nm: Optional[str] = None, fields: List[str] = None, restrictions: Dict[str, str] = None, order_by: List[Optional[str, int]] = None, all_schemas: bool = False, run: Optional[bool] = None)Union[str, pd.DataFrame]

Query information_schema.tables for a given table or view.

Parameters
  • nm (str) – Table name, including schema if creating a stage outside of the current schema.

  • fields (List[str]) – List of fields to include in returned results (e.g. [‘table_name’, ‘table_type’, ‘last_altered’])

  • restrictions (List[str]) – List of conditionals typed as literal components of a where clause (e.g. [“table_type = ‘base table’”, ‘last_altered::date = current_date()’] ).

  • order_by (List[str]) – List of fields or their ordinal positions to order the results by.

  • all_schemas (bool) – Include tables/views from all schemas; defaults to False.

  • run (bool) – Determines whether to run the generated sql or not; defaults to None which will reference the current value of the auto_run attribute which defaults to True.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

column_info(self, nm: Optional[str] = None, fields: Optional[List] = None, restrictions: Optional[Dict] = None, order_by: Optional[List] = None, all_schemas: bool = False, run: Optional[bool] = None)Union[str, pd.DataFrame]

Query information_schema.columns for a given table or view.

Parameters
  • nm (str) – Table name, including schema if creating a stage outside of the current schema.

  • fields (List[str]) – List of fields to include in returned results (e.g. [‘ordinal_position’, ‘column_name’, ‘data_type’])

  • restrictions (List[str]) – List of conditionals typed as literal components of a where clause (e.g.[“regexp_count(lower(column_name), ‘tmstmp’) = 0”]).

  • order_by (List[str]) – List of fields or their ordinal positions to order the results by.

  • all_schemas (bool) – Include tables/views from all schemas; defaults to False.

  • run (bool) – Determines whether to run the generated sql or not; defaults to None which will reference the current value of the auto_run attribute which defaults to True.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

columns(self, nm: Optional[str] = None, from_info_schema: bool = False, lower: bool = False, run: Optional[bool] = None)Union[str, List]

Returns an ordered list of columns for a table or view.

Note

  • Default behavior is to retrieve the columns for a table or view by selecting a single sample record and returning the column index from the DataFrame that’s returned which is much faster than selecting the column_names from information_schema.columns pulling column names from the information schema

  • This can be changed by passing from_info_schema=True.

Parameters
  • nm (str) – Name of table or view, including schema if the table or view is outside of the current schema.

  • from_info_schema (bool) – Indicates whether to retrieve columns via the information_schema.columns or by selecting a sample record from the table or view; defaults to False.

  • lower (bool) – Lower case each column in the list that’s returned.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, List]):
Either:
  1. An ordered list of columns for the table or view, or

  2. The query against the table or view as a str of sql.

select(self, nm: Optional[str] = None, fields: Optional[List[str]] = None, apply: Optional[List[Tuple[str, str]]] = None, n: Optional[int] = None, run: Optional[bool] = None, **kwargs)Union[str, pd.DataFrame]

Generic select statement.

Parameters
  • nm (str) – Table to select from, including schema if the table is outside of the current schema

  • fields (Optional[List[str]]) – Select these fields (optional).

  • apply (Optional[List[Tuple[str, str]]]) –

    Select aggregations of these fields.

    apply [

    (this_func, to_this_field, [as_alias]), (.., .., [..]),

    ]

    • apply should be provided as a list of tuples, each containing a minimum of 2 items (respectively) representing the aggregate function to apply and the field to which it should be applied

    • By default, the aggregated result inherits the name of the field being aggregated, including any qualifier (optionally) provided with the field name or an explicit alias included as a 3rd item within the tuple

      The following snippet exhaustively illustrates the functionality described above

    sn.select(
       nm='sandbox.sample_table',
       apply=[
           ('count', 'col1'),
           ('count', 'distinct col1'),
           ('count', 'distinct col1', 'col1_dst'),
       ],
       run=False,
    )
    >>>
    select
     count(col1)
       as col1
     ,count(distinct col1)
       as distinct_col1
     ,count(distinct col1)
       as col1_dst
    from sandbox.sample_table
    

  • n (int) – Number of records to return, implemented as a ‘limit’ clause in the query; defaults to 1.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):

Either:

  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

exists(self, nm: Optional[str] = None)bool

Checks the existence of a table or view.

Parameters

nm (str) – Name of table or view, including schema if the table or view is outside of the current schema.

Returns (bool):

Boolean indication of whether or not the table or view exists.

is_distinct(self, nm: Optional[str] = None, field: Optional[str] = None)bool

Checks if table nm is distinct on column on_col

Parameters
  • nm (str) – Table name.

  • field (str) – Column name.

count(self, nm: Optional[str] = None, of: Optional[str] = None, dst_of: Optional[str] = None, as_perc: Optional[bool] = None, run: Optional[bool] = None)Union[int, float]

Number of records within a table or view.

Parameters
  • nm (str) – Table name, including schema if querying outside current schema.

  • of (str) – Column name (indistinct).

  • dst_of (str) – Column name (distinct).

  • as_perc (bool) – Option to return distinct count of the dst_of column as a percentage of the namespace depth of the table or view.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

show(self, obj: str, in_loc: Optional[str] = None, names: bool = False, run: Optional[bool] = None, **kwargs)Union[pd.DataFrame, List[str], str]

Show schema objects of typ ‘obj’, optionally ‘in_loc’.

Parameters
  • obj (str) – Schema object type (‘tables’, ‘file formats’, etc).

  • in_loc (str) – Snowflake location (‘in schema sandbox’, ‘in database prod’, etc).

  • names (bool) – Return a list of schema object names only (‘name’ field).

  • run (bool) – Execute the generated sql or return it as a string.

Returns (Union[pd.DataFrame, str]):
Either:
  1. The results of the query as a pandas.DataFrame

  2. The ‘names’ column of the results returned as a list

  3. The generated query as a str of sql

ddl(self, nm: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None)str

Query the DDL for an schema object.

Parameters
  • nm (str) – Name of the object to get DDL for, including schema if object is outside of the current schema.

  • obj (str) – Type of object to get DDL for (e.g. ‘table’, ‘view’, ‘file-format’).

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (str):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

comment(self, nm: Optional[str] = None, obj: Optional[str] = None, set_as: Optional[str] = None, from_json: bool = False, as_json: bool = False, run: Optional[bool] = None, **kwargs)Union[str, Dict]

Get or set comment on a schema object.

Parameters
  • nm (str) – Name of the schema object, including schema prefix if object is outside implicit scope of the current connection.

  • obj (str) – Type of schema object (e.g. ‘table’, ‘schema’, etc).

  • set_as (str) – Content to set as comment on schema object.

  • from_json (bool) – Parse schema object comment as a string of json and return it as a dictionary.

  • as_json (bool) – Dump contents of ‘set_as’ to a string of json prior to setting comment.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

  • **kwargs – Keyword argument to pass to json.loads(comment) if from_json=True.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The schema object comment as a str

  2. The generated query as a str of sql.

  3. The schema object comment as a dictionary if from_json=True

last_altered(self, nm: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.Timestamp]

Last altered timestamp for a table or view.

Parameters
  • nm (str) – Table name, including schema if creating a stage outside of the current schema.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

truncate(self, nm: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Truncate a table.

Parameters
  • nm (str) – Name of table, including schema if the table is outside of the current schema.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

drop(self, nm: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Drop a Snowflake object.

Parameters
  • nm (str) – Schema object’s name.

  • obj (str) – Type of schema object (e.g. ‘table’, ‘view’, or ‘schema’)

  • run (bool) – Execute generated statement; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

clone(self, nm: Optional[str] = None, to: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None, replace: bool = False)Union[str, pd.DataFrame]

Clone a Snowflake object.

Warning

Note

  • In this specific method, the value provided to nm and to can be a single object name, a single schema, or both in the form of obj_schema.obj_name depending on the desired outcome.

  • Additionally, at least one of the nm or to arguments must be pr.

  • The defaults for the target object are constructed such that users can either:

    1. Clone objects to other schemas that inherit the source object’s name without specifying so in the to argument, or

    2. Clone objects within the current schema that inherit the source object’s schema without specifying so in the to argument.

  • If providing a schema without a name to either argument, prefix the value provided with __ to signify it’s a schema and not a lower-level object to be cloned.

    • e.g. providing nm=’sample_table’ and to=’__sandbox’ will clone sample_table from the current schema to sandbox.sample_table.

  • An assertion error will be raised raised if neither argument is specified as this would result in a command to clone an object and store it in an object that has the same name & schema as the object being cloned.

Parameters
  • nm (str) – Name of the object to clone, including schema if cloning an object outside of the current schema.

  • to (str) – Target name for cloned object, including schema if cloning an object outside of the current schema.

  • obj (str) – Type of object to clone (e.g. ‘table’, ‘view’, ‘file-format’); defaults to table.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

  • replace (bool) – Indicates whether to replace an existing stage if pre-existing; default is False.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

create_stage(self, nm_stage: str, nm_format: str, replace: bool = False, run: Optional[bool] = None)Union[str, pd.DataFrame]

Create a staging table.

Parameters
  • nm_stage (str) – Name of stage to create, including schema if creating a stage outside of the current schema.

  • nm_format (str) – Name of file format to specify for the stage, including schema if using a format from outside of the current schema.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

  • replace (bool) – Indicates whether to replace an existing stage if pre-existing; default is False.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

put_file_from_stage(self, path: Union[Path, str], nm_stage: str, options: Optional[Dict] = None, ignore_defaults: bool = False, run: Optional[bool] = None)Union[str, pd.DataFrame]

Generates a ‘put’ command into a staging table from a local file.

Parameters
  • path (Union[Path, str]) – Path to local data file as a pathlib.Path or string.

  • nm_stage (str) – Name of the staging table to load into.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

  • options (dict) – Optional arguments to add to put statement in addition to the values specified in the loading.put section of snowmobile.toml.

  • ignore_defaults (bool) – Option to ignore the values specified in snowmobile.toml; defaults to False.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

copy_into_table_from_stage(self, nm: str, nm_stage: str, options: Optional[Dict] = None, ignore_defaults: bool = False, run: Optional[bool] = None)Union[str, pd.DataFrame]

Generates a command to copy data into a table from a staging table.

Parameters
  • nm (str) – Name of the object to drop, including schema if creating a stage outside of the current schema.

  • nm_stage (str) – Name of the staging table to load from.

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

  • options (dict) – Optional arguments to add to put statement in addition to the values specified in the loading.put section of snowmobile.toml.

  • ignore_defaults (bool) – Option to ignore the values specified in snowmobile.toml; defaults to False.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

current(self, obj: str, run: Optional[bool] = None)Union[str, Union[str, int]]

Generic implementation of ‘select current’ for session-based objects.

Parameters
  • obj (str) – Type of object to retrieve information for (schema, session, ..).

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

current_session(self, run: Optional[bool] = None)Union[str, pd.DataFrame]

Select the current session.

current_schema(self, run: Optional[bool] = None)Union[str, pd.DataFrame]

Select the current schema.

current_database(self, run: Optional[bool] = None)Union[str, pd.DataFrame]

Select the current database.

current_warehouse(self, run: Optional[bool] = None)Union[str, pd.DataFrame]

Select the current warehouse.

current_role(self, run: Optional[bool] = None)Union[str, pd.DataFrame]

Select the current role.

use(self, obj: str, nm: str, run: Optional[bool] = None)

Generic implementation of ‘use’ command for schema objects.

Parameters
  • nm (str) – Name of object to use (schema name, warehouse name, role name, ..).

  • obj (str) – Type of object to use (schema, warehouse, role, ..).

  • run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.

Returns (Union[str, pd.DataFrame]):
Either:
  1. The results of the query as a pandas.DataFrame, or

  2. The generated query as a str of sql.

use_schema(self, nm: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Use schema command.

use_database(self, nm: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Use database command.

use_warehouse(self, nm: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Use warehouse command.

use_role(self, nm: Optional[str] = None, run: Optional[bool] = None)Union[str, pd.DataFrame]

Use role command.

static order(by: List[Union[int, str]])str

Generates ‘order by’ clause from a list of fields or field ordinal positions.

static where(restrictions: Dict)str

Generates a ‘where’ clause based on a dictionary of restrictions.

Parameters

restrictions (dict) – A dictionary of conditionals where each key/value pair respectively represents the left/right side of a condition within a ‘where’ clause.

Returns (str):

Formatted where clause.

static fields(fields: Optional[List[str]] = None)str

Utility to generate fields within a ‘select’ statement.

class snowmobile.core.Markup(sn: snowmobile.core.connection.Snowmobile, path: pathlib.Path, contents: Dict[int, Union[Statement, Marker]], nm: Optional[str] = None, prefix: Optional[str] = None, suffix: Optional[str] = None, root_dir: Optional[Union[str, Path]] = None, sub_dir: Optional[str] = None, incl_sql: bool = True, incl_markers: bool = True, incl_sql_tag: bool = False, incl_exp_ctx: bool = True, result_wrap: Optional[str] = None)

Bases: snowmobile.core.Generic

Contains all sections within the context of a Script.

Parameters
  • sn (Snowmobile) – A Snowmobile instance.

  • path (Path) – A full path to the sql file that script was instantiated from.

  • contents (Dict[int, Union[Statement, Marker]]) – A dictionary of the script’s contents (st and markers) by index position.

  • nm (Optional[str]) – Alternate file name to use; defaults to path.name.

  • prefix (Optional[str]) – Prefix to prepend to original file name when exporting.

  • suffix (Optional[str]) – Suffix to append to original file name when exporting.

  • root_dir (Optional[Union[str, Path]]) – Alternate target directory for exports; defaults to ./.snowmobile where . is the directory containing the sql file that the script was created from.

  • sub_dir (Optional[str]) – Alternate sub-directory name; defaults to path.name where path is a full Path to the sql file that the script was created from.

  • incl_sql (bool) – Include statements in export.

  • incl_markers (bool) – Include markers in export.

  • incl_sql_tag (bool) – Include the raw wrap in the sql that is rendered in the md export.

  • incl_exp_ctx (bool) – Include (configurable) disclaimer at the top of exported sql file.

exported

List of file paths that current instance has exported to.

Type

List[Path]

created

List of directory paths that current instance has created (should mostly apply for initial scaffolding build on first run only).

Type

List[Path]

property export_dir(self)pathlib.Path

Documentation sub-directory; .snowmobile by default.

property sections(self)Dict[int, Section]

Dictionary of all sections by index position.

property markdown(self)str

Full markdown file as a string.

property sql(self)

SQL for save.

save(self, md: bool = True, sql: bool = True)None

Save files to disk.

Parameters
  • md (bool) – Export a generated markdown file.

  • sql (bool) – Export a generated sql file.

class snowmobile.core.Script(sn: Optional[Snowmobile] = None, path: Optional[Path, str] = None, sql: Optional[str] = None, as_generic: bool = False, delay: bool = True, **kwargs)

Bases: snowmobile.core.Generic

Parser and operator of local sql files.

Parameters
  • sn (snowmobile.core.connection.Snowmobile) – An instance of Snowmobile.

  • path (Optional[str]) – A full path to a sql file or readable text file containing valid sql code.

  • path – A raw string of valid sql code as opposed to reading from a path.

  • as_generic (bool) – Instantiate all statements as generic st; skips all checks for a mapping of a statement anchor to a derived statement class to instantiate in the place of a generic Statement.

  • delay (bool) – Delay connection of the Snowmobile; only applicable if the sn argument is omitted and Script is instantiating a Snowmobile in its absence.

  • **kwargs – Any keyword arguments to pass to Snowmobile; only applicable if the sn argument is omitted and Script is instantiating a Snowmobile in its absence

sn

An instance of Snowmobile

Type

snowmobile.core.connection.Snowmobile

patterns

Configured patterns from snowmobile.toml.

Type

snowmobile.core.cfg.script.Pattern

as_generic

Instantiate all statements as generic st; skips all checks for a mapping of a statement anchor to a derived statement class to instantiate in the place of a generic Statement.

Type

bool

filters

Dictionary of filters that have been passed to the current instance of snowmobile.core.Script.

Type

Dict[Any[str, int], Dict[str, Set]]

markers

Dictionary of all markers found in the script.

Type

Dict[int, cfg.Marker]

path

Path to sql file (e.g. full/path/to/script.sql).

Type

Path

name

Name of sql file (e.g. script.sql).

Type

str

source

Raw sql text of script; will be the text contained in the raw sql file when initially read from source and reflect any modifications to the script’s contents made post-instantiation.

Type

str

read(self, path: pathlib.Path = None)snowmobile.core.script.Script

Runs quick path validation and reads in a sql file as a string.

A valid path must be provided if the script.path attribute hasn’t been set; ValueErrors will be thrown if neither is valid.

Parameters

path (pathlib.Path) – Full path to a sql object.

from_str(self, sql: str, name: str, directory: pathlib.Path = Path.cwd())snowmobile.core.script.Script

Instantiates a raw string of sql as a script.

source(self, original: bool = False)str

The script’s sql as a raw string.

parse_one(self, s: Union[sqlparse.sql.Statement, str], index: Optional[int] = None, nm: Optional[str] = None)None

Adds a statement object to the script.

Default behavior will only add sqlparse.sql.Statement objects returned from script.source_stream.

clean_parse() utility function is utilized so that generated sql within Python can be inserted back into the script as raw strings.

Parameters
  • s (Union[sqlparse.sql.Statement, str]) – A sqlparse.sql.Statement object or a raw string of SQL for an individual statement.

  • index (int) – Index position of the statement within the script; defaults to n + 1 if index is not provided where n is the number of statements within the script at the time parse_one() is called.

  • nm (Optional[str]) – Optionally provided the name of the statement being added; the script instance will treat this value as if it were provided within an in-script wrap.

parse_stream(self, stream: str)None

Parses a stream of sql and adds onto existing Script contents.

filter(self, incl_kw: Optional[List[str], str] = None, incl_obj: Optional[List[str], str] = None, incl_desc: Optional[List[str], str] = None, incl_anchor: Optional[List[str], str] = None, incl_nm: Optional[List[str], str] = None, excl_kw: Optional[List[str], str] = None, excl_obj: Optional[List[str], str] = None, excl_desc: Optional[List[str], str] = None, excl_anchor: Optional[List[str], str] = None, excl_nm: Optional[List[str], str] = None, as_id: Optional[Union[str, int]] = None, from_id: Optional[Union[str, int]] = None, last: bool = False)ContextManager[Script]

Subset the script based on attributes of its st.

script.filter() returns a modified instance of script that can be operated on within the context defined.

Note

Keyword arguments beginning with incl or excl expect a string or a list of strings containing regex patterns with which to check for a match against the associated attribute of its st’ Name.

Parameters
  • incl_kw – Include only kw

  • incl_obj – Include only obj

  • incl_desc – Include only desc

  • incl_anchor – Include only anchor

  • incl_nm – Include only nm

  • excl_kw – Exclude kw

  • excl_obj – Exclude obj

  • excl_desc – Exclude desc

  • excl_anchor – Exclude anchor

  • excl_nm – Exclude nm

  • as_id – ID to assign the filters passed to method; used to populated the filters attribute

  • from_id – ID previously used on the same instance of Script from which to populate filtered arguments

  • last – Re-use the last set of filters passed to context manager.

Returns (Script):

The instance of script based on the context imposed by arguments pr.

property depth(self)int

Count of statements in the script.

property lines(self)int

Number of lines in the script

property excluded(self)

All statements by index position excluded from the current context.

property executed(self)Dict[int, Statement]

Executed statements by index position included in the current context.

reset(self, index: bool = False, ctx_id: bool = False, in_context: bool = False, scope: bool = False, _filter: bool = False)snowmobile.core.script.Script

Resets indices and scope on all statements to their state as read from source.

Invoked before exiting filter() context manger to reverse the revised indices set by index_to() and inclusion/ exclusion scope set by Statement.Name.scope().

property duplicates(self)Dict[str, int]

Dictionary of indistinct statement names/tags within script.

s(self, _id: Optional[str, int] = None)Any[Statement, Empty, Diff]

Fetch a single statement by _id.

property st(self)Dict[Union[int, str], Statement]

Accessor for all statements.

dtl(self, full: bool = False, excluded: bool = False, title: bool = True, r: bool = False)Union[str, None]

Prints summary of statements within the current scope to console.

property first_s(self)

First statement by index position.

property last_s(self)

Last statement by index position

property first(self)Union[Statement, Empty, Diff]

First statement executed.

property last(self)Union[Statement, Empty, Diff]

Last statement executed.

doc(self, nm: Optional[str] = None, prefix: Optional[str] = None, suffix: Optional[str] = None, incl_markers: Optional[bool] = True, incl_sql: Optional[bool] = True, incl_exp_ctx: Optional[bool] = True, result_wrap: Optional[str] = None)snowmobile.core.Markup

Returns a Markup from the script.

Parameters
  • nm (Optional[str]) – Alternate file name to use.

  • prefix (Optional[str]) – Prefix for file name.

  • suffix (Optional[str]) – Suffix for file name.

  • incl_markers (Optional[bool]) – Include markers in exported files.

  • incl_sql (Optional[bool]) – Include sql in exported files.

  • incl_exp_ctx (Optional[bool]) – Include disclaimer of programmatic save in exported sql file.

Returns

A Markup instance based on the contents included in the script’s context.

ids(self, _id: Optional[Union[Tuple, List]] = None)List[int]

Utility function to get a list of statement IDs given an _id.

Invoked within script.run() if the _id parameter is either a:
  1. tuple of integers (lower and upper bound of statement indices to run)

  2. list of integers or strings (statement names or indices to run)

  3. default=None; returns all statement indices within scope if so

Parameters

_id (Union[Tuple, List]) – _id field provided to script.run() if it’s neither an integer or a string.

Returns (List[int]):

A list of statement indices to run.

run(self, _id: Optional[str, int, Tuple[int, int], List] = None, as_df: bool = True, on_error: Optional[str] = None, on_exception: Optional[str] = None, on_failure: Optional[str] = None, lower: bool = True, render: bool = False, **kwargs)None

Performs statement-by-statement execution of the script’s contents.

Executes script’s contents that are included within its current context and any (optional) value passed to the _id argument.

Note

Keyword arguments on_exception and on_failure are only applicable to derived classes of Statement (e.g., those within snowmobile.core.qa by default).

Parameters
  • _id (Optional[str, int, Tuple[int, int], List]) –

    Identifier for statement(s) to execute, can be either:
    • None (default); execute all statements

    • A single statement’s nm

    • A single statement’s index position

    • A tuple of lower/upper index bounds of statements to execute

    • A list of statement names or index positions to execute

  • as_df (bool) – Store statement’s results as a DataFrame; defaults to True

  • on_error (Optional[str]) – Action to take on execution error; providing c will continue execution as opposed to raising exception.

  • on_exception (Optional[str]) – Action to take on post-processing error from a derived Statement; providing c will continue execution as opposed to raising exception.

  • on_failure (Optional[str]) – Action to take on failure of post-processing assertion from a derived Statement; providing c will continue execution as opposed to raising exception.

  • lower (bool) – Lower-case columns in results returned if as_df=True.

  • render (bool) – Render sql executed as markdown; only applicable in Jupyter/iPython environments.

  • **kwargs

items(self, by_index: bool = True, ignore_scope: bool = False, statements: bool = True, markers: bool = False, validate: bool = True)ItemsView[Union[int, str], Union[Statement, Marker]]

Dunder items.

keys(self, **kwargs)KeysView[Union[int, str]]

Access keys of items only.

values(self, **kwargs)ValuesView[Union[int, str]]

Access values of items only.

dict(self, **kwargs)Dict

Unpacking items view into an actual dictionary.

class snowmobile.core.Table(df: pandas.DataFrame, table: str, sn: Optional[Snowmobile] = None, if_exists: Optional[str] = None, as_is: bool = False, path_ddl: Optional[Union[str, Path]] = None, path_output: Optional[str, Path] = None, file_format: Optional[str] = None, incl_tmstmp: Optional[bool] = None, tmstmp_col_nm: Optional[str] = None, reformat_cols: Optional[bool] = None, validate_format: Optional[bool] = None, validate_table: Optional[bool] = None, upper_case_cols: Optional[bool] = None, lower_case_table: Optional[bool] = None, keep_local: Optional[bool] = None, on_error: Optional[str] = None, check_dupes: Optional[bool] = None, load_copy: Optional[bool] = None, **kwargs)

Bases: snowmobile.core.Generic

Constructed with a DataFrame and a table name to load into.

The df and table’s compatibility can be inspected prior to calling the Table.load() method or by providing as_is=True` when instantiating the object; the latter will kick off the loading process invoked by .load() based on the parameters provided to snowmobile.Table().

Parameters
  • df (DataFrame) – The DataFrame to load.

  • table (str) – The table name to load df into.

  • sn (Optional[Snowmobile]) – An instance of Snowmobile; can be used to load a table on a specific connection or from a specific snowmobile.toml file.

  • if_exists (Optional[str]) – Action to take if table already exists - options are fail, replace, append, and truncate; defaults to append.

  • as_is (bool) – Load df into table based on the parameters provided to Table without further pre-inspection by the user; defaults to False.

  • path_ddl (Optional[Path]) – Alternate path to file format DDL to use for load.

  • keep_local (Optional[bool]) – Keep local file that is written out as part of the bulk loading process; defaults to False.

  • path_output (Optional[str Path]) – Path to write output local file to; defaults to a generated file name exported in the current working directory.

  • file_format (Optional[str]) – The name of the file_format to use when loading df; defaults to snowmobile_default_psv.

  • incl_tmstmp (Optional[bool]) – Include timestamp of load as part of table; defaults to True.

  • tmstmp_col_nm (Optional[str]) – Name to use for load timestamp if incl_tmstmp=True; defaults to loaded_tmstmp.

  • upper_case_cols (Optional[bool]) – Upper case columns of df when loading into table; defaults to True.

  • reformat_cols (Optional[bool]) –

    Reformat applicable columns of df to be DB-compliant; defaults to True.

    Reformatting primarily entails:
    • Replacing spaces and special characters with underscores

    • De-duping consecutive special characters

    • De-duping repeated column names; adds an _i suffix to duplicate fields where i is the nth duplicate name for a field

  • validate_format (Optional[bool]) –

    Validate the file format being used prior to kicking off the load; defaults to True.

    Validation entails:
    • Checking if the file format being used already exists based on formats accessible to the current connection

    • Executing DDL for the file format being used if not, pulled from the DDL ext-location and the statement name create file format~{format name}

    Tip

    Providing validate_format=False will speed up loading time when batch-loading into an existing table by skipping this step

  • validate_table (Optional[bool]) –

    Perform validations of df against table prior to kicking off the loading process; defaults to True.

    Validation entails:
    • Checking the existence of table; no further validation is performed if it does not exist

    • Compares the columns of df to the columns of table and stores results for use during loading process

    Note

    Table validation results are used in conjunction with the if_exists parameter to determine the desired behavior based on the (potential) existence of table and its compatibility with df.

    Tip

    Providing validate_table=False will speed up loading time time when batch-loading into an existing table

  • lower_case_table (Optional[bool]) – Lower case table name; defaults to False.

  • on_error (Optional[str]) – Action to take if an exception is encountered as part of the validating or loading process - providing on_error='c' will continue past an exception as opposed to raising it; defaults to None meaning any exception encountered will be raised

  • check_dupes (Optional[bool]) – Check for duplicate field names in df; defaults to True.

  • load_copy (Optional[bool]) – Alter and load a deep copy of df as opposed to the df in-memory as passed to the parameter; defaults to True.

db_responses

Responses from database during loading process.

Type

Dict[str, str]

loaded

Table was loaded successfully.

Type

bool

load(self, if_exists: Optional[str] = None, from_script: pathlib.Path = None, verbose: bool = True, **kwargs)snowmobile.core.table.Table

Loads df into table.

Parameters
  • if_exists (Optional[str]) – Determines behavior to take if the table being loaded into already exists; defaults to append; options are replace, append, truncate, and fail

  • from_script (Optional[Union[Path, str]]) – Path to sql file containing custom DDL for table; DDL is assumed to have a valid statement name as is parsed by Script and following the naming convention of create table~TABLE where TABLE is equal to the value provided to the table keyword argument

  • verbose (bool) – Verbose console output; defaults to True

Returns (Table):

The Table after attempting load of df into table; a successful load can be verified by inspecting loaded

property exists(self)bool

Indicates if the target table exists.

col_diff(self, mismatched: bool = False)Dict[int, Tuple[str, str]]

Returns diff detail of local DataFrame to in-warehouse table.

property cols_match(self)bool

Indicates if columns match between DataFrame and table.

load_statements(self, from_script: pathlib.Path)List[str]

Generates exhaustive list of the statements to execute for a given instance of loading a DataFrame.

to_local(self, quote_all: bool = True)None

Export to local file via configuration in snowmobile.toml.

property tm_load(self)int

Seconds elapsed during loading.

property tm_validate_load(self)int

Seconds elapsed during validation.

property tm_total(self)int

Total seconds elapsed for load.

validate(self, if_exists: str)None

Validates load based on current state through a variety of operations.

Parameters

if_exists (str) – Desired behavior if table already exists; intended to be passed in from table.load() by default.