snowmobile
¶
An analytics-focused wrapper around the snowflake.connector for Python.
Subpackages¶
Package Contents¶
Classes¶
SQL class for generation & execution of common sql commands. | |
A parsed snowmobile.toml file. | |
Primary method of statement execution and accessor to parsed snowmobile.toml. | |
Primary method of statement execution and accessor to parsed snowmobile.toml. | |
Constructed with a | |
Parser and operator of local sql files. | |
Base class for all |
- class
snowmobile.
SQL
(_query_func: Callable, _cfg: snowmobile.core.configuration.Configuration, nm: Optional[str] = None, schema: Optional[str] = None, obj: Optional[str] = None, auto_run: Optional[bool] = True)¶ Bases:
snowmobile.core.Generic
SQL class for generation & execution of common sql commands.
Intended to be interacted with as a parent of
Snowmobile
.Note
All arguments except for
sn
are optional.The benefit of setting the other attributes on an instance of
SQL
is to (optionally) avoid passing the same information to multiple methods when generating a variety of statements around the same object.
-
schema
¶ Schema to use when dot-prefixing sql; defaults to the schema with which the
sn
is connected to.- Type
-
auto_run
¶ Indicates whether to automatically execute the sql generated by a given method; defaults to True
- Type
Initializes a
snowmobile.SQL
object.-
info_schema
(self, loc: str, where: Optional[List[str]] = None, fields: Optional[List[str]] = None, order_by: Optional[List] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Generic case of selecting from information schema location.
-
table_info
(self, nm: Optional[str] = None, fields: List[str] = None, restrictions: Dict[str, str] = None, order_by: List[Optional[str, int]] = None, all_schemas: bool = False, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Query
information_schema.tables
for a given table or view.- Parameters
nm (str) – Table name, including schema if creating a stage outside of the current schema.
fields (List[str]) – List of fields to include in returned results (e.g. [‘table_name’, ‘table_type’, ‘last_altered’])
restrictions (List[str]) – List of conditionals typed as literal components of a where clause (e.g. [“table_type = ‘base table’”, ‘last_altered::date = current_date()’] ).
order_by (List[str]) – List of fields or their ordinal positions to order the results by.
all_schemas (bool) – Include tables/views from all schemas; defaults to False.
run (bool) – Determines whether to run the generated sql or not; defaults to None which will reference the current value of the
auto_run
attribute which defaults to True.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
column_info
(self, nm: Optional[str] = None, fields: Optional[List] = None, restrictions: Optional[Dict] = None, order_by: Optional[List] = None, all_schemas: bool = False, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Query
information_schema.columns
for a given table or view.- Parameters
nm (str) – Table name, including schema if creating a stage outside of the current schema.
fields (List[str]) – List of fields to include in returned results (e.g. [‘ordinal_position’, ‘column_name’, ‘data_type’])
restrictions (List[str]) – List of conditionals typed as literal components of a where clause (e.g.[“regexp_count(lower(column_name), ‘tmstmp’) = 0”]).
order_by (List[str]) – List of fields or their ordinal positions to order the results by.
all_schemas (bool) – Include tables/views from all schemas; defaults to False.
run (bool) – Determines whether to run the generated sql or not; defaults to None which will reference the current value of the
auto_run
attribute which defaults to True.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
columns
(self, nm: Optional[str] = None, from_info_schema: bool = False, lower: bool = False, run: Optional[bool] = None) → Union[str, List]¶ Returns an ordered list of columns for a table or view.
Note
Default behavior is to retrieve the columns for a table or view by selecting a single sample record and returning the column index from the DataFrame that’s returned which is much faster than selecting the column_names from
information_schema.columns
pulling column names from the information schemaThis can be changed by passing from_info_schema=True.
- Parameters
nm (str) – Name of table or view, including schema if the table or view is outside of the current schema.
from_info_schema (bool) – Indicates whether to retrieve columns via the
information_schema.columns
or by selecting a sample record from the table or view; defaults to False.lower (bool) – Lower case each column in the list that’s returned.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
- Returns (Union[str, List]):
- Either:
An ordered list of columns for the table or view, or
The query against the table or view as a
str
of sql.
-
select
(self, nm: Optional[str] = None, fields: Optional[List[str]] = None, apply: Optional[List[Tuple[str, str]]] = None, n: Optional[int] = None, run: Optional[bool] = None, **kwargs) → Union[str, pd.DataFrame]¶ Generic select statement.
- Parameters
nm (str) – Table to select from, including schema if the table is outside of the current schema
fields (Optional[List[str]]) – Select these fields (optional).
apply (Optional[List[Tuple[str, str]]]) –
Select aggregations of these fields.
- apply [
(this_func, to_this_field, [as_alias]), (.., .., [..]),
]
apply
should be provided as a list of tuples, each containing a minimum of 2 items (respectively) representing the aggregate function to apply and the field to which it should be appliedBy default, the aggregated result inherits the name of the field being aggregated, including any qualifier (optionally) provided with the field name or an explicit alias included as a 3rd item within the tuple
The following snippet exhaustively illustrates the functionality described above
sn.select( nm='sandbox.sample_table', apply=[ ('count', 'col1'), ('count', 'distinct col1'), ('count', 'distinct col1', 'col1_dst'), ], run=False, ) >>> select count(col1) as col1 ,count(distinct col1) as distinct_col1 ,count(distinct col1) as col1_dst from sandbox.sample_table
n (int) – Number of records to return, implemented as a ‘limit’ clause in the query; defaults to 1.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
- Returns (Union[str, pd.DataFrame]):
Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
exists
(self, nm: Optional[str] = None) → bool¶ Checks the existence of a table or view.
- Parameters
nm (str) – Name of table or view, including schema if the table or view is outside of the current schema.
- Returns (bool):
Boolean indication of whether or not the table or view exists.
-
is_distinct
(self, nm: Optional[str] = None, field: Optional[str] = None) → bool¶ Checks if table nm is distinct on column on_col
-
count
(self, nm: Optional[str] = None, of: Optional[str] = None, dst_of: Optional[str] = None, as_perc: Optional[bool] = None, run: Optional[bool] = None) → Union[int, float]¶ Number of records within a table or view.
- Parameters
nm (str) – Table name, including schema if querying outside current schema.
of (str) – Column name (indistinct).
dst_of (str) – Column name (distinct).
as_perc (bool) – Option to return distinct count of the dst_of column as a percentage of the namespace depth of the table or view.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
show
(self, obj: str, in_loc: Optional[str] = None, names: bool = False, run: Optional[bool] = None, **kwargs) → Union[pd.DataFrame, List[str], str]¶ Show schema objects of typ ‘obj’, optionally ‘in_loc’.
- Parameters
- Returns (Union[pd.DataFrame, str]):
- Either:
The results of the query as a
pandas.DataFrame
The ‘names’ column of the results returned as a list
The generated query as a
str
of sql
-
ddl
(self, nm: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None) → str¶ Query the DDL for an schema object.
- Parameters
- Returns (str):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
comment
(self, nm: Optional[str] = None, obj: Optional[str] = None, set_as: Optional[str] = None, from_json: bool = False, as_json: bool = False, run: Optional[bool] = None, **kwargs) → Union[str, Dict]¶ Get or set comment on a schema object.
- Parameters
nm (str) – Name of the schema object, including schema prefix if object is outside implicit scope of the current connection.
obj (str) – Type of schema object (e.g. ‘table’, ‘schema’, etc).
set_as (str) – Content to set as comment on schema object.
from_json (bool) – Parse schema object comment as a string of json and return it as a dictionary.
as_json (bool) – Dump contents of ‘set_as’ to a string of json prior to setting comment.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
**kwargs – Keyword argument to pass to json.loads(comment) if from_json=True.
-
last_altered
(self, nm: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.Timestamp]¶ Last altered timestamp for a table or view.
- Parameters
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
truncate
(self, nm: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Truncate a table.
- Parameters
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
drop
(self, nm: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Drop a
Snowflake
object.- Parameters
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
clone
(self, nm: Optional[str] = None, to: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None, replace: bool = False) → Union[str, pd.DataFrame]¶ Clone a
Snowflake
object.Warning
Make sure to read Snowflake’s documentation for restrictions and considerations when cloning objects.
Note
In this specific method, the value provided to
nm
andto
can be a single object name, a single schema, or both in the form of obj_schema.obj_name depending on the desired outcome.Additionally, at least one of the
nm
orto
arguments must be pr.The defaults for the target object are constructed such that users can either:
Clone objects to other schemas that inherit the source object’s name without specifying so in the
to
argument, orClone objects within the current schema that inherit the source object’s schema without specifying so in the
to
argument.
If providing a schema without a name to either argument, prefix the value provided with __ to signify it’s a schema and not a lower-level object to be cloned.
e.g. providing nm=’sample_table’ and to=’__sandbox’ will clone sample_table from the current schema to sandbox.sample_table.
An assertion error will be raised raised if neither argument is specified as this would result in a command to clone an object and store it in an object that has the same name & schema as the object being cloned.
- Parameters
nm (str) – Name of the object to clone, including schema if cloning an object outside of the current schema.
to (str) – Target name for cloned object, including schema if cloning an object outside of the current schema.
obj (str) – Type of object to clone (e.g. ‘table’, ‘view’, ‘file-format’); defaults to table.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
replace (bool) – Indicates whether to replace an existing stage if pre-existing; default is False.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
create_stage
(self, nm_stage: str, nm_format: str, replace: bool = False, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Create a staging table.
- Parameters
nm_stage (str) – Name of stage to create, including schema if creating a stage outside of the current schema.
nm_format (str) – Name of file format to specify for the stage, including schema if using a format from outside of the current schema.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
replace (bool) – Indicates whether to replace an existing stage if pre-existing; default is False.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
put_file_from_stage
(self, path: Union[Path, str], nm_stage: str, options: Optional[Dict] = None, ignore_defaults: bool = False, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Generates a ‘put’ command into a staging table from a local file.
- Parameters
path (Union[Path, str]) – Path to local data file as a
pathlib.Path
or string.nm_stage (str) – Name of the staging table to load into.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
options (dict) – Optional arguments to add to put statement in addition to the values specified in the
loading.put
section of snowmobile.toml.ignore_defaults (bool) – Option to ignore the values specified in snowmobile.toml; defaults to False.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
copy_into_table_from_stage
(self, nm: str, nm_stage: str, options: Optional[Dict] = None, ignore_defaults: bool = False, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Generates a command to copy data into a table from a staging table.
- Parameters
nm (str) – Name of the object to drop, including schema if creating a stage outside of the current schema.
nm_stage (str) – Name of the staging table to load from.
run (bool) – Execute generated sql; defaults to True, otherwise returns sql as a string.
options (dict) – Optional arguments to add to put statement in addition to the values specified in the
loading.put
section of snowmobile.toml.ignore_defaults (bool) – Option to ignore the values specified in snowmobile.toml; defaults to False.
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
current
(self, obj: str, run: Optional[bool] = None) → Union[str, Union[str, int]]¶ Generic implementation of ‘select current’ for session-based objects.
- Parameters
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
current_session
(self, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Select the current session.
-
current_schema
(self, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Select the current schema.
-
current_database
(self, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Select the current database.
-
current_warehouse
(self, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Select the current warehouse.
-
use
(self, obj: str, nm: str, run: Optional[bool] = None)¶ Generic implementation of ‘use’ command for schema objects.
- Parameters
- Returns (Union[str, pd.DataFrame]):
- Either:
The results of the query as a
pandas.DataFrame
, orThe generated query as a
str
of sql.
-
use_schema
(self, nm: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Use schema command.
-
use_database
(self, nm: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Use database command.
-
use_warehouse
(self, nm: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Use warehouse command.
-
use_role
(self, nm: Optional[str] = None, run: Optional[bool] = None) → Union[str, pd.DataFrame]¶ Use role command.
- static
order
(by: List[Union[int, str]]) → str¶ Generates ‘order by’ clause from a list of fields or field ordinal positions.
- static
where
(restrictions: Dict) → str¶ Generates a ‘where’ clause based on a dictionary of restrictions.
- Parameters
restrictions (dict) – A dictionary of conditionals where each key/value pair respectively represents the left/right side of a condition within a ‘where’ clause.
- Returns (str):
Formatted where clause.
- class
snowmobile.
Configuration
(creds: Optional[str] = None, config_file_nm: Optional[str] = None, from_config: Optional[Path, str] = None, export_dir: Optional[Path, str] = None, silence: bool = False)¶ Bases:
snowmobile.core.base.Generic
A parsed snowmobile.toml file.
All keyword arguments optional.
- Parameters
config_file_nm (Optional[str]) – Name of configuration file to use; defaults to snowmobile.toml.
creds (Optional[str]) – Alias for the set of credentials to authenticate with; default behavior will fall back to the connection.default-creds specified in snowmobile.toml, or the first set of credentials stored if this configuration option is left blank.
from_config (Optional[str, Path]) – A full path to a specific configuration file to use; bypasses any checks for a cached file location and can be useful for container-based processes with restricted access to the local file system.
export_dir (Optional[Path]) – Path to save a template snowmobile.toml file to; if pr, the file will be exported within the __init__ method and nothing else will be instantiated.
-
location
¶ Full path to configuration file.
- Type
-
connection
:Optional[cfg.Connection]¶ [connection] from snowmobile.toml.
-
loading
:Optional[cfg.Loading]¶ [loading] from snowmobile.toml.
-
script
:Optional[cfg.Script]¶ [script] from snowmobile.toml.
-
sql
:Optional[cfg.SQL]¶ [sql] from snowmobile-ext.toml.
-
ext_sources
:Optional[cfg.Location]¶ [external-sources] from snowmobile.toml.
- property
markdown
(self) → snowmobile.core.cfg.Markup¶ Accessor for cfg.script.markdown.
- property
attrs
(self) → snowmobile.core.cfg.Attributes¶ Accessor for cfg.script.markdown.attributes.
- property
wildcards
(self) → snowmobile.core.cfg.Wildcard¶ Accessor for cfg.script.patterns.wildcards.
- static
batch_set_attrs
(obj: Any, attrs: dict, to_none: bool = False)¶ Batch sets attributes on an object from a dictionary.
- Parameters
- Returns (Any):
Object post-setting attributes.
- static
attrs_from_obj
(obj: Any, within: Optional[List[str]] = None) → Dict[str, MethodType]¶ Utility to return attributes/properties from an object as a dictionary.
- static
methods_from_obj
(obj: Any, within: Optional[List[str]] = None) → Dict[str, MethodType]¶ Returns callable components of an object as a dictionary.
- property
scopes
(self)¶ All combinations of scope type and scope attribute.
-
scopes_from_kwargs
(self, only_populated: bool = False, **kwargs) → Dict¶ Turns script.filter() arguments into a valid set of kwargs for
Scope
.Returns dictionary of all combinations of ‘arg’ (“kw”, “obj”, “desc”, “anchor” and “nm”), including empty sets for any ‘arg’ not included in the keyword arguments provided.
-
scopes_from_tag
(self, t: Any)¶ Generates list of keyword arguments to instantiate all scopes for a wrap.
- class
snowmobile.
Snowmobile
(creds: Optional[str] = None, delay: bool = False, ensure_alive: bool = True, config_file_nm: Optional[str] = None, from_config: Optional[str, Path] = None, silence: bool = False, **connect_kwargs)¶ Bases:
snowmobile.core.sql.SQL
Primary method of statement execution and accessor to parsed snowmobile.toml.
- Parameters
creds (Optional[str]) – Alias for the set of credentials to authenticate with; default behavior will fall back to the
connection.default-creds
specified in snowmobile.toml, or the first set of credentials stored if this configuration option is left blank.delay (bool) – Optionally delay establishing a connection when the object is instantiated, enabling access to the configuration object model through the
Connection.cfg
attribute; defaults to False.ensure_alive (bool) – Establish a new connection if a method requiring a connection against the database is called while
alive
is False; defaults to True.config_file_nm (Optional[str]) – Name of configuration file to use; defaults to snowmobile.toml.
from_config (Optional[str, Path]) – A full path to a specific configuration file to use; bypasses any checks for a cached file location and can be useful for container-based processes with restricted access to the local file system.
**connect_kwargs –
Additional arguments to provide to snowflake.connector.connect(); arguments provided here will over-ride connection arguments specified in snowmobile.toml, including:
Connection parameters in connection.default-arguments
Credentials parameters associated with a given alias
Connection parameters associated with a given alias
Initializes a
snowmobile.SQL
object.-
cfg
:Configuration¶ snowmobile.toml
-
con
:Optional[SnowflakeConnection]¶ Can be None until set by
Snowmobile.connect()
- Type
SnowflakeConnection
-
e
:ExceptionHandler¶ Exception / context management
-
connect
(self, **kwargs) → snowmobile.core.connection.Snowmobile¶ Establishes connection to Snowflake.
Re-implements snowflake.connector.connect() with connection arguments sourced from snowmobile’s object model, specifically:
Credentials from snowmobile.toml.
Default connection arguments from snowmobile.toml.
Optional keyword arguments either passed to
snowmobile.connect()
or directly to this method.
- kwargs:
Optional keyword arguments to pass to snowflake.connector.connect(); arguments passed here will over-ride
connection.default-arguments
specified insnowmobile.toml
.
-
disconnect
(self) → snowmobile.core.connection.Snowmobile¶ Disconnect from connection with which Connection() was instantiated.
- property
cursor
(self) → snowflake.connector.connection.SnowflakeCursor¶ SnowflakeCursor
accessor.
- property
dictcursor
(self) → snowflake.connector.DictCursor¶ DictCursor
accessor.
-
ex
(self, sql: str, on_error: Optional[str] = None, **kwargs) → snowflake.connector.connection.SnowflakeCursor¶ Executes a command via
SnowflakeCursor
.- Parameters
- Returns (SnowflakeCursor):
SnowflakeCursor
object that executed the command.
-
exd
(self, sql: str, on_error: Optional[str] = None, **kwargs) → snowflake.connector.DictCursor¶ Executes a command via
DictCursor
.- Parameters
- Returns (DictCursor):
DictCursor
object that executed the command.
-
query
(self, sql: str, as_df: bool = False, as_cur: bool = False, as_dcur: bool = False, as_scalar: bool = False, lower: bool = True, on_error: Optional[str] = None) → Union[pd.DataFrame, SnowflakeCursor]¶ Execute a query and return results.
Default behavior of results=True will return results as a
pandas.DataFrame
, otherwise will execute the sql provided with aSnowflakeCursor
and return the cursor object.- Parameters
sql (str) – Raw SQL to execute.
as_df (bool) – Return results in DataFrame.
as_cur (bool) – Return results in Cursor.
as_dcur (bool) – Return results in a DictCursor.
as_scalar (bool) – Return results as a single scalar value.
lower (bool) – Boolean value indicating whether or not to return results with columns lower-cased.
on_error (str) – String value to impose a specific behavior if an error occurs during the execution of
sql
.
- Returns (Union[pd.DataFrame, SnowflakeCursor]):
Results from
sql
as aDataFrame
by default or theSnowflakeCursor
object if results=False.
- class
snowmobile.
connect
(creds: Optional[str] = None, delay: bool = False, ensure_alive: bool = True, config_file_nm: Optional[str] = None, from_config: Optional[str, Path] = None, silence: bool = False, **connect_kwargs)¶ Bases:
snowmobile.core.sql.SQL
Primary method of statement execution and accessor to parsed snowmobile.toml.
- Parameters
creds (Optional[str]) – Alias for the set of credentials to authenticate with; default behavior will fall back to the
connection.default-creds
specified in snowmobile.toml, or the first set of credentials stored if this configuration option is left blank.delay (bool) – Optionally delay establishing a connection when the object is instantiated, enabling access to the configuration object model through the
Connection.cfg
attribute; defaults to False.ensure_alive (bool) – Establish a new connection if a method requiring a connection against the database is called while
alive
is False; defaults to True.config_file_nm (Optional[str]) – Name of configuration file to use; defaults to snowmobile.toml.
from_config (Optional[str, Path]) – A full path to a specific configuration file to use; bypasses any checks for a cached file location and can be useful for container-based processes with restricted access to the local file system.
**connect_kwargs –
Additional arguments to provide to snowflake.connector.connect(); arguments provided here will over-ride connection arguments specified in snowmobile.toml, including:
Connection parameters in connection.default-arguments
Credentials parameters associated with a given alias
Connection parameters associated with a given alias
Initializes a
snowmobile.SQL
object.-
cfg
:Configuration¶ snowmobile.toml
-
con
:Optional[SnowflakeConnection]¶ Can be None until set by
Snowmobile.connect()
- Type
SnowflakeConnection
-
e
:ExceptionHandler¶ Exception / context management
-
connect
(self, **kwargs) → snowmobile.core.connection.Snowmobile¶ Establishes connection to Snowflake.
Re-implements snowflake.connector.connect() with connection arguments sourced from snowmobile’s object model, specifically:
Credentials from snowmobile.toml.
Default connection arguments from snowmobile.toml.
Optional keyword arguments either passed to
snowmobile.connect()
or directly to this method.
- kwargs:
Optional keyword arguments to pass to snowflake.connector.connect(); arguments passed here will over-ride
connection.default-arguments
specified insnowmobile.toml
.
-
disconnect
(self) → snowmobile.core.connection.Snowmobile¶ Disconnect from connection with which Connection() was instantiated.
- property
cursor
(self) → snowflake.connector.connection.SnowflakeCursor¶ SnowflakeCursor
accessor.
- property
dictcursor
(self) → snowflake.connector.DictCursor¶ DictCursor
accessor.
-
ex
(self, sql: str, on_error: Optional[str] = None, **kwargs) → snowflake.connector.connection.SnowflakeCursor¶ Executes a command via
SnowflakeCursor
.- Parameters
- Returns (SnowflakeCursor):
SnowflakeCursor
object that executed the command.
-
exd
(self, sql: str, on_error: Optional[str] = None, **kwargs) → snowflake.connector.DictCursor¶ Executes a command via
DictCursor
.- Parameters
- Returns (DictCursor):
DictCursor
object that executed the command.
-
query
(self, sql: str, as_df: bool = False, as_cur: bool = False, as_dcur: bool = False, as_scalar: bool = False, lower: bool = True, on_error: Optional[str] = None) → Union[pd.DataFrame, SnowflakeCursor]¶ Execute a query and return results.
Default behavior of results=True will return results as a
pandas.DataFrame
, otherwise will execute the sql provided with aSnowflakeCursor
and return the cursor object.- Parameters
sql (str) – Raw SQL to execute.
as_df (bool) – Return results in DataFrame.
as_cur (bool) – Return results in Cursor.
as_dcur (bool) – Return results in a DictCursor.
as_scalar (bool) – Return results as a single scalar value.
lower (bool) – Boolean value indicating whether or not to return results with columns lower-cased.
on_error (str) – String value to impose a specific behavior if an error occurs during the execution of
sql
.
- Returns (Union[pd.DataFrame, SnowflakeCursor]):
Results from
sql
as aDataFrame
by default or theSnowflakeCursor
object if results=False.
- class
snowmobile.
Table
(df: pandas.DataFrame, table: str, sn: Optional[Snowmobile] = None, if_exists: Optional[str] = None, as_is: bool = False, path_ddl: Optional[Union[str, Path]] = None, path_output: Optional[str, Path] = None, file_format: Optional[str] = None, incl_tmstmp: Optional[bool] = None, tmstmp_col_nm: Optional[str] = None, reformat_cols: Optional[bool] = None, validate_format: Optional[bool] = None, validate_table: Optional[bool] = None, upper_case_cols: Optional[bool] = None, lower_case_table: Optional[bool] = None, keep_local: Optional[bool] = None, on_error: Optional[str] = None, check_dupes: Optional[bool] = None, load_copy: Optional[bool] = None, **kwargs)¶ Bases:
snowmobile.core.Generic
Constructed with a
DataFrame
and a table name to load into.The
df
andtable
’s compatibility can be inspected prior to calling theTable.load()
method or by providing as_is=True` when instantiating the object; the latter will kick off the loading process invoked by.load()
based on the parameters provided tosnowmobile.Table()
.- Parameters
df (DataFrame) – The
DataFrame
to load.table (str) – The table name to load
df
into.sn (Optional[Snowmobile]) – An instance of
Snowmobile
; can be used to load a table on a specific connection or from a specificsnowmobile.toml
file.if_exists (Optional[str]) – Action to take if
table
already exists - options are fail, replace, append, and truncate; defaults toappend
.as_is (bool) – Load
df
intotable
based on the parameters provided toTable
without further pre-inspection by the user; defaults to False.path_ddl (Optional[Path]) – Alternate path to file format DDL to use for load.
keep_local (Optional[bool]) – Keep local file that is written out as part of the bulk loading process; defaults to False.
path_output (Optional[str Path]) – Path to write output local file to; defaults to a generated file name exported in the current working directory.
file_format (Optional[str]) – The name of the file_format to use when loading
df
; defaults tosnowmobile_default_psv
.incl_tmstmp (Optional[bool]) – Include timestamp of load as part of
table
; defaults to True.tmstmp_col_nm (Optional[str]) – Name to use for load timestamp if
incl_tmstmp=True
; defaults to loaded_tmstmp.upper_case_cols (Optional[bool]) – Upper case columns of
df
when loading intotable
; defaults to True.reformat_cols (Optional[bool]) –
Reformat applicable columns of
df
to be DB-compliant; defaults to True.- Reformatting primarily entails:
Replacing spaces and special characters with underscores
De-duping consecutive special characters
De-duping repeated column names; adds an
_i
suffix to duplicate fields wherei
is the nth duplicate name for a field
validate_format (Optional[bool]) –
Validate the file format being used prior to kicking off the load; defaults to True.
- Validation entails:
Checking if the file format being used already exists based on formats accessible to the current connection
Executing DDL for the file format being used if not, pulled from the
DDL
ext-location and the statement namecreate file format~{format name}
Tip
Providing validate_format=False will speed up loading time when batch-loading into an existing table by skipping this step
validate_table (Optional[bool]) –
Perform validations of
df
againsttable
prior to kicking off the loading process; defaults to True.- Validation entails:
Checking the existence of
table
; no further validation is performed if it does not existCompares the columns of
df
to the columns oftable
and stores results for use during loading process
Note
Table validation results are used in conjunction with the
if_exists
parameter to determine the desired behavior based on the (potential) existence oftable
and its compatibility withdf
.Tip
Providing validate_table=False will speed up loading time time when batch-loading into an existing table
lower_case_table (Optional[bool]) – Lower case
table
name; defaults to False.on_error (Optional[str]) – Action to take if an exception is encountered as part of the validating or loading process - providing
on_error='c'
will continue past an exception as opposed to raising it; defaults to None meaning any exception encountered will be raisedcheck_dupes (Optional[bool]) – Check for duplicate field names in
df
; defaults to True.load_copy (Optional[bool]) – Alter and load a deep copy of
df
as opposed to thedf
in-memory as passed to the parameter; defaults to True.
-
load
(self, if_exists: Optional[str] = None, from_script: pathlib.Path = None, verbose: bool = True, **kwargs) → snowmobile.core.table.Table¶ Loads
df
intotable
.- Parameters
if_exists (Optional[str]) – Determines behavior to take if the table being loaded into already exists; defaults to append; options are replace, append, truncate, and fail
from_script (Optional[Union[Path, str]]) – Path to sql file containing custom DDL for
table
; DDL is assumed to have a valid statement name as is parsed byScript
and following the naming convention ofcreate table~TABLE
whereTABLE
is equal to the value provided to thetable
keyword argumentverbose (bool) – Verbose console output; defaults to True
-
col_diff
(self, mismatched: bool = False) → Dict[int, Tuple[str, str]]¶ Returns diff detail of local DataFrame to in-warehouse table.
-
load_statements
(self, from_script: pathlib.Path) → List[str]¶ Generates exhaustive list of the statements to execute for a given instance of loading a DataFrame.
- class
snowmobile.
Script
(sn: Optional[Snowmobile] = None, path: Optional[Path, str] = None, sql: Optional[str] = None, as_generic: bool = False, delay: bool = True, **kwargs)¶ Bases:
snowmobile.core.Generic
Parser and operator of local sql files.
- Parameters
sn (snowmobile.core.connection.Snowmobile) – An instance of
Snowmobile
.path (Optional[str]) – A full path to a sql file or readable text file containing valid sql code.
path – A raw string of valid sql code as opposed to reading from a
path
.as_generic (bool) – Instantiate all statements as generic st; skips all checks for a mapping of a statement anchor to a derived statement class to instantiate in the place of a generic
Statement
.delay (bool) – Delay connection of the
Snowmobile
; only applicable if thesn
argument is omitted andScript
is instantiating aSnowmobile
in its absence.**kwargs – Any keyword arguments to pass to
Snowmobile
; only applicable if thesn
argument is omitted andScript
is instantiating aSnowmobile
in its absence
-
sn
¶ An instance of
Snowmobile
-
patterns
¶ Configured patterns from snowmobile.toml.
-
as_generic
¶ Instantiate all statements as generic st; skips all checks for a mapping of a statement anchor to a derived statement class to instantiate in the place of a generic
Statement
.- Type
-
filters
¶ Dictionary of filters that have been passed to the current instance of
snowmobile.core.Script
.
-
markers
¶ Dictionary of all markers found in the script.
- Type
Dict[int, cfg.Marker]
-
path
¶ Path to sql file (e.g. full/path/to/script.sql).
- Type
Path
-
source
¶ Raw sql text of script; will be the text contained in the raw sql file when initially read from source and reflect any modifications to the script’s contents made post-instantiation.
- Type
-
read
(self, path: pathlib.Path = None) → snowmobile.core.script.Script¶ Runs quick path validation and reads in a sql file as a string.
A valid path must be provided if the script.path attribute hasn’t been set;
ValueErrors
will be thrown if neither is valid.- Parameters
path (pathlib.Path) – Full path to a sql object.
-
from_str
(self, sql: str, name: str, directory: pathlib.Path = Path.cwd()) → snowmobile.core.script.Script¶ Instantiates a raw string of sql as a script.
-
parse_one
(self, s: Union[sqlparse.sql.Statement, str], index: Optional[int] = None, nm: Optional[str] = None) → None¶ Adds a statement object to the script.
Default behavior will only add
sqlparse.sql.Statement
objects returned fromscript.source_stream
.clean_parse()
utility function is utilized so that generated sql within Python can be inserted back into the script as raw strings.- Parameters
s (Union[sqlparse.sql.Statement, str]) – A sqlparse.sql.Statement object or a raw string of SQL for an individual statement.
index (int) – Index position of the statement within the script; defaults to
n + 1
if index is not provided wheren
is the number of statements within the script at the timeparse_one()
is called.nm (Optional[str]) – Optionally provided the name of the statement being added; the script instance will treat this value as if it were provided within an in-script wrap.
-
parse_stream
(self, stream: str) → None¶ Parses a stream of sql and adds onto existing Script contents.
-
filter
(self, incl_kw: Optional[List[str], str] = None, incl_obj: Optional[List[str], str] = None, incl_desc: Optional[List[str], str] = None, incl_anchor: Optional[List[str], str] = None, incl_nm: Optional[List[str], str] = None, excl_kw: Optional[List[str], str] = None, excl_obj: Optional[List[str], str] = None, excl_desc: Optional[List[str], str] = None, excl_anchor: Optional[List[str], str] = None, excl_nm: Optional[List[str], str] = None, as_id: Optional[Union[str, int]] = None, from_id: Optional[Union[str, int]] = None, last: bool = False) → ContextManager[Script]¶ Subset the script based on attributes of its st.
script.filter()
returns a modified instance of script that can be operated on within the context defined.Note
Keyword arguments beginning with
incl
orexcl
expect a string or a list of strings containing regex patterns with which to check for a match against the associated attribute of its st’Name
.- Parameters
incl_kw – Include only
kw
incl_obj – Include only
obj
incl_desc – Include only
desc
incl_anchor – Include only
anchor
incl_nm – Include only
nm
excl_kw – Exclude
kw
excl_obj – Exclude
obj
excl_desc – Exclude
desc
excl_anchor – Exclude
anchor
excl_nm – Exclude
nm
as_id – ID to assign the filters passed to method; used to populated the
filters
attributefrom_id – ID previously used on the same instance of
Script
from which to populate filtered argumentslast – Re-use the last set of filters passed to context manager.
- Returns (Script):
The instance of script based on the context imposed by arguments pr.
- property
excluded
(self)¶ All statements by index position excluded from the current context.
- property
executed
(self) → Dict[int, Statement]¶ Executed statements by index position included in the current context.
-
reset
(self, index: bool = False, ctx_id: bool = False, in_context: bool = False, scope: bool = False, _filter: bool = False) → snowmobile.core.script.Script¶ Resets indices and scope on all statements to their state as read from source.
Invoked before exiting
filter()
context manger to reverse the revised indices set byindex_to()
and inclusion/ exclusion scope set byStatement.Name.scope()
.
- property
duplicates
(self) → Dict[str, int]¶ Dictionary of indistinct statement names/tags within script.
-
s
(self, _id: Optional[str, int] = None) → Any[Statement, Empty, Diff]¶ Fetch a single statement by _id.
-
dtl
(self, full: bool = False, excluded: bool = False, title: bool = True, r: bool = False) → Union[str, None]¶ Prints summary of statements within the current scope to console.
- property
first_s
(self)¶ First statement by index position.
- property
last_s
(self)¶ Last statement by index position
-
doc
(self, nm: Optional[str] = None, prefix: Optional[str] = None, suffix: Optional[str] = None, incl_markers: Optional[bool] = True, incl_sql: Optional[bool] = True, incl_exp_ctx: Optional[bool] = True, result_wrap: Optional[str] = None) → snowmobile.core.Markup¶ Returns a
Markup
from the script.- Parameters
nm (Optional[str]) – Alternate file name to use.
prefix (Optional[str]) – Prefix for file name.
suffix (Optional[str]) – Suffix for file name.
incl_markers (Optional[bool]) – Include markers in exported files.
incl_sql (Optional[bool]) – Include sql in exported files.
incl_exp_ctx (Optional[bool]) – Include disclaimer of programmatic save in exported sql file.
- Returns
A
Markup
instance based on the contents included in the script’s context.
-
ids
(self, _id: Optional[Union[Tuple, List]] = None) → List[int]¶ Utility function to get a list of statement IDs given an _id.
- Invoked within script.run() if the _id parameter is either a:
tuple of integers (lower and upper bound of statement indices to run)
list of integers or strings (statement names or indices to run)
default=None; returns all statement indices within scope if so
- Parameters
_id (Union[Tuple, List]) – _id field provided to script.run() if it’s neither an integer or a string.
- Returns (List[int]):
A list of statement indices to run.
-
run
(self, _id: Optional[str, int, Tuple[int, int], List] = None, as_df: bool = True, on_error: Optional[str] = None, on_exception: Optional[str] = None, on_failure: Optional[str] = None, lower: bool = True, render: bool = False, **kwargs) → None¶ Performs statement-by-statement execution of the script’s contents.
Executes script’s contents that are included within its current context and any (optional) value passed to the
_id
argument.Note
Keyword arguments
on_exception
andon_failure
are only applicable to derived classes ofStatement
(e.g., those withinsnowmobile.core.qa
by default).- Parameters
_id (Optional[str, int, Tuple[int, int], List]) –
- Identifier for statement(s) to execute, can be either:
None (default); execute all statements
A single statement’s
nm
A single statement’s index position
A tuple of lower/upper index bounds of statements to execute
A list of statement names or index positions to execute
as_df (bool) – Store statement’s results as a
DataFrame
; defaults to Trueon_error (Optional[str]) – Action to take on execution error; providing c will continue execution as opposed to raising exception.
on_exception (Optional[str]) – Action to take on post-processing error from a derived
Statement
; providing c will continue execution as opposed to raising exception.on_failure (Optional[str]) – Action to take on failure of post-processing assertion from a derived
Statement
; providing c will continue execution as opposed to raising exception.lower (bool) – Lower-case columns in results returned if
as_df=True
.render (bool) – Render sql executed as markdown; only applicable in Jupyter/iPython environments.
**kwargs –
-
items
(self, by_index: bool = True, ignore_scope: bool = False, statements: bool = True, markers: bool = False, validate: bool = True) → ItemsView[Union[int, str], Union[Statement, Marker]]¶ Dunder items.
-
dict
(self, **kwargs) → Dict¶ Unpacking items view into an actual dictionary.
- class
snowmobile.
Statement
(sn: snowmobile.core.connection.Snowmobile, statement: Union[sqlparse.sql.Statement, str], index: Optional[int] = None, attrs_raw: Optional[str] = None, e: Optional[ExceptionHandler] = None, **kwargs)¶ Bases:
snowmobile.core.tag.Attrs
,snowmobile.core.Name
,snowmobile.core.Generic
Base class for all
Statement
objects.Home for attributes and methods that are associated with all statement objects, generic or QA.
-
sn
¶ snowmobile.connect
object.- Type
-
statement
¶ A
sqlparse.sql.Statement
object.- Type
Union[sqlparse.sql.Statement, str]
-
patterns
¶ config.Pattern
object for more succinct access to values specified in snowmobile.toml.- Type
config.Pattern
-
results
¶ The results of the statement if executed as a
pandas.DataFrame
.- Type
pd.DataFrame
-
outcome
¶ Numeric indicator of outcome; defaults to 0 and is modified based on the outcome of statement execution and/or QA validation for derived classes.
- Type
-
outcome_html
¶ HTML text for the outcome as an admonition/information banner based on the following mapping of
outcome_txt
to admonition argument:failed ——> warning
completed –> info
passed —–> success
- Type
-
execution_time_txt
¶ Plain text description of execution time if executed; returned in seconds if execution time is less than 60 seconds, minutes otherwise.
- Type
-
first_keyword
¶ The first keyword within the statement as a
sqlparse.sql.Token
.- Type
Initialize self. See help(type(self)) for accurate signature.
-
sql
(self, set_as: Optional[str] = None, tag: bool = False) → Union[str, Statement]¶ Raw sql from statement, including result limit if enabled.
-
parse
(self) → Tuple[Dict, str]¶ Parses tag contents into a valid dictionary.
Uses the values specified in snowmobile.toml to parse a raw string of statement attributes into a valid dictionary.
Note
If
is_multiline
is True and name is not included within the arguments, an assertion error will be thrown.If
is_multiline
is False, the raw string within the wrap will be treated as the name.The
wrap
attribute is set once parsing is completed and name has been validated.
- Returns (dict):
Parsed wrap arguments as a dictionary.
-
start
(self)¶ Sets
start_time
attribute.
-
end
(self)¶ Updates execution time attributes.
- In namespace, sets:
-
trim
(self) → str¶ Statement as a string including only the sql and a single-line wrap name.
Note
The wrap name used here will be the user-pr wrap from the original script or a generated
Name.nm
if a wrap was not provided for a given statement.
- property
is_derived
(self)¶ Indicates whether or not it’s a generic or derived (QA) statement.
-
as_section
(self, incl_sql_tag: Optional[bool] = None, result_wrap: Optional[str] = None) → snowmobile.core.Section¶ Returns current statement as a
Section
object.
-
set_state
(self, index: Optional[int] = None, ctx_id: Optional[int] = None, in_context: Optional[bool] = None, filters: dict = None) → snowmobile.core.statement.Statement¶ Sets current state/context on a statement object.
-
reset
(self, index: bool = False, ctx_id: bool = False, in_context: bool = False, scope: bool = False) → snowmobile.core.statement.Statement¶ Resets attributes on the statement object to reflect as if read from source.
- In its current form, includes:
Resetting the statement/wrap’s index to their original values.
Resetting the
is_included
attribute of the statement’swrap
to True.Populating
error_last
with errors from current context.Caching current context’s timestamp and resetting back to None.
-
process
(self)¶ Used by derived classes for post-processing the returned results.
-
run
(self, as_df: bool = True, lower: bool = True, render: bool = False, on_error: Optional[str] = None, on_exception: Optional[str] = None, on_failure: Optional[str] = None, ctx_id: Optional[int] = None) → snowmobile.core.statement.Statement¶ Run method for all statement objects.
- Parameters
as_df (bool) – Store results of query as
pandas.DataFrame
orSnowflakeCursor
.lower (bool) – Lower case column names in
results
DataFrame if as_df=True.render (bool) – Render the sql executed as markdown.
on_error (str) –
- Behavior if an execution/database error is encountered
None: default behavior, exception will be raised
c: continue with execution
on_exception (str) –
Behavior if an exception is raised in the post-processing of results from a derived class of
Statement
(Empty
andDiff
).None: default behavior, exception will be raised
c: continue with execution
on_failure (str) –
Behavior if no error is encountered in execution or post-processing but the result of the post-processing has turned the statement’s
outcome
attribute to False, indicating the results returned by the statement have failed validation.None: default behavior, exception will be raised
c: continue with execution
- Returns (Statement):
Statement object post-executing query.
-