Snowmobile¶
snowmobile.core.connection
An instance of Snowmobile
, , represents a distinct session along with the contents of the snowmobile.toml with which it was instantiated.
Its purpose is to provide an entry point that will:
Locate, parse, and instantiate snowmobile.toml as a
Configuration
object,sn.cfg
Establish connections to Snowflake
Store the SnowflakeConnection,
sn.con
, and execute commands against the database
Usage¶
Setup
This section assumes the following about the contents of snowmobile.toml:
[connection.credentials.creds1] and [connection.credentials.creds2] are:
Populated with valid credentials
The first and second credentials stored respectively
Aliased as creds1 and creds2 respectively
default-creds has been left blank
Connecting to Snowflake¶
Establishing a connection can be done with:
import snowmobile
sn = snowmobile.connect()
Here’s some basic information on the composition of sn
:
print(sn) #> snowmobile.Snowmobile(creds='creds1')
print(sn.cfg) #> snowmobile.Configuration('snowmobile.toml')
print(type(sn.con)) #> <class 'snowflake.connector.connection.SnowflakeConnection'>
sn2 = snowmobile.connect(creds="creds1")
Here’s some context on how to think about these two instances of Snowmobile
:
sn.cfg.connection.current == sn2.cfg.connection.current #> True
sn.current("schema") == sn2.sql.current("schema") #> True
sn.current("session") == sn2.sql.current("session") #> False
Executing Raw SQL¶
df = sn.query("select 1") # == pd.read_sql()
type(df) #> pandas.core.frame.DataFrame
query()
implements pandas.read_sql()
for querying results into a pandas.DataFrame
.
df = sn.query("select 1") # == pd.read_sql()
type(df) #> pandas.core.frame.DataFrame
# -- pd.read_sql() --
import pandas as pd
df2 = pd.read_sql(sql="select 1", con=sn.con)
print(df2.equals(df)) #> True
cur = sn.ex("select 1") # == SnowflakeConnection.cursor().execute()
type(cur) #> snowflake.connector.cursor.SnowflakeCursor
ex()
implements SnowflakeConnection.cursor().execute()
for executing commands within a SnowflakeCursor.
cur = sn.ex("select 1") # == SnowflakeConnection.cursor().execute()
type(cur) #> snowflake.connector.cursor.SnowflakeCursor
# -- SnowflakeConnection.cursor().execute() --
cur2 = sn.con.cursor().execute("select 1")
print(cur.fetchone() == cur2.fetchone()) #> True
dcur = sn.exd("select 1") # == SnowflakeConnection.cursor(DictCursor).execute()
type(dcur) #> snowflake.connector.DictCursor
exd()
implements SnowflakeConnection.cursor(DictCursor).execute()
for executing commands within DictCursor.
dcur = sn.exd("select 1") # == SnowflakeConnection.cursor(DictCursor).execute()
type(dcur) #> snowflake.connector.DictCursor
# -- SnowflakeConnection.cursor(DictCursor).execute() --
from snowflake.connector import DictCursor
dcur2 = sn.con.cursor(cursor_class=DictCursor).execute("select 1")
print(dcur.fetchone() == dcur2.fetchone()) #> True
SnowflakeCursor / DictCursor
The accessors sn.cursor
and sn.dictcursor
are properties of Snowmobile
that return a new instance each time they are accessed. Depending on the intended use of SnowflakeCursor or DictCursor, it could be better to store an instance for re-referencing as opposed to repeatedly instantiating new instances off sn
.
The below demonstrates the difference between calling two methods on the cursor
property compared to on the same instance of SnowflakeCursor.
import snowmobile
sn = snowmobile.connect()
cur1 = sn.cursor.execute("select 1")
cur2 = sn.cursor.execute("select 2")
cursor = sn.cursor
cur11 = cursor.execute("select 1")
cur22 = cursor.execute("select 2")
id(cur1) == id(cur2) #> False
id(cur11) == id(cur22) #> True
Naming Convention
The following convention of variable/attribute name to associated object is used throughout snowmobile’s documentation and source code, including in method signatures:
For example, see the below attributes of Snowmobile
:
import snowmobile
sn = snowmobile.connect()
type(sn) #> snowmobile.core.connection.Snowmobile
type(sn.cfg) #> snowmobile.core.configuration.Configuration
str(sn.cfg) #> snowmobile.Configuration('snowmobile.toml')
type(sn.con) #> snowflake.connector.connection.SnowflakeConnection
type(sn.cursor) #> snowflake.connector.cursor.SnowflakeCursor
Aliasing Credentials¶
The default snowmobile.toml contains scaffolding for two sets of credentials, aliased creds1
and creds2
respectively.
By changing default-creds = ''
to default-creds = 'creds2'
, Snowmobile will use the credentials from creds2
regardless of where it falls relative to all the other credentials stored.
The change can be verified with:
import snowmobile
sn = snowmobile.connect()
assert sn.cfg.connection.default_alias == 'creds2', (
"Something's not right here; expected default_alias =='creds2'"
)
Parameter Resolution¶
will look in the following three places to compile the connection arguments that it passes to snowflake.connector.connect() when establishing a connection:
Keyword arguments passed to
snowmobile.connect()
If the same argument is defined in more than one entry point, the last value found will take precedent; the purpose of this resolution order is to enable:
Embedding connection arguments (e.g. timezone or transaction mode) within an aliased credentials block whose values differ from defaults specified in [connection.default-arguments]
Superseding any connection parameters configured in snowmobile.toml with keyword arguments passed directly to
snowmobile.connect()
The way implements resolving connection parameters from multiple entry points is outlined below.
The [connection.default-arguments] and [connection.credentials.alias_name] are merged as the connect_kwargs
property of Connection
with:
@property
def connect_kwargs(self) -> Dict:
"""Arguments from snowmobile.toml for `snowflake.connector.connect()`."""
return {**self.defaults, **self.current.credentials}
connect_kwargs
is then combined with keyword arguments passed to snowmobile.connect()
within the method itself as the con
attribute of is being set:
def connect(self, **kwargs) -> Snowmobile:
"""Establishes connection to Snowflake.
...
"""
try:
self.con = connect(
**{
**self.cfg.connection.connect_kwargs, # snowmobile.toml
**kwargs, # any kwarg over-rides
}
)
self.sql = sql.SQL(sn=self)
print(f"..connected: {str(self)}")
return self
except DatabaseError as e:
raise e
Delaying Connection¶
Sometimes it’s helpful to create a Snowmobile
without establishing a connection; this is accomplished with:
import snowmobile
sn = snowmobile.connect(delay=True)
When provided with delay=True
, the that’s returned omits connecting to Snowflake upon its instantiation; its con
attribute is None, but its cfg
attribute is a fully valid Configuration
object.
See the tabbed Examples for more info.
When provided with delay=True
, the con
attribute of will be None until a method is called on it that requires a connection.
If such a method is invoked, a call is made by to snowflake.connector.connect(), a connection established, and the attribute set.
import snowmobile
sn = snowmobile.connect(delay=True)
type(sn.con) #> None
print(sn.alive) #> False
_ = sn.query("select 1")
type(sn.con) #> snowflake.connector.connection.SnowflakeConnection
print(sn.alive) #> True
In addition to implictly connecting by executing a query, the connect()
method can be called on an existing instance of ; this will establish an initial connection if was created with delay=True
or a new session with the existing connection arguments otherwise.
import snowmobile
# -- Delayed Connection --
sn_del = snowmobile.connect(delay=True)
print(type(sn_del.con)) #> None
sn_del.connect()
print(type(sn_del.con)) #> snowflake.connector.connection.SnowflakeConnection
# -- Live Connection --
sn_live = snowmobile.connect()
session1 = sn_live.sql.current('session')
sn_live.connect()
session2 = sn_live.sql.current('session')
print(session1 != session2) #> True
Specifying snowmobile.toml¶
A full path (pathlib.Path
or str
) to a snowmobile.toml file can be provided to the from_config
parameter to instantiate from a specific configuration file.
In practice, this looks like:
from pathlib import Path
import snowmobile
path = Path.cwd() / 'snowmobile_v2.toml' # any alternate file path
sn = snowmobile.connect(from_config=path)
This will bypass any checks for a cached path and is useful for:
Testing different sets of configuration options without altering the original snowmobile.toml file
Binding a specific configuration with a process for sql-parsing purposes
Hard coding the configuration source in processes that have access to limited file systems (e.g. containers or VMs)
Snowmobile caches locations based on the file name provided to the config_file_nm
parameter of snowmobile.connect()
, the default value of which is snowmobile.toml
.
If an alternate file name is provided, it will be located and its location cached in the same way as the global snowmobile.toml file so that future instances of on the same machine can make use of it upon instantiation without having to re-locate it.
The below codes are a contrived example demonstrating this behavior in practice.
Setup
All code blocks in this example are from the same code file, assumed to be executed in full starting with code directly below in which a second configuration file called snowmobile2.toml is created in the same folder as the global snowmobile.toml file.
import time
import shutil
import snowmobile
# Instantiate sn from snowmobile.toml; omit unnecessary connection
sn = snowmobile.connect(delay=True)
# Create alternate snowmobile.toml file called 'snowmobile2.toml'
path_cfg_orig = sn.cfg.location
path_cfg2 = path_cfg_orig.parent / 'snowmobile2.toml'
shutil.copy(path_cfg_orig, path_cfg2)
Below, alt_sn()
is used to create sn_alt1
and sn_alt2
, representing an initial and future instance of respectively:
def alt_sn(n: int) -> snowmobile.Snowmobile:
"""Instantiate sn from snowmobile2.toml and print time elapsed."""
pre = time.time()
sn = snowmobile.connect(
config_file_nm='snowmobile2.toml',
delay=True # omit connection - not needed
)
print(f"n={n}, time-required: ~{int(time.time() - pre)} seconds")
return sn
sn_alt1 = alt_sn(n=1) #> n=1, time-required: ~6 seconds -> locates file, caches path
sn_alt2 = alt_sn(n=2) #> n=2, time-required: ~0 seconds -> uses cache from sn_alt1
"""
Note:
The time required for `sn_alt1` to locate 'snowmobile2.toml' is arbitrary and
will vary based the file's location relative to the current working directory.
"""
Cleanup is done with the following two lines which remove the snowmobile2.toml file created during the for this example:
import os
os.remove(sn_alt1.cfg.location)
Using ensure_alive¶
Controlling the behavior of Snowmobile
when a connection is lost or intentionally killed is done through the ensure_alive
parameter.
Its default value is True, meaning that if the alive
property evaluates to False, and a method is invoked that requires a connection, it will re-connect to Snowflake before continuing execution.
Note
A re-established connection will not be on the same session as the original connection.
See this snippet for additional details.