Snowmobile


snowmobile.core.connection

An instance of Snowmobile, , represents a distinct session along with the contents of the snowmobile.toml with which it was instantiated.

Its purpose is to provide an entry point that will:

  1. Locate, parse, and instantiate snowmobile.toml as a Configuration object, sn.cfg

  2. Establish connections to Snowflake

  3. Store the SnowflakeConnection, sn.con, and execute commands against the database




Usage


Setup

This section assumes the following about the contents of snowmobile.toml:

  1. [connection.credentials.creds1] and [connection.credentials.creds2] are:

    1. Populated with valid credentials

    2. The first and second credentials stored respectively

    3. Aliased as creds1 and creds2 respectively

  2. default-creds has been left blank

Connecting to Snowflake


Establishing a connection can be done with:

import snowmobile

sn = snowmobile.connect()

Here’s some basic information on the composition of sn:

print(sn)            #> snowmobile.Snowmobile(creds='creds1')
print(sn.cfg)        #> snowmobile.Configuration('snowmobile.toml')
print(type(sn.con))  #> <class 'snowflake.connector.connection.SnowflakeConnection'>

Given , sn is implicitly using the same connection arguments as:

sn2 = snowmobile.connect(creds="creds1")

Here’s some context on how to think about these two instances of Snowmobile:

sn.cfg.connection.current == sn2.cfg.connection.current  #> True
sn.current("schema") == sn2.sql.current("schema")    #> True
sn.current("session") == sn2.sql.current("session")  #> False

Executing Raw SQL


The following three methods are available for statement execution directly off .

df = sn.query("select 1")  #  == pd.read_sql()
type(df)                   #> pandas.core.frame.DataFrame

query() implements pandas.read_sql() for querying results into a pandas.DataFrame.

df = sn.query("select 1")  #  == pd.read_sql()
type(df)                   #> pandas.core.frame.DataFrame

# -- pd.read_sql() --
import pandas as pd

df2 = pd.read_sql(sql="select 1", con=sn.con)

print(df2.equals(df))  #> True

cur = sn.ex("select 1")    #  == SnowflakeConnection.cursor().execute()
type(cur)                  #> snowflake.connector.cursor.SnowflakeCursor

ex() implements SnowflakeConnection.cursor().execute() for executing commands within a SnowflakeCursor.

cur = sn.ex("select 1")    #  == SnowflakeConnection.cursor().execute()
type(cur)                  #> snowflake.connector.cursor.SnowflakeCursor

# -- SnowflakeConnection.cursor().execute() --
cur2 = sn.con.cursor().execute("select 1")

print(cur.fetchone() == cur2.fetchone())  #> True

dcur = sn.exd("select 1")  #  == SnowflakeConnection.cursor(DictCursor).execute()
type(dcur)                 #> snowflake.connector.DictCursor

exd() implements SnowflakeConnection.cursor(DictCursor).execute() for executing commands within DictCursor.

dcur = sn.exd("select 1")  #  == SnowflakeConnection.cursor(DictCursor).execute()
type(dcur)                 #> snowflake.connector.DictCursor

# -- SnowflakeConnection.cursor(DictCursor).execute() --
from snowflake.connector import DictCursor

dcur2 = sn.con.cursor(cursor_class=DictCursor).execute("select 1")

print(dcur.fetchone() == dcur2.fetchone())  #> True

SnowflakeCursor / DictCursor

The accessors sn.cursor and sn.dictcursor are properties of Snowmobile that return a new instance each time they are accessed. Depending on the intended use of SnowflakeCursor or DictCursor, it could be better to store an instance for re-referencing as opposed to repeatedly instantiating new instances off sn.

The below demonstrates the difference between calling two methods on the cursor property compared to on the same instance of SnowflakeCursor.

import snowmobile

sn = snowmobile.connect()

cur1 = sn.cursor.execute("select 1")
cur2 = sn.cursor.execute("select 2")

cursor = sn.cursor
cur11 = cursor.execute("select 1")
cur22 = cursor.execute("select 2")

id(cur1) == id(cur2)    #> False
id(cur11) == id(cur22)  #> True


Naming Convention

The following convention of variable/attribute name to associated object is used throughout snowmobile’s documentation and source code, including in method signatures:

For example, see the below attributes of Snowmobile:

import snowmobile

sn = snowmobile.connect()

type(sn)         #> snowmobile.core.connection.Snowmobile

type(sn.cfg)     #> snowmobile.core.configuration.Configuration
str(sn.cfg)      #> snowmobile.Configuration('snowmobile.toml')

type(sn.con)     #> snowflake.connector.connection.SnowflakeConnection
type(sn.cursor)  #> snowflake.connector.cursor.SnowflakeCursor


Aliasing Credentials


The default snowmobile.toml contains scaffolding for two sets of credentials, aliased creds1 and creds2 respectively.

By changing default-creds = '' to default-creds = 'creds2', Snowmobile will use the credentials from creds2 regardless of where it falls relative to all the other credentials stored.

The change can be verified with:

import snowmobile

sn = snowmobile.connect()

assert sn.cfg.connection.default_alias == 'creds2', (
    "Something's not right here; expected default_alias =='creds2'"
)

Parameter Resolution


will look in the following three places to compile the connection arguments that it passes to snowflake.connector.connect() when establishing a connection:

  1. [connection.default-arguments]

  2. [connection.credentials.alias_name]

  3. Keyword arguments passed to snowmobile.connect()



If the same argument is defined in more than one entry point, the last value found will take precedent; the purpose of this resolution order is to enable:

The way implements resolving connection parameters from multiple entry points is outlined below.


    @property
    def connect_kwargs(self) -> Dict:
        """Arguments from snowmobile.toml for `snowflake.connector.connect()`."""
        return {**self.defaults, **self.current.credentials}

connect_kwargs is then combined with keyword arguments passed to snowmobile.connect() within the method itself as the con attribute of is being set:

    def connect(self, **kwargs) -> Snowmobile:
        """Establishes connection to Snowflake.
        ...
        """
        try:
            self.con = connect(
                **{
                    **self.cfg.connection.connect_kwargs,  # snowmobile.toml
                    **kwargs,  # any kwarg over-rides
                }
            )
            self.sql = sql.SQL(sn=self)
            print(f"..connected: {str(self)}")
            return self

        except DatabaseError as e:
            raise e

Delaying Connection


Sometimes it’s helpful to create a Snowmobile without establishing a connection; this is accomplished with:

import snowmobile

sn = snowmobile.connect(delay=True)

When provided with delay=True, the that’s returned omits connecting to Snowflake upon its instantiation; its con attribute is None, but its cfg attribute is a fully valid Configuration object.

See the tabbed Examples for more info.

When provided with delay=True, the con attribute of will be None until a method is called on it that requires a connection.

If such a method is invoked, a call is made by to snowflake.connector.connect(), a connection established, and the attribute set.

import snowmobile

sn = snowmobile.connect(delay=True)

type(sn.con)     #> None
print(sn.alive)  #> False

_ = sn.query("select 1")

type(sn.con)     #> snowflake.connector.connection.SnowflakeConnection
print(sn.alive)  #> True

In addition to implictly connecting by executing a query, the connect() method can be called on an existing instance of ; this will establish an initial connection if was created with delay=True or a new session with the existing connection arguments otherwise.

import snowmobile

# -- Delayed Connection --
sn_del = snowmobile.connect(delay=True)

print(type(sn_del.con))  #> None
sn_del.connect()
print(type(sn_del.con))  #> snowflake.connector.connection.SnowflakeConnection


# -- Live Connection --
sn_live = snowmobile.connect()

session1 = sn_live.sql.current('session')
sn_live.connect()
session2 = sn_live.sql.current('session')
print(session1 != session2)  #> True

Specifying snowmobile.toml


A full path (pathlib.Path or str) to a snowmobile.toml file can be provided to the from_config parameter to instantiate from a specific configuration file.

In practice, this looks like:

from pathlib import Path

import snowmobile

path = Path.cwd() / 'snowmobile_v2.toml'  # any alternate file path

sn = snowmobile.connect(from_config=path)

This will bypass any checks for a cached path and is useful for:

  1. Testing different sets of configuration options without altering the original snowmobile.toml file

  2. Binding a specific configuration with a process for sql-parsing purposes

  3. Hard coding the configuration source in processes that have access to limited file systems (e.g. containers or VMs)

Snowmobile caches locations based on the file name provided to the config_file_nm parameter of snowmobile.connect(), the default value of which is snowmobile.toml.

If an alternate file name is provided, it will be located and its location cached in the same way as the global snowmobile.toml file so that future instances of on the same machine can make use of it upon instantiation without having to re-locate it.



The below codes are a contrived example demonstrating this behavior in practice.

Setup

All code blocks in this example are from the same code file, assumed to be executed in full starting with code directly below in which a second configuration file called snowmobile2.toml is created in the same folder as the global snowmobile.toml file.

import time
import shutil
import snowmobile

# Instantiate sn from snowmobile.toml; omit unnecessary connection
sn = snowmobile.connect(delay=True)

# Create alternate snowmobile.toml file called 'snowmobile2.toml'
path_cfg_orig = sn.cfg.location
path_cfg2 = path_cfg_orig.parent / 'snowmobile2.toml'
shutil.copy(path_cfg_orig, path_cfg2)

Below, alt_sn() is used to create sn_alt1 and sn_alt2, representing an initial and future instance of respectively:

def alt_sn(n: int) -> snowmobile.Snowmobile:
    """Instantiate sn from snowmobile2.toml and print time elapsed."""
    pre = time.time()
    sn = snowmobile.connect(
        config_file_nm='snowmobile2.toml',
        delay=True  # omit connection - not needed
    )
    print(f"n={n}, time-required: ~{int(time.time() - pre)} seconds")
    return sn


sn_alt1 = alt_sn(n=1)  #> n=1, time-required: ~6 seconds  -> locates file, caches path
sn_alt2 = alt_sn(n=2)  #> n=2, time-required: ~0 seconds  -> uses cache from sn_alt1
"""
Note:
    The time required for `sn_alt1` to locate 'snowmobile2.toml' is arbitrary and
    will vary based the file's location relative to the current working directory.
"""

Cleanup is done with the following two lines which remove the snowmobile2.toml file created during the for this example:

import os
os.remove(sn_alt1.cfg.location)

Using ensure_alive


Controlling the behavior of Snowmobile when a connection is lost or intentionally killed is done through the ensure_alive parameter.

Its default value is True, meaning that if the alive property evaluates to False, and a method is invoked that requires a connection, it will re-connect to Snowflake before continuing execution.

Note

A re-established connection will not be on the same session as the original connection.

See this snippet for additional details.