:mod:`snowmobile.core.sql` ========================== .. py:module:: snowmobile.core.sql .. autoapi-nested-parse:: :class:`SQL` contains utility methods to generate common SQL commands; :class:`~snowmobile.core.connection.Snowmobile` inherits everything from this object and passes along its :meth:`~snowmobile.core.connection.Snowmobile.query()` method for statement execution. .. note:: The :attr:`~SQL.auto_run` attribute defaults to `True`, meaning that the generated sql will execute when a method is called; if set to `False` the method will return the sql as a string without executing. The :class:`SQL` object is primarily interacted with as a pre-instantiated attribute of :class:`~snowmobile.Snowmobile`; in these instances users can fetch the generated sql as a string either by: 1. Providing *run=False* to any method called; this will override all behavior set by the current value of :attr:`auto_run` 2. Setting the :attr:`auto_run` attribute to `False` on an existing instance of :class:`SQL`, which will replicate the behavior of `(1)` without needing to provide *run=False* to each method called on that instance Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: snowmobile.core.sql.SQL .. class:: SQL(_query_func: Callable, _cfg: snowmobile.core.configuration.Configuration, nm: Optional[str] = None, schema: Optional[str] = None, obj: Optional[str] = None, auto_run: Optional[bool] = True) Bases: :class:`snowmobile.core.Generic` SQL class for generation & execution of common sql commands. Intended to be interacted with as a parent of :class:`~snowmobile.Snowmobile`. .. note:: * All arguments except for :attr:`sn` are optional. * The benefit of setting the other attributes on an instance of :class:`SQL` is to (optionally) avoid passing the same information to multiple methods when generating a variety of statements around the same object. .. attribute:: nm Object name to use in generated sql (e.g. 'some_table_name') :type: str .. attribute:: obj Object type to use in generated sql (e.g. 'table') :type: str .. attribute:: schema Schema to use when dot-prefixing sql; defaults to the schema with which the :attr:`sn` is connected to. :type: str .. attribute:: auto_run Indicates whether to automatically execute the sql generated by a given method; defaults to *True* :type: bool Initializes a :class:`snowmobile.SQL` object. .. method:: info_schema(self, loc: str, where: Optional[List[str]] = None, fields: Optional[List[str]] = None, order_by: Optional[List] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Generic case of selecting from information schema location. .. method:: table_info(self, nm: Optional[str] = None, fields: List[str] = None, restrictions: Dict[(str, str)] = None, order_by: List[Optional[str, int]] = None, all_schemas: bool = False, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Query ``information_schema.tables`` for a given table or view. :param nm: Table name, including schema if creating a stage outside of the current schema. :type nm: str :param fields: List of fields to include in returned results (e.g. ['table_name', 'table_type', 'last_altered']) :type fields: List[str] :param restrictions: List of conditionals typed as literal components of a `where` clause (e.g. ["table_type = 'base table'", 'last_altered::date = current_date()'] ). :type restrictions: List[str] :param order_by: List of fields or their ordinal positions to order the results by. :type order_by: List[str] :param all_schemas: Include tables/views from all schemas; defaults to `False`. :type all_schemas: bool :param run: Determines whether to run the generated sql or not; defaults to `None` which will reference the current value of the :attr:`auto_run` attribute which defaults to `True`. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: column_info(self, nm: Optional[str] = None, fields: Optional[List] = None, restrictions: Optional[Dict] = None, order_by: Optional[List] = None, all_schemas: bool = False, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Query ``information_schema.columns`` for a given table or view. :param nm: Table name, including schema if creating a stage outside of the current schema. :type nm: str :param fields: List of fields to include in returned results (e.g. ['ordinal_position', 'column_name', 'data_type']) :type fields: List[str] :param restrictions: List of conditionals typed as literal components of a `where` clause (e.g.["regexp_count(lower(column_name), 'tmstmp') = 0"]). :type restrictions: List[str] :param order_by: List of fields or their ordinal positions to order the results by. :type order_by: List[str] :param all_schemas: Include tables/views from all schemas; defaults to `False`. :type all_schemas: bool :param run: Determines whether to run the generated sql or not; defaults to `None` which will reference the current value of the :attr:`auto_run` attribute which defaults to `True`. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: columns(self, nm: Optional[str] = None, from_info_schema: bool = False, lower: bool = False, run: Optional[bool] = None) -> Union[(str, List)] Returns an ordered list of columns for a table or view. .. note:: * Default behavior is to retrieve the columns for a table or view by selecting a single sample record and returning the column index from the DataFrame that's returned which is much faster than selecting the **column_names** from ``information_schema.columns`` pulling column names from the information schema * This can be changed by passing `from_info_schema=True`. :param nm: Name of table or view, including schema if the table or view is outside of the current schema. :type nm: str :param from_info_schema: Indicates whether to retrieve columns via the ``information_schema.columns`` or by selecting a sample record from the table or view; defaults to `False`. :type from_info_schema: bool :param lower: Lower case each column in the list that's returned. :type lower: bool :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, List]): Either: 1. An ordered list of columns for the table or view, **or** 2. The query against the table or view as a :class:`str` of sql. .. method:: select(self, nm: Optional[str] = None, fields: Optional[List[str]] = None, apply: Optional[List[Tuple[str, str]]] = None, n: Optional[int] = None, run: Optional[bool] = None, **kwargs) -> Union[(str, pd.DataFrame)] Generic `select` statement. :param nm: Table to select from, including schema if the table is outside of the current schema :type nm: str :param fields: Select these fields (optional). :type fields: Optional[List[str]] :param apply: Select aggregations of these fields. **apply** [ (*this_func, to_this_field, [as_alias]*), (*.., .., [..]*), ] * ``apply`` should be provided as a list of tuples, each containing a minimum of 2 items (respectively) representing the aggregate function to apply and the field to which it should be applied * By default, the aggregated result inherits the name of the field being aggregated, including any qualifier (optionally) provided with the field name or an explicit alias included as a 3rd item within the tuple *The following snippet exhaustively illustrates the functionality described above* .. code-block:: python sn.select( nm='sandbox.sample_table', apply=[ ('count', 'col1'), ('count', 'distinct col1'), ('count', 'distinct col1', 'col1_dst'), ], run=False, ) >>> select count(col1) as col1 ,count(distinct col1) as distinct_col1 ,count(distinct col1) as col1_dst from sandbox.sample_table :type apply: Optional[List[Tuple[str, str]]] :param n: Number of records to return, implemented as a 'limit' clause in the query; defaults to 1. :type n: int :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: exists(self, nm: Optional[str] = None) -> bool Checks the existence of a table or view. :param nm: Name of table or view, including schema if the table or view is outside of the current schema. :type nm: str Returns (bool): Boolean indication of whether or not the table or view exists. .. method:: is_distinct(self, nm: Optional[str] = None, field: Optional[str] = None) -> bool Checks if table `nm` is distinct on column `on_col` :param nm: Table name. :type nm: str :param field: Column name. :type field: str .. method:: count(self, nm: Optional[str] = None, of: Optional[str] = None, dst_of: Optional[str] = None, as_perc: Optional[bool] = None, run: Optional[bool] = None) -> Union[(int, float)] Number of records within a table or view. :param nm: Table name, including schema if querying outside current schema. :type nm: str :param of: Column name (indistinct). :type of: str :param dst_of: Column name (distinct). :type dst_of: str :param as_perc: Option to return distinct count of the `dst_of` column as a percentage of the namespace depth of the table or view. :type as_perc: bool :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: show(self, obj: str, in_loc: Optional[str] = None, names: bool = False, run: Optional[bool] = None, **kwargs) -> Union[(pd.DataFrame, List[str], str)] Show schema objects of typ 'obj', optionally 'in_loc'. :param obj: Schema object type ('tables', 'file formats', etc). :type obj: str :param in_loc: Snowflake location ('in schema sandbox', 'in database prod', etc). :type in_loc: str :param names: Return a list of schema object names only ('name' field). :type names: bool :param run: Execute the generated sql or return it as a string. :type run: bool Returns (Union[pd.DataFrame, str]): Either: 1. The results of the query as a :class:`pandas.DataFrame` 2. The 'names' column of the results returned as a list 3. The generated query as a :class:`str` of sql .. method:: ddl(self, nm: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None) -> str Query the DDL for an schema object. :param nm: Name of the object to get DDL for, including schema if object is outside of the current schema. :type nm: str :param obj: Type of object to get DDL for (e.g. 'table', 'view', 'file-format'). :type obj: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (str): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: comment(self, nm: Optional[str] = None, obj: Optional[str] = None, set_as: Optional[str] = None, from_json: bool = False, as_json: bool = False, run: Optional[bool] = None, **kwargs) -> Union[(str, Dict)] Get or set comment on a schema object. :param nm: Name of the schema object, including schema prefix if object is outside implicit scope of the current connection. :type nm: str :param obj: Type of schema object (e.g. 'table', 'schema', etc). :type obj: str :param set_as: Content to set as comment on schema object. :type set_as: str :param from_json: Parse schema object comment as a string of json and return it as a dictionary. :type from_json: bool :param as_json: Dump contents of 'set_as' to a string of json prior to setting comment. :type as_json: bool :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool :param \*\*kwargs: Keyword argument to pass to `json.loads(comment)` if *from_json=True*. Returns (Union[str, pd.DataFrame]): Either: 1. The schema object comment as a :class:`str` 2. The generated query as a :class:`str` of sql. 3. The schema object comment as a dictionary if *from_json=True* .. method:: last_altered(self, nm: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.Timestamp)] Last altered timestamp for a table or view. :param nm: Table name, including schema if creating a stage outside of the current schema. :type nm: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: truncate(self, nm: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Truncate a table. :param nm: Name of table, including schema if the table is outside of the current schema. :type nm: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: drop(self, nm: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Drop a ``Snowflake`` object. :param nm: Schema object's name. :type nm: str :param obj: Type of schema object (e.g. 'table', 'view', or 'schema') :type obj: str :param run: Execute generated statement; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: clone(self, nm: Optional[str] = None, to: Optional[str] = None, obj: Optional[str] = None, run: Optional[bool] = None, replace: bool = False) -> Union[(str, pd.DataFrame)] Clone a ``Snowflake`` object. .. warning:: * Make sure to read `Snowflake's documentation `_ for restrictions and considerations when cloning objects. .. note:: * In this specific method, the value provided to ``nm`` and ``to`` can be a single object name, a single schema, or both in the form of `obj_schema.obj_name` depending on the desired outcome. * Additionally, **at least one of the** ``nm`` **or** ``to`` **arguments must be pr**. * The defaults for the target object are constructed such that users can **either**: 1. Clone objects to *other* schemas that inherit the source object's *name* without specifying so in the ``to`` argument, **or** 2. Clone objects within the *current* schema that inherit the source object's *schema* without specifying so in the ``to`` argument. * If providing a schema without a name to either argument, prefix the value provided with `__` to signify it's a schema and not a lower-level object to be cloned. * e.g. providing `nm='sample_table'` and `to='__sandbox'` will clone `sample_table` from the current schema to `sandbox.sample_table`. * An assertion error will be raised raised if neither argument is specified as *this would result in a command to clone an object and store it in an object that has the same name & schema as the object being cloned*. :param nm: Name of the object to clone, including schema if cloning an object outside of the current schema. :type nm: str :param to: Target name for cloned object, including schema if cloning an object outside of the current schema. :type to: str :param obj: Type of object to clone (e.g. 'table', 'view', 'file-format'); defaults to `table`. :type obj: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool :param replace: Indicates whether to replace an existing stage if pre-existing; default is `False`. :type replace: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: create_stage(self, nm_stage: str, nm_format: str, replace: bool = False, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Create a staging table. :param nm_stage: Name of stage to create, including schema if creating a stage outside of the current schema. :type nm_stage: str :param nm_format: Name of file format to specify for the stage, including schema if using a format from outside of the current schema. :type nm_format: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool :param replace: Indicates whether to replace an existing stage if pre-existing; default is `False`. :type replace: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: put_file_from_stage(self, path: Union[(Path, str)], nm_stage: str, options: Optional[Dict] = None, ignore_defaults: bool = False, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Generates a 'put' command into a staging table from a local file. :param path: Path to local data file as a :class:`pathlib.Path` or string. :type path: Union[Path, str] :param nm_stage: Name of the staging table to load into. :type nm_stage: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool :param options: Optional arguments to add to `put` statement in addition to the values specified in the ``loading.put`` section of **snowmobile.toml**. :type options: dict :param ignore_defaults: Option to ignore the values specified in **snowmobile.toml**; defaults to `False`. :type ignore_defaults: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: copy_into_table_from_stage(self, nm: str, nm_stage: str, options: Optional[Dict] = None, ignore_defaults: bool = False, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Generates a command to copy data into a table from a staging table. :param nm: Name of the object to drop, including schema if creating a stage outside of the current schema. :type nm: str :param nm_stage: Name of the staging table to load from. :type nm_stage: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool :param options: Optional arguments to add to `put` statement in addition to the values specified in the ``loading.put`` section of **snowmobile.toml**. :type options: dict :param ignore_defaults: Option to ignore the values specified in **snowmobile.toml**; defaults to `False`. :type ignore_defaults: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: current(self, obj: str, run: Optional[bool] = None) -> Union[(str, Union[str, int])] Generic implementation of 'select current' for session-based objects. :param obj: Type of object to retrieve information for (schema, session, ..). :type obj: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: current_session(self, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Select the current session. .. method:: current_schema(self, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Select the current schema. .. method:: current_database(self, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Select the current database. .. method:: current_warehouse(self, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Select the current warehouse. .. method:: current_role(self, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Select the current role. .. method:: use(self, obj: str, nm: str, run: Optional[bool] = None) Generic implementation of 'use' command for schema objects. :param nm: Name of object to use (schema name, warehouse name, role name, ..). :type nm: str :param obj: Type of object to use (schema, warehouse, role, ..). :type obj: str :param run: Execute generated sql; defaults to `True`, otherwise returns sql as a string. :type run: bool Returns (Union[str, pd.DataFrame]): Either: 1. The results of the query as a :class:`pandas.DataFrame`, or 2. The generated query as a :class:`str` of sql. .. method:: use_schema(self, nm: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Use schema command. .. method:: use_database(self, nm: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Use database command. .. method:: use_warehouse(self, nm: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Use warehouse command. .. method:: use_role(self, nm: Optional[str] = None, run: Optional[bool] = None) -> Union[(str, pd.DataFrame)] Use role command. .. method:: order(by: List[Union[int, str]]) -> str :staticmethod: Generates 'order by' clause from a list of fields or field ordinal positions. .. method:: where(restrictions: Dict) -> str :staticmethod: Generates a 'where' clause based on a dictionary of restrictions. :param restrictions: A dictionary of conditionals where each key/value pair respectively represents the left/right side of a condition within a 'where' clause. :type restrictions: dict Returns (str): Formatted where clause. .. method:: fields(fields: Optional[List[str]] = None) -> str :staticmethod: Utility to generate fields within a 'select' statement.