:mod:`snowmobile.core.table` ============================ .. py:module:: snowmobile.core.table .. autoapi-nested-parse:: :class:`snowmobile.Table` is a canned implementation of the :xref:`bulk loading from a local file system` standard and is intended to provide a predictable, no-nonsense method of loading a :xref:`DataFrame`, ``df``, into a ``table`` (:class:`str`). ------ .. note:: **Core functionality includes:** #. Generating and executing generic DDL for ``df`` if the table doesn't yet exist #. Executing DDL for the **file format** being used `if it doesn't yet exist in the current schema`, or (optionally) specifying an alias for a file format in its ``file_format`` argument; **in the case of the latter:** * An absolute ``path`` to an independent, user-defined sql file must be specified within the **external-sources.ddl** field of ``snowmobile.toml`` * Prior to attempting the load of ``df``, :class:`snowmobile.Table` will create a :class:`~snowmobile.core.script.Script` from the configured ``path`` and execute the (file format DDL) statement whose tagged name maps to the value provided to its ``file_format`` argument * An error will be thrown during the creation of the :class:`Table` if the :class:`~snowmobile.core.script.Script` associated with the configured ``path`` does not contain a statement whose tagged name matches the value of ``file_format`` or if an error is raised when the file is parsed * Bypassed by creating the :class:`~snowmobile.Table` with: .. code:: python snowmobile.Table(validate_format=False, **kwargs) #. Dimensional compatibility checks between ``df`` and the table being loaded into * Bypassed by creating the :class:`~snowmobile.Table` with: .. code:: python snowmobile.Table(validate_table=False, **kwargs) #. Coercing column names of ``df`` into a generic database standard prior to loading, including de-duplication of field names when applicable #. Argument or configuration based handling of action to take if table being loaded into already exists (respectively) via the ``if_exists`` argument to :class:`snowmobile.Table()` or its associated section in ``snowmobile.toml``; valid values are **replace**, **truncate**, **append**, **fail** Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: snowmobile.core.table.Table .. class:: Table(df: pandas.DataFrame, table: str, sn: Optional[Snowmobile] = None, if_exists: Optional[str] = None, as_is: bool = False, path_ddl: Optional[Union[str, Path]] = None, path_output: Optional[(str, Path)] = None, file_format: Optional[str] = None, incl_tmstmp: Optional[bool] = None, tmstmp_col_nm: Optional[str] = None, reformat_cols: Optional[bool] = None, validate_format: Optional[bool] = None, validate_table: Optional[bool] = None, upper_case_cols: Optional[bool] = None, lower_case_table: Optional[bool] = None, keep_local: Optional[bool] = None, on_error: Optional[str] = None, check_dupes: Optional[bool] = None, load_copy: Optional[bool] = None, **kwargs) Bases: :class:`snowmobile.core.Generic` Constructed with a :class:`DataFrame` and a table name to load into. The ``df`` and ``table``'s compatibility can be inspected prior to calling the :meth:`Table.load()` method or by providing `as_is=True`` when instantiating the object; the latter will kick off the loading process invoked by :meth:`.load()` based on the parameters provided to :class:`snowmobile.Table()`. :param df: The :class:`~pandas.DataFrame` to load. :type df: DataFrame :param table: The table name to load ``df`` into. :type table: str :param sn: An instance of :class:`~snowmobile.Snowmobile`; can be used to load a table on a specific connection or from a specific ``snowmobile.toml`` file. :type sn: Optional[Snowmobile] :param if_exists: Action to take if ``table`` already exists - options are `fail`, `replace`, `append`, and `truncate`; defaults to ``append``. :type if_exists: Optional[str] :param as_is: Load ``df`` into ``table`` based on the parameters provided to :class:`Table` without further pre-inspection by the user; defaults to `False`. :type as_is: bool :param path_ddl: Alternate path to file format DDL to use for load. :type path_ddl: Optional[Path] :param keep_local: Keep local file that is written out as part of the bulk loading process; defaults to `False`. :type keep_local: Optional[bool] :param path_output: Path to write output local file to; defaults to a generated file name exported in the current working directory. :type path_output: Optional[str Path] :param file_format: The name of the file_format to use when loading ``df``; defaults to ``snowmobile_default_psv``. :type file_format: Optional[str] :param incl_tmstmp: Include timestamp of load as part of ``table``; defaults to `True`. :type incl_tmstmp: Optional[bool] :param tmstmp_col_nm: Name to use for load timestamp if ``incl_tmstmp=True``; defaults to `loaded_tmstmp`. :type tmstmp_col_nm: Optional[str] :param upper_case_cols: Upper case columns of ``df`` when loading into ``table``; defaults to `True`. :type upper_case_cols: Optional[bool] :param reformat_cols: Reformat applicable columns of ``df`` to be DB-compliant; defaults to `True`. Reformatting primarily entails: - Replacing spaces and special characters with underscores - De-duping consecutive special characters - De-duping repeated column names; adds an ``_i`` suffix to duplicate fields where ``i`` is the nth duplicate name for a field :type reformat_cols: Optional[bool] :param validate_format: Validate the :xref:`file format` being used prior to kicking off the load; defaults to `True`. Validation entails: - Checking if the file format being used already exists based on formats accessible to the current connection - Executing DDL for the file format being used if not, pulled from the ``DDL`` `ext-location` and the statement name ``create file format~{format name}`` .. tip:: Providing `validate_format=False` will speed up loading time when batch-loading into an existing table by skipping this step :type validate_format: Optional[bool] :param validate_table: Perform validations of ``df`` against ``table`` prior to kicking off the loading process; defaults to `True`. Validation entails: - Checking the existence of ``table``; no further validation is performed if it does **not** exist - Compares the columns of ``df`` to the columns of ``table`` and stores results for use during loading process .. note:: Table validation results are used in conjunction with the ``if_exists`` parameter to determine the desired behavior based on the (potential) existence of ``table`` and its compatibility with ``df``. .. tip:: Providing `validate_table=False` will speed up loading time time when batch-loading into an existing table :type validate_table: Optional[bool] :param lower_case_table: Lower case ``table`` name; defaults to `False`. :type lower_case_table: Optional[bool] :param on_error: Action to take if an exception is encountered as part of the validating or loading process - providing ``on_error='c'`` will *continue* past an exception as opposed to raising it; defaults to `None` meaning any exception encountered will be raised :type on_error: Optional[str] :param check_dupes: Check for duplicate field names in ``df``; defaults to `True`. :type check_dupes: Optional[bool] :param load_copy: Alter and load a deep copy of ``df`` as opposed to the ``df`` in-memory as passed to the parameter; defaults to `True`. :type load_copy: Optional[bool] .. attribute:: db_responses Responses from database during loading process. :type: Dict[str, str] .. attribute:: loaded Table was loaded successfully. :type: bool .. method:: load(self, if_exists: Optional[str] = None, from_script: pathlib.Path = None, verbose: bool = True, **kwargs) -> snowmobile.core.table.Table Loads ``df`` into ``table``. :param if_exists: Determines behavior to take if the table being loaded into already exists; defaults to **append**; options are **replace**, **append**, **truncate**, and **fail** :type if_exists: Optional[str] :param from_script: Path to sql file containing custom DDL for ``table``; DDL is assumed to have a valid statement name as is parsed by :class:`~snowmobile.core.script.Script` and following the naming convention of ``create table~TABLE`` where ``TABLE`` is equal to the value provided to the ``table`` keyword argument :type from_script: Optional[Union[Path, str]] :param verbose: Verbose console output; defaults to **True** :type verbose: bool Returns (Table): The :class:`Table` after attempting load of ``df`` into ``table``; a successful load can be verified by inspecting :attr:`loaded` .. method:: exists(self) -> bool :property: Indicates if the target table exists. .. method:: col_diff(self, mismatched: bool = False) -> Dict[(int, Tuple[str, str])] Returns diff detail of local DataFrame to in-warehouse table. .. method:: cols_match(self) -> bool :property: Indicates if columns match between DataFrame and table. .. method:: load_statements(self, from_script: pathlib.Path) -> List[str] Generates exhaustive list of the statements to execute for a given instance of loading a DataFrame. .. method:: to_local(self, quote_all: bool = True) -> None Export to local file via configuration in ``snowmobile.toml``. .. method:: tm_load(self) -> int :property: Seconds elapsed during loading. .. method:: tm_validate_load(self) -> int :property: Seconds elapsed during validation. .. method:: tm_total(self) -> int :property: Total seconds elapsed for load. .. method:: validate(self, if_exists: str) -> None Validates load based on current state through a variety of operations. :param if_exists: Desired behavior if table already exists; intended to be passed in from :meth:`table.load()` by default. :type if_exists: str