SQL Server Import/Export to Excel using Python script

Introduction

In previous article, we saw how to import/export excel to/from SQL Server by executing R script within T-SQL
In this post, let us see another similar approach to import excel into SQL Server and export SQL server data to excel by executing Python script within T-SQL.

There are some existing methods to do this using BCP, Bulk Insert, Import & Export wizard from SSMS, SSIS, Azure data factory, Linked server & OPENROWSET query and SQLCMD.

BULK INSERT statement, the BCP tool, or Azure Data Factory can't read Excel files directly   
BCP - Workaround has to be done to include the header while exporting
SSIS - Though it supports exporting to excel, with dynamic source & destination, handling mapping between source to target increases the complexity of the package
SQLCMD - Cannot export output in Excel file format


R & Python language extension was introduced in SQL Server 2016 & 2017 as part of machine learning. With support of R in Azure SQL database   and Java language extension support in SQL Server 2019 , this new approach can be used extensively as it easy, fast and flexible.


T-SQL : Data profiling in On-premise SQL Server / Azure SQL database

In this post, let us see how we can do data profiling on On-premise SQL Server / Azure SQL database tables using T-SQL script.

Data profiling provides below high level information about data:

  • Number of rows and size of the data in the object, date for the most recent update of the data and the object schema
  • Number of null records, distinct values for columns and data type of column
  • Minimum, maximum values for numeric columns and check for existence of special characters, leading/trailing spaces for columns of string data type 
Two stored procedures are created to generate data profiling output. Inside the stored procedure, iteration on each table columns are performed using system catalog views to apply the MIN, MAX and other functions on each column to get the data profiling output



1) usp_DataProfiling

2) usp_DataProfiling_Metadata


T-SQL : Search for string or phrase in SQL Server database

In this post, let us see how to search for a string / phrase in SQL Server database using hybrid solution of T-SQL LIKE operator & R grep function. Currently the options that exists in SQL Server to perform a search operation are 
  1. LIKE operator 
  2. Using string functions CHARINDEX, PATINDEX 
  3. Full text search 
Consider below example: To search and return only records with string "VAT" . Expected result is to return record 1,5 & 6.

DECLARE @Tmp TABLE (Id INT, Descrip VARCHAR(500))
INSERT @Tmp SELECT 1,'my VAT calculation is incorrect'
INSERT @Tmp SELECT 2,'Private number'
INSERT @Tmp SELECT 3,'Innnovation model'
INSERT @Tmp SELECT 4,'ELEVATE'
INSERT @Tmp SELECT 5,'total VAT'
INSERT @Tmp SELECT 6,'VAT'
SELECT FROM @Tmp WHERE Descrip LIKE 'VAT'
SELECT FROM @Tmp WHERE Descrip LIKE '%VAT'
SELECT FROM @Tmp WHERE Descrip LIKE '%VAT%'
SELECT FROM @Tmp WHERE Descrip LIKE '% VAT %'
SELECT FROM @Tmp WHERE Descrip LIKE '% VAT'






As shown in above example, to do an exact search on string, there is no straight forward option using first two options mentioned above. However though it is possible with third option using  Full text CONTAINS predicate. Full text catalog, unique index & full text index has to be created on the table on which search operation needs to be performed.
If the exact search of string needs to be performed on the entire database then creating full text catalog, unique index & full text index on each and every table won’t be a viable option.

With the hybrid approach [T-SQL LIKE operator & R grep function], let us see various search types that can be performed [Pattern Search, Exact Search, Multi pattern search and other search scenario's  – based on collation, case sensitive/insensitive search and complex wildcard search].