Introduction

In previous article, we saw how to import/export excel to/from SQL Server by executing R script within T-SQL.

In this post, let us see another similar approach to import excel into SQL Server and export SQL server data to excel by executing Python script within T-SQL.

There are some existing methods to do this using BCP, Bulk Insert, Import & Export wizard from SSMS, SSIS, Azure data factory, Linked server & OPENROWSET query and SQLCMD.

BULK INSERT statement, the BCP tool, or Azure Data Factory can't read Excel files directly
BCP - Workaround has to be done to include the header while exporting
SSIS - Though it supports exporting to excel, with dynamic source & destination, handling mapping between source to target increases the complexity of the package
SQLCMD - Cannot export output in Excel file format

R & Python language extension was introduced in SQL Server 2016 & 2017 as part of machine learning. With support of R in Azure SQL database and Java language extension support in SQL Server 2019 , this new approach can be used extensively as it easy, fast and flexible.

Pre-requisites

We have used SQL Server 2019 evaluation edition on Windows 10 64 bit and WideWorldImporters SQL Server sample database for this example.

Below scripts will work starting from SQL Server 2017 and above (as execution of Python language using T-SQL was introduced in SQL Server 2017).

Only pre-requisite step is to install Python services and then from SSMS enable the external scripting feature. Restart the database engine and then verify the installation as mentioned in MSDN .

Provide folder permission to access the excel files during import / export process, right-click on folder -> Properties -> Security -> (Full Control) to "ALL_APPLICATION_PACKAGES".

pandas.DataFrame.to_excel & pandas.read_excel are used to export and import excel which are installed by default

We can check that by running below statement in SSMS:

EXECUTE sp_execute_external_script

@language =N'Python',

@script=N'import pip

for i in pip.get_installed_distributions():

print(i)';

GO

Exporting SQL Server data (list of tables, views) into Excel file

For this approach, we have created a Stored procedure named "usp_ExportExcel" in "WideWorldImporters" database. Executing the Stored procedure based on input parameters exports SQL Server data to excel files

Stored procedure - Has below three input parameters and writes the output to excel file

Parameter	Description
@ExportPath	Path for exporting excel files
@SchemaName	List of objects under this schema to be exported. Can have multiple values separated by comma
@ObjectlisttoExport	List of tables, views to be exported. Can have multiple values separated by comma

CREATE OR ALTER PROC usp_ExportExcel (@ExportPath NVARCHAR(MAX),

@SchemaName NVARCHAR(MAX),

@ObjectlisttoExport NVARCHAR(MAX)

)

AS

BEGIN

SET NOCOUNT ON;

BEGIN TRY

IF ISNULL(@ExportPath,'') <> ''

BEGIN

SELECT @ExportPath = CASE WHEN RIGHT(@ExportPath,1) = '\' THEN @ExportPath ELSE CONCAT(@ExportPath,'\') END

DECLARE @ValidPath TABLE (ValidPathCheck BIT)

INSERT @ValidPath

EXEC sp_execute_external_script

@language =N'Python',

@script=N'

import pandas as pd

d = os.path.isdir(ExportFilePath)

OutputDataSet = pd.DataFrame(d,columns=["Filename"])'

,@params = N'@ExportFilePath NVARCHAR(MAX)'

,@ExportFilePath = @ExportPath

IF (SELECT ValidPathCheck FROM @ValidPath) = 1

BEGIN

IF ISNULL(@SchemaName,'') <> '' OR ISNULL(@ObjectlisttoExport,'') <> ''

BEGIN

DROP TABLE IF EXISTS #ExportTablesList, #FinalExportList

CREATE TABLE #ExportTablesList(TableCount INT IDENTITY(1,1),Cols NVARCHAR(MAX),TableName NVARCHAR(200))

--Get the list of objects to be exported

INSERT #ExportTablesList (Cols,TableName)

SELECT CASE WHEN TY.name IN ('date','datetime2','datetimeoffset','time','timestamp','decimal','bit','int','bigint')

THEN CONCAT('TRY_CONVERT(','VARCHAR(MAX),',C.name,') AS ',QUOTENAME(C.NAME))

ELSE C.name END Cols -- To cover poor data type conversions b/n Python & SQL Server

,CONCAT(SCHEMA_NAME(T.SCHEMA_ID),'.',T.name) TableName

FROM Sys.tables T

JOIN sys.columns C

ON T.object_id = C.object_id

JOIN sys.types TY

ON C.[user_type_id] = TY.[user_type_id]

WHERE Schema_name(T.schema_id) IN (SELECT value FROM STRING_SPLIT(@SchemaName, ','))

-- Ignore the datatypes that are not required to be exported

AND TY.name NOT IN ('geography','varbinary','binary','text', 'ntext', 'image', 'hierarchyid', 'xml', 'sql_variant')

INSERT #ExportTablesList (Cols,TableName)

SELECT CASE WHEN TY.name IN ('date','datetime2','datetimeoffset','time','timestamp','decimal','bit','int','bigint')

THEN CONCAT('TRY_CONVERT(','VARCHAR(MAX),',C.name,') AS ',QUOTENAME(C.NAME))

ELSE C.name END Cols -- To cover poor data type conversions b/n Python & SQL Server

,CONCAT(SCHEMA_NAME(T.SCHEMA_ID),'.',T.name) TableName

FROM Sys.tables T

JOIN sys.columns C

ON T.object_id = C.object_id

JOIN sys.types TY

ON C.[user_type_id] = TY.[user_type_id]

WHERE CONCAT(SCHEMA_NAME(T.SCHEMA_ID),'.',T.name) IN (SELECT value FROM STRING_SPLIT(@ObjectlisttoExport, ','))

-- Ignore the datatypes that are not required to be exported

AND TY.name NOT IN ('geography','varbinary','binary','text', 'ntext', 'image', 'hierarchyid', 'xml', 'sql_variant')

--Dedup of object list

;WITH dedup

AS

(

SELECT *,ROW_NUMBER()OVER(PARTITION BY TableName,Cols ORDER BY Cols) Rn FROM #ExportTablesList

)

DELETE FROM dedup

WHERE Rn > 1

--Forming columns list as comma separated

SELECT TableName,IDENTITY(INT,1,1) AS TableCount

, STUFF(

(

SELECT ', ' + C.Cols

From #ExportTablesList As C

WHERE C.TableName = T.TableName

FOR XML PATH('')

), 1, 2, '') AS Cols

INTO #FinalExportList

From #ExportTablesList As T

GROUP BY TableName

DECLARE @I INT = 1

,@TableName NVARCHAR(200)

,@SQL NVARCHAR(MAX) = N''

,@PythonScript NVARCHAR(MAX) = N''

,@ExportFilePath NVARCHAR(MAX) = N''

--Loop through the object list to export as excel

WHILE @I <= (SELECT COUNT(TableName) FROM #FinalExportList)

BEGIN

-- Just for testing purpose top 10 records are selected

SELECT @SQL = CONCAT('SELECT TOP 10 ',Cols,' FROM ',TableName,';')

,@TableName = TableName

FROM #FinalExportList WHERE TableCount = @I

SET @PythonScript = N'

FullFilePath = ExcelFilePath+TableName+".xlsx"

InputDataSet.to_excel(FullFilePath,sheet_name=TableName.split(".")[-1],index=False)'

EXEC sp_execute_external_script

@language = N'Python'

,@script = @PythonScript

,@input_data_1 = @SQL

,@params = N'@ExcelFilePath NVARCHAR(MAX), @TableName NVARCHAR(200)'

,@ExcelFilePath = @ExportPath -- file path where Excel files are placed

,@TableName = @TableName

SET @I = @I + 1

END

END ELSE PRINT 'Schema name of objects or list of objects (separated by comma) to be exported need to be mentioned'

END ELSE PRINT 'Invalid folder path'

END ELSE PRINT 'Export folder path need to be mentioned'

END TRY

BEGIN CATCH

PRINT 'Issue while executing this SP, please check whether there is permission to execute the script / to access the folder and input params are valid'

END CATCH

END

Importing Excel file into SQL Server

For this approach, we have created a Stored procedure named "usp_ImportExcel" in "WideWorldImporters" database. Executing the Stored procedure based on input parameters imports excel files to SQL Server table

Stored procedure - Has below six input parameters

Parameter	Description
@ImportPath	Path where excel files are placed for importing into SQL Server
@DBConnectionString	Target SQL Server database connection string where files are imported. Can be Database=DB name; Trusted_Connection=True or Database=DB name;Uid= user name;Pwd=Password
@ImportAll	If set to 1 then all files in the mentioned path are imported. If set to 0 then only mentioned files are imported
@CombineTarget	Flag to decide single target table for each source file (files with same structure) or separate target table for each source file
@ExcelFileName	If @ImportAll = 0 then excel file name needs to be passed to this parameter
@ExcelSheetName	If @ImportAll = 0 then corresponding sheet name of the excel file needs to be passed to this parameter

CREATE OR ALTER PROC usp_ImportExcel (@ImportPath NVARCHAR(MAX),

@DBConnectionString NVARCHAR(MAX),

@ImportAll BIT,

@CombineTarget BIT,

@ExcelFileName NVARCHAR(200),

@ExcelSheetName NVARCHAR(50)

)

AS

BEGIN

SET NOCOUNT ON;

BEGIN TRY

IF ISNULL(@ImportPath,'') <> '' AND ISNULL(@DBConnectionString,'') <> ''

BEGIN

SELECT @ImportPath = CASE WHEN RIGHT(@ImportPath,1) = '\' THEN @ImportPath ELSE CONCAT(@ImportPath,'\') END

DECLARE @Serv NVARCHAR(200) = CONCAT(CHAR(39),CHAR(39),@@SERVERNAME,CHAR(39),CHAR(39))

DECLARE @ValidPath TABLE (ValidPathCheck BIT)

INSERT @ValidPath

EXEC sp_execute_external_script

@language =N'Python',

@script=N'

import pandas as pd

d = os.path.isdir(ImportFilePath)

OutputDataSet = pd.DataFrame(d,columns=["Filename"])'

,@params = N'@ImportFilePath NVARCHAR(MAX)'

,@ImportFilePath = @ImportPath

IF (SELECT ValidPathCheck FROM @ValidPath) = 1

BEGIN

IF (@ImportAll = 0 AND (ISNULL(@ExcelFileName,'') <> '' AND ISNULL(@ExcelSheetName,'') <> ''))

OR (@ImportAll = 1 AND (ISNULL(@ExcelFileName,'') = '' AND ISNULL(@ExcelSheetName,'') = ''))

BEGIN

DECLARE @PythonScript NVARCHAR(MAX) =CONCAT('

import pandas as pd

import os

import glob

from revoscalepy import RxSqlServerData, rx_data_step

sqlConnString = "Driver=SQL Server;Server=Serv; ',@DBConnectionString,'"

Filefolderepath = ImportFilePath+"*.xlsx"

if ImportAll ==1 and CombineTarget==0:

for FullFilePath in glob.glob(Filefolderepath):

Filename = os.path.basename(FullFilePath).replace(".xlsx","")

xl = pd.ExcelFile(FullFilePath)

for sheetname in xl.sheet_names:

Output = pd.read_excel(FullFilePath, sheetname=sheetname, na_filter=False).astype(str)

if not Output.empty:

sqlDS = RxSqlServerData(connection_string = sqlConnString,table = "".join(fl for fl in Filename if fl.isalnum())+"_"+"".join(sh for sh in sheetname if sh.isalnum()))

rx_data_step(input_data = Output, output_file = sqlDS,overwrite = True)

if ImportAll ==1 and CombineTarget==1:

df2=pd.DataFrame()

for FullFilePath in glob.glob(Filefolderepath):

Filename = os.path.basename(FullFilePath).replace(".xlsx","")

xl = pd.ExcelFile(FullFilePath)

for sheetname in xl.sheet_names:

Output = pd.read_excel(FullFilePath, sheetname=sheetname).columns.astype(str)

Output = ",".join(list(Output))

df1 = pd.DataFrame([[Filename,sheetname,FullFilePath,Output]],columns=["Filename","sheetname","FullFilePath","Headers"])

df2=df2.append(df1,ignore_index=True)

sqlDS = RxSqlServerData(connection_string = sqlConnString,table = "Tbl_PyImpExp1")

rx_data_step(input_data = df2, output_file = sqlDS,overwrite = True)

if ImportAll ==0:

Filename =ImportFilePath+ExcelFileName+".xlsx"

exists = os.path.isfile(Filename)

if exists and ExcelSheetName in pd.ExcelFile(Filename).sheet_names:

Output = pd.read_excel(Filename, sheetname=ExcelSheetName, na_filter=False).astype(str)

if not Output.empty:

sqlDS = RxSqlServerData(connection_string = sqlConnString,table = "".join(fl for fl in ExcelFileName if fl.isalnum())+"_"+"".join(sh for sh in ExcelSheetName if sh.isalnum()))

rx_data_step(input_data = Output, output_file = sqlDS,overwrite = True)

else:

print("Invalid Excel file or sheet name")')

EXEC sp_execute_external_script

@language = N'Python'

,@script = @PythonScript

,@params = N'@ImportFilePath NVARCHAR(MAX),@ImportAll BIT,@CombineTarget BIT,@ExcelFileName NVARCHAR(200),@ExcelSheetName NVARCHAR(50),@Serv NVARCHAR(200)'

,@ImportFilePath = @ImportPath

,@ImportAll = @ImportAll

,@CombineTarget = @CombineTarget

,@ExcelFileName = @ExcelFileName

,@ExcelSheetName = @ExcelSheetName

,@Serv = @Serv

IF @ImportAll =1 AND @CombineTarget =1

BEGIN

IF OBJECT_ID('Tbl_PyImpExp1') IS NOT NULL

BEGIN

DROP TABLE IF EXISTS Tbl_PyImpExp2

;WITH FileList

As(

SELECT [Filename]

,[sheetname]

,[Headers]

,[FullFilePath]

,ROW_NUMBER()OVER(ORDER BY (SELECT 1)) Rn

,ROW_NUMBER()OVER(PARTITION BY [Headers] ORDER BY [Headers]) Grp

,DENSE_RANK()OVER(ORDER BY [Headers]) Grp1

FROM [dbo].[Tbl_PyImpExp1]

)

SELECT *,FIRST_VALUE([Filename]) OVER (PARTITION BY Grp1 ORDER BY Grp ASC) AS TableName

INTO Tbl_PyImpExp2

FROM FileList

END

IF EXISTS (SELECT 1 FROM Tbl_PyImpExp2)

BEGIN

DECLARE @I INT = 1

,@SQL NVARCHAR(MAX) =N''

SET @PythonScript = CONCAT('

import pandas as pd

from revoscalepy import RxSqlServerData, rx_data_step

sqlConnString = "Driver=SQL Server;Server=Serv; ',@DBConnectionString,'"

if ImportAll ==1 and CombineTarget==1:

FinalImport=pd.DataFrame()

for index, row in InputDataSet.iterrows():

Tbl = "".join(T for T in row["TableName"] if T.isalnum())

Import = pd.read_excel(row["FullFilePath"], sheetname=row["sheetname"], na_filter=False).astype(str)

Import["ImportKey"] = row["TableName"]+"_"+row["sheetname"]

FinalImport=FinalImport.append(Import,ignore_index=True)

if not FinalImport.empty:

sqlDS = RxSqlServerData(connection_string = sqlConnString,table = Tbl)

rx_data_step(input_data = FinalImport, output_file = sqlDS,overwrite = True)')

WHILE @I <= (SELECT MAX(Grp1) FROM Tbl_PyImpExp2)

BEGIN

SET @SQL = CONCAT('SELECT FullFilePath,sheetname,TableName FROM Tbl_PyImpExp2 WHERE Grp1 = ',@I)

EXEC sp_execute_external_script

@language = N'Python'

,@script = @PythonScript

,@input_data_1 = @SQL

,@params = N'@ImportAll BIT,@CombineTarget BIT,@Serv NVARCHAR(200)'

,@Serv = @Serv

,@ImportAll = @ImportAll

,@CombineTarget = @CombineTarget

SET @I = @I + 1

END

DROP TABLE IF EXISTS Tbl_PyImpExp1,Tbl_PyImpExp2

END

END ELSE PRINT 'Invalid parameters: If ImportAll = 0 then pass Excel file & Sheet Name as input. If ImportAll = 1 then pass Excel file & Sheet Name blank'

END ELSE PRINT 'Invalid folder path'

END ELSE PRINT 'Import folder path or database connection string need to be mentioned'

END TRY

BEGIN CATCH

PRINT 'Issue while executing this SP, please check whether there is permission to execute the script / to access the folder and input params are valid'

END CATCH

END

Please see my technet wiki article for detailed explanation on this topic with examples and screenshots and you can also download the scripts and SSRS report from technet gallery site.

We need to be careful with indentations in python script as formatting them without understanding the code can result in error.

All about SQLServer

SQL Server Import/Export to Excel using Python script

Exporting SQL Server data (list of tables, views) into Excel file

Importing Excel file into SQL Server

See Also:

No comments:

About Me

Followers