SQL Server Data Quality Services

This article is about cleansing and formatting data using SQL Server Data Quality Services .

In this post , I am going to show on how to cleanse & format data in the columns of SQL Server table . To do this , first we have to create Knowledge base and then Data quality project .

Creating Knowledge base is nothing but giving hint of source data to DQS and then setting rules on how data should be formatted by creating Domains for each columns .

After creating Knowledge base , create Data quality project making use of knowledge base on our source data and then after deciding on the recommendations provided by DQS , we can export the cleansed (corrected) & formatted data to new table .


Note : click on images to get enlarged view


Consider below screenshot is my sample data :



Create New Knowledge Base





Enter Knowledge Base name and click on Knowledge Discovery and then click Next







Create a domain



Create a domain for Source Column Name
            





Create a domain for Source Column Gender











Similarly create a domain for Source Column CityCode & State with Format Output to Upper Case



Click Next and then click on Start will Performs data discovery analysis on the selected data source .




Click Next and then set the Type as Invalid and enter the values to be corrected .

For example : For domain CityCodeKB - .>  "Chennai" should be corrected as  "chn"  (city code instead of city name ) .







There are no values to be corrected in NameKB & StateKB , Check the values and then click on Finish and Publish the Knowledge Base :



Create New Data Quality Project :



Click Next



Click on Start will Perform Cleansing on the selected data source




Click on Export and then Finish



Clicking on Activity Monitoring will provide us the information as shown below :






 See Also :




No comments: