Archive

Archive for the ‘Azure Data Factory’ Category

Azure Data Factory (ADF) Pipeline failure – found more columns than expected column count (DelimitedTextMoreColumnsThanDefined)

July 29, 2020 3 comments

 
I was setting up an Azure Data Factory (ADF) to copy files from Azure Data Lake Storage Gen1 to Gen2, but while running the Pipeline it was failing with below error:

Operation on target Copy_sae failed: Failure happened on ‘Sink’ side.
ErrorCode=DelimitedTextMoreColumnsThanDefined,
‘Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,
Message=Error found when processing ‘Csv/Tsv Format Text’ source ‘0_2019_11_09_01_43_32.avro’ with row number 53: found more columns than expected column count 27.,
Source=Microsoft.DataTransfer.Common,’

 

After some research I figured out that its because I had not selected the “Binary Copy” option while creating the Copy Data activity (shown in image below).

Root Cause: If the files under a particular folder you are copying contains files having different schema like, variable number of columns, different delimiters, quote char settings, or some data issue, the ADF pipeline will end up running in this error.

So, for bulk copying or migrating your data from one Data Lake to another try choosing this option, so that ADF won’t open the files to read schema, but it just simply treats every file as binary and copy it to the other location.


 
Hope this helps !

Migrate ADLS Gen1 to Gen2