Working With Teradata Volatile Tables And ADO.Net Connection

by Prathy Kamasani

While I am in the flow of blogging, I want to write about an issue which troubled me recently. It definitely deserves a blog post.So that I will know where to look when I face the similar problem again. Recently I started working with Teradata, obviously being a Microsoft person, I have not enjoyed it much but luckily my job was majorly involved in writing select query than any other complicated stuff.

However, I came across a scenario where I need to bring few thousands of rows to SQL Server, which involves complex operations with millions of row’s of data on Teradata and then matching/joining/lookups with different data sources using SSIS. It was a case where Teradata Volatile tables come handy as I want to do all complex joins and operations on Teradata box then extract the only small set of data which requires further transformations and lookups. Well, I thought I can easily do this with ADO.NET connection and variables. Unfortunately, it was not that easy with my limited C# skills which ultimately led to this blog post.

So in this post, I am going to cover two topics: How to use Teradata Volatile tables and using SSIS objects variable as a source with managed ADO.NET connections.

Teradata and SSIS:

Attunity Teradata components are popular to extract data from Teradata databases. I read, this component performs better than using ADO.NET source component. I have not made any comparisons, however as per Microsoft Document, “Attunity Teradata Components uses the Teradata Parallel Transporter API and internal buffering with Teradata PT operators directly rather than requiring access through a managed interfaces”. Paper also mentioned in the conclusion saying that “The Microsoft Connector for Teradata by Attunity provides a high-performance means of loading and unloading data to and from Teradata databases”. With my scenario, I want to work with Volatile tables, so I cannot use Attunity Teradata components.Hence, I gone for ADO.Net Connection Type. The very first thing I need to do while I am working with Teradata is setting Query Banding.

Setting The Query Band:

I used Execute SQL Task to set the Query Band with ADO.NET connection type and System DSN. I want to use this query band for all the queries under this session that’s why I updated the RetainSameConnection property to TRUE.

22-03-16 16-09-24

Creating Volatile table:

In the next step, I created a volatile table using another Execute SQL Task. If you want, you can put all the statements under one Execute SQL Task, but I prefer to put them individually for readability. Hola we are done with creating our first Volatile table using SSIS, and we can use this table in any of our further operations…I want to select data from the volatile table I just created and use the selected data as a source in my data flow task for further data transformations or anything related to Data Flow Task.

22-03-16 16-02-54

I selected data using Execute SQL Task and returned the full result set to use as a source in the Data Flow Task. Pretty much same as my previous post which was based on Tim Mitchell blog post. Besides, here comes the problem. When I used the same script as I used in my last blog post, I was getting following error.

2016-03-20_18-09-50

ADO.Net Connection:

I thought I must have typed something wrong, so went back to my script checked few times, changed few things with my minimal C# knowledge but no success. I tried same example with SQL Server and  OLEDEB Connection, which worked fine. I thought something wrong with Teradata (Probably SSIS Execute SQL task was not supporting Teradata). Apparently when I looked at the error in detail, it clearly says Object is not an ADODB recordset.  With a bit of googling, I found that error is common in the .Net world also found an interesting blog post by Valentino Vranken explaining SSIS object package variable and how it depends on the connection type. First time I realised executes SQL Task result set depends on the connection type. As Valentino blog post explains in debug mode, I can clearly see the value type property of two system object variable is different.

                                image2016-03-21_23-06-37

ADO.net provides System.Data.Dataset and OLEDB provide System._ComObject. Basically, with OLEDB, I used Fill Method to extract data from the variable but with ADO.Net, I was getting error. Surprisingly I can use Fill method in the Control Flow with Script Task but not in the Data Flow Task with Script Component. When I use Fill in the Script Component as Source, I was getting following error

‘DataTable’ does not contain a definition for ‘Fill’ and no extension method ‘Fill’ accepting a first argument of type ‘DataTable’ could be found (are you missing a using directive or an assembly reference?)

But it works fine in the Control flow task with Script Task. Frankly, I do not know why but I overcame this situation by creating a Dataset. Here is the C# code I used with two connection types.

OLEDB Connection:

using System;

using System.Data;

using System.Data.OleDb;

using System.Data.SqlClient;

public override void CreateNewOutputRows()
    {
        OleDbDataAdapter oleDA = new OleDbDataAdapter();
        DataTable dt = new DataTable();
        oleDA.Fill(dt, Variables.OLEDB);
        foreach (DataRow dr in dt.Rows)

       {

           OLEDBBuffer.AddRow();
            OLEDBBuffer.Firstname = (dr["Firstname"].ToString());
            OLEDBBuffer.Lastname = (dr["Lastname"].ToString());
        }

   }

ADO.NET Connection:

using System;

using System.Data;

using System.Data.OleDb;

using System.Data.SqlClient;

  public override void CreateNewOutputRows()
     {
         OleDbDataAdapter oleDA = new OleDbDataAdapter();
         DataSet ds = (DataSet)this.Variables.ADO;
         DataTable dt = ds.Tables[0];
         foreach (DataRow dr in dt.Rows)

        {

            ADOBuffer.AddRow();
             ADOBuffer.Firstname = (dr["Firstname"].ToString());
             ADOBuffer.Lastname = (dr["Lastname"].ToString());
         }

    }
}

Finally it worked. This may not be the proper way of doing, but it worked with my scenario. ADO.Net connections usually don’t perform better compared to other connection types like OLEDB but with my scenario there was no impact on the performance. If someone know why Fill method works with Script Task but not with Script Component, please let me know 🙂

You can download SSIS Package from here: http://tinyurl.com/SSISTeradata  (For security reason I did not included Teradata part . But it has two connection types with SQL Server)

Enjoy,
Prathy

You may also like

7 comments

Rob Bane 15 November 2019 - 8:38 pm

Hi,

What is this detail which sets the query band?

Thanks for the blog!

Reply
JIKEN 29 August 2016 - 3:28 pm

Did you add any reference in script task for use ADOBuffer?

Reply
Prathy Kamasani 2 October 2016 - 7:59 am

No Jiken, I was using script as a source and using SysObject variable; which has the data I extracted from Teradata.

Reply
JIKEN 20 March 2018 - 4:00 pm

Did you add any reference in script task for use ADOBuffer?

Reply
Prathy Kamasani 20 March 2018 - 4:00 pm

No Jiken, I was using script as a source and using SysObject variable; which has the data I extracted from Teradata.

Reply
Shaun 21 July 2016 - 2:27 pm

Thanks, exactly what I needed.

Reply
Shaun 20 March 2018 - 4:00 pm

Thanks, exactly what I needed.

Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More