Accessing more than: Add a DateLoaded column to input data

OK, I *know* this is trivial, and please feel free to giggle, but after years of DTS 2000 my poor brain can't make all the jumps neccesary to cope with SSIS 2005.

I have a Transfer data task. Source input (non SQL Server) has x columns, SQL Destination has x+1 columns, where the extra column is a non-null smalldatetime column DateLoaded. This will be populated with the Date & time the row was added.

In DTS I do this either by adding a GetDate() equivalient to the Source SQL statement (which I can't do in this case) OR by an ActiveX script to put
DATE()+TIME() in the final column in the Transfer pane of the designer.

This must be easy in SSIS, lots of other complex things are, but I am hanged if I can find out how to do it! Scripts as source? Union All to combine? ?

Solutions gratefully received,

Richard R.Richard,
The easiest solution is to put a Derived Column component between your source and destination. You just set the desired column name and set the expression to GETDATE(). The new column will then be available in you destination mappings.
You can do quite a few things with the derived column component including complex transformations. I'm sure you'll find it to be one of your most used components.
Larry|||Another option is to use the value of the System::StartTime variable which contains the time that the package started to execute.
This will mean that all rows in the pipeline are guaranteed to have the same value for this column - which may or may not be what you want.

-Jamie|||I chose to go with the System::StartTime solution, but frequently needed it converted to a String. This can be done with the following cast:
(DT_WSTR, 20) @.[System::StartTime]

Regards,
Lars|||Another option again is to use the Audit Transformation, and the "Execution start time" audit type (column).|||Thanks to everyone who replied to my request.
Derived columns are the solution for this one, but the system time variable will be useful for some other problems that I usually solve by setting a package variable to the current time as the first step.

Thanks again,
Richard|||Richard,
The System::StartTime idea that I talked about still requires a derived column transformation. When you use the Derived column its your choice whether you add GETDATE() or System::StartTime

Using the audit transformation as mentioned by Darren is the quickest way of doing this however. And note that it does exactly teh same thing i.e. Adds the value in System::StartTime into the pipeline.

From a best practices point of view I really recommend you do one of these 2 approaches rather than use GETDATE().

-Jamie|||

Since we are on the topic, I wrote some custom logging functionality using the OnPreExecute and OnPostExecute events. What I do is log the execution time of each task and also use the System::StartTime to identify a batch run. I started out using GETDATE() for the start and stop times of the tasks, but soon realized how bad of an idea it was, since the clock was not in perfect synchronization between the server that was running SSIS and the one with the database on which GETDATE() was called. In other words, stick with one timing scheme for everything, which in my case was using those times which are available within SSIS, like System::EventHandlerStartTime for events.

Regards,
Lars

|||

lasa wrote:

Since we are on the topic, I wrote some custom logging functionality using the OnPreExecute and OnPostExecute events. What I do is log the execution time of each task and also use the System::StartTime to identify a batch run. I started out using GETDATE() for the start and stop times of the tasks, but soon realized how bad of an idea it was, since the clock was not in perfect synchronization between the server that was running SSIS and the one with the database on which GETDATE() was called. In other words, stick with one timing scheme for everything, which in my case was using those times which are available within SSIS, like System::EventHandlerStartTime for events.
Regards,
Lars

Apologies for the digression...
Lars,
Were you able to capture the execution time of each task?

I want to do this using System::ContainerStartTime but unfortunately if you reference that variable in an eventhandler then you end up referencing the instance of it that is scoped to the eventhandler container instead of the one that is scoped to the task which threw the error (which is what you want).

Hence, I can't fnd a way of determining, in an eventhandler, how long the task that threw the event has been executing for.

-Jamie|||

I was able to, but using a workaround which involves storing data in a table. Not sure if you want to take that route, but it works and you'll have the extra benefit of being able to track progress through the rows in the table. On the top level of the package I have added three event handlers, OnPreExecute, OnPostExecute and OnTaskFailure. The following pseudo-code explains what I do in the different handlers:
OnPreExecute
I store the following information in a database table:

System::StartTime AS BatchStartTime,
System::SourceID AS TaskID,
System::SourceName AS TaskName,
System::EventHandlerStartTime AS TaskStartTime,
NULL AS TaskStopTime,
'Started' AS TaskStatus

OnTaskFailure
I update the table:

SET
TaskStatus = 'Failed'
WHERE
TaskID = System::SourceID
AND
BatchStartTime = System::StartTime

OnPostExecute
I update the table:

SET
TaskStopTime = System::EventHandlerStartTime,
TaskStatus = CASE TaskStatus
WHEN 'Started' THEN 'Completed'
ELSE TaskStatus
END
WHERE
TaskID = System::SourceID
AND
BatchStartTime = System::StartTime

You could just as easily use the key TaskID, BatchStartTime to retrieve and subtract System::EventHandlerStartTime from TaskStartTime to get the length of execution in any of the handlers following the PreExecute one.

Regards,
Lars

|||

OK, I see how that could work. Although I don't think it needs to be in a table. It could maybe be done inside the package using an Object variable.

Thanks Lars. You've really made me think here. It needs a lot more thought putting into it but you've given me a starting point.

Thanks
-Jamie

Monday, February 13, 2012

Add a DateLoaded column to input data

No comments:

Post a Comment

Accessing more than

Blog Archive

About Me