cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Community Member

Is there a "data as of" date in Canvas Data?

Jump to solution

Hi, I'm looking for a "data extract date" or "data as of date" so that we know when the data was pulled into Redshift.  I didn't see a status date or table anywhere in the documentation, but maybe I missed it?  We could retrieve the max(timestamp) from one of the fact tables, but is that the best/only way?

12 Replies
Highlighted

a1222252@adelaide.edu.au‌ - Interesting that pseudonym_dim is so up to date compared to requests!  I always thought the whole batch got exported at the same time.  

I could do it on import but as long as I know it's all in UTC I know they all match.  The US time zones and daylight savings time date ranges are kind of difficult to work with.  

0 Kudos
Highlighted

Hi Joni,

This will be due to the fact that the data sources are different. The dims and facts will be extracted from the database, while the requests data is extracted from the web server activity log. I would expect that extracting data from the log is more involved an takes longer, and probably can't be done until the log rotates. This would be why the data always contains data up until midnight UTC.

S.

0 Kudos
Highlighted

fernerj@erau.edu,

Those timestamps in your example code seem to be dependent upon some locale setting. When I tried #10-03-2019#, it interpreted it as October 3, 2019, and not March 10, 2019. People may be better served using the unambiguous YYYY-MM-DD format and #2019-03-10#. That's what Tableau shows in their documentation.

Also, because the time change happens at 2 am local time, which would be 07:00 UTC for EST switching to EDT and 06:00 UTC for EDT switching to EST, you're missing out on 17 hours of DST by making it be > March 10, 2019, instead of >=.

You can include the timestamp as part of your check and not bother chopping it to the date.

IF DATE([Timestamp]) > #10-03-2019# AND DATE([Timestamp]) < #03-11-2019# THEN

could be written as more precisely as

IF [Timestamp] >= #2019-03-10 07:00# AND [Timestamp] < #2019-11-03 06:00# THEN

You could make it a little more readable (arguable) but definitely shorted by combining the checks with an OR instead of doing IF THEN ELSEIF ELSIF ELSIF ... ELSIF END

IF

  ([Timestamp] >= #2019-03-10 07:00# AND [Timestamp] < #2019-11-03 06:00#) OR

  ([Timestamp] >= #2020-03-08 07:00# AND [Timestamp] < #2020-11-01 06:00#) OR

 // fill in the rest of the years here

  ([Timestamp] >= #2025-03-09 07:00# AND [Timestamp] < #2025-11-02 06:00#)

THEN DATEADD('hour',-4,[Timestamp])
ELSE DATEADD('hour',-5,[Timestamp])

END

I don't know that this is clearer, but you could also manipulate that to be

DATEADD('hour',

IF

  ([Timestamp] >= #2019-03-10 07:00# AND [Timestamp] < #2019-11-03 06:00#) OR

  ([Timestamp] >= #2020-03-08 07:00# AND [Timestamp] < #2020-11-01 06:00#) OR

 // fill in the rest of the years here

  ([Timestamp] >= #2025-03-09 07:00# AND [Timestamp] < #2025-11-02 06:00#)

THEN -4 ELSE -5 END,

[Timestamp])

Tableau uses short circuit Boolean analysis with AND and OR logical functions, so it won't evaluate all of the other checks once it finds once that matches. I don't know what kind of performance hit there is to checking all the way out to 2025 when it's still 2019, but that's present in both of our examples.

You might be able to speed up performance by partitioning the data off of the year. Something like this would let you skip a bunch of checks for the later years

DATEADD('hour',
CASE YEAR([Timestamp])
  WHEN 2019 THEN
    IIF([Timestamp] >= #2019-03-10 07:00# AND [Timestamp] < #2019-11-03 06:00#,-4,-5)
  WHEN 2020 THEN
    IIF([Timestamp] >= #2020-03-08 07:00# AND [Timestamp] < #2020-11-01 06:00#,-4,-5)

  // Keep going with additional years

  WHEN 2025 THEN

    IIF([Timestamp] >= #2025-03-09 07:00# AND [Timestamp] < #2025-11-02 06:00#,-4,-5)

  ELSE 0
END,
[Timestamp])

I don't have a real budget for Canvas Data, so I use MySQL. It isn't as high powered as some of the other database systems, but here is what I found a year or so ago when I did some checking.

If I store my values as a DATETIME rather than TIMESTAMP, then the database doesn't do automatic conversions. A TIMESTAMP field in MySQL is converted to UTC when stored and converted back when retrieved. That made it hard to work with timestamps that were already in UTC as it erroneously converted it again. Using DATETIME or DATETIME(3) for the requests table, the timestamps are left in UTC and then I can use the CONVERT_TZ(dt, 'UTC', 'US/Eastern') function to get the local time (although it would be US/Central for me).

Earlier this week, I saw where a many states were pushing to do away with the time split and go with one standard year round. The Illinois Senate just passed a bill on November 13 to make Illinois permanently be on Daylight Saving Time. I think the bill still has to be approved by the Illinois House and it would require a change at the federal level to the Uniform Time Act of 1966. Currently, states can opt out of DST without federal say-so, but we cannot opt into year-round DST without their OK. Some New England states are trying to get into a different timezone so that they can be on permanent Atlantic Standard Time, which would be permanent Eastern Daylight Saving time. I suppose that Illinois could take that approach and requested a move into Eastern Time, but almost none of the substantive requirements are met, so it wouldn't be likely to happen.

From a technology perspective, it would be easier if people stuck to one time year round. Somehow, I don't think the complexity of a Tableau calculation is going to be the deciding factor in that decision.

Top Kudoed Authors