Cloud Access to CD2

Jump to solution
IanGoh
Community Contributor

Our data team asked the following:

 

Does CD2 keep the actual tables in s3 containers as parquet or other file types ?

And could they provide a url to the s3 container?

Our fabric allows for linking directly to s3 containers.

 

That's above my knowledge, but I kind of doubt Instructure will give us any other access than the CD2 API.

Thanks

Ian

 

Labels (2)
0 Likes
1 Solution
LeventeHunyadi
Instructure
Instructure

Does CD2 keep the actual tables in s3 containers as parquet or other file types ?

We keep data in Parquet files but there is an extra overlay for ensuring transactional safety and ability to provide incremental updates. The data storage format is not plain Parquet.

And could they provide a url to the s3 container?

Unfortunately, this is not directly possible. Part of this is technical. Due to the extra overlay on top of Parquet files, S3 access alone would not provide the necessary information to make sense of the data (e.g. differentiate between current and stale data).

Our fabric allows for linking directly to s3 containers.

While this could be technically feasible, there are a number of non-trivial aspects that come into play. At least in the short-run and mid-run, we don't plan on opening up any other access than CD 2/DAP API.

View solution in original post

0 Likes