Your Community is getting an upgrade!
Read about our partnership with Higher Logic and how we will build the next generation of the Instructure Community.
Found this content helpful? Log in or sign up to leave a like!
Just happened to be running a request, got back
{
"id": "f6173a8b-e377-4f44-9919-3284cac40a88",
"status": "complete",
"objects": [
{
"id": "f6173a8b-e377-4f44-9919-3284cac40a88/part-00000-63ae619d-52d5-473d-b7b9-3e03edebe7e1-c000.json.gz"
},
{
"id": "f6173a8b-e377-4f44-9919-3284cac40a88/part-00001-63ae619d-52d5-473d-b7b9-3e03edebe7e1-c000.json.gz"
},
{
"id": "f6173a8b-e377-4f44-9919-3284cac40a88/part-00003-63ae619d-52d5-473d-b7b9-3e03edebe7e1-c000.json.gz"
}
],
"expires_at": "2023-07-12T17:53:50Z",
"schema_version": 1,
"at": "2023-07-11T17:01:02Z"
}
didn't think anything was unusual until I was getting request object URLs and noticed I had part-00000, part-00001, and part-00003. So what happened to part-00002 ?
Solved! Go to Solution.
The name of the files returned by the API don't bear any special significance, you should not be relying on any pattern. A query operation returns a list of object identifiers, which capture the entire result-set. If you process all the objects the API call returns, you don't miss out on any output data. In particular, our own DAP client library completely ignores file names.
Behind the scenes, these files are generated by independent parallel processes that don't communicate with one other. Occasionally, one of these processes may be terminated, and must be restarted. If this happens, the new process is assigned the next value in the sequence, and there will be a left-out value for the terminated process. The API call returns when all processes have completed successfully, and all data is ready to be returned.
I am lead to believe there are some issues with returning multiple files and the processing behind it, which are being addressed somewhere in the backlog. I frequently (almost always on larger tables) get two files back - a Part 0 which is empty, and a higher sequence - up to 8 is the highest I think I have seen, which has actual data. This is especially prevalent with delta requests.
Sometimes, with very large sets, there are multiple parts which actually have data in them. I wouldn't worry about the sequence numbers missing - just process all the files (even if they only have a header) and hopefully Instructure will tidy it up so that the overhead of processing effectively empty files, and the confusion of missing sequences can go away.
The name of the files returned by the API don't bear any special significance, you should not be relying on any pattern. A query operation returns a list of object identifiers, which capture the entire result-set. If you process all the objects the API call returns, you don't miss out on any output data. In particular, our own DAP client library completely ignores file names.
Behind the scenes, these files are generated by independent parallel processes that don't communicate with one other. Occasionally, one of these processes may be terminated, and must be restarted. If this happens, the new process is assigned the next value in the sequence, and there will be a left-out value for the terminated process. The API call returns when all processes have completed successfully, and all data is ready to be returned.
To interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign InTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign In