Your Community is getting an upgrade!
Read about our partnership with Higher Logic and how we will build the next generation of the Instructure Community.
Found this content helpful? Log in or sign up to leave a like!
Hi --
I'm trying to run my DAP init/sync process in AWS Lambda which does not have a writable filesystem by default. This had been working fine as of DAP Client version 0.3.15, but more recent versions (including the current 0.3.18) give me an error, presumably because the library tries to create a "instructure_dap_temp" temporary directory which fails:
{
"errorMessage": "[Errno 2] No such file or directory: 'instructure_dap_temp'",
"errorType": "FileNotFoundError",
"requestId": "229de7b3-d0c2-4615-a4cd-5348373e3857",
"stackTrace": [
" File \"/var/task/app.py\", line 55, in lambda_handler\n asyncio.get_event_loop().run_until_complete(\n",
" File \"/var/lang/lib/python3.11/asyncio/base_events.py\", line 653, in run_until_complete\n return future.result()\n",
" File \"/var/task/dap/actions/sync_db.py\", line 16, in sync_db\n await SQLReplicator(session, db_connection).synchronize(\n",
" File \"/var/task/dap/replicator/sql.py\", line 120, in synchronize\n await client.download(\n",
" File \"/var/task/dap/downloader.py\", line 192, in download\n with use_and_remove(DEFAULT_FOLDER) as directory:\n",
" File \"/var/lang/lib/python3.11/contextlib.py\", line 155, in __exit__\n self.gen.throw(typ, value, traceback)\n",
" File \"/var/task/dap/downloader.py\", line 39, in use_and_remove\n rmtree(directory)\n",
" File \"/var/lang/lib/python3.11/shutil.py\", line 722, in rmtree\n onerror(os.lstat, path, sys.exc_info())\n",
" File \"/var/lang/lib/python3.11/shutil.py\", line 720, in rmtree\n orig_st = os.lstat(path, dir_fd=dir_fd)\n"
]
}
In Lambda I can provide a read/write filesystem, but it must be mounted at a path under /mnt/. So -- my request is: in a future version of the DAP Client, could the location of that "instructure_dap_temp" folder be overridable? In theory that would let me place it on the read/write filesystem and things should work as before.
That value is set at line 26 of downloader.py:
DEFAULT_FOLDER: str = "instructure_dap_temp"
I'd like a clean way to be able to set that to "/mnt/instructure_dap_temp" or something similar.
cc: @LeventeHunyadi
Thanks!
--Colin
In newer versions of the DAP client library, the development team decided to change how data is transferred from server-side DAP to client-side database. Previously, downloaded result-set data would be inserted into the database directly, without extra hops. In newer versions, data would first be written to a file, and then read from the file to be inserted into the database, which shortens the duration while a network connection has to remain open, even when database insertion is slow, lowering the chance of a connection being aborted.
The typical pattern to configure global settings in a Python library is to have a top-level variable in one of the Python source files acting as the library interface. In other words, DEFAULT_FOLDER would move from downloader.py (part of the library implementation, an implementation detail) to api.py (the outward-facing interface for the library), and become a documented variable (possibly with a more descriptive name). You could then assign the variable via dap.api.DEFAULT_FOLDER. Since the library would pick up the value from the global variable, if you set the value in your application before you trigger any interaction with files or databases, the library would use the value you have assigned.
I am no longer actively involved in any work related to DAP client library but I have passed on your request to the development team for consideration, with a suggested approach they might want to take.
Hi Colin,
Thanks for bringing this to my attention before I eventually update to 0.3.18 (we are also running on AWS Lambda). Out of curiosity, is there a reason you are using /mnt rather than the ephemeral storage at /tmp? From my understanding, your Lambda function should be able to read/write files in /tmp without any additional configuration, although it's not clear that that is where they're actually trying to create "instructure_dap_temp" regardless.
Hi @jwals --
That's a good point; I'd forgotten that /tmp is available for read/write. But yeah -- it's not clear to me exactly where the library is trying to create its temp folder, and being able to explicitly place it in /tmp/ would be nice.
I'd love to compare notes with you on running this on Lambda! FWIW, even after reverting to 0.3.15 I'm still getting pretty frequent errors that seem like they're related to downloading the data files from the service. My very rough/work-in-progress code is here: https://github.com/Harvard-University-iCommons/canvas-data-2/tree/feature/mvp
--Colin
I'd be happy to chat about AWS & CD2 (my very early iterations were based on your post here 😉). I can't publicly share our Git repo for various reasons, but you can reach me at jwals23 [AT] emory [DOT] edu if you want to connect!
Your mention of /tmp/ was very helpful -- I was able to get the Lambda function to work properly with DAP client version 0.3.18 by adding os.chdir("/tmp/") before calling the client function.
Thanks!
--Colin
FYI for anyone looking for the repo above: we've made that one private and created a new, more generic version of our CD2 code here:
https://github.com/Harvard-University-iCommons/canvas-data-2-aws/tree/develop
To interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign InTo interact with Panda Bot, our automated chatbot, you need to sign up or log in:
Sign In