There is also a fairly large problem: I can't test profiles because they're dynamic. [...] Somehow I need to include a system where responses from Instagram are saved to disk, and then I can choose to use those responses instead of live data when testing in the future, essentially testing against a static version of Instagram that won't break on me. This will probably be annoying to do.
It wasn't as annoying as I expected, but it could definitely go that way, since responses are keyed by their full URL, not by any headers, timings, protocols, IP address, or other metadata. This is fine for now, but I suspect I'll need to make more general or more specific matching in future, and then I'll be in for a world of pain, because HTTP caching is really stupid to try to do well.
The problem I'm stuck on now is that obviously contributors will need a copy of the database and of the stored request responses in order to run the tests. I don't want to include these in the main repo because it'll force people who just want to run this site and not its tests to still download several megabytes of database data and stored responses, which is definitely not good.
I'm considering either an npm install step or git submodules, but I'm not entirely sold on either of those. I think git submodules will be easier to use, but for contributors, making sure to download them in the first place and then update them in the future is annoying. For something in npm, it'll be easy since I can make it part of --save-dev, and it'll be updated whenever someone runs npm install, but it'll be annoying to upload stuff to unless I can get clever about it.
Or I could put a full git repo (not a submodule) inside the repo, which isn't cloned normally, and running npm test clones or pulls the inner repo first. This sounds like it would be most convenient but most janky.
I may leave this decision alone for a few days and come back to it later.