Byte length discoveries

Previous post in this saga: writev

I've made some discoveries regarding the garbage that was happening last time.

After the time (or request count, or whatever, I don't know) is up and the stuff is acting weird, it's not related to certain endpoints, it's related to the byte length of the entire HTTP response (including protocol, headers, and body). If the response is between 2817 and 3327 bytes inclusive, it gets stuck and the previously discussed writev/EAGAIN loop happens. 2816 and smaller is fine. 3328 and larger is fine (at least I think it is, there might be regions further up that cause problems, but I don't know).

It does not matter whether in pinski I pipe a response stream or I write and end the data manually, both of those have the same problem in the broken state. The only lead I currently have is that 2816 and 3328 are multiples of 256, and 3328 - 2816 = 512. 256 and 512 are powers of 2, which means they're important computer numbers. There's definitely some computer fuckery going on here. This isn't random.

I have found exactly ONE POST on THE ENTIRE INTERNET of somebody with the same problem, which is a mailing list last posted to in 2010. I am ACTUALLY LOSING MY MIND.

The size of the HTTP request doesn't seem to matter.

I'm rolling an experiment on at the moment that I don't want to talk too much about, but it should help me tell whether one feature of the code could be causing this problem or not. The experiment will be active for a couple of days. The experiment is not in git. If you already use, please continue to use it, but if for some reason it becomes unusable as a result of the experiment (unlikely!) then remember that you can switch to another instance which doesn't have it.

I'm really at a loss for what could be causing this. If the experiment shows that the suspicious feature of the code is not to blame, I think I'm going to move to a brand new server and see if it persists there, because nobody else seems to be having problems with it, so maybe it's something to do with my environment.

If you have absolutely any idea what could be causing this issue then please send help ASAP, you know how to talk to me. If you need more information about the problem, I have chat logs where I discuss it that I can send to you to if you're interested in helping.

— Cadence

← Previous: Amanda node managmentNext: Updater script specifics →