While at Everything Open, I got to catch up with Rob N and chat about what we had been working on the last few years. We got chatting about a really interesting issue he was having with a patch for ZFS he'd be working on where he couldn't find what was decrementing a reference counter too far and causing issues.
While chatting about this I had the idea "Why not change the amount the counter increments and decrements?". That way if at one callsite you inc/dec by 1 and another you do the same by 2, then if you end up at minus 2 you know which caller was the one being triggered too much, and can then follow up with other debugging options to work out why the dec was called too much.
Another interesting suggestion was to increment and decrement by prime numbers, but this may end up failing if the call site that inc/dec by 1 is called multiple times. So some creative shuffling of the values could end up being needed to bisect the offending caller in that case. Generally large prime numbers would avoid this potential issue though, and if your counter is 64bit the likelihood of an over/under flow is pretty low.
So I hope that helps someone with a future reference count issue.