I should clarify, I wasn't actually making my own malloc, I was only macro-switc...

I should clarify, I wasn't actually making my own malloc, I was only macro-switching it so that I could keep a list of all the malloc call returns. I was also interrupting free, so I could see which objects had been malloc'd but not free'd. I think I mentioned it was to track down memory leaks.

The example I quoted in Python's implementation, though, seems to hold truer - rather than calling malloc whenever it needs more space it allocates buffers and hands out memory from them, reducing the frequency of calls to malloc, and meaning that its garbage collector sometimes can't free as much RAM as you hoped (since there are some nearly-empty buffers that can't be released.) This makes running python instructions which need a little more RAM much faster, at the expense of complexity when RAM is short, of course. And overall the program should be faster, since it has fewer system calls (assuming that the code which hands out RAM is well optimised, which should be possible since it can be much simpler than the system malloc.)