I'm tempted to say that it isn't possible. Anytime you're computing O(n log n) amount of information but have (1) nowhere to stash it (constant space), and (2) nowhere to immediately use it (O(n) moves), there is something weird going on, possibly involving heavy use of the stack (which may not be included in the space analysis, although it should be).
It might be possible if you store temporary information inside many bits of just one integer, or something squirrelly like that. (So O(1) in practice, but O(log n) in theory.)
Radix sorting wouldn't quite do it because you'd have to call the predicate to create the digits, and if you don't memoize the transitivity of comparison then you'll call it O(n^2) times. (But to memoize it takes O(log n) amortized space per item, I think.)
QED - Proof by lack of imagination :)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…