Peter Gabriel Wants You to Re-Shock the Monkey
The heart of music compression is exploiting masking effects - a loud sound obscures quieter sounds that happen near the same time and frequency. When compressing a mixed together song, the encoder will not bother to encode the sound of e.g. a clarinet at he moment a cymbal crashes, because you wouldn't be able to hear it anyway. This is one of the ways mp3 saves information, and encoding tracks separetely would prevent this from happening.
Re: your first point about entropy -- the entropy in a downmixed track is strictly less than or equal to the sum of the entropies of the individual tracks. So encoding the tracks separately would require more space for the same quality.