I've got a 700MB XML file coming from a Windows provider.
As one might expect, the line endings are '
' (or ^M in vi). What is the most efficient way to deal with this situation aside from getting the supplier to send over '
' :-)
- Use os.linesep
- Use rstrip() (requiring opening the file ... which seems crazy)
- Using Universal newline support is not standard on my Mac Snow Leopard - so isn't an option.
I'm open to anything that requires Python 2.6+ but it needs to work on Snow Leopard and Ubuntu 9.10 with minimal external requirements. I don't mind a small performance penalty but I am looking for the standard best way to deal with this.
----edit----
The line endings are in the middle of the tag descriptors, otherwise they wouldn't be such a problem. I know this is bad form and that they shouldn't be sending this to me, but this is how I have the file and the vendor is mostly incompetent.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…