Monday, 19 August 2013

Python unicode.splitlines() triggers at non-EOL character

Python unicode.splitlines() triggers at non-EOL character

Triyng to make this in Python 2.7:
>>> s = u"some\u2028text"
>>> s
u'some\u2028text'
>>> l = s.splitlines(True)
>>> l
[u'some\u2028', u'text']
\u2028 is Left-To-Right Embedding character, not \r or \n, so that line
should not be splitted. Is there a bug or just my misunderstanding?

No comments:

Post a Comment