5.1.7 Carriage Return, Line Feed and New Line

New lines are represented on different platforms by carriage return (CR), line feed (LF), CRLF, or new line (NEL).

Unfortunately, not only are new lines represented by different characters on different platforms, they also have ambiguous behaviour even on the same platform.

Especially with the advent of the web, where text on a single machine can arise from many sources, this causes a significant problem.

Unfortunately, these characters are often transcoded directly into the corresponding Unicode codes when a character set is transcoded; this means that even programs handling pure Unicode have to deal with the problems.

 

Unicode

ASCII

EBCDIC 1

EBCDIC 2

CR

000D

0D

0D

0D

LF

000A

0A

25

15

CRLF

000D 000A

0D 0A

0D 25

0D 15

NEL

0085

85

15

25

VT

000B

0B

0B

0B

FF

000C

0C

0C

0C

LS

2028

n/a

n/a

n/a

PS

2029

n/a

n/a

n/a

 

There are two mappings of LF and NEL used by EBCDIC systems.

The first EBCDIC column shows the MVS Open Edition (including CP1047) mapping of these characters, while the second column shows the CDRA mapping.

This difference arises from the use of LF character as 'New Line' in ASCII-based Linux environments and in some data transfer protocols that use the Linux assumptions.

The second column is based on the standardized definitions — both in ASCII and EBCDIC of LF.

NEL is not actually defined in ASCII: it is defined in ISO 6429 as a C1 control.

For more information refer to:

ww.w3.org/TR/newline

www.unicode.org/unicode/reports/tr13/tr13-5.html