- Notifications
You must be signed in to change notification settings - Fork0
A Rust crate to detect, normalize, and convert line endings across platforms. Ensures consistent handling of LF, CRLF, and CR line endings in text processing.
License
jzombie/rust-line-ending
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
OS | Status |
---|---|
Ubuntu-latest | |
macOS-latest | |
Windows-latest |
A Rust crate to detect, normalize, and convert line endings across platforms, including support for character streams. Ensures consistent handling ofLF
,CRLF
, andCR
line endings in text processing.
cargo add line-ending
Line endings can be auto-detected for the current platform.
use line_ending::LineEnding;let detected =LineEnding::from_current_platform();#[cfg(target_os ="windows")]assert_eq!(detected,LineEnding::CRLF,"Windows should detect CRLF");#[cfg(target_family ="unix")]assert_eq!(detected,LineEnding::LF,"Unix/macOS should detect LF");#[cfg(target_family ="wasm")]assert_eq!(detected,LineEnding::LF,"WASM should default to LF");
Split a string into a vector of strings using the auto-detected line ending parsed from the string.
use line_ending::LineEnding;let crlf =LineEnding::split("first\r\nsecond\r\nthird");let cr =LineEnding::split("first\rsecond\rthird");let lf =LineEnding::split("first\nsecond\nthird");let expected =vec!["first","second","third"];assert_eq!(crlf, expected);assert_eq!(cr, expected);assert_eq!(lf, expected);
Join a vector of strings using the specified line ending.
use line_ending::LineEnding;let lines =vec!["first".to_string(),"second".to_string(),"third".to_string(),];assert_eq!(LineEnding::CRLF.join(lines.clone()),"first\r\nsecond\r\nthird");assert_eq!(LineEnding::CR.join(lines.clone()),"first\rsecond\rthird");assert_eq!(LineEnding::LF.join(lines.clone()),"first\nsecond\nthird");
Apply a specific line ending type to an existing string.
use line_ending::LineEnding;let mixed_text ="first line\r\nsecond line\rthird line\nfourth line\n";assert_eq!(LineEnding::CRLF.apply(mixed_text),"first line\r\nsecond line\r\nthird line\r\nfourth line\r\n");assert_eq!(LineEnding::CR.apply(mixed_text),"first line\rsecond line\rthird line\rfourth line\r");assert_eq!(LineEnding::LF.apply(mixed_text),"first line\nsecond line\nthird line\nfourth line\n");
Detect the predominant line ending style used in the input string.
use line_ending::LineEnding;let crlf ="first line\r\nsecond line\r\nthird line";let cr ="first line\rsecond line\rthird line";let lf ="first line\nsecond line\nthird line";assert_eq!(LineEnding::from(crlf),LineEnding::CRLF);assert_eq!(LineEnding::from(cr),LineEnding::CR);assert_eq!(LineEnding::from(lf),LineEnding::LF);
Convert all line endings in a string toLF
(\n
) for consistent processing.
use line_ending::LineEnding;let crlf ="first\r\nsecond\r\nthird";let cr ="first\rsecond\rthird";let lf ="first\nsecond\nthird";assert_eq!(LineEnding::normalize(crlf), lf);assert_eq!(LineEnding::normalize(cr), lf);assert_eq!(LineEnding::normalize(lf), lf);
Restore line endings in a string to the specified type.
use line_ending::LineEnding;let lf ="first\nsecond\nthird";let crlf_restored =LineEnding::CRLF.denormalize(lf);let cr_restored =LineEnding::CR.denormalize(lf);let lf_restored =LineEnding::LF.denormalize(lf);assert_eq!(crlf_restored,"first\r\nsecond\r\nthird");assert_eq!(cr_restored,"first\rsecond\rthird");assert_eq!(lf_restored,"first\nsecond\nthird");
When a string contains multiple types of line endings (LF
,CRLF
, andCR
), theLineEnding::from
method will detect the most frequent line ending type and return it as the dominant one. This ensures a consistent approach to mixed-line-ending detection.
use line_ending::LineEnding;let mixed_type ="line1\nline2\r\nline3\nline4\nline5\r\n";assert_eq!(LineEnding::from(mixed_type),LineEnding::LF);// `LF` is the most common
The detection algorithm works as follows:
- Counts occurrences of each line ending type (
LF
,CRLF
,CR
). - Selects the most frequent one as the detected line ending.
- Defaults to
CRLF
if all are equally present or if the input is empty.
use line_ending::LineEnding;let mostly_crlf ="line1\r\nline2\r\nline3\nline4\r\nline5\r\n";assert_eq!(LineEnding::from(mostly_crlf),LineEnding::CRLF);// `CRLF` is the most commonlet mostly_cr ="line1\rline2\rline3\nline4\rline5\r";assert_eq!(LineEnding::from(mostly_cr),LineEnding::CR);// `CR` is the most common
IfLF
,CRLF
, andCR
all appear the same number of times, the function will returnCRLF
as a tie-breaker.
use line_ending::LineEnding;let equal_mixed ="line1\r\nline2\nline3\rline4\r\nline5\nline6\r";assert_eq!(LineEnding::from(equal_mixed),LineEnding::CRLF);// `CRLF` > `CR` > `LF`
CRLF
is chosen as a tie-breaker because it represents bothCR
andLF
, making it the most inclusive option.
If a single line contains different line endings, the function still chooses the most frequent across the entire string.
use line_ending::LineEnding;let mixed_on_one_line ="line1\r\nline2\rline3\r\nline4\r\nline5\r";assert_eq!(LineEnding::from(mixed_on_one_line),LineEnding::CRLF);// `CRLF` appears the most overall
use line_ending::LineEnding;let empty_text ="";assert_eq!(LineEnding::from(empty_text),LineEnding::CRLF);// Defaults to `CRLF`
Count occurrences of each line ending type in the given string.
use line_ending::{LineEnding,LineEndingScores};// `LineEndingScores` is a hash map that associates each line ending type with// its occurrence count.let mostly_lf ="line1\nline2\r\nline3\rline4\nline5\nline6\n";assert_eq!(LineEnding::from(mostly_lf),LineEnding::LF);assert_eq!(LineEnding::score_mixed_types(mostly_lf,),[(LineEnding::CRLF,1),(LineEnding::CR,1),(LineEnding::LF,4),].into_iter().collect::<LineEndingScores>());
If you want to forcefully split by a certain type.
use line_ending::{LineEnding};let mostly_lf ="line1\nline2\r\nline3\rline4\nline5\nline6\n";let split_crlf =LineEnding::CRLF.split_with(mostly_lf);assert_eq!(split_crlf, vec!["line1\nline2","line3\rline4\nline5\nline6\n"]);
When processing text from a stream (for example, when reading from a file), you often work with aPeekable
iterator over characters. Manually checking for a newline (such as '\n') isn’t enough to handle all platforms, because Windows uses a two‑character sequence (\r\n
) and some older systems use just\r
.
This crate provides a trait extension (via thePeekableLineEndingExt
trait) that adds a consume_line_ending() method to aPeekable<Chars>
iterator. This method automatically detects and consumes the full line break sequence (whether it’s LF, CR, or CRLF) from the stream.
The following example demonstrates how to split a character stream into lines without having to manually handle each line-ending case:
use line_ending::{LineEnding,PeekableLineEndingExt};let text ="line1\r\nline2\nline3\rline4";letmut it = text.chars().peekable();letmut lines =Vec::new();letmut current_line =String::new();while it.peek().is_some(){// consume_line_ending() will automatically consume the full line break (CR, LF, or CRLF)if it.consume_line_ending().is_some(){ lines.push(current_line); current_line =String::new();}else{ current_line.push(it.next().unwrap());}}lines.push(current_line);assert_eq!(lines, vec!["line1","line2","line3","line4"]);
Note: Mixed-type line-ending character streams are automatically handled.
Rust treats\\n
as a literal sequence rather than an actual newline. This behavior ensures that escaped sequences are not mistakenly interpreted as real line breaks.
For example:
use line_ending::LineEnding;let lf_with_escaped ="First\\nSecond\nThird";let result =LineEnding::split(lf_with_escaped);assert_eq!(result, vec!["First\\nSecond","Third"]);// Escaped `\\n` remains intactlet lf ="First\nSecond\nThird";let result_actual =LineEnding::split(lf);assert_eq!(result_actual, vec!["First","Second","Third"]);// Actual `\n` splits
Licensed underMIT. SeeLICENSE
for details.