Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fix(html): Handle<br> elements to insert line breaks in text#1950

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
Dhruv-Maradiya wants to merge7 commits intodart-lang:main
base:main
Choose a base branch
Loading
fromDhruv-Maradiya:main

Conversation

@Dhruv-Maradiya
Copy link
Contributor

Fixes#1090 by updating the DOM parser to handle<br> elements and insert line breaks (\n) when converting HTML content to plain text.

Initially, I thought adding a simple condition might not be a reliable solution. So, I decided to check how HTML-to-text conversion is handled in Chromium and found a similar approach. Here's thelink.


  • I’ve reviewed the contributor guide and applied the relevant portions to this PR.
Contribution guidelines:

Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.

@Dhruv-MaradiyaDhruv-Maradiya changed the titleFix(html): Handle <br> elements to insert line breaks in textFix(html): Handle<br> elements to insert line breaks in textDec 27, 2024
@github-actions
Copy link

github-actionsbot commentedDec 30, 2024
edited
Loading

PR Health

License Headers⚠️
// Copyright (c) 2025, the Dart project authors. Please see the AUTHORS file// for details. All rights reserved. Use of this source code is governed by a// BSD-style license that can be found in the LICENSE file.
Files
pkgs/html/lib/dom.dart
pkgs/html/test/parser_feature_test.dart

All source files should start with alicense header.

Unrelated files missing license headers
Files
pkgs/bazel_worker/benchmark/benchmark.dart
pkgs/benchmark_harness/integration_test/perf_benchmark_test.dart
pkgs/boolean_selector/example/example.dart
pkgs/clock/lib/clock.dart
pkgs/clock/lib/src/clock.dart
pkgs/clock/lib/src/default.dart
pkgs/clock/lib/src/stopwatch.dart
pkgs/clock/lib/src/utils.dart
pkgs/clock/test/clock_test.dart
pkgs/clock/test/default_test.dart
pkgs/clock/test/stopwatch_test.dart
pkgs/clock/test/utils.dart
pkgs/coverage/lib/src/coverage_options.dart
pkgs/html/example/main.dart
pkgs/html/lib/dom_parsing.dart
pkgs/html/lib/html_escape.dart
pkgs/html/lib/parser.dart
pkgs/html/lib/src/constants.dart
pkgs/html/lib/src/encoding_parser.dart
pkgs/html/lib/src/html_input_stream.dart
pkgs/html/lib/src/list_proxy.dart
pkgs/html/lib/src/query_selector.dart
pkgs/html/lib/src/token.dart
pkgs/html/lib/src/tokenizer.dart
pkgs/html/lib/src/treebuilder.dart
pkgs/html/lib/src/utils.dart
pkgs/html/test/dom_test.dart
pkgs/html/test/parser_test.dart
pkgs/html/test/query_selector_test.dart
pkgs/html/test/selectors/level1_baseline_test.dart
pkgs/html/test/selectors/level1_lib.dart
pkgs/html/test/selectors/selectors.dart
pkgs/html/test/support.dart
pkgs/html/test/tokenizer_test.dart
pkgs/html/test/trie_test.dart
pkgs/html/tool/generate_trie.dart
pkgs/pubspec_parse/test/git_uri_test.dart
pkgs/stack_trace/example/example.dart
pkgs/watcher/test/custom_watcher_factory_test.dart
pkgs/yaml_edit/example/example.dart

This check can be disabled by tagging the PR withskip-license-check.

API leaks⚠️

The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.

PackageLeaked API symbolLeaking sources
htmlHtmlTokenizerhtml/parser.dart::HtmlParser::tokenizer
htmlTokentokenizer.dart::HtmlTokenizer
tokenizer.dart::HtmlTokenizer::tokenQueue
tokenizer.dart::HtmlTokenizer::currentToken
tokenizer.dart::HtmlTokenizer::currentToken
token.dart::TagToken
token.dart::DoctypeToken
token.dart::StringToken
tokenizer.dart::HtmlTokenizer::current
token.dart::StartTagToken
token.dart::CommentToken
html/parser.dart::Phase::processComment
html/parser.dart::Phase::processDoctype
token.dart::CharactersToken
html/parser.dart::Phase::processCharacters
token.dart::SpaceCharactersToken
html/parser.dart::Phase::processSpaceCharacters
html/parser.dart::Phase::processStartTag
html/parser.dart::Phase::startTagHtml
token.dart::EndTagToken
html/parser.dart::Phase::processEndTag
html/parser.dart::HtmlParser::inForeignContent::token
html/parser.dart::HtmlParser::parseRCDataRawtext::token
html/parser.dart::BeforeHeadPhase::startTagOther
html/parser.dart::BeforeHeadPhase::endTagImplyHead
html/parser.dart::InHeadPhase::startTagOther
html/parser.dart::InHeadPhase::endTagHtmlBodyBr
html/parser.dart::AfterHeadPhase::startTagOther
html/parser.dart::AfterHeadPhase::endTagHtmlBodyBr
html/parser.dart::InBodyPhase::startTagProcessInHead
html/parser.dart::InBodyPhase::startTagButton
html/parser.dart::InBodyPhase::startTagOther
html/parser.dart::InBodyPhase::endTagHtml
html/parser.dart::InTablePhase::startTagCol
html/parser.dart::InTablePhase::startTagImplyTbody
html/parser.dart::InTablePhase::startTagTable
html/parser.dart::InTablePhase::startTagStyleScript
html/parser.dart::InCaptionPhase::startTagTableElement
html/parser.dart::InCaptionPhase::startTagOther
html/parser.dart::InCaptionPhase::endTagTable
html/parser.dart::InCaptionPhase::endTagOther
html/parser.dart::InColumnGroupPhase::startTagOther
html/parser.dart::InColumnGroupPhase::endTagOther
html/parser.dart::InTableBodyPhase::startTagTableCell
html/parser.dart::InTableBodyPhase::startTagTableOther
html/parser.dart::InTableBodyPhase::startTagOther
html/parser.dart::InTableBodyPhase::endTagTable
html/parser.dart::InTableBodyPhase::endTagOther
html/parser.dart::InRowPhase::startTagTableOther
html/parser.dart::InRowPhase::startTagOther
html/parser.dart::InRowPhase::endTagTable
html/parser.dart::InRowPhase::endTagTableRowGroup
html/parser.dart::InRowPhase::endTagOther
html/parser.dart::InCellPhase::startTagTableOther
html/parser.dart::InCellPhase::startTagOther
html/parser.dart::InCellPhase::endTagImply
html/parser.dart::InCellPhase::endTagOther
html/parser.dart::InSelectPhase::startTagInput
html/parser.dart::InSelectPhase::startTagScript
html/parser.dart::InSelectPhase::startTagOther
html/parser.dart::InSelectInTablePhase::startTagTable
html/parser.dart::InSelectInTablePhase::startTagOther
html/parser.dart::InSelectInTablePhase::endTagTable
html/parser.dart::InSelectInTablePhase::endTagOther
html/parser.dart::AfterBodyPhase::startTagOther
html/parser.dart::AfterBodyPhase::endTagHtml::token
html/parser.dart::AfterBodyPhase::endTagOther
html/parser.dart::InFramesetPhase::startTagNoframes
html/parser.dart::InFramesetPhase::startTagOther
html/parser.dart::AfterFramesetPhase::startTagNoframes
html/parser.dart::AfterAfterBodyPhase::startTagOther
html/parser.dart::AfterAfterFramesetPhase::startTagNoFrames
htmlHtmlInputStreamtokenizer.dart::HtmlTokenizer::stream
htmlTagTokentokenizer.dart::HtmlTokenizer::currentTagToken
token.dart::StartTagToken
token.dart::EndTagToken
html/parser.dart::InTableBodyPhase::startTagTableOther::token
html/parser.dart::InTableBodyPhase::endTagTable::token
htmlDoctypeTokentokenizer.dart::HtmlTokenizer::currentDoctypeToken
treebuilder.dart::TreeBuilder::insertDoctype::token
html/parser.dart::Phase::processDoctype::token
htmlStringTokentokenizer.dart::HtmlTokenizer::currentStringToken
token.dart::StringToken::add
treebuilder.dart::TreeBuilder::insertComment::token
token.dart::CommentToken
token.dart::CharactersToken
token.dart::SpaceCharactersToken
html/parser.dart::InBodyPhase::processSpaceCharactersDropNewline::token
html/parser.dart::InTableTextPhase::characterTokens
htmlTreeBuilderhtml/parser.dart::HtmlParser::tree
html/parser.dart::Phase::tree
html/parser.dart::HtmlParser::new::tree
htmlActiveFormattingElementstreebuilder.dart::TreeBuilder::activeFormattingElements
htmlStartTagTokentreebuilder.dart::TreeBuilder::insertRoot::token
treebuilder.dart::TreeBuilder::createElement::token
treebuilder.dart::TreeBuilder::insertElement::token
treebuilder.dart::TreeBuilder::insertElementNormal::token
treebuilder.dart::TreeBuilder::insertElementTable::token
html/parser.dart::Phase::processStartTag::token
html/parser.dart::Phase::startTagHtml::token
html/parser.dart::HtmlParser::adjustMathMLAttributes::token
html/parser.dart::HtmlParser::adjustSVGAttributes::token
html/parser.dart::HtmlParser::adjustForeignAttributes::token
html/parser.dart::BeforeHeadPhase::startTagHead::token
html/parser.dart::BeforeHeadPhase::startTagOther::token
html/parser.dart::InHeadPhase::startTagHead::token
html/parser.dart::InHeadPhase::startTagBaseLinkCommand::token
html/parser.dart::InHeadPhase::startTagMeta::token
html/parser.dart::InHeadPhase::startTagTitle::token
html/parser.dart::InHeadPhase::startTagNoScriptNoFramesStyle::token
html/parser.dart::InHeadPhase::startTagScript::token
html/parser.dart::InHeadPhase::startTagOther::token
html/parser.dart::AfterHeadPhase::startTagBody::token
html/parser.dart::AfterHeadPhase::startTagFrameset::token
html/parser.dart::AfterHeadPhase::startTagFromHead::token
html/parser.dart::AfterHeadPhase::startTagHead::token
html/parser.dart::AfterHeadPhase::startTagOther::token
html/parser.dart::InBodyPhase::addFormattingElement::token
html/parser.dart::InBodyPhase::startTagProcessInHead::token
html/parser.dart::InBodyPhase::startTagBody::token
html/parser.dart::InBodyPhase::startTagFrameset::token
html/parser.dart::InBodyPhase::startTagCloseP::token
html/parser.dart::InBodyPhase::startTagPreListing::token
html/parser.dart::InBodyPhase::startTagForm::token
html/parser.dart::InBodyPhase::startTagListItem::token
html/parser.dart::InBodyPhase::startTagPlaintext::token
html/parser.dart::InBodyPhase::startTagHeading::token
html/parser.dart::InBodyPhase::startTagA::token
html/parser.dart::InBodyPhase::startTagFormatting::token
html/parser.dart::InBodyPhase::startTagNobr::token
html/parser.dart::InBodyPhase::startTagButton::token
html/parser.dart::InBodyPhase::startTagAppletMarqueeObject::token
html/parser.dart::InBodyPhase::startTagXmp::token
html/parser.dart::InBodyPhase::startTagTable::token
html/parser.dart::InBodyPhase::startTagVoidFormatting::token
html/parser.dart::InBodyPhase::startTagInput::token
html/parser.dart::InBodyPhase::startTagParamSource::token
html/parser.dart::InBodyPhase::startTagHr::token
html/parser.dart::InBodyPhase::startTagImage::token
html/parser.dart::InBodyPhase::startTagIsIndex::token
html/parser.dart::InBodyPhase::startTagTextarea::token
html/parser.dart::InBodyPhase::startTagIFrame::token
html/parser.dart::InBodyPhase::startTagRawtext::token
html/parser.dart::InBodyPhase::startTagOpt::token
html/parser.dart::InBodyPhase::startTagSelect::token
html/parser.dart::InBodyPhase::startTagRpRt::token
html/parser.dart::InBodyPhase::startTagMath::token
html/parser.dart::InBodyPhase::startTagSvg::token
html/parser.dart::InBodyPhase::startTagMisplaced::token
html/parser.dart::InBodyPhase::startTagOther::token
html/parser.dart::InTablePhase::startTagCaption::token
html/parser.dart::InTablePhase::startTagColgroup::token
html/parser.dart::InTablePhase::startTagCol::token
html/parser.dart::InTablePhase::startTagRowGroup::token
html/parser.dart::InTablePhase::startTagImplyTbody::token
html/parser.dart::InTablePhase::startTagTable::token
html/parser.dart::InTablePhase::startTagStyleScript::token
html/parser.dart::InTablePhase::startTagInput::token
html/parser.dart::InTablePhase::startTagForm::token
html/parser.dart::InTablePhase::startTagOther::token
html/parser.dart::InCaptionPhase::startTagTableElement::token
html/parser.dart::InCaptionPhase::startTagOther::token
html/parser.dart::InColumnGroupPhase::startTagCol::token
html/parser.dart::InColumnGroupPhase::startTagOther::token
html/parser.dart::InTableBodyPhase::startTagTr::token
html/parser.dart::InTableBodyPhase::startTagTableCell::token
html/parser.dart::InTableBodyPhase::startTagOther::token
html/parser.dart::InRowPhase::startTagTableCell::token
html/parser.dart::InRowPhase::startTagTableOther::token
html/parser.dart::InRowPhase::startTagOther::token
html/parser.dart::InCellPhase::startTagTableOther::token
html/parser.dart::InCellPhase::startTagOther::token
html/parser.dart::InSelectPhase::startTagOption::token
html/parser.dart::InSelectPhase::startTagOptgroup::token
html/parser.dart::InSelectPhase::startTagSelect::token
html/parser.dart::InSelectPhase::startTagInput::token
html/parser.dart::InSelectPhase::startTagScript::token
html/parser.dart::InSelectPhase::startTagOther::token
html/parser.dart::InSelectInTablePhase::startTagTable::token
html/parser.dart::InSelectInTablePhase::startTagOther::token
html/parser.dart::InForeignContentPhase::adjustSVGTagNames::token
html/parser.dart::AfterBodyPhase::startTagOther::token
html/parser.dart::InFramesetPhase::startTagFrameset::token
html/parser.dart::InFramesetPhase::startTagFrame::token
html/parser.dart::InFramesetPhase::startTagNoframes::token
html/parser.dart::InFramesetPhase::startTagOther::token
html/parser.dart::AfterFramesetPhase::startTagNoframes::token
html/parser.dart::AfterFramesetPhase::startTagOther::token
html/parser.dart::AfterAfterBodyPhase::startTagOther::token
html/parser.dart::AfterAfterFramesetPhase::startTagNoFrames::token
html/parser.dart::AfterAfterFramesetPhase::startTagOther::token
htmlTagAttributetoken.dart::StartTagToken::attributeSpans
htmlCommentTokenhtml/parser.dart::Phase::processComment::token
htmlCharactersTokenhtml/parser.dart::Phase::processCharacters::token
html/parser.dart::InTablePhase::insertText::token
htmlSpaceCharactersTokenhtml/parser.dart::Phase::processSpaceCharacters::token
htmlEndTagTokenhtml/parser.dart::Phase::processEndTag::token
html/parser.dart::Phase::popOpenElementsUntil::token
html/parser.dart::BeforeHeadPhase::endTagImplyHead::token
html/parser.dart::BeforeHeadPhase::endTagOther::token
html/parser.dart::InHeadPhase::endTagHead::token
html/parser.dart::InHeadPhase::endTagHtmlBodyBr::token
html/parser.dart::InHeadPhase::endTagOther::token
html/parser.dart::AfterHeadPhase::endTagHtmlBodyBr::token
html/parser.dart::AfterHeadPhase::endTagOther::token
html/parser.dart::InBodyPhase::endTagP::token
html/parser.dart::InBodyPhase::endTagBody::token
html/parser.dart::InBodyPhase::endTagHtml::token
html/parser.dart::InBodyPhase::endTagBlock::token
html/parser.dart::InBodyPhase::endTagForm::token
html/parser.dart::InBodyPhase::endTagListItem::token
html/parser.dart::InBodyPhase::endTagHeading::token
html/parser.dart::InBodyPhase::endTagFormatting::token
html/parser.dart::InBodyPhase::endTagAppletMarqueeObject::token
html/parser.dart::InBodyPhase::endTagBr::token
html/parser.dart::InBodyPhase::endTagOther::token
html/parser.dart::TextPhase::endTagScript::token
html/parser.dart::TextPhase::endTagOther::token
html/parser.dart::InTablePhase::endTagTable::token
html/parser.dart::InTablePhase::endTagIgnore::token
html/parser.dart::InTablePhase::endTagOther::token
html/parser.dart::InCaptionPhase::endTagCaption::token
html/parser.dart::InCaptionPhase::endTagTable::token
html/parser.dart::InCaptionPhase::endTagIgnore::token
html/parser.dart::InCaptionPhase::endTagOther::token
html/parser.dart::InColumnGroupPhase::endTagColgroup::token
html/parser.dart::InColumnGroupPhase::endTagCol::token
html/parser.dart::InColumnGroupPhase::endTagOther::token
html/parser.dart::InTableBodyPhase::endTagTableRowGroup::token
html/parser.dart::InTableBodyPhase::endTagIgnore::token
html/parser.dart::InTableBodyPhase::endTagOther::token
html/parser.dart::InRowPhase::endTagTr::token
html/parser.dart::InRowPhase::endTagTable::token
html/parser.dart::InRowPhase::endTagTableRowGroup::token
html/parser.dart::InRowPhase::endTagIgnore::token
html/parser.dart::InRowPhase::endTagOther::token
html/parser.dart::InCellPhase::endTagTableCell::token
html/parser.dart::InCellPhase::endTagIgnore::token
html/parser.dart::InCellPhase::endTagImply::token
html/parser.dart::InCellPhase::endTagOther::token
html/parser.dart::InSelectPhase::endTagOption::token
html/parser.dart::InSelectPhase::endTagOptgroup::token
html/parser.dart::InSelectPhase::endTagSelect::token
html/parser.dart::InSelectPhase::endTagOther::token
html/parser.dart::InSelectInTablePhase::endTagTable::token
html/parser.dart::InSelectInTablePhase::endTagOther::token
html/parser.dart::AfterBodyPhase::endTagOther::token
html/parser.dart::InFramesetPhase::endTagFrameset::token
html/parser.dart::InFramesetPhase::endTagOther::token
html/parser.dart::AfterFramesetPhase::endTagHtml::token
html/parser.dart::AfterFramesetPhase::endTagOther::token

This check can be disabled by tagging the PR withskip-leaking-check.

Breaking changes⚠️
PackageChangeCurrent VersionNew VersionNeeded VersionLooking good?
htmlBreaking0.15.60.15.7-wip0.16.0
Got "0.15.7-wip" expected >= "0.16.0" (breaking changes)
⚠️

This check can be disabled by tagging the PR withskip-breaking-check.

Coverage ✔️
FileCoverage
pkgs/html/lib/dom.dart💚 65 % ⬆️ 1 %

This check fortest coverage is informational (issues shown here will not fail the PR).

This check can be disabled by tagging the PR withskip-coverage-check.

Changelog Entry ✔️
PackageChanged Files

Changes to files need to beaccounted for in their respective changelogs.

This check can be disabled by tagging the PR withskip-changelog-check.

@mosuemmosuem requested review fromHosseinYousefi and removed request fordevoncarewApril 17, 2025 12:38
@Dhruv-Maradiya
Copy link
ContributorAuthor

Hey, thanks for reviewing this! 🙌
It’s been a few months since I worked on it, and I was still getting familiar with the codebase at the time — so I’ll need to refresh myself on the changes.
I'll take a look as soon as I can. Appreciate your feedback!

@mosuem
Copy link
Member

@Dhruv-Maradiya Just a friendly ping as I am looking through PRs - is there intention to land this?

@Dhruv-Maradiya
Copy link
ContributorAuthor

Dhruv-Maradiya commentedOct 8, 2025
edited
Loading

Hey@mosuem, sorry for the delay! I’ll try to wrap this up ASAP, most likely today.

mosuem reacted with rocket emoji

@mosuem
Copy link
Member

Friendly ping :) (No pressure, just happened to walk by this tab in my browser)

Dhruv-Maradiya reacted with thumbs up emoji

Implements DOM spec textContent algorithm with optional convertBRsToNewlines parameter. Adds isElementBr() helper for namespace-aware BR detection. Maintains backward compatibility with existing .text getter.
@github-actions
Copy link

github-actionsbot commentedOct 31, 2025
edited
Loading

Package publishing

PackageVersionStatusPublish tag (post-merge)
package:bazel_worker1.1.5already published at pub.dev
package:benchmark_harness2.4.0already published at pub.dev
package:boolean_selector2.1.2already published at pub.dev
package:browser_launcher1.1.3already published at pub.dev
package:cli_config0.2.1-wipWIP (no publish necessary)
package:cli_util0.5.0-wipWIP (no publish necessary)
package:clock1.1.3-wipWIP (no publish necessary)
package:code_builder4.11.1-wipWIP (no publish necessary)
package:coverage1.15.0already published at pub.dev
package:csslib1.0.2already published at pub.dev
package:extension_discovery2.1.0already published at pub.dev
package:file7.0.2-wipWIP (no publish necessary)
package:file_testing3.1.0-wipWIP (no publish necessary)
package:glob2.1.3already published at pub.dev
package:graphs2.3.3-wipWIP (no publish necessary)
package:html0.15.7-wipWIP (no publish necessary)
package:io1.1.0-wipWIP (no publish necessary)
package:json_rpc_24.1.0-wipWIP (no publish necessary)
package:markdown7.3.1-wipWIP (no publish necessary)
package:mime2.1.0-wipWIP (no publish necessary)
package:oauth22.0.5already published at pub.dev
package:package_config2.3.0-wipWIP (no publish necessary)
package:pool1.5.2already published at pub.dev
package:process5.0.5already published at pub.dev
package:pub_semver2.2.0already published at pub.dev
package:pubspec_parse1.5.1-wipWIP (no publish necessary)
package:source_map_stack_trace2.1.3-wipWIP (no publish necessary)
package:source_maps0.10.14-wipWIP (no publish necessary)
package:source_span1.10.1already published at pub.dev
package:sse4.1.8already published at pub.dev
package:stack_trace1.12.2-wip(error) pubspec version (1.12.2-wip) and changelog (1.12.2-dev) don't agree
package:stream_channel2.1.4already published at pub.dev
package:stream_transform2.1.2-wipWIP (no publish necessary)
package:string_scanner1.4.1already published at pub.dev
package:term_glyph1.2.3-wipWIP (no publish necessary)
package:test_reflective_loader0.5.0ready to publishtest_reflective_loader-v0.5.0
package:timing1.0.2already published at pub.dev
package:unified_analytics8.0.10already published at pub.dev
package:watcher1.2.0already published at pub.dev
package:yaml3.1.3already published at pub.dev
package:yaml_edit2.2.3already published at pub.dev

Documentation athttps://github.com/dart-lang/ecosystem/wiki/Publishing-automation.

@mosuem
Copy link
Member

@HosseinYousefi could you take another look?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@HosseinYousefiHosseinYousefiHosseinYousefi approved these changes

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

<br/> Tag does not product /n

3 participants

@Dhruv-Maradiya@mosuem@HosseinYousefi

[8]ページ先頭

©2009-2025 Movatter.jp