Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork526
Comments
Fix duplicated subtitle issue--core deduplication logic and screen-display part#1448
Fix duplicated subtitle issue--core deduplication logic and screen-display part#1448TransZAllen wants to merge 4 commits intoTeamNewPipe:devfrom
Conversation
… URL parameters.- Add `V`, `LANG`, `TLANG` constants to `YoutubeParsingHelper`- Implement `extractVideoId()`, `extractLanguageCode()`, `extractTranslationCode()`- Add `extractQueryParam()` utility in `Utils.java`
- Add core deduplicated logic/method- Reproduce bug with the YouTube video:https://www.youtube.com/watch?v=b7vmW_5HSpE- Introduce `SubtitleDeduplicator.java` to check and remove duplicates, storing results in cache.- Add `SubtitleOrigin` and `SubtitleState` enums to model subtitle type and state.- Ensure cache directory is recreated if missing.
…ntegrate deduplicated subtitles, calling `checkAndDeduplicate()` to remove duplicates and store results in cache.
…ubtitleDeduplicatorTest.java`.
TransZAllen commentedJan 30, 2026 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Related issueScope of changesThis PR involvestwo repositories:
Reproduction caseAndroid device, duplicated subtitles visible during playback YouTube video used for testing: Subtitle cache locationCached subtitle files ( The directory name corresponds to Cache file namingCached subtitle filenames are intentionally descriptive, Cache lifecycle & storage impactDo cached subtitle files need to be deleted?No.
Why keep cached subtitles?
Storage considerations
Unit testsTests focus on thecore deduplication logic: Why SubtitleDeduplicator operates on raw TTML text
This design is intended to be practical and simple. At this stage, the goal is only to detect obviously Difference from |
TransZAllen commentedJan 30, 2026 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
The fix has been tested with a YouTube video link:https://www.youtube.com/watch?v=b7vmW_5HSpE Before the fix, the subtitle is shown as follows: After applying the fix, the subtitle is displayed as follows: |
AudricV left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think we don't want NewPipe Extractor to download files directly, so your approach must be changed, especially as you do not delete files. Also, I would avoid downloading each subtitle to avoid reaching rate limits.
The extractor is not an Android library, therefore Android specific comments should be removed.
If YouTube provides incorrect subtitles, this should be not to the extractor to fix them in my opinion. It makes more sense to be fixed with a custom ExoPlayer component in the app side for me.
TransZAllen commentedJan 31, 2026
Thanks for the feedback, it’s helpful for me to better understand the intended boundaries of NewPipeExtractor. I’m preparing some follow-up comments to explain these commits, especially around subtitle downloading. I’m also taking some time to think about whether this design makes sense. I’ll add more comments soon. |
TransZAllen commentedFeb 1, 2026 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
About
: Just to make sure I understand correctly: currently, the extractor only provides My original idea was to fix duplicated subtitles as early as possible and However, I now realize that my changes effectively moved the subtitle downloading At first, I thought this was acceptable since subtitles are eventually downloaded So, performing file downloads inside NewPipeExtractor crosses its intended boundary, right? |




