Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Introduce Microsoft.Extensions.DataIngestion.Abstractions#6949

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
adamsitnik merged 7 commits intodotnet:mainfromadamsitnik:dataIngestionAbstractions
Oct 23, 2025

Conversation

@adamsitnik
Copy link
Member

@adamsitnikadamsitnik commentedOct 22, 2025
edited by dotnet-policy-servicebot
Loading

The APIs got approved in#6893 (comment) and in#6895 (comment)

Microsoft Reviewers:Open in CodeFlow

@adamsitnikadamsitnik self-assigned thisOct 22, 2025
CopilotAI review requested due to automatic review settingsOctober 22, 2025 14:23
Copy link
Contributor

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

This PR introduces the Microsoft.Extensions.DataIngestion.Abstractions library, implementing the APIs approved in the referenced GitHub issues. The library provides abstractions for processing documents from various formats into structured chunks suitable for data ingestion scenarios (e.g., RAG pipelines).

Key changes:

  • Core document representation classes (IngestionDocument,IngestionDocumentElement and its specialized types)
  • Processing pipeline abstractions (IngestionDocumentReader,IngestionDocumentProcessor,IngestionChunker,IngestionChunkProcessor,IngestionChunkWriter)
  • Support classes (IngestionChunk) for representing processed content chunks

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
FileDescription
Microsoft.Extensions.DataIngestion.Abstractions.csprojProject file defining multi-targeting (including netstandard2.0) and conditional package references
IngestionDocument.csCore document container with section management and content enumeration
IngestionDocumentElement.csBase class and specialized element types (Section, Paragraph, Header, Footer, Table, Image)
IngestionDocumentReader.csAbstract reader with file/stream overloads and extensive media type mapping
IngestionDocumentProcessor.csAbstract processor for document transformation pipeline
IngestionChunk.csGeneric chunk representation with metadata support and validation
IngestionChunker.csAbstract chunker for splitting documents into chunks
IngestionChunkProcessor.csAbstract processor for chunk transformation pipeline
IngestionChunkWriter.csAbstract writer with disposable pattern for chunk output
Microsoft.Extensions.DataIngestion.Tests.csprojTest project configuration with analyzer suppressions
IngestionDocumentTests.csUnit tests for document enumeration and validation

@adamsitnikadamsitnikforce-pushed thedataIngestionAbstractions branch fromdb5d273 to7bbb1efCompareOctober 23, 2025 10:40
@adamsitnikadamsitnik merged commit72f930a intodotnet:mainOct 23, 2025
6 checks passed
@adamsitnikadamsitnik deleted the dataIngestionAbstractions branchOctober 23, 2025 11:22
This was referencedNov 19, 2025
@github-actionsgithub-actionsbot locked and limited conversation to collaboratorsNov 23, 2025
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

Copilot code reviewCopilotCopilot left review comments

@stephentoubstephentoubstephentoub approved these changes

@rojirojiAwaiting requested review from roji

Assignees

@adamsitnikadamsitnik

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@adamsitnik@stephentoub

[8]ページ先頭

©2009-2025 Movatter.jp