Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Implement comprehensive multi-format DataFrame Spring integration with Spring Data patterns#1322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
Copilot wants to merge17 commits intomaster
base:master
Choose a base branch
Loading
fromcopilot/fix-1321

Conversation

Copy link

CopilotAI commentedJul 11, 2025
edited
Loading

This PR extends the DataFrame Spring integration from CSV-only to comprehensive multi-format support, following Spring Data patterns for unified data source management.

Multi-Format Data Source Support

The implementation now supports all major DataFrame formats through dedicated annotations:

CSV Data Sources

@CsvDataSource(file="sales.csv", delimiter=',', header=true)lateinitvar salesData:DataFrame<*>@CsvDataSource(file="products.tsv", delimiter='\t')lateinitvar productData:DataFrame<*>

JSON Data Sources

@JsonDataSource(file="users.json")lateinitvar userData:DataFrame<*>@JsonDataSource(    file="complex.json",    typeClashTactic=JSON.TypeClashTactic.ANY_COLUMNS,    keyValuePaths= ["user.preferences","config.settings"])lateinitvar complexData:DataFrame<*>

Arrow/Parquet Data Sources

@ArrowDataSource(file="analytics.parquet")lateinitvar analyticsData:DataFrame<*>@ArrowDataSource(file="timeseries.arrow", format=ArrowFormat.IPC)lateinitvar timeseriesData:DataFrame<*>

JDBC Data Sources

@JdbcDataSource(    connectionBean="dataSource",    tableName="customers")lateinitvar customerData:DataFrame<*>@JdbcDataSource(    url="jdbc:h2:mem:testdb",    username="sa",     password="",    query="SELECT * FROM orders WHERE status = 'COMPLETED'")lateinitvar orders:DataFrame<*>

Spring Data-Inspired Architecture

The design follows established Spring Data patterns:

  • Declarative Annotations: Similar to@Query in Spring Data JPA
  • Strategy Pattern: Format-specific processors handle different data sources
  • Bean Integration: Leverages existing Spring infrastructure for connections
  • Property Placeholders: Support for externalized configuration via${...}
  • Type Safety: Compile-time validation of format-specific parameters

Advanced Parameter Management

Each annotation provides type-safe parameters specific to its format:

  • CSV: Custom delimiters, header configuration
  • JSON: Type clash tactics, key-value path processing, number unification
  • Arrow: Format detection, nullability options
  • JDBC: Connection beans, custom queries, result limits

Real-World Usage

@ComponentclassAnalyticsService {        @CsvDataSource(file="exports/customers.csv")lateinitvar customers:DataFrame<*>        @JsonDataSource(file="logs/events.json")lateinitvar events:DataFrame<*>        @ArrowDataSource(file="ml/features.parquet")lateinitvar features:DataFrame<*>        @JdbcDataSource(        connectionBean="metricsDataSource",        query="SELECT * FROM metrics WHERE timestamp >= NOW() - INTERVAL '1 hour'"    )lateinitvar realtimeMetrics:DataFrame<*>fungenerateReport() {// All data sources automatically loaded and ready to use    }}

Backward Compatibility

The original@DataSource annotation remains supported but is deprecated in favor of the more explicit@CsvDataSource.

Benefits

  1. Unified Data Access: Consistent annotation-based approach across all formats
  2. Reduced Boilerplate: No manual DataFrame construction required
  3. Spring Integration: Leverages existing Spring beans and configuration
  4. Developer Experience: Follows familiar Spring Data patterns
  5. Type Safety: Format-specific parameters with compile-time validation

The implementation transforms DataFrame Spring integration from a CSV-specific utility into a comprehensive data access layer that rivals Spring Data in functionality while maintaining the power and flexibility of the DataFrame API.

Fixes#1321.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn moreCopilot coding agent tips in the docs.

CopilotAIand others added4 commitsJuly 11, 2025 12:52
Co-authored-by: zaleslaw <1198621+zaleslaw@users.noreply.github.com>
…g integrationCo-authored-by: zaleslaw <1198621+zaleslaw@users.noreply.github.com>
…egrationCo-authored-by: zaleslaw <1198621+zaleslaw@users.noreply.github.com>
…fic demoCo-authored-by: zaleslaw <1198621+zaleslaw@users.noreply.github.com>
CopilotAI changed the title[WIP] Explore the possiblity to define schema and constunct object in Spring (DI framework) styleImplement Spring DI-style DataFrame initialization with @DataSource annotationJul 11, 2025
CopilotAI requested a review fromzaleslawJuly 11, 2025 13:03
@zaleslaw
Copy link
Collaborator

@copilot extend this module to support of the different file formats (CSV, JSON, Parquet, Arrow) and try to find optimal solution if we have a lot of parameters for each file type, explore Spring Data approach to work with different data sources, also we also support JDBC, could it be also covered?

…h Spring Data patternsCo-authored-by: zaleslaw <1198621+zaleslaw@users.noreply.github.com>
Copy link
Author

@copilot extend this module to support of the different file formats (CSV, JSON, Parquet, Arrow) and try to find optimal solution if we have a lot of parameters for each file type, explore Spring Data...

I've implemented comprehensive multi-format support for the DataFrame Spring integration, extending it far beyond the original CSV-only implementation. The new design follows Spring Data patterns and includes:

🎯 Multi-Format Support:

  • @CsvDataSource - CSV/TSV files with custom delimiters
  • @JsonDataSource - JSON files with type clash handling
  • @ArrowDataSource - Arrow/Parquet/Feather files with format auto-detection
  • @JdbcDataSource - Database tables/queries with connection pooling

🏗️ Spring Data-Inspired Architecture:

  • Declarative annotations similar to Spring Data repositories
  • Strategy pattern for handling different data source types
  • Bean reference support for database connections
  • Property placeholder resolution for externalized config
  • Comprehensive error handling with meaningful messages

⚙️ Advanced Parameter Management:

  • Type-safe parameters specific to each format
  • Flexible configuration supporting complex scenarios
  • Sensible defaults minimizing boilerplate
  • Support for Spring's${...} property placeholders

📊 JDBC Integration:
Uses existingdataframe-jdbc module capabilities:

@JdbcDataSource(    connectionBean="dataSource",     query="SELECT * FROM orders WHERE status = 'COMPLETED'",    limit=1000)lateinitvar orders:DataFrame<*>

The implementation provides exactly what you requested - a unified approach handling multiple formats with optimal parameter management, closely following Spring Data patterns while maintaining the power and flexibility of the DataFrame API.

Commit:66b692e

CopilotAI changed the titleImplement Spring DI-style DataFrame initialization with @DataSource annotationImplement comprehensive multi-format DataFrame Spring integration with Spring Data patternsAug 22, 2025
…ame reading methodsThis commit includes a refactor to use more specific annotations (`CsvDataSource` and others), upgrades DataFrame Spring to Java 17, adjusts build scripts to handle Java 17 modules, and replaces deprecated `readJdbc` with new `readSqlQuery` and `readSqlTable` methods in DataFrame processing.
…ocessing logicThis commit introduces a new detailed Spring-style integration example (`SpringIntegrationExample2.kt`), showcasing advanced usage patterns and GitHub issue resolution (#1321). Updates also include improvements in DataFrame field injection logic to handle enhanced annotation processing, robust property checks, and better fallback mechanisms for ApplicationContext. Additionally, minor tweaks enable broader compatibility and extensibility within the Spring ecosystem.
…and added new Spring integration demosThis commit deprecates the legacy `@DataSource` annotation in favor of the more specific `@CsvDataSource`. It removes outdated example files and introduces new detailed Spring integration examples demonstrating annotation-based DataFrame initialization, including `CsvDataSource_with_Application_Context` and `CsvDataSource_with_Configuration`. Adjustments also include sample data reorganization and updates to tests for compatibility.
…rationIntroduce a comprehensive Spring Boot example (`springboot-dataframe-web`) showcasing annotated CSV-based data source initialization, web controllers, Thymeleaf templates, and sample data files. The example includes customer and sales reports with sorting and filtering functionalities, leveraging DataFrame operations and Spring Boot features.
… configuration, and sample dataAdded Spring Boot Actuator dependency to `springboot-dataframe-web`, introduced `DataFrameConfiguration` for better DataFrame post-processing, and updated CSV data sources for customers and sales. Adjusted annotations, enhanced lifecycle handling in `DataFramePostProcessor`, and added visual documentation and sample data files. Updated build scripts for Java 17 compatibility.
# Conflicts:#settings.gradle.kts
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@zaleslawzaleslawAwaiting requested review from zaleslaw

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Explore the possiblity to define schema and constunct object in Spring (DI framework) style

2 participants

@zaleslaw

[8]ページ先頭

©2009-2025 Movatter.jp