Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Sample Weights introduction#652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
filippozacchei wants to merge3 commits intodynamicslab:master
base:master
Choose a base branch
Loading
fromfilippozacchei:master

Conversation

@filippozacchei
Copy link

Introduction of Weighted Samples inSINDy.fit()

This pull request introduces support forweighted samples in both standard and weak formulations of SINDy.
It enables users to provide per-sample or per-trajectory weights that influence model fitting, improving control over data importance and uncertainty propagation.

Main Changes

_core._expand_sample_weights

  • Introduces a unified weight-expansion utility to validate and broadcast user-provided sample weights.
  • Instandard SINDy, weights are expanded to match the total number of samples (n_samples × 1).
  • Inweak SINDy, each trajectory’s scalar weight is repeated according to the number of weak test functions (`(
  • (n_trajectories \cdot K) × 1`).
  • Strict validation ensures that:
    • Weak mode acceptsexactly one scalar weight per trajectory.
    • Standard mode allowseither scalars or per-sample arrays.

weighted_weak_pde_library

  • ImplementsGeneralized Least Squares (GLS) whitening for weak-form regression.

  • When non-uniform spatiotemporal weights are provided, the feature matrix (Θ) and right-hand side (V) are prewhitened using theCholesky factor of the covariance matrix:$\mathrm{Cov}[V] = V' \Sigma (V')^\top$

    where:

    • Σ represents the pointwise sample weights.
    • V′ are the weak integration matrices for the right-hand side.

Automated Tests

  • Added two new test suites to validate the weighting behavior.

@Jacob-Stevens-HaasJacob-Stevens-Haas linked an issueOct 14, 2025 that may beclosed by this pull request
@Jacob-Stevens-Haas
Copy link
Member

Jacob-Stevens-Haas commentedOct 14, 2025
edited
Loading

Hey, thanks for this PR! Some quick thoughts:

  • Weak form is going through a big refactor, and the WeakPDELibrary will be deprecated in the next release. In it's place,WeakSINDy will be a subclass of_BaseSINDy. This will allow removing a lot ofif weak: that is sprinkled throughout the package, as well as enable prediction and simulation of weak models and pave the way for people to implement different basis functions. IDK how you want to proceed - wait to see whatweak branch comes up with in a couple weeks, or just continue with strict "1-weight-per-trajectory" for now so it's the same for weak and regular?
  • On a related note, why is GLS whitening desired?
  • I know I steered you away fromSampleConcatter before, but maybe calls to_expand_sample_weights should go there, rather than infit() andscore. It might be able to reduce redundant code. It helps if the logic to handle flattening axes only occurs in one place, since otherwise, developers need to keep multiple axis layout conventions in mind.
  • Is STLSQ the only optimizer that supports sample weight?
  • Add type annotations wherever you touch a function signature

@filippozacchei
Copy link
Author

Thanks for your comment.

I would say then to just use sample weights for only standard Sindy then. The GLS whitening is needed for Weak SINDy if we have samples of the same trajectory of different importance, but I would avoid for the time being.

Will look into SampleConcatter and update!

Copy link
Member

@Jacob-Stevens-HaasJacob-Stevens-Haas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry for the long delay! I can finally begin to prioritize this PR. As I mentioned, it's best to leave Weak for another day, and to put the sample array alignment in the same place as trajectories are already being flattened and combined (SampleConcatter). Other than that, there's a few tweaks to_expand... and to the tests

fromsklearn.pipelineimportPipeline
fromsklearn.utils.validationimportcheck_is_fitted

set_config(enable_metadata_routing=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Leave this up to the user. scikit-learn documentation explains that if they're using a pipeline or composite transform, they'll need this.

Comment on lines +299 to +301
self.set_fit_request(sample_weight=True)
self.set_score_request(sample_weight=True)
self.optimizer.set_fit_request(sample_weight=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Again, leave this up to the user

x_dot=None,
u=None,
feature_names:Optional[list[str]]=None,
sample_weight=None,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Provide static type

Names for the input features (e.g. :code:`['x', 'y', 'z']`).
If None, will use :code:`['x0', 'x1', ...]`.
sample_weight : float or array-like of shape (n_samples,), optional

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why allow a single float?

Also, shouldn't this be (*n_spatial, n_time)? Or a list of those arrays, for when multiple trajectories are set?

x_dot=concat_sample_axis(x_dot)
self.model=Pipeline(steps)
self.model.fit(x,x_dot)
self.model.fit(x,x_dot,sample_weight=sample_weight)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Let's remove the Pipeline and directly call the components. That way, we don't need to rely on metadata routing internally


# --- Fit without explicit sample weights ---
sindy.fit(X_trajs,t=0.1,x_dot=Xdot_trajs)
coef_unweighted=np.copy(sindy.model.named_steps["model"].coef_)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

just usesindy.optimizer.coef_

# --- Fit with sample weights to emphasize trajectory 3 (different system) ---
sample_weight= [np.ones(len(x_a)),np.ones(len(x_a)),10*np.ones(len(x_b))]
sindy.fit(X_trajs,t=0.1,x_dot=Xdot_trajs,sample_weight=sample_weight)
coef_weighted=np.copy(sindy.model.named_steps["model"].coef_)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

same as above

Comment on lines +58 to +64
# 3. Weighted model should bias toward system B coefficients
# since trajectory B had much higher weight
# True systems differ by factor of 2
ratio=np.mean(np.abs(coef_weighted/coef_unweighted))
assert (
ratio>1.05
),"Weighted coefficients should reflect stronger influence from system B"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You don't need both an error message and description (prefer the error message). Also, makeratio > 1.5, since it should be substantially towards system B

Suggested change
# 3. Weighted model should bias toward system B coefficients
# since trajectory B had much higher weight
# True systems differ by factor of 2
ratio=np.mean(np.abs(coef_weighted/coef_unweighted))
assert (
ratio>1.05
),"Weighted coefficients should reflect stronger influence from system B"
# True systems differ by factor of 2
ratio=np.mean(np.abs(coef_weighted/coef_unweighted))
fail_msg="Weighted coefficients should reflect stronger influence from system B"
assertratio>1.5,fail_msg

Comment on lines +51 to +56
# --- Assertions ---
# 1. Shapes are consistent
assertcoef_weighted.shape==coef_unweighted.shape

# 2. The coefficients must differ when weighting is applied
assertnotnp.allclose(coef_weighted,coef_unweighted)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

These are implied by the following tests. That's justified if these are more specific. However, shape errors will show explicitly in the below code. Additionally,not np.allcolse is a less-specific version of the next assertion

Comment on lines +70 to +72
assertnp.linalg.norm(coef_weighted-2*coef_unweighted)<np.linalg.norm(
coef_unweighted
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't see how this is correct? It would make more sense to explicitly formcoef_a andcoef_b, then check the equations that you added as comments.

@Jacob-Stevens-Haas
Copy link
Member

As a heads up, we're changing the default branch frommaster tomain, so you'll need to change the target of the PR tomain

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@Jacob-Stevens-HaasJacob-Stevens-HaasJacob-Stevens-Haas requested changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Introduce sample_weights inside fit function

2 participants

@filippozacchei@Jacob-Stevens-Haas

[8]ページ先頭

©2009-2025 Movatter.jp