In optimizers like Adam and SGD, theself.cache was shared among all layers, leading to a situation where the cache keys were simplyW andb. As a result, when different layers attempted to update their parameters, they all referred to the same cache entries. This led to shape mismatches because the updates for different layers were not properly isolated.

For instance, the cache should have unique keys likelayer1-W,layer1-b,layer2-W, etc., but instead, all parameters were using the same keys, resulting in conflicts during backpropagation.

Solution

The solution involved ensuring that each layer maintained its own cache. This was done by creating a deepcopy of the optimizer linked to each specific layer during its initialization. This way, each layer could independently manage its cache.

All Submissions

Is the code you are submitting your own work?
Have you followed thecontributing guidelines?
Have you checked to ensure there aren't other openPull Requests for the same update/change?

Changes to Existing Models

Have you added an explanation of what your changes do and why you'd like us to include them?
Have you written new tests for your changes, as applicable?
Have you successfully ran tests with your changes locally?

Fix shape mismatch error during backpropagation in MLP optimizer

eb6fefc

Labels

None yet

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix shape mismatch error during backpropagation in MLP optimizer#96

Are you sure you want to change the base?

Fix shape mismatch error during backpropagation in MLP optimizer#96

Uh oh!

Conversation

achal-khanna commentedOct 10, 2024

All Submissions

Changes to Existing Models

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant