Solving pickle error when saving CyclicLR state_dict #110931

ancestor-mithril · 2023-10-10T07:24:56Z

How to reproduce:

import os
import tempfile

import torch
from torch import nn
from torch.optim import SGD
from torch.optim.lr_scheduler import CyclicLR


model = nn.Linear(100, 100)
opt = SGD(model.parameters(), lr=1.)
scheduler = CyclicLR(opt, base_lr=0.1, max_lr=0.2, scale_fn=lambda x: 0.99)

tmp = tempfile.NamedTemporaryFile(delete=False)
try:
    torch.save(scheduler.state_dict(), tmp.name)
    scheduler.load_state_dict(torch.load(tmp.name))
finally:
    tmp.close()
    os.unlink(tmp.name)

Error:

_pickle.PicklingError: Can't pickle <function <lambda> at 0x000001A51DF67600>: attribute lookup <lambda> on __main__ failed

Fix:

Saving scale_fn to the state dict only if it is a callable object and not if it is a function or lambda.

pytorch-bot · 2023-10-10T07:24:59Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110931

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8f4d1a1 with merge base c77a4a4 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

janeyx99

Thanks for the fix, could you add a test case to ensure this behavior won't regress in test_lrschedulers.py?

Also, lint! Oh it looks like they've all been canceled. Maybe repushing will fix it.

janeyx99 · 2023-10-13T20:30:48Z

test/optim/test_lrscheduler.py

Should this be None in both situations? I would expect it to not be none here...

Yes, this should be None in both situations, since scale_fn is a lamba function and not a callable object with an internal state that can be pickled.

Should there be a third case where a custom callable is passed in then?

I've added the third case now.

janeyx99

just one comment!

janeyx99

Nice! Sorry for the delay--I have just gotten back from traveling. Looks good to merge once CI is green (lint seems to be failing)

janeyx99 · 2023-11-15T20:50:01Z

@pytorchbot merge

pytorchmergebot · 2023-11-15T20:51:47Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2023-11-15T23:15:21Z

Merge failed

Reason: Command git -C /home/runner/work/pytorch/pytorch fetch origin returned non-zero exit code 1

From https://github.com/pytorch/pytorch
   2164598c405..5d170fce290  main                    -> origin/main
   dbf98c1f602..f4512eaae57  csl/add_gitignore       -> origin/csl/add_gitignore
 * [new branch]              csl/consistent_pytest_cache -> origin/csl/consistent_pytest_cache
 * [new branch]              csl/coverage            -> origin/csl/coverage
 * [new branch]              csl/onnx_no_retry       -> origin/csl/onnx_no_retry
 + fe705c0d795...a81999adc7c csl/test3               -> origin/csl/test3  (forced update)
 * [new branch]              dcp_stateful            -> origin/dcp_stateful
   44ba4e09e82..a09b32b5468  gh/BowenBao/316/base    -> origin/gh/BowenBao/316/base
   6ae6bbbb577..c5f74dd8a83  gh/BowenBao/316/head    -> origin/gh/BowenBao/316/head
 + ca9339d5710...1cde29c9c47 gh/BowenBao/316/orig    -> origin/gh/BowenBao/316/orig  (forced update)
   03a50c87cb7..17f90c18391  gh/BowenBao/317/base    -> origin/gh/BowenBao/317/base
   efc12322824..5ef6e72751d  gh/BowenBao/317/head    -> origin/gh/BowenBao/317/head
 + dc7e6447956...8dfc996f0f5 gh/BowenBao/317/orig    -> origin/gh/BowenBao/317/orig  (forced update)
   efc12322824..1cbce8845f8  gh/BowenBao/318/base    -> origin/gh/BowenBao/318/base
   72ce3e35bfa..33a679b6d55  gh/BowenBao/318/head    -> origin/gh/BowenBao/318/head
 + ede8c9b937d...e6d11e94edf gh/BowenBao/318/orig    -> origin/gh/BowenBao/318/orig  (forced update)
   59f89ee2bcc..1bd184228ba  gh/andrewlee302/2/base  -> origin/gh/andrewlee302/2/base
   a5d2e1f7bbc..787b2b43edd  gh/andrewlee302/2/head  -> origin/gh/andrewlee302/2/head
 + 5efd9596d56...f4f56e87268 gh/andrewlee302/2/orig  -> origin/gh/andrewlee302/2/orig  (forced update)
   3324cedf683..885abd9be93  gh/andrewor14/43/base   -> origin/gh/andrewor14/43/base
   b23eb432edb..89df0e488a9  gh/andrewor14/43/head   -> origin/gh/andrewor14/43/head
 + 43877f7b15d...0eeb1a9caa0 gh/andrewor14/43/orig   -> origin/gh/andrewor14/43/orig  (forced update)
   5dcc2908b70..add4560a427  gh/andrewor14/44/base   -> origin/gh/andrewor14/44/base
   91a2f3c4684..dfa1118c902  gh/andrewor14/44/head   -> origin/gh/andrewor14/44/head
 + 7e205fe1a06...81b631ebe21 gh/andrewor14/44/orig   -> origin/gh/andrewor14/44/orig  (forced update)
   46511a07207..e2016134077  gh/desertfire/267/head  -> origin/gh/desertfire/267/head
 + b72815cb86f...e329d9f83a9 gh/desertfire/267/orig  -> origin/gh/desertfire/267/orig  (forced update)
   e5053b71791..1fe60f729f4  gh/drisspg/11/base      -> origin/gh/drisspg/11/base
   b1660543bd1..613b808b3ed  gh/drisspg/11/head      -> origin/gh/drisspg/11/head
 + 793567b14d7...365e37ff8fb gh/drisspg/11/orig      -> origin/gh/drisspg/11/orig  (forced update)
   b202abb9cc8..77eee3483aa  gh/eellison/543/base    -> origin/gh/eellison/543/base
   a63e40e8c17..ddec675c04a  gh/eellison/543/head    -> origin/gh/eellison/543/head
 + 6af8f59df85...987a030081d gh/eellison/543/orig    -> origin/gh/eellison/543/orig  (forced update)
   a093a6492cc..fce8796c5d3  gh/ezyang/2443/head     -> origin/gh/ezyang/2443/head
 + bef6160b920...c86716323d2 gh/ezyang/2443/orig     -> origin/gh/ezyang/2443/orig  (forced update)
 * [new branch]              gh/ezyang/2447/base     -> origin/gh/ezyang/2447/base
 * [new branch]              gh/ezyang/2447/head     -> origin/gh/ezyang/2447/head
 * [new branch]              gh/ezyang/2447/orig     -> origin/gh/ezyang/2447/orig
   ee484b28275..05b51b3945c  gh/fduwjj/110/base      -> origin/gh/fduwjj/110/base
   590f4537b39..47662a926e4  gh/fduwjj/110/head      -> origin/gh/fduwjj/110/head
 + db504c041ec...4da29ca0aa9 gh/fduwjj/110/orig      -> origin/gh/fduwjj/110/orig  (forced update)
   404dc5d0119..5e31a443151  gh/fduwjj/113/base      -> origin/gh/fduwjj/113/base
   7547f693ec6..f1314660154  gh/fduwjj/113/head      -> origin/gh/fduwjj/113/head
 + b8aa5342d69...55a8fcb7944 gh/fduwjj/113/orig      -> origin/gh/fduwjj/113/orig  (forced update)
   d63789656ae..0c112f844b0  gh/jbschlosser/95/base  -> origin/gh/jbschlosser/95/base
   e5a183ac672..53391651a77  gh/jbschlosser/95/head  -> origin/gh/jbschlosser/95/head
 + e5a13c3ee35...e07dc6c2064 gh/jbschlosser/95/orig  -> origin/gh/jbschlosser/95/orig  (forced update)
   8320f8ca4e6..965e700b54f  gh/jbschlosser/96/base  -> origin/gh/jbschlosser/96/base
   750e5e711dc..577ebf22ff6  gh/jbschlosser/96/head  -> origin/gh/jbschlosser/96/head
 + bfb52cf9dfa...ddaff7b17d3 gh/jbschlosser/96/orig  -> origin/gh/jbschlosser/96/orig  (forced update)
   e5513373242..6e682e1d399  gh/muchulee8/14/base    -> origin/gh/muchulee8/14/base
   d4adce6f54c..a765ea9b8b3  gh/muchulee8/14/head    -> origin/gh/muchulee8/14/head
 + 978d65ffcfd...dc653d65044 gh/muchulee8/14/orig    -> origin/gh/muchulee8/14/orig  (forced update)
 * [new branch]              gh/pearu/134/base       -> origin/gh/pearu/134/base
 * [new branch]              gh/pearu/134/head       -> origin/gh/pearu/134/head
 * [new branch]              gh/pearu/134/orig       -> origin/gh/pearu/134/orig
   1760519357f..254dddf42cb  gh/soulitzer/251/head   -> origin/gh/soulitzer/251/head
 + a23638e8de8...67cc2cf2591 gh/soulitzer/251/orig   -> origin/gh/soulitzer/251/orig  (forced update)
   060f3c7b2dd..76a199c295c  gh/tugsbayasgalan/167/head -> origin/gh/tugsbayasgalan/167/head
 + bedd82dec84...0f655ce4a0b gh/tugsbayasgalan/167/orig -> origin/gh/tugsbayasgalan/167/orig  (forced update)
 * [new branch]              gh/wconstab/216/base    -> origin/gh/wconstab/216/base
 * [new branch]              gh/wconstab/216/head    -> origin/gh/wconstab/216/head
 * [new branch]              gh/ydwu4/42/base        -> origin/gh/ydwu4/42/base
 * [new branch]              gh/ydwu4/42/head        -> origin/gh/ydwu4/42/head
 * [new branch]              gh/ydwu4/42/orig        -> origin/gh/ydwu4/42/orig
error: cannot lock ref 'refs/remotes/origin/hoy/autotune/lanucher': 'refs/remotes/origin/hoy' exists; cannot create 'refs/remotes/origin/hoy/autotune/lanucher'
 ! [new branch]              hoy/autotune/lanucher   -> origin/hoy/autotune/lanucher  (unable to update local ref)
   f9f3cf34ca3..96c8b797d1b  mlazos/mutable          -> origin/mlazos/mutable
 * [new branch]              remove_global_ns        -> origin/remove_global_ns
 + 3963150e74c...a8b33d711bd update_submodule_FBGEMM -> origin/update_submodule_FBGEMM  (forced update)
 + 36317812df4...d43ed6f2b00 update_submodule_kineto -> origin/update_submodule_kineto  (forced update)
   7f9fafed532..cc11c0d11bf  viable/strict           -> origin/viable/strict
 * [new branch]              voz/serde               -> origin/voz/serde
 * [new tag]                 ciflow/inductor/113799  -> ciflow/inductor/113799
 * [new tag]                 ciflow/inductor/113802  -> ciflow/inductor/113802
 * [new tag]                 ciflow/inductor/113814  -> ciflow/inductor/113814
 * [new tag]                 ciflow/inductor/113815  -> ciflow/inductor/113815
 * [new tag]                 ciflow/trunk/113697     -> ciflow/trunk/113697
 * [new tag]                 ciflow/trunk/113774     -> ciflow/trunk/113774
 * [new tag]                 ciflow/trunk/113813     -> ciflow/trunk/113813
error: some local refs could not be updated; try running
 'git remote prune origin' to remove any old, conflicting branches

Details for Dev Infra team

Raised by workflow job

janeyx99 · 2023-11-21T22:34:16Z

Argh, I am sorry. Could you locally fix the merge conflicts and kick off a @pytorchbot merge on the rebased copy?

ancestor-mithril · 2023-11-22T08:16:34Z

@pytorchbot merge

pytorchmergebot · 2023-11-22T08:19:31Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

ancestor-mithril requested review from albanD and janeyx99 as code owners October 10, 2023 07:24

pytorch-bot bot added the release notes: optim label Oct 10, 2023

pytorchbot added the open source label Oct 10, 2023

ancestor-mithril force-pushed the main branch from 090cb14 to 841dd16 Compare October 10, 2023 07:34

janeyx99 reviewed Oct 10, 2023

View reviewed changes

janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 10, 2023

ancestor-mithril requested a review from janeyx99 October 11, 2023 10:14

janeyx99 reviewed Oct 13, 2023

View reviewed changes

ancestor-mithril requested a review from janeyx99 November 7, 2023 09:40

janeyx99 approved these changes Nov 15, 2023

View reviewed changes

ancestor-mithril force-pushed the main branch from d75c88c to 1c5b043 Compare November 15, 2023 20:26

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 15, 2023

pytorchmergebot added the merging label Nov 15, 2023

pytorchmergebot removed the merging label Nov 15, 2023

ancestor-mithril added 3 commits November 22, 2023 10:11

Solving pickle error when saving CyclicLR state_dict

34a2cc0

added tests for pickling CycleLR

85ce732

Added test for callable class

8f4d1a1

ancestor-mithril force-pushed the main branch from 1c5b043 to 8f4d1a1 Compare November 22, 2023 08:16

pytorchmergebot added the merging label Nov 22, 2023

pytorchmergebot added Merged and removed merging labels Nov 22, 2023

pytorchmergebot closed this in 2b72543 Nov 22, 2023

Solving pickle error when saving CyclicLR state_dict #110931

Solving pickle error when saving CyclicLR state_dict #110931

Uh oh!

Conversation

ancestor-mithril commented Oct 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to reproduce:

Fix:

Uh oh!

pytorch-bot bot commented Oct 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110931

✅ No Failures

Uh oh!

janeyx99 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janeyx99 Oct 13, 2023

Choose a reason for hiding this comment

Uh oh!

ancestor-mithril Oct 14, 2023

Choose a reason for hiding this comment

Uh oh!

janeyx99 Oct 19, 2023

Choose a reason for hiding this comment

Uh oh!

ancestor-mithril Oct 23, 2023

Choose a reason for hiding this comment

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

janeyx99 commented Nov 15, 2023

Uh oh!

pytorchmergebot commented Nov 15, 2023

Merge started

Uh oh!

pytorchmergebot commented Nov 15, 2023

Merge failed

Uh oh!

janeyx99 commented Nov 21, 2023

Uh oh!

ancestor-mithril commented Nov 22, 2023

Uh oh!

pytorchmergebot commented Nov 22, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ancestor-mithril commented Oct 10, 2023 •

edited

Loading

pytorch-bot bot commented Oct 10, 2023 •

edited

Loading

janeyx99 left a comment •

edited

Loading