`torch.distributed.checkpoint.state_dict._init_optim_state()` should respect tensor-ness of lr

### 🐛 Describe the bug

https://github.com/pytorch/pytorch/blob/585dbfa583b4db3f368b1c4d9691c4c73f627d70/torch/distributed/checkpoint/state_dict.py#L611-L614

When the original LR is a tensor, `_init_optim_state()` should respect the tensor-ness of LR. This probably doesn't matter for built-in PyTorch optimizers, but for torchao's low-bit optimizers, LR is expected to be a tensor. There will be an error otherwise. Related to pytorch/ao#1189

I propose to change L614 to

```
param_group["lr"] = torch.tensor(0.0) if isinstance(param_group["lr"], torch.Tensor) else 0.0
```

cc: @awgu 

### Versions

2.6.0.dev20241029

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @ezyang @chauhang @penguinwu

	for param_group in optim.param_groups:
	if "lr" in param_group:
	lrs.append(param_group["lr"])
	param_group["lr"] = 0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`torch.distributed.checkpoint.state_dict._init_optim_state()` should respect tensor-ness of lr #139575

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

torch.distributed.checkpoint.state_dict._init_optim_state() should respect tensor-ness of lr #139575

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`torch.distributed.checkpoint.state_dict._init_optim_state()` should respect tensor-ness of lr #139575