新手必看:mobilenetv2_050.lamb_in1k环境配置与依赖安装完全指南
2026/6/13 22:39:02
y = wx +b
x = 2 y_true = 4w = 1 b = 0lr = 0.1y_hat = w x + b = 1 * 2 + 0 = 2L = 1/2 * (2 - 4)^2 = 1/2 * 4 = 2我们现在问一个问题:
如果我把 w 稍微变大一点点,loss 是变大还是变小?
(w x - y) = (1 * 2 - 4) = -2 x = 2所以代入公式2:
grad = -2 * 2 = -4梯度是负的
意味着:
👉增大 w,会让 loss 下降
w_new = w - lr * gradw_new = 1 - 0.1 * (-4) = 1 + 0.4 = 1.4y_hat = 1.4 * 2 = 2.8L = 1/2 * (2.8 - 4)^2 = 1/2 * 1.44 = 0.722 → 0.72,loss 真的下降了
w = 1.4(w x - y) = (1.4 * 2 - 4) = -1.2 grad = (w x - y) *x = -1.2 * 2 = -2.4w_new = 1.4 - 0.1 * (-2.4) = 1.64L = 1/2 * (1.64 * 2 - 4)^2 = 0.13
梯度的符号:告诉你往哪边走
梯度的大小:告诉你走多远
学习率:控制步子大小
# 1. 计算损失 - 前向传播
loss = criterion(model(inputs), labels) # 计算预测值与真实值的差异# 2. 计算梯度 - 反向传播
loss.backward() # 自动计算所有参数的梯度并存储# 3. 更新权重 - 优化器步进
optimizer.step() # 根据梯度更新模型参数