Python实战:用列表推导式+Requests搞定M3U8视频下载,自动过滤广告.ts文件
2026/6/6 8:31:39
在机器学习评估中,Micro和Macro代表两种根本不同的评估策略,它们计算TP(真阳性)、FP(假阳性)和FN(假阴性)的方式有本质区别。理解这种差异对正确解读模型性能至关重要。
# 对所有样本/实例的预测total_tp=0total_fp=0total_fn=0foreach sample:# 使用 cal_micro 函数计算当前样本的 TP/FP/FNtp,fp,fn=cal_micro(pred_set,label_set)total_tp+=tp total_fp+=fp total_fn+=fn# 全局汇总后计算指标precision=total_tp/(total_tp+total_fp)recall=total_tp/(total_tp+total_fn)f1=2*(precision*recall)/(precision+recall)# 对每个类别分别计算class_metrics={}foreachclass:class_tp=0class_fp=0class_fn=0foreach sample:# 针对当前类别计算 TP/FP/FNifprediction containsclassandlabel containsclass:class_tp+=1elifprediction containsclassbutlabel doesn't:class_fp+=1eliflabel containsclassbutprediction doesn't:class_fn+=1# 为当前类别计算指标class_precision=class_tp/(class_tp+class_fp)if(class_tp+class_fp)>0else0class_recall=class_tp/(class_tp+class_fn)if(class_tp+class_fn)>0else0class_f1=2*(class_precision*class_recall)/(class_precision+class_recall)if(class_precision+class_recall)>0else0class_metrics[class]=(class_precision,class_recall,class_f1)# 对所有类别的指标取平均macro_precision=average(class_metrics[class][0]forclassinclasses)macro_recall=average(class_metrics[class][1]forclassinclasses)macro_f1=average(class_metrics[class][2]forclassinclasses)考虑一个3类别文本分类问题,有115个样本:
模型预测结果:
总TP = 90 (A) + 2 (B) + 1 (C) = 93 总FP = 10 (A) + 8 (B) + 4 (C) = 22 总FN = 10 (A) + 8 (B) + 4 (C) = 22 Micro Precision = 93 / (93 + 22) = 93/115 = 0.809 Micro Recall = 93 / (93 + 22) = 93/115 = 0.809 Micro F1 = 0.809类别A: Precision_A = 90/100 = 0.90 Recall_A = 90/100 = 0.90 F1_A = 0.90 类别B: Precision_B = 2/10 = 0.20 Recall_B = 2/10 = 0.20 F1_B = 0.20 类别C: Precision_C = 1/5 = 0.20 Recall_C = 1/5 = 0.20 F1_C = 0.20 Macro Precision = (0.90 + 0.20 + 0.20)/3 = 0.433 Macro Recall = (0.90 + 0.20 + 0.20)/3 = 0.433 Macro F1 = (0.90 + 0.20 + 0.20)/3 = 0.433| 指标 | Micro | Macro | 差异原因 |
|---|---|---|---|
| Precision | 0.809 | 0.433 | Micro受大类别A主导 |
| Recall | 0.809 | 0.433 | Macro平等对待所有类别 |
| F1 | 0.809 | 0.433 | 模型在小类别上表现差拉低Macro |
在多标签分类场景中(一个样本可属于多个类别),Micro和Macro的区别更为显著:
样本1: 真实标签={A, B}, 预测标签={A, C}
样本2: 真实标签={B, C}, 预测标签={B}
样本3: 真实标签={A}, 预测标签={A, B}
使用 cal_micro 函数逐样本计算: 样本1: tp=1(A), fp=1(C), fn=1(B) 样本2: tp=1(B), fp=0, fn=1(C) 样本3: tp=1(A), fp=1(B), fn=0 总TP = 1+1+1 = 3 总FP = 1+0+1 = 2 总FN = 1+1+0 = 2 Micro Precision = 3/(3+2) = 0.60 Micro Recall = 3/(3+2) = 0.60按类别分别计算: 类别A: tp=2 (样本1,3), fp=0, fn=0 Precision_A = 2/2 = 1.0, Recall_A = 2/2 = 1.0 类别B: tp=1 (样本2), fp=1 (样本3), fn=1 (样本1) Precision_B = 1/2 = 0.5, Recall_B = 1/2 = 0.5 类别C: tp=0, fp=1 (样本1), fn=1 (样本2) Precision_C = 0/1 = 0, Recall_C = 0/1 = 0 Macro Precision = (1.0 + 0.5 + 0)/3 = 0.50 Macro Recall = (1.0 + 0.5 + 0)/3 = 0.50weighted_f1 = sum(f1_class × support_class) / total_samples类别不平衡严重时:
多标签分类任务:
学术论文报告:
使用 cal_micro 函数时:
Micro TP/FP/FN 和一般(Macro)TP/FP/FN 的根本区别在于计算策略和权重分配:
理解这种区别让你能够:
在实际应用中,没有"最好"的方法,只有"最适合当前任务"的方法。明智的做法是理解两种方法的优缺点,根据具体应用场景选择合适的评估策略,或同时报告两种结果以获得全面视角。