class Args(argparse.Namespace):
model = 'yolov8l.pt'
cfg = 'default.yaml'
iterative_steps = 10
target_prune_rate = 0.15
max_map_drop = 0.2
sched = Schedule(partial(sched_onecycle, α=10, β=4))
args=Args()
prune(args)Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 43,668,288 parameters, 0 gradients, 165.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4049.2±1625.9 MB/s, size: 51.0 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.731 0.768 0.828 0.66
Speed: 0.7ms preprocess, 3.1ms inference, 0.0ms loss, 2.1ms postprocess per image
Results saved to runs/detect/val8
Before Pruning: MACs= 82.72641 G, #Params= 43.69152 M, mAP= 0.66035
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train7, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/train7, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3969.9±1494.9 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 505.5±201.7 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/train7/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/train7
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 17.6G 0.8369 0.7191 1.072 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.774 0.763 0.839 0.674
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 17.1G 0.8351 0.665 1.061 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.826 0.783 0.85 0.689
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 17.2G 0.8322 0.6222 1.066 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.858 0.794 0.86 0.704
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 17.1G 0.8023 0.5615 1.029 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.896 0.793 0.87 0.717
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 17.3G 0.7755 0.521 1.012 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.879 0.824 0.89 0.731
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 17.3G 0.7552 0.5039 1.011 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.869 0.84 0.892 0.738
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 16.8G 0.7342 0.4821 0.9817 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.885 0.835 0.896 0.749
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 17.2G 0.7389 0.4766 0.9989 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.884 0.855 0.904 0.762
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 17.2G 0.7197 0.4778 0.9785 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.875 0.866 0.909 0.767
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 17.2G 0.7149 0.457 1.007 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.882 0.867 0.911 0.768
10 epochs completed in 0.010 hours.
Optimizer stripped from runs/detect/train7/weights/last.pt, 175.3MB
Optimizer stripped from runs/detect/train7/weights/best.pt, 175.3MB
Validating runs/detect/train7/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 43,668,288 parameters, 0 gradients, 165.2 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.883 0.867 0.911 0.768
Speed: 0.1ms preprocess, 2.7ms inference, 0.0ms loss, 0.3ms postprocess per image
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 43,668,288 parameters, 0 gradients, 165.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5337.5±708.2 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.901 0.849 0.904 0.769
Speed: 0.1ms preprocess, 5.4ms inference, 0.0ms loss, 0.5ms postprocess per image
Results saved to runs/detect/baseline_val4
Before Pruning: MACs= 82.72641 G, #Params= 43.69152 M, mAP= 0.76904
Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 1: progress=0.018, ratio=0.003
After Pruning
Model Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 43,043,386 parameters, 74,176 gradients, 162.6 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5560.8±1327.8 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.878 0.862 0.904 0.746
Speed: 0.2ms preprocess, 6.9ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_0_pre_val2
After post-pruning Validation
Model Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 1: MACs=81.4709528 G, #Params=43.066447 M, mAP=0.7464191064783372, speed up=1.0154098308274602
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_0_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_0_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3796.3±1197.1 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1819.0±382.7 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_0_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_0_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 17.4G 0.67 0.4225 0.9631 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.907 0.846 0.908 0.755
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 17.4G 0.6359 0.3913 0.947 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.888 0.861 0.914 0.757
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 17.3G 0.6677 0.427 0.9806 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.896 0.861 0.914 0.761
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 17.4G 0.6512 0.3957 0.9469 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.907 0.858 0.916 0.776
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 17.6G 0.6385 0.3909 0.94 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.852 0.919 0.779
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 17.3G 0.6406 0.4071 0.9522 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.942 0.846 0.917 0.781
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 17.4G 0.6228 0.3905 0.9324 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.877 0.883 0.92 0.788
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 17.4G 0.6583 0.4037 0.9571 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.866 0.923 0.793
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 17.4G 0.6465 0.4069 0.941 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.92 0.875 0.931 0.798
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 17.4G 0.6573 0.4086 0.9788 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.884 0.932 0.799
10 epochs completed in 0.010 hours.
Optimizer stripped from runs/detect/step_0_finetune2/weights/last.pt, 172.8MB
Optimizer stripped from runs/detect/step_0_finetune2/weights/best.pt, 172.8MB
Validating runs/detect/step_0_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 43,043,386 parameters, 0 gradients, 162.6 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.884 0.932 0.799
Speed: 0.1ms preprocess, 3.1ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 43,043,386 parameters, 0 gradients, 162.6 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4856.6±2095.6 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.916 0.87 0.922 0.791
Speed: 0.1ms preprocess, 7.0ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_0_post_val2
After fine tuning mAP=0.7912829910872162
After post fine-tuning validation
Model Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 2: progress=0.048, ratio=0.007
After Pruning
Model Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 42,094,706 parameters, 74,160 gradients, 158.8 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5539.0±1668.1 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.845 0.912 0.769
Speed: 0.1ms preprocess, 6.9ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_1_pre_val2
After post-pruning Validation
Model Conv2d(3, 63, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 2: MACs=79.5541908 G, #Params=42.117503 M, mAP=0.7685751155559024, speed up=1.0398749024796818
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_1_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_1_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3717.7±1437.5 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1717.9±457.2 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_1_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_1_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 17G 0.6016 0.3791 0.9319 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.859 0.921 0.783
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 17.1G 0.5765 0.3537 0.9187 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.864 0.918 0.786
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 17.1G 0.5879 0.3755 0.9353 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.915 0.868 0.919 0.791
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 17.2G 0.5637 0.3453 0.9177 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.93 0.87 0.932 0.794
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 17.2G 0.5691 0.3553 0.9072 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.928 0.869 0.928 0.794
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 17.2G 0.5736 0.3496 0.9185 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.872 0.924 0.796
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 17.4G 0.5726 0.3525 0.9006 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.873 0.924 0.794
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 17.3G 0.6045 0.3704 0.9303 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.927 0.882 0.932 0.8
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 17.3G 0.6179 0.3961 0.9203 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.938 0.883 0.932 0.804
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 17.1G 0.6393 0.416 0.9573 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.941 0.883 0.933 0.804
10 epochs completed in 0.010 hours.
Optimizer stripped from runs/detect/step_1_finetune2/weights/last.pt, 169.0MB
Optimizer stripped from runs/detect/step_1_finetune2/weights/best.pt, 169.0MB
Validating runs/detect/step_1_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 42,094,706 parameters, 0 gradients, 158.8 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.941 0.883 0.933 0.804
Speed: 0.1ms preprocess, 3.1ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 42,094,706 parameters, 0 gradients, 158.8 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5405.8±1076.0 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.931 0.882 0.931 0.795
Speed: 0.1ms preprocess, 7.0ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_1_post_val2
After fine tuning mAP=0.7950947724666012
After post fine-tuning validation
Model Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 3: progress=0.119, ratio=0.018
After Pruning
Model Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 40,324,469 parameters, 74,160 gradients, 152.5 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5472.8±1324.6 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.858 0.851 0.907 0.743
Speed: 0.1ms preprocess, 7.0ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_2_pre_val2
After post-pruning Validation
Model Conv2d(3, 62, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 3: MACs=76.3708784 G, #Params=40.34678 M, mAP=0.7432541366864699, speed up=1.0832192601833424
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_2_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_2_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4279.1±1583.2 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 975.8±228.6 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_2_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_2_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 16.8G 0.6389 0.3964 0.9263 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.922 0.836 0.912 0.764
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 17.3G 0.5608 0.361 0.903 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.852 0.925 0.78
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 17.3G 0.5679 0.364 0.9166 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.903 0.876 0.931 0.783
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 17.1G 0.549 0.362 0.8975 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.905 0.885 0.934 0.79
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 16.9G 0.5402 0.3396 0.8914 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.932 0.873 0.929 0.793
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 17.1G 0.5511 0.3452 0.9006 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.925 0.872 0.933 0.797
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 17.1G 0.5463 0.3546 0.8866 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.92 0.882 0.932 0.797
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 16.9G 0.5963 0.3718 0.9195 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.894 0.936 0.801
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 16.9G 0.6017 0.3778 0.9098 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.896 0.937 0.806
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 17.3G 0.6401 0.4083 0.954 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.93 0.892 0.937 0.808
10 epochs completed in 0.010 hours.
Optimizer stripped from runs/detect/step_2_finetune2/weights/last.pt, 161.9MB
Optimizer stripped from runs/detect/step_2_finetune2/weights/best.pt, 161.9MB
Validating runs/detect/step_2_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 40,324,469 parameters, 0 gradients, 152.5 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.93 0.892 0.937 0.808
Speed: 0.1ms preprocess, 3.0ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 40,324,469 parameters, 0 gradients, 152.5 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4201.8±2216.9 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.927 0.891 0.936 0.8
Speed: 0.1ms preprocess, 7.0ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_2_post_val2
After fine tuning mAP=0.7996752102772763
After post fine-tuning validation
Model Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 4: progress=0.270, ratio=0.040
After Pruning
Model Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 37,708,749 parameters, 74,160 gradients, 143.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4881.8±1731.9 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.875 0.805 0.892 0.702
Speed: 0.1ms preprocess, 6.1ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_3_pre_val2
After post-pruning Validation
Model Conv2d(3, 61, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 4: MACs=71.732976 G, #Params=37.730325 M, mAP=0.7020598286629811, speed up=1.1532549046898597
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_3_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_3_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4040.0±1479.4 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 898.4±215.3 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_3_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_3_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 16G 0.6694 0.4281 0.9392 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.884 0.856 0.915 0.743
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 16.3G 0.596 0.378 0.9048 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.899 0.869 0.924 0.762
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 16.2G 0.5892 0.389 0.9174 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.901 0.868 0.922 0.775
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 16.3G 0.5714 0.3688 0.8989 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.923 0.875 0.933 0.779
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 16.3G 0.5768 0.3685 0.8988 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.93 0.875 0.935 0.788
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 16.3G 0.5769 0.3674 0.8972 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.902 0.892 0.937 0.79
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 16.3G 0.5726 0.3653 0.8875 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.929 0.876 0.937 0.796
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 16.4G 0.6152 0.3919 0.9235 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.918 0.883 0.939 0.802
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 16.4G 0.6269 0.3936 0.92 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.922 0.886 0.939 0.805
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 16.2G 0.6646 0.4099 0.9612 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.886 0.941 0.809
10 epochs completed in 0.010 hours.
Optimizer stripped from runs/detect/step_3_finetune2/weights/last.pt, 151.5MB
Optimizer stripped from runs/detect/step_3_finetune2/weights/best.pt, 151.5MB
Validating runs/detect/step_3_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 37,708,749 parameters, 0 gradients, 143.2 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.886 0.941 0.809
Speed: 0.1ms preprocess, 2.9ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 37,708,749 parameters, 0 gradients, 143.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3800.7±1771.4 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.916 0.888 0.941 0.808
Speed: 0.1ms preprocess, 6.0ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_3_post_val2
After fine tuning mAP=0.8076550755729582
After post fine-tuning validation
Model Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 5: progress=0.501, ratio=0.075
After Pruning
Model Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 35,132,671 parameters, 74,160 gradients, 133.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5237.1±1333.4 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.786 0.737 0.83 0.664
Speed: 0.1ms preprocess, 6.5ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_4_pre_val2
After post-pruning Validation
Model Conv2d(3, 59, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 5: MACs=66.7424992 G, #Params=35.153479 M, mAP=0.6635248706814774, speed up=1.2394861953266503
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_4_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_4_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3910.5±1562.2 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 745.1±159.7 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_4_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_4_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 15.6G 0.7302 0.482 0.966 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.844 0.814 0.887 0.718
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 15.5G 0.6487 0.4256 0.9334 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.891 0.835 0.91 0.753
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 15.7G 0.6402 0.4361 0.9413 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.889 0.848 0.919 0.761
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 15.7G 0.6214 0.3973 0.9185 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.893 0.862 0.924 0.776
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 15.7G 0.5974 0.3845 0.9032 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.86 0.929 0.779
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 15.7G 0.6027 0.3936 0.9126 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.86 0.932 0.786
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 15.8G 0.5974 0.3946 0.8942 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.922 0.872 0.934 0.791
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 15.8G 0.6442 0.4018 0.9322 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.883 0.935 0.8
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 15.8G 0.659 0.4155 0.9267 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.886 0.936 0.802
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 16.3G 0.6831 0.4327 0.9751 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.909 0.897 0.937 0.801
10 epochs completed in 0.010 hours.
Optimizer stripped from runs/detect/step_4_finetune2/weights/last.pt, 141.2MB
Optimizer stripped from runs/detect/step_4_finetune2/weights/best.pt, 141.2MB
Validating runs/detect/step_4_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 35,132,671 parameters, 0 gradients, 133.2 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.886 0.936 0.802
Speed: 0.1ms preprocess, 2.7ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 35,132,671 parameters, 0 gradients, 133.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4520.2±1803.1 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.912 0.886 0.938 0.8
Speed: 0.1ms preprocess, 6.5ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_4_post_val2
After fine tuning mAP=0.7996035449784826
After post fine-tuning validation
Model Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 6: progress=0.733, ratio=0.110
After Pruning
Model Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 33,747,610 parameters, 74,160 gradients, 128.5 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4015.0±1353.0 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.918 0.822 0.908 0.743
Speed: 0.2ms preprocess, 6.0ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_5_pre_val2
After post-pruning Validation
Model Conv2d(3, 56, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 6: MACs=64.3900056 G, #Params=33.768007 M, mAP=0.7431841358762444, speed up=1.2847709148203583
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_5_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_5_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4044.3±1635.3 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 726.9±171.1 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_5_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_5_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 15.3G 0.6333 0.4011 0.9294 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.828 0.922 0.757
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 15.3G 0.5444 0.3673 0.8873 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.93 0.842 0.925 0.769
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 15.2G 0.5664 0.3835 0.9134 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.917 0.867 0.929 0.771
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 15.4G 0.5632 0.3668 0.8936 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.915 0.867 0.93 0.782
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 15.4G 0.5594 0.3643 0.8994 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.924 0.856 0.929 0.792
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 15.4G 0.5635 0.359 0.8999 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.923 0.857 0.93 0.793
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 15.3G 0.5725 0.3679 0.8946 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.932 0.858 0.933 0.794
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 15.3G 0.6254 0.3951 0.9293 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.925 0.863 0.932 0.796
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 15.3G 0.642 0.4066 0.9224 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.871 0.906 0.932 0.797
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 15.2G 0.6799 0.4366 0.9771 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.896 0.894 0.932 0.797
10 epochs completed in 0.018 hours.
Optimizer stripped from runs/detect/step_5_finetune2/weights/last.pt, 135.6MB
Optimizer stripped from runs/detect/step_5_finetune2/weights/best.pt, 135.6MB
Validating runs/detect/step_5_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 33,747,610 parameters, 0 gradients, 128.5 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.871 0.906 0.932 0.797
Speed: 0.1ms preprocess, 3.4ms inference, 0.0ms loss, 1.5ms postprocess per image
After fine-tuning
Model Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 33,747,610 parameters, 0 gradients, 128.5 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3665.2±391.2 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.933 0.857 0.934 0.795
Speed: 1.5ms preprocess, 15.6ms inference, 0.0ms loss, 3.4ms postprocess per image
Results saved to runs/detect/step_5_post_val2
After fine tuning mAP=0.7951189977994946
After post fine-tuning validation
Model Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 7: progress=0.883, ratio=0.132
After Pruning
Model Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,913,682 parameters, 74,160 gradients, 125.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1738.3±815.1 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.824 0.863 0.908 0.742
Speed: 1.1ms preprocess, 13.4ms inference, 0.0ms loss, 2.8ms postprocess per image
Results saved to runs/detect/step_6_pre_val2
After post-pruning Validation
Model Conv2d(3, 55, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 7: MACs=62.7046164 G, #Params=32.933815 M, mAP=0.7416030070446816, speed up=1.319303284981104
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_6_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_6_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1812.0±691.5 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 881.6±250.0 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_6_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_6_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 15.1G 0.5837 0.3747 0.9064 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.895 0.851 0.923 0.76
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 15.1G 0.5111 0.3316 0.8736 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.905 0.87 0.929 0.777
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 15G 0.5221 0.3494 0.8919 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.909 0.875 0.932 0.786
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 15G 0.531 0.3318 0.8828 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.911 0.874 0.927 0.785
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 15.1G 0.5416 0.3462 0.8843 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.886 0.888 0.93 0.786
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 15.1G 0.5524 0.354 0.89 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.893 0.877 0.926 0.783
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 15.1G 0.5693 0.3642 0.8824 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.907 0.869 0.926 0.786
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 15.1G 0.6151 0.3821 0.9231 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.906 0.878 0.929 0.788
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 15.1G 0.6315 0.4039 0.9118 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.915 0.883 0.932 0.797
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 15G 0.6669 0.4226 0.9674 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.884 0.933 0.795
10 epochs completed in 0.015 hours.
Optimizer stripped from runs/detect/step_6_finetune2/weights/last.pt, 132.3MB
Optimizer stripped from runs/detect/step_6_finetune2/weights/best.pt, 132.3MB
Validating runs/detect/step_6_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,913,682 parameters, 0 gradients, 125.2 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.915 0.883 0.932 0.796
Speed: 0.1ms preprocess, 3.0ms inference, 0.0ms loss, 1.1ms postprocess per image
After fine-tuning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,913,682 parameters, 0 gradients, 125.2 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 3042.0±594.6 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.915 0.877 0.931 0.794
Speed: 0.5ms preprocess, 9.4ms inference, 0.0ms loss, 1.6ms postprocess per image
Results saved to runs/detect/step_6_post_val2
After fine tuning mAP=0.7942548824738299
After post fine-tuning validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 8: progress=0.955, ratio=0.143
After Pruning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,669,140 parameters, 74,160 gradients, 124.6 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1360.5±623.7 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.916 0.867 0.927 0.789
Speed: 0.7ms preprocess, 9.3ms inference, 0.0ms loss, 1.6ms postprocess per image
Results saved to runs/detect/step_7_pre_val2
After post-pruning Validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 8: MACs=62.4070664 G, #Params=32.689204 M, mAP=0.7892334700405261, speed up=1.325593577332454
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_7_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_7_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1005.4±526.3 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 626.8±111.3 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_7_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_7_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 14.9G 0.495 0.3205 0.88 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.923 0.874 0.929 0.799
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 15.1G 0.4323 0.2908 0.8485 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.917 0.884 0.936 0.794
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 15G 0.4617 0.3107 0.8715 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.908 0.886 0.933 0.797
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 15G 0.4587 0.3004 0.8575 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.929 0.878 0.93 0.795
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 14.9G 0.4739 0.3132 0.8601 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.929 0.878 0.932 0.796
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 14.9G 0.4952 0.3178 0.8655 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.934 0.864 0.929 0.795
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 15G 0.4984 0.326 0.8585 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.92 0.877 0.93 0.796
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 15.1G 0.5569 0.3589 0.894 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.872 0.929 0.798
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 15.1G 0.5973 0.3794 0.897 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.929 0.873 0.934 0.804
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 15G 0.6558 0.4162 0.9558 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.929 0.878 0.936 0.805
10 epochs completed in 0.012 hours.
Optimizer stripped from runs/detect/step_7_finetune2/weights/last.pt, 131.3MB
Optimizer stripped from runs/detect/step_7_finetune2/weights/best.pt, 131.3MB
Validating runs/detect/step_7_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,669,140 parameters, 0 gradients, 124.6 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.929 0.878 0.936 0.805
Speed: 0.1ms preprocess, 2.5ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,669,140 parameters, 0 gradients, 124.6 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4798.0±2031.8 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.937 0.883 0.937 0.802
Speed: 0.2ms preprocess, 6.1ms inference, 0.0ms loss, 0.5ms postprocess per image
Results saved to runs/detect/step_7_post_val2
After fine tuning mAP=0.8021166946231042
After post fine-tuning validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 9: progress=0.984, ratio=0.148
After Pruning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 74,160 gradients, 123.4 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4707.7±1115.9 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.931 0.863 0.921 0.768
Speed: 0.1ms preprocess, 6.1ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_8_pre_val2
After post-pruning Validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 9: MACs=61.8488912 G, #Params=32.436843 M, mAP=0.7680307574283493, speed up=1.3375568226839933
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_8_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_8_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4035.8±1487.2 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 1631.0±420.7 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_8_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_8_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 14.9G 0.4943 0.3212 0.8714 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.93 0.869 0.926 0.789
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 14.9G 0.4371 0.2908 0.8475 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.935 0.869 0.931 0.802
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 14.9G 0.443 0.2951 0.864 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.933 0.873 0.934 0.801
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 14.9G 0.4433 0.295 0.8514 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.909 0.892 0.933 0.801
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 14.9G 0.4481 0.2939 0.852 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.912 0.896 0.932 0.797
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 14.9G 0.4641 0.3056 0.8523 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.917 0.89 0.935 0.802
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 14.9G 0.4891 0.3107 0.858 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.931 0.886 0.937 0.802
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 14.9G 0.532 0.338 0.8835 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.942 0.887 0.936 0.802
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 15G 0.5758 0.3629 0.8903 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.941 0.893 0.936 0.807
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 14.9G 0.6455 0.3983 0.9429 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.942 0.893 0.938 0.808
10 epochs completed in 0.008 hours.
Optimizer stripped from runs/detect/step_8_finetune2/weights/last.pt, 130.3MB
Optimizer stripped from runs/detect/step_8_finetune2/weights/best.pt, 130.3MB
Validating runs/detect/step_8_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 0 gradients, 123.4 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.942 0.893 0.938 0.808
Speed: 0.1ms preprocess, 2.5ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 0 gradients, 123.4 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 2456.1±440.7 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.933 0.893 0.935 0.806
Speed: 0.1ms preprocess, 6.2ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_8_post_val2
After fine tuning mAP=0.8062525404490082
After post fine-tuning validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruning step 10: progress=0.996, ratio=0.149
After Pruning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 74,160 gradients, 123.4 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5303.3±1442.4 MB/s, size: 44.7 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.934 0.895 0.936 0.806
Speed: 0.1ms preprocess, 6.2ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_9_pre_val2
After post-pruning Validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
After pruning iter 10: MACs=61.8488912 G, #Params=32.436843 M, mAP=0.8062440401624619, speed up=1.3375568226839933
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
engine/trainer: agnostic_nms=False, amp=False, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=coco128.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8l.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=step_9_finetune2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=12.0, pretrained=True, profile=False, project=None, rect=False, resume=False, retina_masks=False, save=True, save_conf=False, save_crop=False, save_dir=runs/detect/step_9_finetune2, save_frames=False, save_json=False, save_period=-1, save_txt=False, scale=0.5, seed=0, shear=0.0, show=False, show_boxes=True, show_conf=True, show_labels=True, simplify=True, single_cls=False, source=None, split=val, stream_buffer=False, task=detect, time=None, tracker=botsort.yaml, translate=0.1, val=True, verbose=False, vid_stride=1, visualize=False, warmup_bias_lr=0.1, warmup_epochs=3.0, warmup_momentum=0.8, weight_decay=0.0005, workers=8, workspace=None
Freezing layer 'model.22.dfl.conv.weight'
train: Fast image access ✅ (ping: 0.0±0.0 ms, read: 4503.1±1040.0 MB/s, size: 50.9 KB)
train: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/lab
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 795.8±192.5 MB/s, size: 52.5 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Plotting labels to runs/detect/step_9_finetune2/labels.jpg...
optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically...
optimizer: AdamW(lr=0.000119, momentum=0.9) with parameter groups 105 weight(decay=0.0), 112 weight(decay=0.0005), 111 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 8 dataloader workers
Logging results to runs/detect/step_9_finetune2
Starting training for 10 epochs...
Closing dataloader mosaic
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
1/10 15.3G 0.424 0.2844 0.8499 121 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.937 0.892 0.939 0.811
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
2/10 14.9G 0.3993 0.2626 0.8333 113 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.923 0.896 0.942 0.808
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
3/10 14.9G 0.4118 0.2764 0.8534 118 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.914 0.899 0.941 0.808
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
4/10 14.9G 0.4239 0.2808 0.8413 68 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.926 0.892 0.937 0.807
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
5/10 15G 0.4537 0.2909 0.8466 95 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.942 0.891 0.935 0.805
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
6/10 15.1G 0.4596 0.299 0.8484 122 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.947 0.885 0.938 0.807
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
7/10 14.9G 0.4647 0.3001 0.8475 75 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.948 0.887 0.94 0.807
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
8/10 14.9G 0.5177 0.3237 0.8788 142 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.947 0.891 0.942 0.807
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
9/10 14.9G 0.5476 0.3486 0.8788 104 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.946 0.891 0.942 0.811
Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
10/10 15.3G 0.6247 0.3905 0.942 164 640: 100%|██████████
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.944 0.889 0.941 0.811
10 epochs completed in 0.008 hours.
Optimizer stripped from runs/detect/step_9_finetune2/weights/last.pt, 130.3MB
Optimizer stripped from runs/detect/step_9_finetune2/weights/best.pt, 130.3MB
Validating runs/detect/step_9_finetune2/weights/best.pt...
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 0 gradients, 123.4 GFLOPs
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.946 0.891 0.942 0.811
Speed: 0.1ms preprocess, 2.5ms inference, 0.0ms loss, 0.3ms postprocess per image
After fine-tuning
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CUDA:0 (NVIDIA GeForce RTX 5090, 32109MiB)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 0 gradients, 123.4 GFLOPs
val: Fast image access ✅ (ping: 0.0±0.0 ms, read: 5168.4±959.1 MB/s, size: 53.4 KB)
val: Scanning /home/nathan/Developer/FasterAI-Labs/Projects/ALX Systems/datasets/coco128/label
Class Images Instances Box(P R mAP50 mAP50-95): 100%
all 128 929 0.937 0.892 0.939 0.806
Speed: 0.1ms preprocess, 6.3ms inference, 0.0ms loss, 0.4ms postprocess per image
Results saved to runs/detect/step_9_post_val2
After fine tuning mAP=0.8059532806050649
After post fine-tuning validation
Model Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Pruner Conv2d(3, 54, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
Ultralytics 8.3.162 🚀 Python-3.12.11 torch-2.9.1+cu128 CPU (Intel Core(TM) i9-14900KS)
YOLOv8l summary (fused): 121 layers, 32,416,863 parameters, 0 gradients, 123.4 GFLOPs
PyTorch: starting from 'runs/detect/step_9_finetune2/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 84, 8400) (124.2 MB)
ONNX: starting export with onnx 1.17.0 opset 10...
W0205 17:34:38.183000 260195 site-packages/torch/onnx/_internal/exporter/_compat.py:114] Setting ONNX exporter to use operator set version 18 because the requested opset_version 10 is a lower version than we have implementations for. Automatic version conversion will be performed, which may not be successful at converting to the requested version. If version conversion is unsuccessful, the opset version of the exported model will be kept at 18. Please consider setting opset_version >=18 to leverage latest ONNX features
The model version conversion is not supported by the onnxscript version converter and fallback is enabled. The model will be converted using the onnx C API (target version: 10).
Failed to convert the model to the target version 10 using the ONNX C API. The model was not modified
Traceback (most recent call last):
File "/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/onnxscript/version_converter/__init__.py", line 127, in call
converted_proto = _c_api_utils.call_onnx_api(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/onnxscript/version_converter/_c_api_utils.py", line 65, in call_onnx_api
result = func(proto)
^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/onnxscript/version_converter/__init__.py", line 122, in _partial_convert_version
return onnx.version_converter.convert_version(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/onnx/version_converter.py", line 38, in convert_version
converted_model_str = C.convert_version(model_str, target_version)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: /github/workspace/onnx/version_converter/BaseConverter.h:70: adapter_lookup: Assertion `false` failed: No Adapter To Version $17 for Resize
Applied 1 of general pattern rewrite rules.
ONNX: slimming with onnxslim 0.1.59...
ONNX: export success ✅ 2.9s, saved as 'runs/detect/step_9_finetune2/weights/best.onnx' (123.8 MB)
Export complete (3.4s)
Results saved to /home/nathan/Developer/FasterAI-Labs/gh/fasterai/nbs/tutorials/prune/runs/detect/step_9_finetune2/weights
Predict: yolo predict task=detect model=runs/detect/step_9_finetune2/weights/best.onnx imgsz=640
Validate: yolo val task=detect model=runs/detect/step_9_finetune2/weights/best.onnx imgsz=640 data=/home/nathan/miniconda3/envs/dev/lib/python3.12/site-packages/ultralytics/cfg/datasets/coco128.yaml
Visualize: https://netron.app