从Labelme到DOTA：手把手教你搞定遥感图像旋转框标注与mmdetection训练-酒店常州论坛

从Labelme到DOTA：遥感图像旋转框标注与mmdetection实战指南

遥感图像中的目标检测一直是计算机视觉领域的重要研究方向。与常规水平框检测不同，旋转框检测（OBB）能更精确地定位和识别具有任意方向的目标，如建筑物、车辆或飞机。本文将深入探讨如何将Labelme标注的遥感图像转换为DOTA格式，并利用mmdetection框架进行高效训练。

1. 遥感图像旋转框标注基础

旋转框检测的核心在于用旋转矩形框（Rotated Bounding Box）精确描述目标的空间位置和方向。与水平框相比，旋转框能减少背景干扰，提升检测精度，特别适合遥感图像中密集排列或任意朝向的目标。

常见旋转框表示方法：

五点法：(x,y,w,h,θ) - 中心点坐标、宽高和旋转角度
八点法：(x1,y1,x2,y2,x3,y3,x4,y4) - 四个顶点坐标
DOTA格式：采用八点表示法，要求顶点按顺时针顺序排列

# DOTA格式示例 """ imagesource:GoogleEarth gsd:0.146343 airplane 0 0 50 0 50 50 0 50 1 """

注意：DOTA格式要求第一个顶点(x1,y1)位于旋转框的"头部"，通常选择目标的主方向作为起始点

2. Labelme到DOTA的格式转换实战

Labelme是常用的图像标注工具，但其输出的JSON格式与DOTA不兼容。我们需要编写转换脚本处理多边形标注到旋转框的转换。

转换流程关键步骤：

多边形凸包计算：使用OpenCV的convexHull处理复杂多边形
最小面积旋转矩形拟合：cv2.minAreaRect获取旋转矩形参数
顶点排序与格式调整：确保顶点符合DOTA的顺时针顺序要求

import json import cv2 import numpy as np def labelme_to_dota(labelme_json, output_txt): with open(labelme_json) as f: data = json.load(f) shapes = data['shapes'] with open(output_txt, 'w') as f: f.write("imagesource:Labelme\n") f.write("gsd:None\n") for shape in shapes: points = np.array(shape['points']) # 计算凸包 hull = cv2.convexHull(points) # 获取最小旋转矩形 rect = cv2.minAreaRect(hull) box = cv2.boxPoints(rect) # 排序顶点为顺时针 box = sort_points_clockwise(box) # 写入DOTA格式 line = f"{shape['label']} {box[0][0]} {box[0][1]} {box[1][0]} {box[1][1]} " line += f"{box[2][0]} {box[2][1]} {box[3][0]} {box[3][1]} 0\n" f.write(line) def sort_points_clockwise(points): # 实现顶点顺时针排序逻辑 ...

提示：实际应用中还需处理多边形的凹性、目标遮挡等复杂情况，可能需要结合业务逻辑调整转换算法

3. mmdetection旋转框检测配置详解

mmdetection作为强大的目标检测框架，支持多种旋转框检测模型。下面以S2ANet为例介绍关键配置：

数据集配置示例：

dataset_type = 'DOTADataset' data_root = 'data/dota/' train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), dict(type='RResize', img_scale=(1024, 1024)), dict(type='RRandomFlip', flip_ratio=0.5), dict(type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375]), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) ] data = dict( samples_per_gpu=2, workers_per_gpu=2, train=dict( type=dataset_type, ann_file=data_root + 'train/labelTxt/', img_prefix=data_root + 'train/images/', pipeline=train_pipeline), val=dict( type=dataset_type, ann_file=data_root + 'val/labelTxt/', img_prefix=data_root + 'val/images/', pipeline=test_pipeline), test=dict( type=dataset_type, ann_file=data_root + 'test/labelTxt/', img_prefix=data_root + 'test/images/', pipeline=test_pipeline))

模型配置关键参数：

model = dict( type='S2ANet', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(0, 1, 2, 3), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='FPN', in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5), bbox_head=dict( type='S2ANetHead', num_classes=15, # 根据实际类别数调整 in_channels=256, feat_channels=256, stacked_convs=2, with_orconv=True, anchor_ratios=[1.0], anchor_strides=[8, 16, 32, 64, 128], anchor_scales=[4], target_means=[.0, .0, .0, .0, .0], target_stds=[1.0, 1.0, 1.0, 1.0, 1.0], loss_fam_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_fam_bbox=dict( type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0), loss_odm_cls=dict( type='FocalLoss', use_sigmoid=True, gamma=2.0, alpha=0.25, loss_weight=1.0), loss_odm_bbox=dict( type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)))

4. 大尺寸遥感图像处理策略

4096×4096等高分辨率遥感图像直接输入网络会带来显存和计算量问题。常用处理方案：

方法	优点	缺点	适用场景
直接下采样	实现简单	小目标信息丢失	目标尺寸较大且均匀
滑动窗口裁剪	保留细节	边缘目标处理复杂	高精度要求场景
图像金字塔	多尺度检测	计算量大	目标尺寸差异大
自适应裁剪	平衡精度效率	实现复杂	通用场景

推荐滑动窗口实现代码：

import cv2 import numpy as np def sliding_window(image, window_size, stride): height, width = image.shape[:2] windows = [] for y in range(0, height - window_size[1] + 1, stride): for x in range(0, width - window_size[0] + 1, stride): window = image[y:y+window_size[1], x:x+window_size[0]] windows.append({ 'window': window, 'x': x, 'y': y }) # 处理边缘未覆盖区域 if (height - window_size[1]) % stride != 0: for x in range(0, width - window_size[0] + 1, stride): window = image[-window_size[1]:, x:x+window_size[0]] windows.append({ 'window': window, 'x': x, 'y': height - window_size[1] }) if (width - window_size[0]) % stride != 0: for y in range(0, height - window_size[1] + 1, stride): window = image[y:y+window_size[1], -window_size[0]:] windows.append({ 'window': window, 'x': width - window_size[0], 'y': y }) return windows

提示：实际应用中建议设置50%重叠区域，并使用NMS后处理合并重复检测结果

5. 模型部署与Docker优化

将训练好的旋转框检测模型部署到生产环境时，Docker能有效解决环境依赖问题。以下是关键注意事项：

Dockerfile优化要点：

使用轻量级基础镜像（如python:3.8-slim）
多阶段构建减少最终镜像体积
合理利用层缓存加速构建

# 构建阶段 FROM nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04 as builder WORKDIR /install RUN apt-get update && apt-get install -y --no-install-recommends \ build-essential \ python3-dev \ python3-pip COPY requirements.txt . RUN pip install --prefix=/install -r requirements.txt # 运行阶段 FROM nvidia/cuda:11.3.1-cudnn8-runtime-ubuntu20.04 WORKDIR /app COPY --from=builder /install /usr/local COPY . . # 预编译模型加速推理 RUN python -c "import torch; model=torch.load('model.pth')" CMD ["python", "inference.py"]

性能优化技巧：

使用TensorRT加速推理
开启CUDA Graph减少内核启动开销
批处理预测请求提高GPU利用率
使用半精度(FP16)计算

import torch from torch2trt import torch2trt # 转换模型为TensorRT格式 model = init_detector(config, checkpoint) x = torch.ones((1, 3, 1024, 1024)).cuda() model_trt = torch2trt(model, [x], fp16_mode=True) # 保存优化后模型 torch.save(model_trt.state_dict(), 'model_trt.pth')

在实际项目中，我们发现使用DOTA_devkit中的评估工具时，需要特别注意测试集name_list的生成格式必须与评估脚本严格匹配。一个常见的错误是忘记去除文件扩展名，导致评估失败。解决方案是统一使用os.path.splitext处理文件名：

import os def generate_namelist(img_dir, output_file): with open(output_file, 'w') as f: for filename in os.listdir(img_dir): if filename.endswith(('.jpg', '.png', '.bmp')): name = os.path.splitext(filename)[0] f.write(name + '\n')

企业官网建设流程全解析

从Labelme到DOTA：遥感图像旋转框标注与mmdetection实战指南

1. 遥感图像旋转框标注基础

2. Labelme到DOTA的格式转换实战

3. mmdetection旋转框检测配置详解

4. 大尺寸遥感图像处理策略

5. 模型部署与Docker优化

热门文章

文章分类

标签云

需要专业的网站建设服务？

企业官网建设流程全解析

从Labelme到DOTA：遥感图像旋转框标注与mmdetection实战指南

1. 遥感图像旋转框标注基础

2. Labelme到DOTA的格式转换实战

3. mmdetection旋转框检测配置详解

4. 大尺寸遥感图像处理策略

5. 模型部署与Docker优化

热门文章

文章分类

标签云

相关文章

AgentCPM研报助手解决研究痛点：快速生成初稿，提升分析效率

PvZ Toolkit：植物大战僵尸PC版开源修改工具深度解析与高级应用指南

Comics Downloader：8大漫画网站一键下载，打造你的个人漫画图书馆

需要专业的网站建设服务？