Week 4
Following
up previous week a total of 20,000 synthetic images were generated. A xml file
containing the ground truth label and location of the bounding boxes were also
generated for each image.
As
yolov5 uses .txt files with the darknet syntax, the generated xml files
containing Pascal VOC annotations cannot be used. The script
‘convert_voc_yolo.py’ was also provided in the github repo but we decided to
skip that and use roboflow. Roboflow allows you to upload images along with its
annotations in any format and can convert between them. Roboflow also allows
you to split the dataset into training, validation and testing without creating
a script.
14000
images were used for training, 4000 for validation and 2000 for testing. It was
decided to use a google colab instance to train the dataset as they provide a
free GPU cloud computing for up to 12 hours each instance.
There
is a template for training a yolov5 model on colab provided by Roboflow which
we used as we retrieved our training data using the roboflow api.
Please
see “cards_YOLOv5.ipynb” for the notebook.
There
are multiple versions of yolov5, we chose yolov5s as we wanted to be able to
train faster and have faster inference. The drop in accuracy compared to
yolov5m and yolov5l is negligible.
After
100 epochs of training, the model achieved 99.85% training accuracy and 91.46%
validation accuracy which is better than expected. For the output during training.
To inference in real time, we had to extract the model weights from colab. The yolov5 github repo was cloned, the model weights placed in its corresponding folder. A conda environment was created containing all the required packages need such as pytorch. “python detect.py --source 0” was ran to use the webcam. An RTX3070 was used to accelerate the inference speed, achieving 30+ fps.
Comments
Post a Comment