Week 4

 

Following up previous week a total of 20,000 synthetic images were generated. A xml file containing the ground truth label and location of the bounding boxes were also generated for each image.

As yolov5 uses .txt files with the darknet syntax, the generated xml files containing Pascal VOC annotations cannot be used. The script ‘convert_voc_yolo.py’ was also provided in the github repo but we decided to skip that and use roboflow. Roboflow allows you to upload images along with its annotations in any format and can convert between them. Roboflow also allows you to split the dataset into training, validation and testing without creating a script.

14000 images were used for training, 4000 for validation and 2000 for testing. It was decided to use a google colab instance to train the dataset as they provide a free GPU cloud computing for up to 12 hours each instance.

There is a template for training a yolov5 model on colab provided by Roboflow which we used as we retrieved our training data using the roboflow api.

Please see “cards_YOLOv5.ipynb” for the notebook.

There are multiple versions of yolov5, we chose yolov5s as we wanted to be able to train faster and have faster inference. The drop in accuracy compared to yolov5m and yolov5l is negligible.

After 100 epochs of training, the model achieved 99.85% training accuracy and 91.46% validation accuracy which is better than expected. For the output during training.





To inference in real time, we had to extract the model weights from colab. The yolov5 github repo was cloned, the model weights placed in its corresponding folder. A conda environment was created containing all the required packages need such as pytorch. “python detect.py --source 0” was ran to use the webcam. An RTX3070 was used to accelerate the inference speed, achieving 30+ fps.



Comments

Popular posts from this blog

Week 3