Deep Learning#
BeagleBoard-X15, BeagleBone-AI and BeagleBone-AI64 all have accelerators for running deep learning tasks using TIDL (1, 2). We’d love projects that enable people to do more deep learning application and end-nodes and leverage cloud-based training more easily. Goal here is to create tools that make learning about and applying AI and deep learning easier. Contributions to projects like ArduPilot and DonkeyCar (DIY Robocars and BlueDonkey) to introduce autonomous navigation to mobile robots are good possible candidates.
For some background, be sure to check out simplify embedded edge AI development post from TI.
YOLO models on the X15/AI-64
Port the YOLO model(s) to the X15/AI so the accelerator blocks can be leveraged. Currently, running a frame through YOLOv2-tiny takes anywhere from 35 sec to 15 second depending on the how the code is run on the ARM.35 second being a pure brute force compilation for ARM; 15 second utilizing NEON and tweaked algorithms. The goal is to get things down to 1 second or less using the onboard accelerators. Note, there are over 6 different variants of YOLO (YOLOv1, YOLOv2, YOLOv2 and each one has a full size and a tiny version). The main interest is in getting either the YOLOv2 or YOLOv3 versions running. Please discuss with potential mentors on the desired approach as there are many approaches. Just to name a few: Porting the YOLO model into TIDL; OpenCL directly; OpenCL integration with the acceleration library; Integrating TIDL support with an acceleration library.
Goal: Run YOLOv2 or YOLOv3 with the onboard hardware acceleration.
Hardware Skills: None
Software Skills: C, C++, Linux kernel, Understanding of NNs and Convolution.
Possible Mentors: Hunyue Yau (ds2)
Expected Size of Project: 350 hrs
Rating: Medium
Upstream Repository: Numerous
References: https://pjreddie.com/darknet/yolo/
OpenGLES acceleration for DL
Current acceleration on the X15/AI focuses on using the EVE and DSP hardware blocks. The SoC on those boards also feature an OpenGLES enabled GPU. The goal with this is to utilize shaders to perform computations. A possible frame work to utilize this on is the Darknet CNN framework.
Goal: Accelerate as many layers types as possible using OpenGLES.
Hardware Skills: None
Software Skills: C, C++, Linux kernel, OpenGLES, Understanding of NNs and Convolution.
Possible Mentors: Hunyue Yau (ds2)
Expected Size of Project: 350 hrs
Rating: Medium
Upstream Repository: Numerous
References: https://pjreddie.com/darknet/