This application runs the object detection algorithm YOLO on Intel Neural Compute Sticks and uses a browser based UI to display the predictions. The architecture of the system is as below.
The client is a React app. The UI allows three types of inputs - image, video and webcam stream. If the input is an image, it is converted to a base64 encoded string and sent to server. The server decodes the image and runs the detection algorithm on it. Server then sends back the prediction results - class, score and bounding box locations - which are then displayed on the UI. For video or video stream captured through UI, the workflow is same, except that in this case, each frame of the video is encoded as base64 string and sent to server.
To run the client, follow the below steps
cd app/
npm install
npm start
The client runs on port 3000.
The server is a simple Flask server.
To run the server:
- pip install Flask (prefered in virtualenv) https://flask.palletsprojects.com/en/1.1.x/installation/
- setup OpenVINO env as instructed by Intel
- run
python3 server_paralel.py
- plugin the device when you see the instruction, one by one
The server supports two APIs
-
POST /detect_objects
This API is used to send the image data to the server.
URL:
http://{device ip}:5000/detect_objects
request body:
{ "image": "Base 64 encoded image, as a string", "mode": "“parallel” or “ensemble”", "models": "one or more model names, as an array" }
The server accepts the request data and immediately responds with 200 response code. The data is processed asynchronously and the prediction results are stored in the server until requested by the client.
-
GET /detect_objects_response
This API is used to retrieve the prediction results from the server. The client repeatedly polls the server with this API until a response is given.
URL:
http://{device ip}:5000/detect_objects_response?models=<model names>
Response:
with 'mode' == 'parallel'
{ "model1": [ { "bbox": [ 1, 0, 200, 200 ], "class": "person", "score": 0.838 } ], "model2": [ { "bbox": [ 1, 0, 200, 200 ], "class": "person", "score": 0.838 } ], "model3": [ { "bbox": [ 1, 0, 200, 200 ], "class": "person", "score": 0.838 } ] }
with 'mode' == 'ensemble'
{ "all": [ { "bbox": [ 1, 0, 200, 200 ], "class": "person", "score": 0.838 } ] }