This Unity project is used for executing 2D image object recognition models on the HoloLens2 hardware using the Unity Framework Sentis. As an example the YOLOv8n and Yolov10n models were used (the default models are included here). The model is executed as soon as the application is started. The recognized objects are identified with a label, which displays the recognized class and the associated detection probability (as you can see in the image). This label is placed in the center of the detected object by shooting a sphere cast onto the spatial mesh of the environment.
By combining the 2D position of the objects in the image with the information about the position of the camera and the spatial mesh, it is possible to estimate the position of the object in the 3D space. Therefore, the image is virtually positioned in front of the camera on a scale that is aligned with the field of view of the camera. Referring the following image you can estimate the position by casting a sphere starting in the position of camera A and going through the position of the object in image B. The point of the collision with the spatial mesh C is the estimated position of the object in the 3D space.
To reduce the error rate of recognisitions and increase robustness, an additional filter mechanism is integrated. A new object is only displayed in the user's field of view once it has been recognised in four consecutive model executions. Furthermore, objects are not removed immediately if they are not recognised in a model execution. The object is only removed if it has not been recognised in the user's field of vision for three seconds. If the object is not in the user's field of vision, it remains at his position.
In this example code, a label and optional debug information is displayed for each recognized object.
If you would like to execute your own functionality based on the recognized objects, the method TriggerDetectionActions
in the YoloRecognitionHandler
class can be extended.
Various settings can be changed using the hand menu. The parameters of the programm can be changed in the Parameters
file. The following parameters are used in the context of the implementation:
ModelImageResolution
: In this parameter the resolution of the model input image is defined. The YOLO default input size is 640 x 640.ModelVersion
: This parameter destinguishes between the used version of YOLO. Currently v8 and v10 are supported.OverlapThreshold
: Defines when two boundingboxes describe the same object (measured as IoU).
These values describe the presets stored in the hand menu for the performance of the model execution. The value describes the number of layers that are executed in every frame.
LayersHigh
: Preset for high model execution rate, but low framerate.LayersLow
: Preset for low model execution rate, but higher framerate.
These values describe the presets stored in the hand menu for the accuracy of the model recognitions. The value describes the minimum accuracy that an individual recognition must have in order to be perceived as such.
ThresholdHigh
ThresholdMedium
ThresholdLow
These options can be activated in the hand menu for debugging purposes.
Bounding boxes
: Visualizes the bounding boxes of the detected objects on the virtual projection plane.Model debug image
: Displays the image that is feed into the model and draws bounding boxes for the detections inside the image.Projection cubes
: Visualizes fixed positions in the image (marked by colored pixels in the debug image) as cubes at the projected positions on the virtual projection plane. Can be used to determine parameters for the projection. The aim is to ensure that the positions in the room match those in the image.Debug sphere cast
: Visualizes the shooted sphere casts.
This group of parameters deals with the 3D position estimation based on the 2D recognition in the image. The values for these parameters were determined using the debug options.
VirtualProjectionPlaneWidth
: This describes the width of the projection plane (previously denoted as B in the image).MaxSphereLength
: Maximum distance at which objects are displayed.SphereCastSize
: Size of the cast from the HoloLens to the Mesh (When this parameter is bigger, it is easier to hit smaller objects).HeightOffset
: Offset of the projection plane.SphereCastOffset
: Offset of the start position of the sphere casts. This represents the position of the camera of the HoloLens relative to the eyes.
MinTimesSeen
: Number of model execution where the model need to detect an object before it is visible to the user.ObjectTimeOut
: Time in seconds which the object needs to be abscent such that an object is deleted.MaxIdenticalObject
: Maximum distance between two objects of the same class, so that they are recognised as the same object in different detections.
To use your own model, the following steps must be carried out:
- Train your own YOLOv8n or YOLOv10n model
- Export the trained model to onnx, see Yolov8 / Yolov10
- Copy the trained model to the
Assets/Models
folder - Update the linked model in the
YoloObjectLabeler
prefab or directly in the scene - Update the detected classes in the
ObjectClass
script
- Note: Must be the same order as in config.yaml for training
- Update the
ModelImageResolution
in theParameters
script if you changed the input resolution of the model (default: 640 x 640)
Performance recommendations:
- The fewer classes the model has to differentiate between, the better the performance. The default model has quite a few classes and is therefore relatively slow
- Quantization to 16 bit has little to no effect (8 bit has not been tried yet)
In addition to YOLO, other onnx models should also be usable.
For this, the logic for reading the model output tensor in the YoloModelOutputProcessor
class has to be adapted as well.
To install the latest version of the app on the HoloLens, perform the following steps:
- Install required software:
- git
- Unity version 2021.3.22f1 with the
Universal Windows Platform Build Support
module- Recommendation: Do not install Visual Studio as a DevTool with Unity, since only Visual Studio 2019 is available
- Visual Studio 2017 or greater (Recommendation: 2022) with the following workloads and components (can be modified via the Visual Studio Installer):
.NET desktop development
workloadDesktop development with C++
workloadUniversal Windows Platform development
workload- Make sure that under
installation details
the following components are included for the workload:USB-Connectivity
C++ (vNNN) Universal Windows Platform tools
(newest version)- any
Windows SDK
(Recommendation: Windows 11)
- Make sure that under
Game development with Unity
workloadC++ Universal Windows Platform support for vNNN build tools (ARM64)
componentMSVC vNNN - VS NNNN C++ ARM build tools (Latest)
componentMSVC vNNN - VS NNNN C++ ARM64 build tools (Latest)
component
- Activate
Developer Mode
on your local machine - In the HoloLens go to:
Settings
->Update & Security
->For developers
and enableUse Developer Features
andDevice Discovery
- Clone the git repo
- Open the project in Unity
- In Unity, select
File
->Build Settings
- Check that
HoloLensYOLOObjectDetectionScene
is selected as Scenes in Build - Check that
Universal Windows Platform
is selected as Platform - Check that
ARM-64
is selected asBuild Target Platform
- Optional: Change the name of the app by clicking on
Player Settings...
and enter a custom name asProduct name
and asPackage name
(Universal Windows Platform
->Publishing settings
) - Click on Build and select an target folder (Recommendation: empty folder)
- Note: If someone else has already installed an app with the same name on the HoloLens, it can sometimes happen that this version is not overwritten. Renaming is then necessary. This is also the case when you getting the error code
0x80070057
during deployment
- Check that
- Open the built solution in Visual Studio
- In Visual Studio select
Release
,ARM64
andDevice
as a build target (in the top tool bar) - Connect the HoloLens via a USB cable with the PC
- Note: The HoloLens should appear in the Windows Explorer
- In Visual Studio click on
Build
- Note: When deploying to the HoloLens from a PC for the first time, a PIN prompt appears during the deployment process. This PIN can be found in the HoloLens as follows:
Settings
->Update & Security
->For developers
->Click onPair
underDevice discovery
The Windows Device Portal can be used, for example, to stream the HoloLens image to the PC or to analyse the performance of an app in the HoloLens. To call this up, carry out the following steps:
- Connect the HoloLens via a USB cable with the PC or connect the PC and the HoloLens to the same WLAN network
- In the HoloLens go to:
Settings
->Update & Security
->For developers
and enable theDevice Portal
- In the HoloLens go to:
Settings
->Update & Security
->For developers
and enter the first URL displayed underDevice Portal
in the browser on the PC - The video stream is available at
Views
->Mixed Reality Capture
->Live preview
To test the model execution in Unity, the virtual camera of OBS can be used as camera input of the model. In this case, the objects are projected in a fixed distance to the user relative to the position in the image.
In order to debug your C# code directly on the HoloLens, follow these steps (strongly based on https://stackoverflow.com/a/59990792):
- In the
Project Settings
->Player
enable the capabilitiesPrivateNetworkClientServer
andInternetClientServer
- In the
Build Settings
enableDevelopment Build
,ScriptDebugging
andWait For Managed Debugger
.- Note: Changing the
Build configuration
toDebug
increases the chance that debugging will work successfully. However, this further reduces performance. If this option is selected,Debug
must also be selected instead of Release when deploying from Visual Studio.
- Note: Changing the
- Build and deploy your project as usual.
- Connect the PC and the HoloLens to the same WLAN (WLAN must support multicast -> Recommendation: Create a hotspot with the PC)
- Disconnect the USB cable from the HoloLens and start the App -> A pop-up occurs
- Open the
HoloLens-YOLO-Object-Detection.sln
in Visual Studio and click onDebug
->Attach Unity Debugger
. In the list the HoloLens should occur -> select the HoloLens- If the
HoloLens-YOLO-Object-Detection.sln
is not part of your repository folder:- In Unity select
Edit
->Preferences
->External tools
- Select Visual Studio as
External Script Editor
. - Select
Embedded packages
andLocal packages
underGenerate .csproj files for:
- Click on
Regenerate project files
- In Unity select
- If the
- In the HoloLens close the popup
- Set break points in Visual Studio and debug your app.
Note: The app is significantly slower when debugging and it also takes longer to deploy the app.
The project was inspired by YoloHolo and parts of the code are based on it.