A low-cost, accessible framework for measuring and comparing the visual perception performance of Video See-Through (VST) head-mounted displays (HMDs) against natural human vision.
Video see-through (VST) technology aims to seamlessly blend virtual and physical worlds by reconstructing reality through cameras. However, it remains unclear how close these systems are to replicating natural human vision across varying environmental conditions. This benchmark provides a standardized method to quantify the perceptual gap between VST headsets and the human eye.
This benchmark is designed to:
- Evaluate VST HMDs: Assess the visual perception capabilities of VST headsets (e.g., Apple Vision Pro, Meta Quest 3, Quest Pro) across different lighting conditions
- Compare with Human Vision: Quantitatively measure how VST systems perform relative to natural naked eye vision
- Identify Limitations: Reveal specific areas where current VST technology falls short, particularly in low-light environments
- Guide Development: Provide actionable metrics that can inform the design and optimization of future VST HMDs
- Support Research: Offer a standardized, replicable methodology for visual perception assessment in mixed reality systems
The benchmark adapts psychophysical methods from vision science to evaluate three fundamental aspects of visual perception:
Visual acuity measures the clarity or sharpness of vision—the ability to discern fine details. It is quantified using:
- Visual Angle: The angle an object subtends at the eye:
Visual Angle = 2 × arctan(Object Size / (2 × Object Distance)) - MAR (Minimum Angle of Resolution): The smallest gap size that can be resolved, measured in arc minutes
- Decimal Visual Acuity: The reciprocal of MAR (normal vision = 1.0)
- logMAR: Logarithm of MAR (normal vision = 0)
- Snellen Fraction: Traditional notation like 20/20 vision
The test uses a Tumbling E Chart with a bisection method to precisely determine the smallest recognizable optotype size.
Contrast sensitivity measures the ability to discern differences between light and dark, particularly important in low-light conditions and for detecting fine details. It is quantified using:
- Weber Contrast:
(Luminance_target - Luminance_background) / Luminance_background - Contrast Sensitivity: The inverse of the contrast threshold
- logCS: Logarithm of contrast sensitivity (normal = 2.0 for Pelli-Robson)
The test uses a Pelli-Robson Chart that presents letters at fixed size but decreasing contrast levels.
Color vision assesses the ability to distinguish different hues and perceive accurate color representation. The test uses:
- Farnsworth-Munsell 100 Hue Test: Participants arrange 85 colored caps in correct hue order across 4 groups
- Total Error Score (TES): Calculated by summing placement errors:
TES = Σ|O(i-1) - O(i)| + |O(i+1) - O(i)| - 2
A lower error score indicates better color discrimination ability.
- Unity 6000.0.39f1 or later
- Platform Support:
- PC - All tests supported
- Android - Visual acuity and contrast sensitivity tests only
- Clone this repository
- Open the project in Unity
- Load the desired test scene from
Assets/Scenes/:Visual acuity test.unityContrast sensitivity test.unityColor vision test.unity
All tests follow a similar workflow:
-
Configure Participant Information
- Select Participant ID (1-64)
- Choose Device type:
eyes- Testing naked eye visionA,B,C- Testing different VST HMDs
- Select Trial type:
Both- Standard test- Custom trial conditions
-
Start the Test
- Click the "Start" button to begin
- Follow the on-screen instructions for each specific test
-
Complete the Test
- Results are automatically saved to CSV files in the
Recordingsfolder
- Results are automatically saved to CSV files in the
Platform: PC and Android
Purpose: Measures the smallest visual detail that can be resolved at a given distance.
How to Use:
- Launch the
Visual acuity testscene - Calibration (first-time setup):
- Click "Calibration" button
- Adjust the vertical and horizontal sliders to center the E optotype
- Adjust the Size slider on the right until the letter "E" has a physical height of 2.4 cm on the screen
- Click "Save" to store calibration settings
- Testing:
- Maintain a viewing distance of 1 meter from the screen
- Configure participant ID, device, and trial
- Click "Start Test"
- A single letter "E" will appear in one of four orientations (↑↓←→)
- Press the arrow key corresponding to the E's orientation
- The E size will adjust using a bisection algorithm based on your responses
- Continue until the test converges on your visual acuity threshold
Output: See Output Format section below.
Platform: PC and Android
Purpose: Measures the ability to detect low-contrast patterns, crucial for vision in varying lighting conditions.
How to Use:
- Launch the
Contrast sensitivity testscene - Calibration (first-time setup):
- Ensure the display brightness is set to 100%
- Testing:
- Maintain a viewing distance of 1 meter from the screen
- Configure participant ID, device, and trial
- Click "Start Test"
- Two letters will appear with reduced contrast against the background
- Type the two letters you see using your keyboard
- The contrast level will adjust based on your responses using a bisection method
- Continue until the test determines your contrast sensitivity threshold
- Perform Grayscale vs Weber Contrast calibration later to ensure accurate contrast conversion
Output: See Output Format section below.
Platform: PC ONLY (designed for calibrated monitors)
Purpose: Evaluates color discrimination ability and the accuracy of color representation through VST systems.
How to Use:
- Launch the
Color vision testscene on a PC - Ensure your monitor is properly calibrated for accurate color reproduction. Maintain a viewing distance of 50 cm from the monitor
- Configure participant ID, device, and trial
- Click "Start Test"
- You will see 4 rows of colored caps with fixed caps at each end
- Drag and drop the colored caps to arrange them in order of hue
- Each row should transition smoothly from the starting color to the ending color
- Click "Submit" after arranging all rows
- The test calculates your Total Error Score based on placement accuracy
Output: See Output Format section below.
All test results are saved as CSV files in the following locations:
- PC:
Assets/Recordings/ - Android:
Application.persistentDataPath/Recordings/
Filename: VA_ID_{ParticipantID}_Device_{DeviceName}_Trail_{TrailType}{Timestamp}.csv
Columns:
Timestamp: Test execution timestamp (yyyy_MM_dd_hh_mm_ss_ffffff)ID: Participant ID (1-64)Device: Device type (eyes, A, B, C)trail: Trial conditiontime interval: Time between responses (seconds)scale: Current E optotype scalecurrentMinScale: Minimum scale in current bisection rangecurrentMaxScale: Maximum scale in current bisection rangeWidth: Screen width (pixels)Height: Screen height (pixels)Correct Button: Correct orientation (up/down/left/right)Pressed Button: User's responseIs Correct: Whether response was correct (True/False)VA: Calculated visual acuity (decimal format)
Filename: VCS_ID_{ParticipantID}_Device_{DeviceName}_Trail_{TrailType}{Timestamp}.csv
Columns:
Timestamp: Test execution timestampID: Participant IDDevice: Device typetrail: Trial conditiontime interval: Time between responses (seconds)Virtual contrast: Current contrast level (0-1)currentMinContrast: Minimum contrast in bisection rangecurrentMaxContrast: Maximum contrast in bisection rangeCorrect letter: Correct letter pairPressed letter: User's inputIs Correct: Whether response was correct (True/False)Virtual contrast sensitivity: Calculated contrast sensitivity (1/contrast)
Filename: VC_ID_{ParticipantID}_Device_{DeviceName}_Trail_{TrailType}{Timestamp}.csv
Columns:
id: Participant IDcurrentDevice: Device typecurrentTrail: Trial conditionrow1: Semicolon-separated cap indices for row 1row2: Semicolon-separated cap indices for row 2row3: Semicolon-separated cap indices for row 3row4: Semicolon-separated cap indices for row 4errorScore: Total Error Score (TES) - lower is bettertest_time: Total test duration (seconds)
Contributions are welcome! Please feel free to submit issues or pull requests to improve the benchmark.
- Eye-Chart-Fonts: Special thanks to Denis Pelli for providing the Sloan and Pelli fonts used in this project. Note that these fonts are licensed under CC BY-NC-SA 4.0.
If you find this benchmark useful for your research, please cite our paper:
@misc{wang2026perceptual,
title={The perceptual gap between video see-through displays and natural human vision},
author={Jialin Wang and Songming Ping and Kemu Xu and Yue Li and Hai-Ning Liang},
year={2026},
eprint={2601.02805},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2601.02805},
}The data used for the paper can be found in the Dataset folder.
For questions or support, please open an issue on this repository.
Contact: chaosikaros@outlook.com