NOA Longitudinal Validation On Home OCT for AMD

03/16/2026

Longitudinal home OCT data were used to evaluate whether an artificial intelligence system can distinguish meaningful disease-activity change from stability when patients self-image frequently outside the clinic.

In this report, the authors assessed the Notal OCT Analyzer (NOA) by comparing AI-generated time-series of retinal total hypo-reflective volume (TRO) against expert grader determinations of whether each eye’s 5-week trajectory was stable or changing. Rather than focusing on single scans, the analysis centers on how trajectories behave across repeated measurements and whether longitudinal patterns align with expert classification. The central question was whether NOA-derived longitudinal signals could discriminate TRO change versus stability in the reported dataset.

The underlying prospective home-imaging study was conducted at 7 U.S. retina clinics from June 22, 2021, to December 15, 2022, and enrolled adults (≥55 years) with nAMD in at least one eye; other potentially confounding retinal pathologies, including epiretinal membrane, were not exclusionary. The report describes 198 consented participants, with 180 initiating at-home testing; 317 study eyes were enrolled, and 296 eyes’ trajectories were included after exclusion of 21 eyes without meaningful longitudinal data (defined as at least 4 tests). Across included eyes, participants completed 8242 tests over 9563 monitoring days, with a mean (SD) of 27.8 (7.0) tests per eye and 32.3 (6.2) days per eye. This testing volume and near-daily sampling underpin the trajectory-based comparisons to expert grading.

For personalized longitudinal classification, the authors applied a reference change value framework adapted from laboratory medicine, summarizing each eye’s fitted TRO trajectory by a signal-to-noise ratio (SNR) that reflects the amplitude of change relative to within-subject variation. Using expert labels (stable vs changing) as ground truth for the 5-week home-OCT time series, the reported SNR ROC analysis yielded an AUROC of 0.9811 and an optimal SNR cutoff of 2.42, corresponding to 99.1% sensitivity, 89.4% specificity, and 94.2% accuracy for distinguishing TRO change versus stability in this dataset. In this framing, the authors describe the personalized approach as a trajectory-level method intended to distinguish change from expected measurement variation.

The report also presents a population-threshold strategy based on the observed maximum-to-minimum change in TRO over the monitoring period, reporting an AUROC of 0.9687 with an optimal uniform threshold of 3.88 VU and associated sensitivity of 94.4%, specificity of 89.4%, and accuracy of 91.9%. As a contrast, the commonly used 10 VU threshold was reported with 76.6% sensitivity, 94.7% specificity, and 85.7% accuracy. In the authors’ presentation, the fixed 10 VU cutoff corresponds to fewer false notifications at the expense of missed change events relative to more sensitive thresholds, while the personalized method is described as sensitivity-forward in this dataset. Together, these results outline reported trade-offs between individualized and uniform thresholding approaches.

Illustrative timing examples in the report show expert-identified hyporeflective spaces preceding a fixed-threshold crossing: in one case, 10 VU was reached 11 days after initial detection (August 15 to August 26, 2022), and in another case the trajectory was 8.7 VU at treatment 8 days after initial detection (August 29 to September 6, 2022). The authors also describe sources of misclassification tied to segmentation and image/trajectory “noise,” including epiretinal membrane–related confounding in which an area under an ERM could be misclassified as hyporeflective space, along with factors such as fixation/centration and segmentation inaccuracies that may affect TRO estimation. The report notes that the system includes an interface for setting notification thresholds and discusses potential workflow considerations (for example, alert frequency and review burden), in the context of the analyzed 5-week TRO trajectories and their expert-labeled ground truth.

Key Takeaways:

The authors report a near-daily home OCT trajectory dataset analyzed over a five-week monitoring period, supporting longitudinal comparison of AI-generated TRO trends with expert grading.
A personalized RCV/SNR method was reported to show strong discrimination between change and stability when evaluated against expert-labeled trajectories.
A fixed 10 VU cutoff was reported to trade sensitivity for specificity, and the report describes timing examples and failure modes (including ERM-related segmentation confounding) that may influence classification in some cases.

Title

Share on ReachMD