Benchmarks

Benchmarking

To improve the validity, integrity, and accuracy of BP estimation algorithm, we introduce and referee the first open-contribution BP estimation benchmark. This benchmark will ensure that the results reported in BP studies are valid (produce statistically significant results across different datasets), reproducible (provide code, data, experimental protocols, and device details), and accurate (high ED). We list the results of the open-contribution benchmark below in the form of SBP ED : DBP ED to compare across studies with different datasets:

Dataset	Estimator	MIMIC		PPG-BP	VitalDB
Filter		Kachuee et al., (2017)	Hasandazeh et al., (2019)	Liang et al., (2018)	Zhang et al., (2021)
	Dagamseh et al., (2021)	1.0:0.99	1.4:1.47	1.09:0.97	1.06:1.04
	Hasandazeh et al., (2019)	1.05:1.02	1.45:1.4	1.13:1.0	1.05:1.08
	Jeong et al., (2021)	1.0:1.0	nan:nan	nan:nan	1.06:1.03
	Huang et al., (2022)	1.0:1.0	nan:nan	nan:nan	1.05:1.08

Although the benchmarks only cover PPG, as wearable sensors become cheaper and more ubiquitous, they will be expanded to include them.

Open Contribution

In addition to the implemented benchmarks, we provide an opportunity to contribute to this table by providing an update form for researchers to improve upon existing benchmarks and add new benchmarks. Every benchmark reported on our website will follow strict guidelines on sharing code and reporting results. We provide examples and templates for feature extraction machine learning pipelines, deep learning pipelines, and data visualization scripts to allow for more transparent reporting and streamlined testing. This code can be found on in our WearableBP GitHub. Finally, although the benchmarks only cover PPG, as wearable sensors become cheaper and more ubiquitous, they will be expanded to include them.

Forms to contribute to our benchmarks: