Method comparison and Bland-Altman Analysis#

Assume for a specific type of measurement, there are two methods for performing it. A common question in this context is if both methods could replace each other. Therefore, similarity of measurements is investigated. One method for this is Bland-Altman analysis called after Martin Bland and Douglas Altman.

See also Altman and Bland: Measurement in Medicine: the Analysis of Method Comparison Studies

Before running this notebook, install the following packages:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# new measurements
measurement_1 = [130, 132, 138, 145, 148, 150, 155, 160, 161, 170, 175, 178, 182, 182, 188, 195, 195, 200, 200, 204, 210, 210, 215, 220, 200]
measurement_2 = [122, 130, 135, 132, 140, 151, 145, 150, 160, 150, 160, 179, 168, 175, 187, 170, 182, 179, 195, 190, 180, 195, 210, 190, 200]

# scatter plot
plt.plot(measurement_1, measurement_2, "o")
plt.plot([120, 220], [120, 220])
plt.axis([120, 220, 120, 220])
plt.show()
../_images/01_Bland_altman_analysis_2_0.png
# Determining Pearson's correlation coefficient r with a for-loop
import numpy as np

# get the mean of the measurements
mean_1 = np.mean(measurement_1)
mean_2 = np.mean(measurement_2)

# get the number of measurements
n = len(measurement_1)

# get the standard deviation of the measurements
std_dev_1 = np.std(measurement_1)
std_dev_2 = np.std(measurement_2)

# sum the expectation of 
sum = 0
for m_1, m_2 in zip(measurement_1, measurement_2):
    sum = sum + (m_1 - mean_1) * (m_2 - mean_2) / n

r = sum / (std_dev_1 * std_dev_2)

print ("r = " + str(r))
r = 0.9435300113035253
# Determine Pearson's r using scipy
from scipy import stats

stats.pearsonr(measurement_1, measurement_2)[0]
0.9435300113035255

Bland-Altman plots#

Bland-Altman plots are a way to visualize differences between paired measurements specifically. When googling for python code that draws such plots, one can end up with this solution:

# A function for drawing Bland-Altman plots
# source https://stackoverflow.com/questions/16399279/bland-altman-plot-in-python
import matplotlib.pyplot as plt
import numpy as np

def bland_altman_plot(data1, data2, *args, **kwargs):
    data1     = np.asarray(data1)
    data2     = np.asarray(data2)
    mean      = np.mean([data1, data2], axis=0)
    diff      = data1 - data2                   # Difference between data1 and data2
    md        = np.mean(diff)                   # Mean of the difference
    sd        = np.std(diff, axis=0)            # Standard deviation of the difference

    plt.scatter(mean, diff, *args, **kwargs)
    plt.axhline(md,           color='gray', linestyle='--')
    plt.axhline(md + 1.96*sd, color='gray', linestyle='--', c='orange', label=r'+ 1.96$\sigma$')
    plt.axhline(md - 1.96*sd, color='gray', linestyle='--', c='magenta', label=r'- 1.96$\sigma$')
    plt.legend()
    plt.xlabel("Average")
    plt.ylabel("Difference")
# draw a Bland-Altman plot
bland_altman_plot(measurement_1, measurement_2)
plt.show()
../_images/01_Bland_altman_analysis_7_0.png

Alternatively, one can use more advanced visualizations using statsmodels (pip install statsmodels). It needs numpy-arrays to work with and doesn’t work with lists.

from statsmodels.graphics.agreement import mean_diff_plot

m1 = np.asarray(measurement_1)
m2 = np.asarray(measurement_2)

plot = mean_diff_plot(m1, m2)
../_images/01_Bland_altman_analysis_9_0.png

Some simulated situations#

For simulating some characteristic bland altman plots, we start by setting a ground truth set of numbers, e.g. radii in mm of 100 cherries.

radii = np.random.normal(15, scale=1, size=(100))

# for demo purposes: print the first 5
print(radii[:5])
[14.24177821 16.20841584 13.92014335 15.02919871 13.18000739]

We assume these radii are the perfect ground truth. We will now measure the size of these cherries twice. Our measurement device is not perfect and thus, has a little error.

measurement_r1 = radii +  np.random.normal(0, scale=0.5, size=(100))
measurement_r2 = radii +  np.random.normal(0, scale=0.5, size=(100))

print(measurement_r1[:5])
print(measurement_r2[:5])
[14.68285507 15.56632    14.29800755 14.67622638 13.80484713]
[13.99560311 17.07692381 14.1803845  15.02369335 13.02208326]

Plotting these two measurements against each other in a Bland-Altman plot visualizes the agreement between the methods. Note the symmetry of the plot and that the data points are arranged around difference = 0.

bland_altman_plot(measurement_r1, measurement_r2)
plt.show()
../_images/01_Bland_altman_analysis_15_0.png

Additive bias#

If one method tends to measure more than the other method by a fixed offset, we call that an additive bias:

offset = 1

measurement_r1 = radii +  np.random.normal(0, scale=0.5, size=(100))
measurement_r2 = radii +  np.random.normal(0, scale=0.5, size=(100)) + offset

print(measurement_r1[:5])
print(measurement_r2[:5])
[14.59435321 15.35641134 14.14844843 15.51343634 12.68685903]
[15.78157031 17.17752145 15.24547858 16.41070381 14.51073123]

You can observe the offset in the Bland-Altman plot because the measurements are arranged around difference = offset:

bland_altman_plot(measurement_r1, measurement_r2)
plt.show()
../_images/01_Bland_altman_analysis_19_0.png

Multiplicative bias#

If one method always measures more than the other linearily depending on the measurement itself, we call that a multiplicative bias.

factor = 1.1

measurement_r1 = radii +  np.random.normal(0, scale=0.5, size=(100))
measurement_r2 = radii +  np.random.normal(0, scale=0.5, size=(100)) * factor

print(measurement_r1[:5])
print(measurement_r2[:5])
[14.89562962 16.54400292 14.14907105 14.94583762 13.03638946]
[15.29793478 16.16504215 13.21210776 15.03700252 13.72178155]
bland_altman_plot(measurement_r1, measurement_r2)
plt.show()
../_images/01_Bland_altman_analysis_22_0.png

As the effect is harder to see, we use a higher factor.

factor = 5

measurement_r1 = radii +  np.random.normal(0, scale=0.5, size=(100))
measurement_r2 = radii +  np.random.normal(0, scale=0.5, size=(100)) * factor + 5

print(measurement_r1[:5])
print(measurement_r2[:5])

bland_altman_plot(measurement_r1, measurement_r2)
plt.show()
[14.53183782 15.69039334 14.24592791 14.49140603 13.28029871]
[16.90801151 25.25274655 16.87634905 18.05211977 16.36020682]
../_images/01_Bland_altman_analysis_24_1.png

Exercise#

Process the banana dataset again, e.g. using a for-loop that goes through the folder ../data/banana/, and processes all the images. Measure the size of the banana slices using the scikit-image thresholding methods threshold_otsu and threshold_yen. Compare both methods using the techniques you learned above.