The Friends and Family Test (FFT) is a single question used to assess services provided by the NHS:
“How likely are you to recommend our ward/A&E department to friends and family if they needed similar care or treatment?”.
NHS England’s publication guidance says: “The Friends and Family Test is a simple, comparable test which, when combined with follow-up questions, provides a mechanism to identify both good and bad performance and encourage staff to make improvements where services do not live up to expectations”.
Added, 10 may 2016: This blog relates to early research carried out shortly after the FFT was launched in april 2013. Clearly the comments about low levels of response will not apply in many cases today (though, presumably there will be still certain areas of some trusts where the level of response is too low for the results that are derived from them to be meaningful). For example, Gloucestershire Hospitals NHS Foundation Trust, which covers two large hospitals in Gloucester and Cheltenham [*1]:
Since the launch of the FFT in 2013, the number of patients leaving feedback has risen to over 25,000 per year with the majority also telling us why they voted the way they did. This information is fed back to every ward and department each month so that staff can see quickly areas that could be improved, as well as those areas that are well-received by the patient.
However, the rest of the article still appears to be current.
I fear there are a number of problems with the test.
Low levels of response
Many hospitals only managed to collect a few responses from patients.
The ‘research’ was carried out in all English acute units, polling both inpatients and A&E users over three months. Over the quarter April-June 2013 the national average response rate was 13.3%.
However, many units fell below this (that is the nature of averages, of course).
Out of 172 responding trusts, 48 delivered an overall response rate of under 10% (one trust turned in 2.8%: one feels their heart can’t have been in it and, I believe, they would have had good reason for feeling despondent).
As Jo Bibby says in her Health Foundation blog: “To find that some wards have been labelled as ‘failing’ based on as few as three responses in a month or that an A&E department is held up as best practice based on 16 responses (probably less than the footfall in an hour!) is a concern.”
Such small samples are statistically meaningless. But larger samples, had they been collected in the same way (see below), would be little better, not least because of the likelihood that results would be corrupted by special interest groups (on both sides) and others with various axes to grind.
Even if none of the above applied, respondents were only given five choices of response, plus “don’t know”: these are too few to provide data for a statistically meaningful analysis. Good practice requires at least ten.
Inconsistent data collection techniques
No consistent technique was used to collect data. In fact, trusts were encouraged to do what they liked.
Whilst there was a preponderance of “paper/postcard at point of discharge”, some trusts used only (or almost entirely) paper questionnaires mailed to patients’ homes (thereby relying on the respondents returning them (in time).
Some used solely text messages: an approach guaranteed to exclude significant, important, large groups—the elderly, the poor, and those not technically proficient.
Some relied solely on “electronic tablet/kiosk at point of discharge”.
Even those that used solely “paper/postcard at point of discharge” thereby excluded anyone who wanted to participate after they got home.
Suitability of respondents
The question presupposes that the respondent is clinically competent to assess whether the treatment they have received would be appropriate for friends and family—which I guess in the vast majority of cases, they aren’t—and capable of understanding and accepting good explanations if their experience didn’t match their expectations or demands.
On the other hand, it ignores the capacity of some respondents to think intelligently about the circumstances surrounding their spell of care. For example, I would recommend an A&E unit, even if it didn’t meet my expectations, if I knew that the alternative for a friend or family member was such a long journey to another unit that they might die en route.
It reduces the assessment of the treatment to no better than the assessment of a hotel or cafe.
Suitability of the question
The wording of the question is flawed. The use of the word “recommend” in “How likely are you to recommend our ward/A&E department…” is a subtle nudge in one direction rather than another. A better question might be, “How do you rate our ward/A&E department?”. And, I am certain that, had the question been, “how unlikely are you to recommend…”, it would not have delivered precisely the inverse results.
The publication guidance says: “The Friends and Family Test is a simple, comparable test which, when combined with follow-up questions, provides a mechanism to identify both good and bad performance and encourage staff to make improvements where services do not live up to expectations”.
What is the mechanism that the test provides? How does it identify good and bad performance—it only identifies, as it says, where “services do not live up to expectations”. What are the follow-up questions, who has determined them, how are they defined, how will the results be collected and assessed? It strikes me that the follow up questions will be the only useful part of the survey, if conducted properly, yet they have been ignored in favour of the dumbed down “headline question”.
And what about those people who had indifferent or low expectations of what they were going to get? If the hospital met them, did that count as success?
(1) Expectations, unless they are informed by education and experience, are pretty meaningless, even if you were to know what they were
(2) The people being asked to make a useful, useable assessment are, overwhelmingly, not qualified to do so
(3) The question asked is not capable of elucidating a meaningful assessment
(4) The method of collecting the data is so arbitrary and prone to corruption, and the sample sizes so low, that the data are statistically meaningless.
It makes me weep to see yet another example of the NHS colluding with its own abuse by politicians: “How heavy, sir, would you like us to make the stick you can use to beat us with?”
[*1] Involve magazine for members, march/april 2016, p4