Evaluation Methods


We identify below six methods of evaluating the implementation of Safewards in your wards, hospitals and Trusts. None are perfect and all of them have flaws or difficulties of one sort or another. It is possible to use more than one of these evaluation methods, but this may be onerous and impractical, depending on the resources available to you.

If you do conduct a careful evaluation, please do seek to publish it, whether the results are positive or negative. It is very important for us all that we accumulate evidence to support future decisions and actions.

No evaluation

Many, if not most research based interventions are implemented without further evaluation. The RCT (Randomised Controlled Trial) provides gold standard evidence. On the basis of RCTs psychiatric services have introduced early intervention teams, clozapine, crisis intervention teams, etc. Following the research that has been properly conducted, peer reviewed and published, no further collection of local evidence is required. The Safewards interventions work. If wards implement them, overall they will get similar results.

Strengths: No extra work is required, there are no forms to fill in, no data analysis or report writing to be completed. All efforts can be concentrated on actual clinical work with patients and on doing the Safewards interventions in a wholesale and rigorous way.

Weaknesses: There will be no feedback on progress at any level, not to the staff on the wards or to the Trust Board. As feedback is often wanted in order to motivate implementation, the potential for enthusiastic implementation or for winning over ambivalent staff might be decreased.

Existing official incident report rates

Define a period of time before, and another period of time after you implement Safewards. These two periods of time should be of the same duration. Then look at the number of officially reported adverse incidents in these two periods to see if they decreased after Safewards was implemented. Different types of conflict events (aggression, absconding, self-harm, drug/alcohol use) can be added together to produce a conflict rate for comparison. Similarly different types of containment (rapid tranquillisation, manual restraint, seclusion, constant special observation) can be added together to produce a containment rate for comparison before and after – if you have reliable information on these sorts of events.

In order to be meaningful, this has to be carried out with some care as there are two problems to be overcome:

Reporting bias and missing data. Staff differ in the thresholds at which official incident reports are made. For example some wards report more incidents because perhaps they are more likely to report incidents of verbal abuse than other wards. We therefore suggest you only include in your evaluation the more serious incidents which are reported more rigorously and reliably (physical assault, property damage, successful abscond by a detained patient, self-harm) and exclude less serious incidents (medication refusal, theft, verbal abuse, attempted absconding). If you have reliable data on the use of seclusion and/or manual restraint, you can also use these in a separate evaluation.

Distinguishing between random fluctuations and a real effect of the Safewards interventions. Thankfully incidents of conflict or containment that are sufficiently severe to be officially reported are relatively rare. Their frequency also fluctuates randomly to some degree. This means that if I take a short time period and a small service area, some of the rises and falls in incident rates before and after the Safewards interventions will be random. They won’t represent a real effect. To take an extreme example to illustrate this, if I take one ward and a time period of one week before implementation and one week afterwards, I may have one incident in the before period and none in the after period. This is far more likely to be due to chance rather than the effect of Safewards. Similarly, no incidents before and one afterwards would not mean Safewards hadn’t worked. Very many clinical audits are misleading in this way, and they represent a waste of effort as well as producing false conclusions.

In order to overcome this difficulty, for individual wards very long time periods are required before any conclusion is drawn. We would recommend on average at least six months and preferably a year for a single ward. That is one year of data before Safewards is implemented, and one year of data afterwards. The picture is better when analysis is done and conclusions drawn at the level of multiple wards (e.g. a whole hospital or a Trust). Then shorter periods of time can be used, perhaps three months in the before period and three months after implementation. However when you analyse your outcomes at the Trust level, you cannot provide individual feedback to wards – the effect that the individual ward level will be effectively invisible.

The precise time periods required depend on your baseline rate of conflict and containment incidents, and we would advise you to seek the support of someone with some statistical knowledge to advise you on the precise time periods that are required to provide an evaluation at a particular service level (ward, hospital or Trust wide).

This evaluation design can be enhanced by implementing Safewards on some wards but not on others, and comparing changes in conflict and containment rates across both. Then if rates go down on the intervention wards but not on others, you will have further evidence on efficacy. However (i) this requires greater statistical sophistication and you will need help to decide how many wards and for how long, and (ii) the wards and patients not receiving the intervention miss out, and may incur harms that could otherwise have been avoided. Such an evaluation design probably takes you outside clinical audit and may require a formal research ethics approval. Unless, of course, this design forms part of a planned sequential roll out of Safewards across your Trust.

Strengths: No additional data collection is required, and if you attend to the reliability and sample issues identified above, the evaluation will be modestly robust. For outcomes at the level of a Trust, this is probably a reasonable design.

Weaknesses: Some statistical advice and planning is required. Help with analysing the outcome using statistical tests may also be required. Obtaining an outcome by ward requires long time periods and therefore this method is probably not good for providing quick feedback to generate commitment and enthusiasm.

Other existing data sources on a before and after basis

It may be that your Trust collects other information in a systematic way from the wards. For example some Trusts collect repeated rounds of inpatient satisfaction surveys. It may be possible to use these types of data to perform a before and after analysis.

Again the numbers are important. In order to make deductions about improvements on individual wards large numbers are required. The precise numbers of patients satisfaction questionnaires required in the before and after periods depends on the statistical properties of the questionnaire used. A rough rule of thumb would be at least 30 questionnaires in the before period to compare to at least 30 questionnaires in the after period. We suggest you seek statistical help to get an idea of the precise numbers you would require.

Just as with officially reported incidents, this method will be more powerful at the Trust level because of the larger number of questionnaires that can be tested.

Strengths: No additional data collection is required.

Weaknesses: Statistical advice on sample sizes and help with analysis may be required. Any evidence of outcome will be indirect, and will depend on how closely the data you are using is linked to the primary Safewards outcomes, conflict and containment. Whatever the results of such an evaluation approach, this would not be considered strong or robust evidence.

Use of existing official incidents and time between charts

This method uses official incident data before and after the implementation of Safewards, as described previously. However instead of analysing the frequency of incidents in defined periods before and after, it is the time between incidents (number of days) that is the outcome variable. If the number of days between incidents becomes longer, then the Safewards intervention is working.

This method is derived from a type of mathematics called ‘Statistical Process Control’ and is widely used in industry for quality control processes. More recently it has been brought into health care settings by the ‘Institute for Healthcare Improvement’ (IHI) in the US, and is used in quality control processes for MRSA infections, surgical accidents, etc. Using these methods it is possible to construct t-charts or time between charts for individual wards that will allow staff (or their incident report services) to plot incidents and see fairly quickly whether things are improving or not, and even give some indication as to whether those improvements are statistically significant.

Drawing up such charts requires some mathematical sophistication. The formulae can be found in books about ‘Statistical Process Control’ which themselves can easily be found in an amazon or equivalent search. Some information may also be available on the IHI website and in the following journal publication. Unfortunately due to copyright reasons we cannot supply copies of these items.

Benneyan, J. C., Llloyd, R. C. & Plsek, P. E. (2003) Statistical process control as a tool for research and healthcare improvement. Quality and Safety in Health Care 12: 458-464

Strengths: These charts give relatively quick feedback at the ward level and can therefore be very helpful at increasing motivation to change, and rewarding those who have made efforts to implement the Safewards interventions.

Weaknesses: Rests on the assumption that conflict or containment events in the ‘before’ period are occurring as a constant rate. However as changes are always occurring on acute psychiatric wards and in our hospitals in terms of staff and policies, and that a number of these are linked to conflict and containment outcomes (see the Safewards model), this assumption is unlikely to be completely true. Some caution in the interpretation of these charts is therefore required. The charts are frustratingly complicated to construct, and help or training may be required. Ward staff will need instruction on how these charts are to be completed and what changes mean, as counter intuitively a rising line means more days between events and therefore represents success.

Safewards intervention process measures

This is another way of indirectly assessing outcome of the Safewards interventions. Instead of directly measuring outcome in terms of conflict and containment events, using this method the degree of implementation of the Safewards interventions is assessed. In other words, you attempt to measure how much the ward staff are actually doing the ten Safewards interventions. If there is evidence that they are doing them a lot, then you can assume a good outcome. If the ward staff are not doing them very much, you can assume the outcome is worse or negligible.

During the research we used this checklist pdfOrganisation Fidelity. You could also use this or adapt it. The checklist includes all those things that can be seen that evidence the use of the Safewards interventions. You may want to supplement this with ideas of your own. This checklist based approach can only realistically work if someone other than the ward staff are completing them on a regular basis. In addition, you will need someone to input and summarise the data you obtain.

Strengths: No baseline or ‘before’ data is required, this process measure can commence as soon as the ward staff start implementing the Safewards interventions.

Weaknesses: This is an indirect measure, and not all the Safewards interventions can be seen or logged in a systematic way. Someone has to invest the time to complete the checklist on a regular basis, and time is needed to input the data and summarise it into a report on a regular basis.

Careful description of the Safewards interventions in practice

This is a variant of the process measure described above. However instead of an independent visitor to the ward using a checklist to note visible evidence of Safewards implementation, this method depends on workers giving qualitative information about their successful use of the Safewards interventions. In other words this is the structured and systematic collection of staff feedback.

Workers on the ward are asked to fill out a form once a week describing their use of the interventions or how the interventions have changed their practice. They are asked to give specific examples as to how the interventions have helped or hindered them in their work. These are submitted centrally and typed (we recommend that they are handwritten given the typing speeds of most staff currently, although email could be used if you consider this is not a problem). This feedback is then summarised centrally into a monthly report which is fed back to wards and to the Trust Board or a relevant committee.

Strengths: No baseline or ‘before’ data is required, this process measure can commence as soon as the ward staff start implementing the Safewards interventions. Because the data is qualitative, it can give a really powerful and convincing account of the Safewards interventions in action, and quickly evidence any causes for concern.

Weaknesses: It is reliant on staff having time to complete the feedback and do so in sufficient detail. The feedback may be biased, in that more positive and enthusiastic staff are more likely to provide the feedback, whereas negative events or lack of success is less likely to be reported. Some central resource is required to typing and for analysis and reporting.

Collect additional incident data using research tools

There are a number of reliable and valid research tools for counting conflict and containment events. During the Safewards RCT and many other previous studies, we have used the PCC (pdfPatient-staff conflict checklist). This simple checklist is completed at the end of every shift, taking only a few minutes of staff time, and indicates the numbers of conflict and containment events during the shift. Training is simple and can consist of simply a thorough read of the pdfPCC handbook.

In order to use this in an evaluation, a ward would need to collect PCC data for three months before starting to implement Safewards, then for at least three months while Safewards was in continuous use.

PCC completion rates would need to average 66% or better during these periods. Data could then be entered onto a computer and statistically analysed to show any change.

Strengths: The use of a research tool makes the results from this type of analysis much more credible and convincing.

Weaknesses: The collection of extra data is required, and more paperwork is resented by staff. Once collected, a resource is required to enter data, and someone with statistical knowledge is required to analyse it and to present the results. Without comparison to other wards not using the Safewards interventions, any change seen might to due to other local changes over time.