Executive Summary of Response to the Joint Funding Bodies’
Review of Research Assessment from the
Royal Geographical Society (with Institute of British Geographers)
1. Our definition of quality in research is
… intellectual/academic excellence. High quality work is that which
makes an original contribution to knowledge, either conceptual or
substantive, and influences practice – academic or otherwise.
2. This definition leads us to two axioms regarding any research assessment:
a. Judgements about research quality are necessarily subjective; and
b. Systems must deploy transparent, consistent and credible procedures.
3. Of the four approaches to assessment set out in the consultation document, only
expert review will meet our definition of quality and assessment axioms.
4. Assessments must be made by panels of experts who have the confidence of the
Funding Councils and the community assessed, based on the fullest possible range of
evidence. They should be both retrospective and prospective and relate to either
research groups or departments: they should NOT be made for individuals.
5. Of the other systems proposed, various metrics are important pieces of
information that should be available to assessment panels, but algorithms based on
them are not viable alternatives to peer evaluations of quality: many available metrics
(such as citation counts) are very imperfect indicators of that they purport to measure.
We cannot identify any means of self-assessment that would meet our criteria of
transparency, consistency and credibility. Reliance on historical rankings would
significantly reduce (if not remove) incentives from the system.
6. Geography as a discipline crosses the boundaries between the social and the
natural sciences. Inter-disciplinary or working across discipline boundaries is the
norm for many geographers. This may be done within an institution or even
between institutions. We feel that there should be an opportunity for this to be
recognised, and at the very least for it not to be penalised.
7. Of the three reasons for conducting research assessments offered, we recognise
the necessity of the first and the political desirability of the second, but strongly
believe that the third – to encourage research improvement – is by far the most
important for the long-term health of the UK’s universities. Assessment exercises
should be so constructed that they do not skew the nature of research undertaken
towards the short-term production of outputs and attraction of inputs.
8. We believe that seven years is the minimum time between reviews – the first
five were at too-frequent intervals. Rolling reviews have potential advantages.
9. We strongly oppose review results being used to determine the distribution of
research money between subjects by the Funding Councils. All institutions should
be assessed in the same way, with allowances for different disciplinary research
practices: within disciplines, consistency of treatment is paramount.
10. Whatever system is put in place must meet our criteria and be: 1) fair to
individuals and institutions; 2) resistant to games-playing; and 3) rigorous.
Ron Johnston, Ray Hudson, Rita Gardner 30 November 2002
Joint funding bodies’ review of research assessment
Invitation to contribute
Executive summary
Purpose
1. This document invites initial contributions to the joint funding bodies‟ review of
research assessment in higher education. As a basis for discussion, it identifies a
number of key issues and possible approaches to research assessment. It also outlines
the purpose of the review and its timescale.
Key points
2. The review will be owned by the UK funding councils and the Department for
Employment and Learning in Northern Ireland (DEL NI), and be overseen by a
steering group chaired by Sir Gareth Roberts. There will also be a wider consultative
group.
3. This is only the first part of plans for consulting stakeholders in the course of
the review, which will include meetings, focus groups and a dedicated web-site.
Action requested
4. At this stage we invite interested parties to debate the issues, including the ones
identified in this document, and contribute to the scope of the review. Please send
completed responses by e-mail to Vanessa Conte at rareview@hefce.ac.uk The
closing date is 29 November 2002.
Why are we reviewing research assessment policy?
5. On current measures of performance, UK research is in excellent health.
Citation evidence confirms the strength of UK research in higher education
institutions (HEIs) illustrated so dramatically by the results of the 2001 Research
Assessment Exercise (RAE).
6. There are, however, good reasons to re-examine the continued fitness for
purpose of the RAE. Concerns include:
effect of the RAE upon the financial sustainability of research
an increased risk that as HEIs‟ understanding of the system becomes more
sophisticated, games-playing will undermine the exercise
administrative burden
the need to properly recognise collaborations and partnerships across institutions
and with organisations outside HE
the need to fully recognise all aspects of excellence in research (such as pure
intellectual quality, value added to professional practice, applicability, and impact
within and beyond the research community)
ability to recognise, or at least not discourage, enterprise activities
concern over the disciplinary basis of the RAE and its effects upon
interdisciplinarity and multidisciplinarity
lack of discrimination in the current rating system, especially at the top end.
7. For these reasons, we have set up a review of research assessment led by Sir
Gareth Roberts, President of Wolfson College, Oxford. The review is expected to
complete its work by March 2003, although there will, of course, be a need for further
detailed work by the administrators of any new system.
Assessment of research quality: the context
8. In conceptual terms, the issue of research assessment might be seen as
straightforward. However, to arrive at the best possible assessment process, we must
first answer the philosophical question, „what is meant by quality in research‟. Is
quality simply another term for intellectual excellence, or does it have other
dimensions reflecting its likely impact within and beyond the research community?
As we are ultimately using the information to calculate funding entitlements for
institutions, this question has to include an assessment of their contribution to the
development of researchers as well as their research output. Once we have the answer
– or answers – to the philosophical question, we are left with the technical but non-
trivial problem of designing a system, which provides a fair and accurate assessment
of quality while minimising burden on all concerned.
Our response to this question is unequivocal: quality in research can only imply
intellectual/scholarly excellence. High quality work is that which makes an original
contribution to knowledge, either conceptual or substantive, and influences practice –
academic or otherwise.
It follows from this response that judgements about quality of research are necessarily
subjective, reliant upon expert knowledge, and an assessment scheme has to be
devised which is based upon this. Our responses to the issues raised in this
consultation are all based on that axiom, and on the criteria that we believe follow
from it – assessments, and the procedures by which they are made, should be
transparent, consistent and credible.
9. These two themes, the philosophical and the practical, will run throughout this
review. This document, coming as it does at the beginning of the process, tends to
emphasise the former, although by no means exclusively so. We have also made a
small number of assumptions which the review will not challenge:
a. The dual support system will continue. There will thus be an ongoing
need for a method of allocating funds selectively. Research assessment of some
description will continue to be used for this purpose.
b. The quality of research will continue to be considered in a global context.
It will therefore need to be assessed at a national and international level.
The terms ‘national’ and ‘international’ have always been confusing when used
in RAE terminology and assessments. Whether any piece of work is ‘nationally’
or ‘internationally’ of high quality can be a function of the context in which it is
undertaken, and so can any individual’s or group’s performance. It is not clear
why it is necessary to use the terms in the grade descriptors as relating to a unit’s
research, which is separate from its standing. A department may be widely
recognised internationally as of the highest quality, although the work that it
does has a national impact only, or at least a very restricted international impact.
Clearly, this issue will be more important for some disciplines than others, but it
seems unnecessary to use the descriptors when describing the quality of
research: it is either of the highest standard or it isn’t!
10. We also wish to introduce as context three relevant factors:
a. There is, quite properly, an increasing emphasis upon the „people
dimension‟ – that is, the contribution made by institutions to the supply and
development of researchers.
b. There are now public funds available to universities and colleges for
knowledge transfer activities. Work is continuing to develop measures of
excellence in those activities, many of which involve research services to
external partners.
c. With the competition for research funding being increasingly fierce and
the costs of research in many subjects increasing, there is a need to consider
whether targeted help is required to enable new subjects and new fields to
develop. It may (or may not) fall to the research assessment process to identify
suitable candidates for any such assistance.
Our plans for the review
11. The review will be owned by the UK funding councils and DEL NI and be
overseen by a steering group chaired by Sir Gareth Roberts. There will also be a wider
consultative group, with access to steering group papers and remote discussions.
12. The steering group will draw up a list of broad approaches to research
assessment. A team of people – led by Siân Thomas, at the HEFCE – will support the
steering group and undertake development work on each approach, showing how it
might work and what behavioural impacts it might have.
13. The output of the review will be a number of detailed models of research
assessment and a covering report. This will be presented to the chairs and chief
executives of the funding councils (and their equivalents at DEL NI) before formal
consultation with stakeholders.
14. In its report, the steering group will either identify a single preferred option or
suggest the circumstances in which particular models would be most appropriate.
15. The membership and terms of reference for the group are at Annex A.
Timetable
16. This is only the first part of the plans for consulting stakeholders in the course
of the review. A series of public meetings and focus groups will be held from October
(dates to be confirmed). In addition, the review team will be meeting individual
stakeholder groups, and records of these meetings will be published. Working
documents produced by the review will be published on a dedicated web-site, as will
responses to this invitation (unless confidentiality has been specifically requested).
17. A formal consultation lasting 12 weeks will be launched once the review has
completed its work. A provisional timetable is given below (some dates to be
confirmed):
Invitation to contribute opens 4th October 2002
Website launched 11th October 2002
First steering group meeting October 2002
Invitation to contribute ends 29th November 2002
Public meetings November-December 2002
Focus groups October-November 2002
Second steering group meeting December 2002
Third steering group meeting March 2003
Completed report April 2003
Publication of report: formal consultation May 2003
Informing the review
18. We invite interested parties to debate the issues, including the ones we have
identified, and contribute to our review. Respondents may wish to convene focus
groups or workshops and submit the formal record of their discussions. We have
provided what we hope are helpful notes to stimulate debate (see Annex B).
Approaches to assessment
19. We can envisage four distinct approaches to assessment:
expert review (including peer review)
algorithm based entirely upon quantitative metrics
self-assessment
historical ratings.
Given our response to the issue of defining quality, and our firm belief that this
necessarily depends upon a process of expert judgement (albeit informed by a
range of metrics and other data), it follows that of the four options identified,
only expert review will meet our criterion. The assessments must be made by
panels of experts who have the confidence of both the Funding Councils and the
community they are assessing, based on the fullest possible range of evidence and
sufficient time to investigate that in depth: the entire process must be
transparent.
Our general comments on the various options are as follows (detailed responses
to the issues raised are added in the relevant parts of the document):
Algorithms based on quantitative metrics
We are totally opposed to these: quality cannot be counted and to reduce its
assessment to a mechanical process would demean the research enterprise.
The benefits of such algorithms are with transparency and costs, but the disadvantages
totally outweigh them. Performance on many indicators is a function of the type of
research undertaken, and can vary even within disciplines, especially broad-based
ones such as geography: this would make it extremely difficult (if not impossible) to
calibrate an algorithm that could deliver justice within a UoA, let alone across UoAs.
Some of the metrics commonly advanced have many limitations because of their
source material and difficulties of interpretation. This is especially the case with
citation indices, which reflect the policy decisions of a single commercial firm with
regard, for example, to subject matter coverage, journal coverage and the construction
of indices. Some of their material may be informative in particular circumstances, but
not for such an important task as undertaking an assessment of research quality across
an entire university system with major consequences for several years of funding.
Self-assessment
We cannot conceive of a method of self-assessment that would meet the overriding
needs for transparency, consistency and credibility in the procedure, nor of a form of
audit that would ensure that those criteria were met. As far as we are aware, there is
little experience of successful use of this method: to deploy it would be a massive
gamble in the context of the importance of the RAEs, and would be an invitation to
game-playing by institutions of the highest order.
There is already an element of self-assessment in the system that has been in place
since the first RAE. This is valuable information in an expert review system
(especially for prospective evaluations) and should be retained.
Historical ratings
Although many department rankings change only slowly, nevertheless a significant
number do and if there are to be incentives in the system and – as we recommend –
RAEs are less frequent than they have been over the last two decades, then the use of
historical rankings would both penalise innovation and improvement whilst at the
same time encourage complacency at the top.
Any system using historical rankings would need to have a improvement/deterioration
component that met our criteria of transparency, consistency and credibility. It would
have to involve peer review, and there would be pressure on many (if not most)
departments to enter for such a review, either in the hope of an upgrade or to ensure
against a downgrade, especially if RAEs were less frequent than now. It could be
almost as extensive an exercise as a full RAE! We would not support such a review
using either algorithms or self-assessment, for the reasons given above.
20. This document invites respondents to explore each of these approaches in turn.
This does not mean that we believe that these methods have to be used in isolation, or
that respondents should feel constrained in proposing systems that employ elements
of two or more of them. It is, for example, clearly possible to use metrics to inform a
self-assessment or an expert review. The 2001 RAE did just that: assessment panels
were obliged to consider some objective data (such as grant income), and allowed to
stipulate that they would consider others (such as bibliometric data). Nevertheless, the
RAE was ultimately an expert review system, because the final decision rested with
the panel.
21. Each approach is discussed in detail in Annex B.
Cross-cutting themes
22. There are also fundamental issues which need to be addressed regardless of the
approach taken to assessment:
a. What should/could an assessment of the research base be used for?
The assessment can have three main uses:
1. To provide public information regarding the quality of research in British
universities, as part of a general exercise in accountability and to aid choice;
2. To inform the Funding Councils‟ development of algorithms for allocating
money to institutions;
3. To encourage research improvement, in the status that follows „success‟, in the
encouragement that recognition brings, and in the resources that are allocated
to stimulate further improvement and realize research potential.
Of these, we recognise the necessity of the first and the political desirability of the
second. However, we believe that the third is by far the most important in the long
term for the academic health of the country’s universities. The current system,
however, is continually concentrating funds in a relatively small number of
departments within each discipline and, overall, a relatively small number of
universities. The long-term goal should be to have a system which ensures that many
institutions are attaining high quality outcomes, and as a consequence greater equality
of opportunity fully to realise their potential for research and scholarship.
We do not believe that the RAE judgements should be used to determine Research
Council policies, though the information will, almost of necessity, inform them. It
would lead to an over-centralised, monolithic system for stimulating research, and
potentially deny resources for improvements to worthy potential recipients.
There is a strong case for each of the four Funding Councils to treat the outcomes of
the RAE in roughly the same way otherwise excellence may not only be better
rewarded in some parts of the UK than others but also differentially stimulated.
b. How often should research be assessed? Should it be on a rolling basis?
We consider that the current situation of a review every 4-5 years is much too
frequent: there have now been five reviews since they were established in 1985.
In determining the length of time between reviews, the important consideration must
be that the period of consistent funding between reviews is sufficiently long to give
stability to financial planning but not too long to remove incentives for improvement
and to encourage complacency. In many disciplines, research planning and delivery is
not a short-term process, and departments need sufficient time to have confidence in
their plans and in their ability to realize them within an RAE period. .
Any decision as to a desirable time between reviews is arbitrary: we think 7 years is
the minimum that should be aimed for, with anything less being debilitating to
research practice and encouraging short-termism. With that length of time, assessment
of both consistent high quality performance and improvement is much more feasible
than over shorter periods.
Rolling reviews have some attractions, notably with regard to the workload involved
within institutions. One disadvantage would be with regard to institutional financial
planning and investment strategies for disciplines and departments since overall
budgets would become less predictable and this would also militate against cross-
disciplinary and inter-disciplinary research initiatives. If less money were involved,
however, this last point would become less important, so that rolling reviews could be
a medium- to long-term goal.
c. What is excellence in research?
As we have already indicated, excellence in research involves intellectual
innovation: taking forward the frontiers of knowledge (which can be done in a
variety of ways). In most disciplines, excellence can only be identified through
published outputs and its identification must involve expert judgements based on
careful consideration of those materials, assisted by whatever other data those
undertaking the evaluations consider desirable and credible.
It has to be recognised that true research excellence is probably highly skewed:
only a very small proportion of researchers will have impacts that are long-
lasting and wide-ranging: most will contribute – entirely satisfactorily – to the
slow accumulation of knowledge.
d. Should research assessment determine the proportion of the available
funding directed towards each subject?
NO. That proportion is again a matter of judgement that can only be made subjectively
through informed debate regarding scientific priorities and justice for all disciplines. It must be
transparent, consistent and credible – as with so many other aspects of the RAE exercises –
and cannot be left to an algorithm. To do that would invite expert panels to ‘play games’ in
order to promote their disciplines’ interests: grade drift would be the outcome and the purpose
of the exercise would become lost.
e. Should each institution be assessed in the same way?
YES. Clearly judgements should take account of particular circumstances –
another reason why they have to be subjective – but without consistent
application of criteria the credibility of the exercise would be undermined, even
with transparent and full reporting of decisions and the reasons for them. And if
resources follow the judgements, equal treatment is absolutely necessary.
There is a strong and widely-held belief that the needed information should be
provided about all individuals who are members of the department/group etc. being
assessed. There is now considerable game-playing over this and some grades are not
properly reflective of an entire group.
f. Should each subject or group of cognate subjects be assessed in the same
way?
NOT ENTIRELY. Subjects differ widely in their research practices, and the expert panels
should recognise this (as we believe was the case with the last RAE) and operate the clear
generic criteria accordingly. Consistency within subjects should be paramount.
g. How much discretion should institutions have in putting together their
submissions?
The submissions have two main components: required data; and required commentary. On
the former, there should be no discretion: in order for the expert panels to make proper
comparisons the data have to be presented in a consistent way. On the latter, if the request is
clear as to what issues it wants discussed, then institutions should have discretion (as they
already do) with regard to how it is presented. This is, for example, critical in providing
information of future plans, which in due course can be compared to outcomes.
h. How can a research assessment process be designed to support equality of
treatment for all groups of staff in Higher Education?
If the system is transparent and consistent, with full reporting and feedback,
then equality of treatment – within the criteria set and the established practices –
should be ensured.
As the present system operates, however, it can (and some believe does) discriminate
against institutions which, for a variety of reasons (historical, institutional, current
choice etc.), place a relatively low priority on research relative to other areas of
activity, especially teaching. This is exacerbated because RAE reports get much wider
publicity than teaching reviews, and have funding consequences; furthermore, the
present systems of reviews do not encourage a research-teaching nexus that should be
at the heart of higher education. There is a strong case, therefore, for conducting
reviews of departments as a whole, covering all of their activities (teaching and
research) and presenting a single, though comprehensive, evaluation of their
performance (i.e. not reduced to a single grade). This would undoubtedly involve a
different format than the current RAEs, including visiting panels to departments. Such
comprehensive evaluations could be the basis of a system of rolling reviews.
In order to assist with equality of treatment – on a range of criteria (including the
vexed issue of contract staff) – the RAE should be so constructed that it doesn‟t,
almost of necessity, stimulate a „speeding up of research‟ through the setting of ever-
higher quantitative targets. In the end, this can be destructive not only of individual
careers and work-life balance but also the entire research enterprise and its goal of
improving the quality of life for all: short-termism must not be allowed to take over.
There is a strong case for each of the four Funding Councils to treat the outcomes of
the RAE in roughly the same way otherwise excellence may not only be better
rewarded in some parts of the UK than others but also differentially stimulated.
Equality of treatment for individuals is an internal matter for institutions.
i. Priorities: what are the most important features of an assessment process?
The system adopted has to make complex subjective decisions in widely-
acceptable ways that are seen as legitimate.
It must be
A. Fair to individuals and institutions;
B. Resistant to games-playing; and
C. Rigorous.
It should also be informative (which it would have to be in order to be fair),
transparent (again, fairness implies this), and not burdensome (or, at least, no more
burdensome than absolutely necessary).
23. Notes elaborating upon each of these issues are also provided in Annex B.
How to respond
24. We will assume that all respondents consent to the publication of their response.
If you wish your response, or any part of it, to remain confidential, this must be
clearly indicated both in the covering e-mail and the front page of the response itself.
25. What we seek at this juncture is the clearest possible sense of what matters to
interested parties, so that we can place those concerns at the heart of the review and
ensure that they inform the development of proposals.
26. The purpose of this invitation, therefore, is to generate ideas and insights rather
than to discriminate between them.
27. The notes in Annex B set out some of the issues around each question. They are
intended to stimulate discussion. We encourage respondents to challenge any
underlying assumptions they discern in our presentation of the issues.
28. Please make clear at the top of the response on whose behalf it has been
submitted. In particular, please indicate whether it represents the corporate view of an
institution, organisation or grouping, or the private view of an individual or group of
individuals.
29. Responses should include the details (name, telephone number and e-mail
address) of someone we can contact if we have any queries about the response.
30. Please send completed responses by e-mail to Vanessa Conte at
rareview.hefce.ac.uk. The closing date is 29 November 2002.
31. We regret that it will not be possible to acknowledge or provide feedback to all
respondents. However, if there are points of particular importance in your response
which you wish to draw to the attention of the review team, please contact Siân
Thomas at or Tom Sastry at rareview@hefce.ac.uk
Annex A
Steering group: membership and terms of reference
Membership
Sir Gareth Roberts Wolfson College, Oxford
(Chairman)
Sir Leszek Boriszeiwicz Imperial College
Professor Vicki Bruce University of Edinburgh
Professor David Eastwood University of East Anglia
Professor Georgina Follett Dundee University
Dr John Kemp Evotek Neurosciences GmbH
Professor Fabian Monds Invest Northern Ireland
Professor Terri Rees Cardiff University
Professor Phil Ruffles Rolls Royce plc
Sir David Watson University of Brighton
Others have been approached. Further names may be confirmed in the near future.
Terms of reference
1. The review will investigate different approaches to the definition and evaluation
of research quality, drawing on the lessons both of the 2001 RAE and of other models
of research assessment,1 and will advise on the future of research quality evaluation.
2. The output will be a number of models of research assessment and a short
covering report to be presented to the chairmen and chief executives of the funding
councils (including DEL NI). The report will either identify one preferred option or
indicate the circumstances under which particular models would be most appropriate.
1
The term ‘assessment’ is used here in its broadest sense to refer to any activity undertaken
with the aim of providing information, assurance or feedback on the quality of research and
associated activities and processes.
Annex B
Notes for facilitators
1. These notes are intended to guide those responsible for producing responses.
2. We have divided the topics for discussion into six groups. We hope this will be
helpful to those organising discussions within their organisations or groupings. Four
of the groups relate to the approaches to assessment outlined in paragraph 19 of the
main document, and the fifth relates to crosscutting issues which will have to be
addressed whichever approach is pursued. Group 6 prompts discussion of any topics
that we have missed.
Group 1: Expert review
3. We have used the term „expert review‟ to describe a system in which experts
(possibly but not necessarily peers) make a professional judgement on the
performance of individuals or groupings2, over the previous cycle, and/or their likely
performance in the future.
4. In such a system, assessors may make use of metrics, but the ultimate
responsibility for decisions rests with them. Assessment may be undertaken entirely
by peers or may incorporate others (such as representatives of user groups, lay people,
and financial experts). The 2001 RAE was an example of this type of assessment.
5. A variant of this system would be a combined assessment of teaching and
research.
6. Suppose the funding councils have decided that they wish to retain the
judgement of experts as the cornerstone of the research assessment. They are,
however, willing to consider any system, however different from the 2001 RAE, so
long as that condition is met. How would you advise them?
7. In providing your advice, you are asked to consider the following questions:
a. Should the assessments be prospective, retrospective or a combination of the
two?
They have to be a combination of the two, so that the assessment is based on the
department’s current trajectory taking into account starting point, recent
performance, and prospects for the next 3-5 years.
b. What objective data should assessors consider?
The dominant objective ‘data’ to be considered should be research outputs during the years
preceding the assessment: as at present, full details should be made available of each
2
A grouping might be (for example) a research group, network, department, faculty, institution
or consortium.
individual’s performance, and copies of all publications should be both available for
consultation and indeed consulted by panel members (preferably more than one).
Other data, as now, should be collected on research income (in suitably disaggregated form:
there are major differences between and within disciplines in the value-added that can be
gained from relatively small grants and these should be identifiable). There should also be
data on both student higher degree completions and their post-graduation employment:
contribution to the reproduction of the academic/research labour force is a major mark of a
department’s excellence.
c. At what level should assessments be made – individuals, groups,
departments, research institutes, or higher education institutions?
The assessment should definitely NOT be of individuals: that is a task for others
in other contexts.
Research within disciplines is organized in most universities by either department or
subject group and – as at present – these should be the units of assessment. The goal
of the assessment is to evaluate how groups are contributing to advancement of
knowledge in their discipline(s), and that evaluation (on the assumption that, as we
strongly urge, it involves expert review) should be undertaken by their peers, hence
the need for assessment subject-by-subject.
Evaluation by institution would be much too broad-brush and could well penalise high
quality groups in institutions that perform poorly overall.
d. Is there an alternative to organising the assessment around subjects or
thematic areas? If this is unavoidable, roughly how many should there be?
We have not identified a viable alternative. There is a strong case, however, for a proper
assessment of pedagogic research, perhaps involving a combination of subject and
educational specialists, with departments able to determine the degree to which they are
assessed in this category.
e. What are the major strengths and weaknesses of this approach?
On the assumption that this question refers to the expert review approach and not the
alternative that may have been identified in (d), we believe that its main strength is
that it is most likely to meet our criteria of transparency, consistency, and credibility.
Research evaluation is a subjective judgement process; it can only properly be
undertaken by recognised experts.
That strength needs to be bolstered by ensuring that all aspects of panel selection and
decision-making are as transparent as possible. For example:
a. Although nominations for membership are called for, it is far from clear
how decisions are then made, other than that the nominated chair is involved
and that a „balance‟ is sought across a number of variables, such as subject
matter, the various countries of the UK, types of institution etc. More external
involvement might enhance credibility, while recognising that the ultimate
decision-making (especially with regard to panel chairs) lies with the Funding
Councils;
b. It should be the norm that the majority of a panel‟s members are „new‟ to
the process in each RAE. Some continuity of membership is needed (in
particular, it is probably desirable that the chair have served on a previous
RAE), but nobody should serve on more than two, since this carries the danger
of concentrating power and allowing a few people potentially to influence a
discipline‟s directions for too long;
c. There was no transparency with regard to the use of overseas advisers in
the last RAE – this should be rectified; and
d. Given the importance of the exercise, the feedback to institutions should
be much greater than has been the case so far.
The weaknesses are that any system of „subjective‟ expert review is only as good as
the individuals concerned, none of whom are omniscient and totally immune from
misinterpretation and error. With sizeable panels this should be avoidable, as long as
they are well chaired and no individual (or small group of individuals) is allowed
either to dominate or to unfairly influence particular decisions.
An allied weakness is that, because the individuals involved are (of necessity) known,
they are open to attempted influence – however indirect. For this, as for so much else,
we have to rely on the integrity of the individuals involved, which should be an
important consideration in their nomination, consideration and appointment.
Group 2: Algorithm
8. Suppose the funding councils have decided to use an algorithm to assess
research quality. The assessment must be „automatic‟, leaving no room for subjective
assessment. Metrics might include:
measures of reputation based on surveys
This would be disastrous. In any large discipline with many separate specialisms, the majority
of individuals in any department would not be known to the great majority of those asked for
their opinions: the judgements of the departments as a whole could be based on little more
than hearsay and could be considerably biased towards the reputations of a few individuals.
(And popularity contests – e.g. Today’s ‘Person of the Year’ – are open to improper
influence!)
external research income
While the ability to attract income is necessary for high quality research in some fields, it is
not sufficient to guarantee success. Using such an input measure would assume a black-box
consistent relationship between inputs and outputs which would, at best, have dubious
sustainability, especially given the variations both between and within disciplines in the
importance of large sums to different types of research.
bibliometric measures (publications or citations)
These have limited utility because of the coverage of sources (which varies both
within and between disciplines) and the ways in which various indices are
calculated. They do not meet the transparency, consistency and credibility
criteria.
research student numbers (or completions)
Although a good measure of contributions to a discipline‟s health, these suffer
drawbacks, such as variations between and within disciplines in the availability of
studentships and a department‟s links to current research council priorities in aspects
of their subject.
measures of financial sustainability.
It is not clear what these would be and how they relate to a
discipline/department, which is the only viable level of assessment in our view.
9. Assume the councils have not, however, formed a view on what metrics should be
used or how they could be combined most effectively in an algorithm. How would you advise
them?
Not to bother.
10. You have been asked in providing your advice to consider the following questions:
a. Is it, in principle, acceptable to assess research entirely on the basis of
metrics?
NO
b. What metrics are available?
Several but, in the above context, none that we can commend.
c. Can the available metrics be combined to provide an accurate picture of the
location of research strength?
NO
d. If funding were tied to the available metrics, what effects would this have
upon behaviour? Would the metrics themselves continue to be reliable?
It would encourage game-playing and put immense pressure on, for example,
journal editors and referees. The metrics would not be reliable.
e. What are the major strengths and weaknesses of this approach?
STRENGTHS: ‘cheap and dirty’. WEAKNESSES: complete lack of credibility.
Group 3: Self-assessment
11. Suppose the funding councils have decided to pursue a self-assessment model in
which institutions, departments or individuals assess themselves. A proportion of the
assessments are reviewed in detail. In a self-assessment model, the assessment is
made by the assessed, although its reliability may be challenged by the validators.
12. Assume the councils have not, however, formed a view on how the assessment
should be structured and how self-assessments will be validated. How would you
advise them?
Not to proceed with the method until comprehensive trials (which had no
influence on reputations and funding outcomes) had been conducted and the
academic community was convinced of the method’s ability to meet our criteria
of transparency, consistency and credibility.
13. In providing your advice, you are asked to consider the following questions:
a. What data might we require institutions to include in their self-assessments?
The data that they provide to RAEs now, but if self-assessment is to be used then it
seems logical to allow supplicants to construct their evidence as they see best – even
though this may make consistency of treatment hard to achieve: if they can‟t produce
a convincing case, do they deserve to succeed?
b. Should the assessments be prospective, retrospective or a combination of the
two?
A combination, as indicated in an earlier answer.
c. What criteria should institutions be obliged to apply to their own work.
Should these be the same in each institution or each subject?
The criteria would have to be common and commonly applied, relative to the
institution’s mission: we have no suggestions as to what they should be since we
believe the approach non-viable.
d. How might we credibly validate institutions‟ own assessment of their own
work?
Rigorous inspection of large samples
e. Would self-assessment be more or less burdensome than expert review?
LESS, though not significantly so given our answer to (d), for the Funding
Councils: probably MORE for the departments and institutions.
f. What are the major strengths and weaknesses of this approach
STRENGTHS: departments could set their own goals and be judged against
them.
WEAKNESSES: lack of transparency, almost certainly great difficulties in
consistency, and total lack of credibility with those affected. Game-playing of the
highest order would be invited!
Group 4: Historical ratings
14. Suppose the funding councils have decided to pursue a policy that gives each
institution a rating on the basis of its historical performance and/or the value of its
research infrastructure. Research would, in effect, be presumed to be strongest in
those departments or institutions with the strongest track record.
15. The councils recognise that such an approach could only be used in conjunction
with another system: there would need to be some way of identifying institutions
whose performance was sharply improving or declining, even if the presumption was
that the distribution of excellence would remain stable. It would also be possible to
alter the share of the total pot provided for each institution on the basis of what had
been achieved with the investment provided (a „value for money‟ rating).
16. Assume you have been asked to advise on how such a system might work. In
developing your advice, you have been asked to consider the following questions:
a. Is it acceptable to employ a system that effectively acknowledges that the
distribution of research strength is likely to change very slowly?
NO. Does the evidence show that? Would it show it if, as we recommend, RAEs
only occurred every 7 years or so?
b. What measures should be used to establish each institution‟s baseline
ratings?
A full RAE as currently conducted.
c. What mechanism might be used to identify failing institutions or institutions
outperforming expectations? Could it involve a „value for money‟ element?
Only a full return to an RAE of the current type. Any other information –
performance indicators, self-assessment etc – would be partial and open to game-
playing.
d. What would be the likely effects upon behaviour?
With all such exercises, people learn how to play the game – as the current round
of teaching assessments makes very clear. Transparency and credibility would be
hard to sustain – greater alienation of academics and a fall in morale.
e. What are the major strengths and weaknesses of this approach?
STRENGTHS: relatively cheap if not done rigorously; if done rigorously, then
most would want to be fully evaluated and savings would be few.
WEAKNESSES: almost certain to fail to meet our criteria of transparency,
consistency and credibility. Likely to result in ossification of the system
Group 5: Crosscutting themes
17. You have been asked to provide advice to the funding councils on the following
fundamental issues:
See our answers above
a. What should/could an assessment of the research base be used for?
b. How often should research be assessed? Should it be on a rolling basis?
c. What is excellence in research?
d. Should research assessment determine the proportion of the available
funding directed towards each subject?
e. Should each institution be assessed in the same way?
f. Should each subject or group of cognate subjects be assessed in the same
way?
g. How much discretion should institutions have in putting together their
submissions?
h. How can a research assessment process be designed to support equality of
treatment for all groups of staff in Higher Education?
i. Priorities: what are the most important features of an assessment process?
18. We have elaborated on each of these questions below. Respondents may wish to
use these notes as a basis for discussion.
a. What should/could an assessment of the research base be used for?
For the funding councils the immediate purpose of research assessment is to
provide the information necessary to calculate funding levels. RAE ratings are,
of course, used by others, including institutions themselves, for a variety of
purposes.
What should research assessments be used for and by whom? Should the
funding councils be more explicit about what the information produced by the
exercise means, and what it ought to be used for? Should we look to design a
research assessment process with the explicit aim of providing reliable
management information for academic communities, institutions and other
funding agencies? Is it the responsibility of others if they use ratings for
purposes that may not be appropriate?
Is there scope for the funding councils to work with other funding agencies–
particularly the research councils – to develop complementary assessment
processes which minimise the total assessment burden? Could the funding
councils and research councils make more use of data produced by their
respective processes? If so, how?
b. How often should research be assessed?
How often should research assessment take place? Should all subjects and all
institutions be assessed at the same time or with the same frequency? Should
clusters of subjects be assessed separately?
c. What is excellence in research?
The purpose of research assessment is to provide information about the quality
of research – but what is quality?
Another way of asking this question would be “what is it that distinguishes the
best research”? Some might feel that this begs the question, “Is it helpful to
speak of the „best‟ research, in a way which implies that there is a magic
ingredient that separates it from the rest”?
Are there different aspects of research activity (for example creativity and
applicability) that each demand recognition? Did the 2001 RAE capture this?
d. Should research assessment determine the proportion of the available
funding directed towards each subject?
In devising a system of research assessment, it is important to know whether it
will be required to inform the distribution of funds between subjects as well as
between institutions.
There are a number of ways in which „subject pots‟ might be determined. These
include:
the quality of UK research in the subject, benchmarked against international
competition
the volume of research in the subject that meets a given quality threshold
a strategic judgement on the importance of the area to the UK
a metric based upon external funding in the subject
an overtly historical distribution which aims to retain the current balance
a mixture of the above.
If the relative quality of research in different subjects is to be used as the basis
for generating subject pots, how is this to be assessed?
e. Should each institution be assessed in the same way?
The 2001 RAE obliged all institutions to submit to the same assessment. The
research outputs of a large multi-faculty institution with a strong research
tradition were assessed in the same way as those of a small college with no
tradition of large-scale investment in research.
Some would argue that this is an unfair competition; others that it is important
for those with minimal resources to see where they stand in relation to leading
units. A middle position would be that it is sensible not to compare institutions
that are very different but that the system should provide a ladder of
improvement so that all researchers and institutions have the opportunity to
demonstrate potential.
f. Should each subject or group of cognate subjects be assessed in the same
way?
How far should the nature of the assessment be allowed to vary between
subjects? Should each subject community be free to define the sort of
assessment most appropriate to it? Should the funding councils go further in
standardising assessment practice? Or is the current balance about right?
This is not necessarily a simple choice between a greater or lesser degree of
standardisation. One approach might be to define a small number of broad
subject areas, and to make assessment methods within each area as similar as
possible while allowing the broad groups to diverge from one another.
g. How much discretion should institutions have in putting together their
submissions?
At present, institutions have a large degree of control over the content of their
submissions, over who or what is assessed and by whom. This ensures that
planning decisions do not make it impossible for the particular nature of an
institution‟s research to be appropriately assessed, but it also brings significant
disadvantages.
There are two alternatives: a more rigid system, or a system in which
submissions are made and controlled by individuals, research groups or
networks rather than by the institutions. The former risks the disadvantages of
any inflexible bureaucratic procedure; the latter would arguably be unfair to
institutions, as their funding would be determined by an assessment into which
they had minimal direct input.
Both, however, would provide more objective results: ratings, scores or shares
of the funding pot would depend entirely upon the quality of research activity as
measured by the exercise, rather than reflecting the willingness of the institution
to trade funding for the prestige of a high rating. They would also close the
question of alleged unfairness to individuals who perceive that the decision not
to include their work in RAE submissions has damaged their careers.
h. How can a research assessment process be designed to support equality of
treatment for all groups of staff in Higher Education?
The funding councils are committed to ensuring that their research assessment
process is non-discriminatory. They are also committed to ensure that it does
not reinforce a culture, wherever such a culture may exist, in which staff are
disadvantaged on the grounds of sex, sexual orientation, race and ethnic origin,
disability, age, religion or any other irrelevant characteristic.
Are there features of past research assessment processes which discriminate or
which can be abused by those seeking to discriminate against any group? Are
there subtler effects, adversely affecting the legitimate interests of groups of
staff, to which the design of the process contributes? What are the essential
design features of a research assessment process that encourages genuine
equality of opportunity for all.
i. Priorities: what are the most important features of an assessment process?
Most people would agree that a successor to the 2001 RAE ought to strive to be
all of the following (and many other things besides):
not burdensome
rigorous
fair to individuals and institutions
informative
transparent
resistant to games-playing
administratively efficient
flexible (so that changes in policy can be accommodated without redesigning the
entire process)
minimally expensive.
We invite respondents to identify the three most important characteristics of an
assessment process. These need not be taken from the list above but should
reflect characteristics of the process rather than the philosophy underpinning it
(we have asked elsewhere what constitutes excellence in research).
Group 6: Have we missed anything?
We invite respondents to tell us whether there are other issues or options not
considered here. In particular, we would be interested to hear of any approach to
research assessment that could not be described as a variant of the approaches listed
above.