Guidelines for the evaluation of development research
What is “development research”?
In the present text, “development research” is interpreted as “research in a development context”; it is research that is, or could somehow be relevant for development; i.e. either “research in a developing country” or “research for development”.
In developing countries science is relevant if it helps to solve problems on the short or intermediate term or, if it helps to better understand our problems. Developing countries face immense problems which partly can be solved by knowledge and technology which is already available elsewhere, but which has to be adapted to the local context and boundary conditions, e.g. by using local materials and local labour. This is exactly what an engineer should do anytime and anywhere: solve problems using scientific knowledge under external constraints. (e.g. drinking water production and distribution, sanitation, better crop productivity, irrigation, … ). There is a need for knowledge transfer and trained researchers to adapt the knowledge to site specific conditions. “Sharing the fruits of scientific and technological progress is one of the most important ways that rich countries can help poor countries fight poverty” (UNDP, 2003).
“If the development community continues to ignore the explosion of technological innovation in food, medicine and information, it risks marginalizing itself and denying developing countries opportunities that, if harnessed effectively, could transform the lives of poor people and offer breakthrough development opportunities to poor countries” (UNDP, 2001) On the other hand developing countries need their own good, creative and innovative research to solve their problems. One should realize that it is probably more difficult to find solutions in a complex environment with limited resources than to do so in a rich, “northern” environment. (e.g. health care and adequate food supplies). Some specific topics do not catch the attention of northern researchers and are left out of the research agenda (e.g. some diseases) or need local field data (e.g. water management, agricultural problems, sociological studies, ...). Developing countries do not escape from the (economic) globalisation tendency and their industry, agriculture, education, ... should be well supported by research in order to cope with increased and increasing worldwide competition.
Obviously, the methodology for evaluating “development research” will depend on the purpose of the evaluation. Is it
- to evaluate an individual researcher , e.g. for an appointment or a promotion, is it
- to assess the past performance of a research group, or is it
- to compare and evaluate research projects e.g. to select the ones eligible for funding?
- Each of these aims asks for a different and specific approach.
Evaluation of an individual researcher
There may be a need to evaluate individual researchers e.g. for an appointment as a teacher or lecturer, as a researcher either at a university or in a public or industrial research centre, as a research manager e.g. in a central administration.
The key question is: what is his/her “quality” as a researcher? Quality can be assessed in an absolute or a relative way. If a candidate has to be evaluated for a position in an international organisation or a foreign company, obviously absolute criteria have to be used. When a group leader has to be appointed, when a dean has to be elected, when a teaching assignment has to be given, the “best” researcher from a limited group of candidates has to be chosen, which implies a relative assessment.
Absolute assessment of an individual researcher
The quality of a researcher is best measured by the (number and) quality of his/ her publications about his research. The best, but most difficult assessment of the quality of a scientific publication, and thus of the research that lies behind it, is the intrinsic evaluation by peers. Of course, the main problem is to find one or more capable experts willing to do the job! Aspects that should be considered in this evaluation are:
- Is the author acquainted with the up-to-date knowledge in his/her domain?
- Does the paper start from a relevant non-trivial research question, and is this question clearly formulated?
- Does the author use the appropriate methodology for obtaining a well-founded answer to this question?
- Are the results of his/her investigation sufficiently convincing for justifying the conclusions drawn by the author?
- Is the paper well written, with a clear structure that underlines the problems, the methodology, the results and the conclusions?
- Do the conclusions constitute a valuable step forward in our knowledge? This value can be purely theoretical in our conceptual understanding of the phenomena – which in a later stage may eventually lead to useful applications – or they can be immediately and directly useful. Is there a possibility for special applications in the developing world?
- Often, indirect evaluations are performed on the basis of external indicators that try to measure the impact that a paper has (or potentially may have) on the further development of science or on interesting applications. The indicators may be:
- The number of citations that the paper receives in the subsequent literature. This criterion is very seductive because it is quantitative in nature and therefore an easy measure. Disadvantages are:
- it may be abused (friends citing each other),
- a citation may contain very negative criticism,
- the number of citations measures more the popularity of a subject or the size of the specific scientific community rather than the intrinsic value of a paper,
- the main problem is that it can take several years before the value of a paper can be rightfully assessed on the basis of the number of citations.
- A surrogate indicator often used is the so-called “impact factor” (IF) of the journal in which the paper is published, which measures the average number of citations received during two years by all papers in the journal. Since there is a clear correlation between this success factor of the papers and the severity of the peer review process of the journal, a high impact factor guarantees somehow a positive intrinsic evaluation. The drawbacks are nevertheless:
- The IF does not say anything about the quality of an individual paper: individual articles may have citation numbers that strongly deviate from the average in the journal In 2004, 90 % of the IF of “Nature” was based on only 25 % of its articles!
- The IF is also not a measure for the quality of the journal, but rather a measure of the size of the scientific community in a particular discipline and the popularity of the journal. Since a high impact factor gives the publisher a commercial advantage, the selection process of a journal may be biased towards more popular subjects.
- This actual hype of the impact factor is one of the causes of the abusively high subscription prices of some journals, which make them unaffordable for the poorer universities in the developing world. We should not encourage this...
- Using 2 years as a window for determining the IP of a journal is not justified in all scientific disciplines. One should indeed consider the useful lifetime of knowledge, which differs between scientific fields. The “half-life” of knowledge is the amount of time that has to elapse before half of the knowledge in a particular area is superseded or shown to be untrue . The half-life of psychology has been estimated to be 5 years , and 7 years for civil engineering ; in nuclear engineering it is 2 ½ years. Knowledge in “traditional fields” last long, knowledge in newer fields develops fast and is quickly outdated. It may be expected that the useful life-time of development research will be rather long, which puts it in an even more disadvantageous position with respect to IF!
- Because of the different half life of science and the different sizes of the research communities, IF cannot be compared between scientific areas. Some examples: CA – Cancer Journal for Clinicians IF = 69 Nature IF = 29 Science IF = 26 Estuarine, Coastal and Shelf Science IF = 1.8 ASCE jnl. of hydraulic engineering IF = 1
- Whereas the publication in a high-quality journal should certainly be appreciated, we should not look down on the publication in an Open Access journal. This modern way of freely sharing the results of research results with the whole world is certainly to be preferred above burying a paper in an obscure local journal.
- SCI journal papers should be a measure of the quality of somebody’s research work, not its quantity. In many disciplines, especially in the rapidly evolving ones, it might be more appropriate to publish at international conferences to disseminate research results. Some conferences are highly selective indeed. Whilst the time between submission and publication in a journal may take 1 to 2 years, a conference paper is published within a few months.
Relative assessment of an individual researcher
To choose which individual from a limited group of candidates qualifies for a position a relative assessment has to be made. The individual researchers should be assessed on the basis of their whole C.V., in which their list of publications plays an important role. Here not only int. Journals and conferences count but also local publications (dissemination of knowledge, impact on society, ...). However, also other elements have to be considered:
Publications in international journals should show the quality of the research but not the quantity, nor its impact. To have impact on society in the South, one should not be cited from int. journals, but one should be read or heard by those who could possibly use the knowledge and apply it. Really, this should be the very reason why we publish! Researchers from the North working on subjects useful for the South, or researchers from the South should therefore find appropriate communication channels.
How to measure the impact of research on society incl., of course, the scientific society? Whereas up to now an evaluation very often finally comes down to one single figure: the number of publications (for a researcher) or the rank (for a project, a university, ...), it becomes increasingly clear that one has to switch to a multi-dimensional evaluation system . For a university it could be performance in education and research, innovation, community outreach and internationalization; for an individual researcher it should be research output, regional and international prestige shown by the number of invited lectures or participation in foreign projects, invitations for doctoral committees, reviews, impact on society, collaboration with industry, international dimension. Besides, just as it is the case for university rankings or accreditation, the performance of an individual, should be measured against his own “mission statement”. If work in a development context is part of it, it should be evaluated and properly acknowledged.
Special case: evaluation of a future researcher: candidates for a PhD) scholarship Imput from VLIR UOS/ CIUF??
Selecting candidate doctoral students for a scholarship or a research (and teaching) position at a university is difficult. There is not very much to base an evaluation upon except from previous school records, which are not always a good indicator for the skills a future researcher may need. Probably, proper motivation for doing research and for doing something with research results is the best asset of a future researcher.
Quality of a research group
A team or research group should be assessed on the basis of the composition of the group, i.e. the quality of its members and their publications (assessed as described above). Furthermore, one should consider the following:
Quality of a research project
Although the main guarantee for success of a research project is the quality of the researcher or the research group that proposes the project, additional considerations should be taken into account:
It is important to decide on beforehand what is going to be evaluated, an how will it be done. Criteria van de VLIR UOS en CIUF
- The number of publications relative to the number of years devoted to research, taking into account the rest of the professional duties of the person as didactical and managerial tasks or other services to the community (“productivity”).
- The scientific output in terms of number of publications and contributions to conferences, relative to the input (funding) and available infrastructure and (human) resources (“efficiency”).
- The scope of the subjects investigated: is there a nice spread or is the same subject treated over and over again? Are the reported studies relevant for increasing our knowledge or solving problems in the development context? What is de impact of the research on society, collaboration with industry, international dimension (“pertinence”)?
- Is the environment of the researcher stimulating his/her research or does it rather work against it?
- Does the researcher stimulate his environment to do research, to be innovative and creative?
- Does he take initiatives, to innovate research or teaching, to apply own or other research results for practical projects to the benefit of the (local) society? Is he active as a personal consultant, practicing engineer? Did he establish a SMC? (Inter)national prestige shown by the number of invited lectures or participation in foreign projects, invitations for doctoral committees, reviews. Does he show leadership? Management skills? (“outreach”)
- Does the group have a coherent research plan, or is everybody following his/her own favourite programme?
- How good are the prospects for a successful realization of the work plan? Is all necessary expertise present in the group? Is the infrastructure appropriate? How good is the leadership? Can the project leader motivate his people? Does he possess besides scientific also managerial skills?
- How good are the international contacts of the group?
- Does the project description indicate that the group is sufficiently acquainted with the up-to-date knowledge in this domain? Do they have broad and adequate access to the international scientific literature?
- Does the project start from a relevant non-trivial question, and is this question clearly formulated in the framework of the subject (“relevance”)?
- Would answering this question (or solving this problem) really be an important step forward in the development or progress of our scientific knowledge? Would it be useful for the developing world (“impact”)?
- Is the proposed methodology for obtaining an answer to this problem appropriate?
- Does the group possess the necessary skills for applying the methodology? Or, conversably, does the project contain an element of upgrading their skills to the required level? Are there enough material and human resources to carry out the project successfully within the proposed time frame and with the requested funding?
- A number of things must be agreed upon on beforehand
- Which instruments used for the evaluation (e.g. logical framework);
- Is it an ex-ante, midterm or ex-post evaluation?
- Is it internal within the organization or external?
- With or without the participation of the project team?
An evaluation methodology must be elaborated in a generic way so that it can be adapted to the different disciplines: an engineering project will be evaluated on different criteria than a pure exact science project or a proposal from the humanities. There will be specific criteria added whilst others will become obsolete. If needed, the methodology must be adjusted or refined during the evaluation process.
A development project has to be evaluated according to 6 criteria (H. Legros)
Acknowledgments : Extensive use has been made of annex “scientific quality” by prof. Raf Dekeyser and a note by Herman Diels (VLIR – UOS)