Knowledge Exchange Digital Author Identifier Summit
An important milestone meeting on digital identifiers was held earlier this week in the Tower Hill area of London by the Knowledge Exchange, an international information science strategy group representing the UK, Denmark, Germany and the Netherlands. There were also representatives present from a number of other countries including Australia, Italy, Norway, the United States, from the international publisher Elsevier and from the ORCID initiative. The meeting at the former Royal Mint, convened by the JISC on 13-14 March 2010, focussed on Digital Author Identifiers and was primarily concerned with uniquely identifying researchers and other academic staff in a cost-effective, internationally agreed and scalable way that has not hitherto been achieved. The first day (see this blog post by Amanda Hill of the Names Project) was devoted to information sharing and consensus building, whereas the second day was productively spent in breakout groups on issues of governance, interoperability and “supply side” issues, and added value services from the perspective of incentivising take-up of identifier schemes amongst users.
Relevance to the UK Researcher ID Task and Finish Group
This meeting follows a series of six meetings of crucial institutional, high-level, strategic and administrative stakeholders in the UK Higher Education sector, the Researcher ID (ResID) Task and Finish Group. This group has been organised by the JISC, which has been represented on the group by programme managers as well as by Brian Kelly and Talat Chaudhri of the ISC at UKOLN. It aims to meet once more in order to present its findings, having achieved a broad consensus amongst those stakeholders and having funded, agreed and published a series of reports and statements of principle. However, the Knowledge Exchange Digital Author Identifier (KEDAI) summit (tweets archived here and notes in this post by Brian Kelly) represents a wider international group interested in the same issues, and the ResID group has expressed a strong interest in developing UK support for researcher ID schemes firmly within the broader international perspective. The ResID group had, broadly speaking, supported the ORCID identifier scheme, which is in early development, since it is being built on just such an international basis and has buy-in and financial support from governmental organisations, worldwide higher education institutions and international publishers. The KEDAI summit, however, did not unambiguously throw its weight behind ORCID. Unlike the ResID group, which could be seen to have understood the competing International Standard Names Identifier (ISNI) as one of a host of many identifiers that would be linked by a single ORCID identifier for each researcher or author, the KEDAI summit, after much discussion, identified both ORCID and ISNI as potential solutions, although recognising that other possibilities could arise and should not be ruled out either at this early stage. Consequently, it will be necessary for the UK members of the ResID group who attended KEDAI to report back and for the group as a whole to re-think some of its findings.
Discussions and Consensus Building
The meeting was extremely successful in clarifying the roles of the possible international players and interest groups in this space, along with the likely sources of conflict that might need to be mitigated in order for any scheme to succeed. In addition to those mentioned above, VIAF, RePEc, CrossRef, TROVE (in Australia) and VIVO (principally in the US and Australia) were factored into the discussions, which were in large part led by Andrew Treloar (Australian National Data Service), Cliff Lynch (CNI), Bas Cordewener (SURF, Knowledge Exchange) and Rachel Bruce (JISC). Other names amongst many that deserve an honorable mention here include, but are not limited to, Paolo Bouquet (University of Trento), Josh Brown (JISC), Nicky Ferguson (Clax Ltd., and author of ResID reports for JISC), Andrew MacEwan (British Library), Mogens Sandfær (DTIC), Chris Shillum (Elsevier) and Maurice Vanderfeesten (SURF).
There were considerable discussions of issues of scope, i.e. who should have an identifier, the differences between authors, researchers, academics and others who could in certain contexts require such an identifier. A great deal of time was devoted to the benefits and financial motivations for developing such infrastructure, which it was agreed were considerable in all of the countries represented – however, the range of use cases are so broad that it is currently difficult to make generalisations about financial incentives: each use case would have its own specific business case, so no single business case can be developed; it is so early in the development of both ORCID and ISNI (amongst others) that only a broad-brush discussion of benefits could be had. All the same, it was agreed that these benefits, in general terms, were so substantial and of such wide applicability within academia internationally, that the case for a single international identifier scheme, whatever that may end up being, was agreed unambiguously and unanimously by the attendees. It was regarded as a major risk to fail in this process, since the likely result would be a series of commercial identifier solutions lacking interoperability, as to some extent already exist today in Web of Science, Microsoft Academic Search and Google Scholar, none of which unambiguously identify authors well at present.
Issues Arising and Differences of Approach
There were, of course, differences. Most notably, there were issues of control. Some argued that it is academics who should have control over their own identifiers, which is the basis upon which the ORCID development is proceeding, albeit with a dose of realism: the data will need to be bulk-loaded by institutions and curated by them whenever an individual academic does not choose to take control over their identifier and associated data. On the other hand, the ISNI data, via the VIAF database, is collected by institutions on a model more familiar to traditional library and research reporting approaches, although this does not mean that there is never a role, lower down in the process, for individuals to correct their own data and take control of it. There are international differences in terms of privacy legislation that will need to be taken account of. In Norway, for example, national security numbers are now public information, whereas in the UK they are considered private. The same could be said even of tax returns in different jurisdictions.
Perhaps the greatest area of uncertainty was over the level of semantic information that needed to be attached to an identifier in order for it to be disambiguated, and whether too much information would effectively turn it into yet another silo of information, unconnected to other similar data silos, as Paolo Bouquet convincingly argued. One alternative view in the ORCID group, as Chris Shillum reported (although not his own view) is that semantic information additional to the lowest level required for author identification will be required in order to create added-value services capable of incentivising the take-up and use of the identifiers by academics in practice: without this, the identifier scheme would be, according to this view, an expensive white elephant, unused by the academics whose institutions had registered them. While it was agreed by all that such added-value services were crucial, the opposing view was that they ought to be kept separate from the identifier scheme that they relied on. Paolo Bouquet won considerable support in maintaining the view that ORCID, for example, should aim at a “thin layer” of interoperability based on a minimum of semantic information attached to each identifier. For example, institutional affiliations can change over time, and require date-stamping: if this were to be included, the identifier scheme would quickly be overburdened; if only the registering institution were included, it would be the source of frequent misleading information about earlier or later publications written elsewhere.
Future Work on Identifiers
One telling discussion occurred on the first day, on this subject, about the broader scope of identifier schemes: specifically organisational identifiers. It was quickly agreed that, while this is a critically important area in future, it is of little use creating organisational identifier schemes when even individual researchers, academics or other authors cannot be uniquely identified. It remains to be seen whether such organisational identifier schemes will be necessary, although this seems likely, and to what extent it will be possible to keep much of the metadata in dispersed stores across institutions rather than overburden the identifier scheme as was discussed with regard to identifiers for individuals. Unlike ISNI, which is a “top-down” initiative, ORCID represents a “bottom-up” approach where authors make claims or assertions about themselves. In phase 1 of ORCID, there will only be self-assertions, whereas Phase 2 is planned to include verification by institutions, publishers, funders and other authorities. It could be said that even this represents a substantial broadening of the metadata that is required to make an identifier scheme function effectively, despite being clearly very useful as an added service.
Overall, it was agreed in general that it was very useful, if not critical, for a broad coalition of international partners and national interests to set out broad principles and guidance in this way, as agreed at KEDAI, for developers of author and/or researcher identifier schemes to follow. It was further agreed that, although the technical difficulty of producing such a scheme is in fact low, it is nonetheless far from easy to produce one that will succeed in practice because of the huge range of stakeholders, international governance organisations and interests, both public sector and commercial, that need to be able to use the scheme effectively in order for it to succeed. As a consequence, previous schemes have not succeeded. Lastly, and most significantly of all, researchers and academics themselves have to see a reason to use any identifier scheme as a necessary and gainful part of their employment in a way that substantially benefits research and human knowledge but also helps individuals in their daily workflows. The attendees agreed that this, above all, was the key criterion of success.