How does it work?

This tool allows you to upload a CSV containing a list of identifiers and other metadata for the purposes of checking whether the articles listed meet the compliance criteria for the Wellcome Trust.

Simply export a CSV from your favourite spreadsheet program (e.g. Excel), and upload it. We will then go off and inspect all the identifiers and gather together your compliance information. You will be able to monitor the progress from your upload page, and download the results at any point during the run, or at the end when it is finished. We will also email you the completed spreadsheet when it is done.

Please note results will be cached for one day. For example, if the licence information for an article you have run through this tool changes in Europe PMC or on the publisher's website, it would take up to 24 hours before running the same article would show you the new licence.

Below are some detailed documentation on the input spreadsheet format and the format in which you will receive the results.

Input Spreadsheet Format

Spreadsheets should be uploaded according to the Wellcome Trust Master Spreadsheet Format. This means that any of the following column headers may be present:

  • University
  • PMCID
  • PMID
  • DOI
  • Publisher
  • Journal title
  • Article title
  • Publication Date
  • Title of paper (shortened)
  • Author(s)
  • Grant References
  • Total cost of Article Processing Charge (APC), in £
  • Amount of APC charged to Wellcome OA grant, in £ (see comment)
  • VAT charged
  • COST (£)
  • Wellcome grant
  • Licence info
  • Notes
In order for a successful processing of your records to take place, AT LEAST one of the following needs to be present:
  • PMCID
  • PMID
  • DOI
  • Article title

You may omit any number of columns from the full list, provided that one of the above is present.

Any column that is present that does not correspond exactly to one of the above headers will be ignored.

The system will ignore any blank rows in the input sheet. This is usually the right thing to do - ignoring excess blank lines at the end of the file which may get exported from your spreadsheet software. But, it will also ignore blank lines anywhere in the input, so it is strongly recommended you avoid having any.

NOTE - The application works best with CSVs which are UTF-8 encoded. Non-UTF-8 encoded sheets (such as those exported by Excel by default) should still work, though you may notice character encoding issues in the results.

Output Spreadsheet Format

When you download your processed spreadsheet, it will reflect back at you any columns that you uploaded which are part of the Wellcome Trust Master Spreadsheet Format, and add a number of other compliance analysis columns. Those additional columns can be interpreted as described below.

Note that if you provide any of the columns below yourself, any data that you provide will be overwritten. The only exception is the Article title column - if you supply a title for an article, your title will be used in EPMC and other lookups. Therefore it is not overwritten, so you can clearly see what title was used in gathering compliance information in all the other columns.

Field Description
PMCID The EuropePMC identifier for the article. You may have provided us this initially when you uploaded your data, but if you didn't we will have tried to populate it for you. If we find a different PMCID to the one you have provided, we will overwrite data in this column. In virtually all cases this means the PMCID you provided was incorrect, since a different one was found via our lookups, and articles should really ever have only one PMCID.
PMID The PubMed identifier for the article. You may have provided us this initially when you uploaded your data, but if you didn't we will have tried to populate it for you. If we find a different PMID to the one you have provided, we will overwrite data in this column. In virtually all cases this means the PMID you provided was incorrect, since a different one was found via our lookups, and articles should really ever have only one PMID.
DOI The DOI for the article. You may have provided us this initially when you uploaded your data, but if you didn't we will have tried to populate it for you. If we find a different DOI to the one you have provided, we will overwrite data in this column. In virtually all cases this means the DOI you provided was incorrect, since a different one was found via our lookups, and articles should really ever have only one DOI.
Publisher The Publisher of the Journal in which the article was published. First we will attempt to get this information from Crossref, if the article has a DOI that you have uploaded or that we have found. If you have uploaded (or we have discovered) an ISSN, and we could not determine the publisher from Crossref, then we will try to get it from DOAJ Journal metadata. If it's not there, we'll finally try Sherpa Romeo Journal policy metadata before giving up.
Journal title The name of the Journal in which the article was published.
ISSN The ISSN(s) of the Journal in which the article was published; if more than one ISSN is found (e.g. the Print ISSN, Electronic ISSN or Linking ISSN) then they will be presented as a comma-separated list.
Article title The title of the article. If you upload an article title we will not override it with titles we may discover in Crossref or CORE (in that order of preference).
Publication Date The date the publisher made the article available (either in print or electronically). The date may contain both the year and month, or only the year, depending on what data is available. If no publication date can be found this column will contain the text "Unavailable".
Electronic Publication Date The date the publisher made the article available online. The date will be in the form YYYY-MM-DD (e.g. 2011-03-21). If no electronic publication date can be found this column will contain the text "Unavailable".
Author(s) List of the article's authors. First, we look for this information in Europe Pubmed Central. Then we check CrossRef. Finally we check CORE.
Fulltext in EPMC? Does the fulltext of the article appear in some form in Europe PMC? This is equivalent to the Europe PMC metadata record asserting inEPMC: Y .
XML Fulltext? Is the XML version of the fulltext of the article freely available in Europe PMC? This will be a subset of those articles where the fulltext appears in some form. All articles in the Open Access subset SHOULD have this content available. We check by attempting to fetch http://www.ebi.ac.uk/europepmc/webservices/rest/INSERT_PMCID_HERE/fullTextXML .
Author Manuscript? Is the copy of the paper in Europe PMC the Accepted Authors Manuscript (AAM)? If the XML fulltext is present in Europe PMC then this information will be lifted from there, otherwise we page-scrape the Europe PMC web page for the article to detect whether it is the author manuscript.
Ahead of Print? Some journals release articles “ahead of print” before assigning them to an issue and depositing them in Europe PMC. If the full text of an article is not in Europe PMC this field will tell you whether the article has only been published ahead of print (TRUE), or whether it has been assigned to an issue (FALSE). If the full text is in Europe PMC this will be "not applicable". If we are unable to determine the status of the article this will be "unknown". We determine Ahead of Print? information by looking at Pubmed metadata (so the article must have a PMID that you supply to us or that we discover via other identifiers you have supplied). Currently we follow http://www.ncbi.nlm.nih.gov/pubmed/INSERT_PMID_HERE?report=xml and we look for a "PublicationStatus" element with a value "aheadofprint".
Open Access? Is the article in the Europe PMC Open Access subset? This is equivalent to the Europe PMC metadata record asserting isOpenAccess: Y .
Journal Type

Whether this is a hybrid or pure Open Access journal. Determined by whether the journal is present in the DOAJ.

Correct Article Confidence

What level of confidence does the system have that it has successfully identified and analysed the correct article?

Since there are different ways we might identify the article, we may not always be 100% certain that we have found the correct one. Values in this column are:

  1. 1 - we are certain this is the right article for the identifiers provided. This is because we found an exact match to the PMCID, PMID or DOI
  2. 0.9 - we are almost certain this is the right article, as we identified it by exact title match, and there was only one result
  3. 0.7 - we are pretty sure this is the right article, as we identified it with a title keyword search, and there was only one result
Standard Compliance?

Does the result of the analysis live up to the requirements of Wellcome's Standard Compliance criteria?

These criteria are:

  • IF full-text is in Europe PMC AND it is an author manuscript THEN compliance = TRUE
  • IF full-text is in Europe PMC AND the licence in Europe PMC is CC BY or CC0 THEN compliance = TRUE
Deluxe Compliance?

Does the result of the analysis live up to the requirements of Wellcome's Deluxe Compliance criteria?

These criteria are:

  • IF full-text is in Europe PMC AND it is an author manuscript THEN compliance = TRUE
  • IF full-text is in Europe PMC AND the licence in Europe PMC is CC BY or CC0 AND the article is in the open access subset THEN compliance = TRUE
Publisher Licence

What licence, if any, were we able to detect for the article from the publisher's web page. This will be "not applicable" if the licence was found elsewhere, or "unknown" if the licence could not be detected on the publisher's site. Otherwise it will contain the name of the licence found.

EPMC Licence

What licence, if any, were we able to detect for the article in Europe PMC. If this field shows "unknown" you should also check the Publisher Licence column for information.

EPMC Licence Source

Where in Europe PMC did we find the licence information contained in the EPMC Licence column?

  1. epmc_xml_permissions - Detected in the Europe PMC Fulltext XML under the permissions section
  2. epmc_xml_outside_permissions - Detected elsewhere in the Europe PMC Fulltext XML
  3. epmc_html - Detected on the Europe PMC web page (via page scraping)
  4. unknown - We were unable to determine a licence for this article
Grant {X}

Grant Number associated with this article.

This is a repeated column, where {X} is a number used to identify the column, and to associate it with the related "Agency {X}" and "PI {X}" columns.

Agency {X}

Agency responsible for funding this article

This is a repeated column, where {X} is a number used to identify the column, and to associate it with the related "Grant {X}" and "PI {X}" columns.

PI {X}

The Principal Investigator of the grant associated with this article

This is a repeated column, where {X} is a number used to identify the column, and to associate it with the related "Grant {X}" and "Agency {X}" columns.

Compliance Processing Output

This field contains detailed logging and provenance information for the processing run, allowing you to understand on a case-by-case basis how the decisions and analysis were made.