Frequently Asked Questions
The following lines provide answers to some frequently asked questions about different aspects of the reproducibility checks and our data and code availability policy. Click on each question to display the answer.
Scope of Reproducibility Checks at EJ
- What is the exact nature of the reproducibility checks carried out at the Economic Journal?
Answer
The purpose of the reproducibility checks carried out at the Economic Journal is to verify three aspects of the replication package: (i) it is complete, in the sense of producing each table, figure, and in-text number in the paper and its appendices, including those online; (ii) it is self-contained, in the sense of not requiring a subprogram or module not included in the package; and (iii) the data and code are adequately documented for other researchers to be able to use them to replicate the results in the paper. When the data are accessible (included in the package or, in case of exemptions, via temporary access by the reproducibility team), the checks ensure that the code exactly reproduces the results in the paper and its appendices. In the case of a data exemption, authors may provide simulated or synthetic data to check that the code runs and produces all output, but the exact results cannot be checked.
Reproducibility checks (not replication checks) are conducted. This means that our checks do not screen for coding errors, discrepancies between what the paper claims the code does and what it actually does, econometric errors, or whether the empirical approach followed in the paper can be reproduced in other environments or other datasets. - Are the reproducibility checks implemented on online appendices?
Answer
Yes, the replication package should produce each table, figure, and in-text number in the paper and its appendices, including those online. All these codes are checked for their ability to produce the results in the paper and appendices. - Why is the Economic Journal running reproducibility checks? Why not replication checks?
Answer
We firmly believe that reproducibility and replicability are the main pillars of science. The nature of replication checks requires time, effort, and resources that journals typically do not have: the publication process should be speedy for science to advance at the right pace. Our reproducibility checks provide a necessary first step: to ensure that authors publish all available data and the codes that generate the results they present in the papers we publish, and, importantly, to check that these codes and data run and produce the published results. The certification that we provide enhances transparency, since it assures that other researchers can reproduce the published research and test it against other datasets, assumptions, methods, etc. It also provides an additional service to the authors, as we often detect small errors that are better amended before publication than in an erratum afterwards.
Data and Code Availability Policy and Exemptions
- My paper uses publicly available data. Is it enough to indicate how to get them or should I provide my datasets as part of the replication package?
Answer
Even publicly available data should be included in the replication package to ensure they remain available in the future for anyone who wants to replicate your results. The only exception is when your exact extract is published in a "trusted" repository (see the following list for guidance) with a permanent DOI. This is important, because datasets are often updated (or removed) by the provider, and your version of the data may no longer be available to researchers in the future. - My paper uses publicly available data. Does it imply that I certainly have the right to re-publish my dataset along with the replication package? If not, how can I obtain permission to publish the data?
Answer
Each provider offers a different policy regarding re-distribution of original and transformed datasets. Some providers, for example, allow re-distribution as long as your extract is deposited in a specific repository. You should make sure about the restrictions to publish your data before the first submission. You should also make sure to seek permission from the original owner of the data to publish them, and make sure to cite the original source accordingly. - Can I request an exemption to publish my data?
Answer
Yes, you can request an exemption on the grounds that the data are restricted-access. The request should be made at the time of initial submission, in a cover letter addressed to the Editor. The Editor in charge of your submission will determine whether your request is justified before submitting the paper to referees. If the Editor decides against the exemption, the manuscript will not be sent to referees, and you will be requested to either accept the data and code availability policy or otherwise the paper will be rejected. Submission fees will not be returned in that case. When an exemption is needed for a dataset that is incorporated to the analysis during the editorial process, the exemption should be requested at the first iteration in which the new data are incorporated. - Can I request an exemption that affects only a part of my data?
Answer
Yes, provided that the request is made at the time of initial submission. - If my main dataset is available to publish, but there is a small portion of my data that I am not allowed to share, should I request a data exemption?
Answer
Yes. If you do not require a data exemption at the time of your first submission, you will be required to publish all the data used in your paper. - If the only data I am not allowed to share is only used in the online appendix, should I request a data exemption?
Answer
Yes. The data to produce all results in the paper and appendices, including those online, should be shared unless an exemption is requested and granted at the time of first submission. - If the data I use are publicly available to everyone, but I do not have permission to re-publish it, should I request a data exemption?
Answer
Yes. Unless you are granted an exemption at the time of first submission, you will be required to publish in the replication package all data to produce all results in the paper and appendices, including those online. - Can I request an exemption later than the first initial submission?
Answer
In general, no. Later exemptions can only be requested for new data that is incorporated into the analysis during the editorial process. If your data cannot be published and you did not request the exemption at the time of initial submission, your paper may be rejected for publication at Economic Journal. - If my data are free of charge and available to any researcher who requests it from the data provider, but I don’t have the right to publish it with the replication package, should I request an exemption?
Answer
Yes. Whenever the data used for the analysis in the paper cannot be published with the replication package (or in an open-access "trusted" repository, see the following list for guidance on what constitutes a "trusted" repository) an exemption needs to be requested at the time of first submission. Only if the exact extract that was used in the study is published in the repository and it is readily available in the exact format that is called by the code, an exemption will not be requested. - Some data providers only allow authors to distribute the data in specific open repositories (for example, the Panel Study of Income Dynamics only allows to distribute the data using the OpenICPSR Repository). Do I need to request an exemption in such cases?
Answer
No. Data archived in "trusted" open repositories (see the following list for guidance) is acceptable in the replication package provided what is published is the extract that was used in the study and it is readily available in the exact format that is called by your code. The Data Editor will evaluate the suitability of the repository. - If I published my data in an open repository, do I need to include it in the replication package?
Answer
Data archived in "trusted" open repositories (see the following list for guidance) is acceptable in the replication package provided what is published is the extract that was used in the study and it is readily available in the exact format that is called by the code. The Data Editor will evaluate the suitability of the repository and whether or not there is the need of publishing a copy with the package on the journal’s repository. - If I publish my replication package on my website (or similar), do I need to submit a replication package?
Answer
Yes. Personal websites are not considered "trusted" open repositories, because there is no guarantee that the package will be systematically archived. See the following list for guidance on what constitutes a "trusted" repository. - Can I request an exemption to publish my data because I collected these data and I want to keep exclusivity rights for future research?
Answer
No. The goal of our data and code availability policy is to ensure transparency and reproducibility of research, and this requires publishing the data you collected. If others can use your data, your research will gain visibility. - Can I apply for data exemption if my data come from a commercial data provider (Datastream, Orbis, …)?
Answer
Yes. Restricted access data is generally discouraged, but when the nature of your research largely relies on a specific dataset and cannot be conducted on an open alternative, those data are eligible for an exemption. However, you may be requested to provide a certification from the provider indicating that the data will be archived and made available to other users following the same procedure to request access to it. - Can I request an exemption for the experimental data I collected?
Answer
In general, no. Data should be anonymized to ensure that subjects cannot be identified. Only when the nature of the study impedes such anonymization, the authors can request a data exemption, which will cover only the required minimum to ensure the anonymity of the experimental subjects. - Can I request an exemption to publish my code?
Answer
No. - Can I use any software, proprietary and open source?
Answer
Yes. Open source software is encouraged, but licensed software is allowed. If the authors use software which is rather uncommon and requires special licenses, we ask for their cooperation to find a solution (which might entail providing remote access to the authors’ machine to our replicators in extreme cases.). - Do I need to publish packages and libraries that are used by my code but not part of the standard distribution of the software used?
Answer
Whenever possible, yes. If these packages or libraries are available in open repositories (e.g. most Stata packages), a clear indication on how to download and use them is sufficient. If the libraries cannot be included in the packages and are not publicly available, the Data Editor will be in contact with the authors to coordinate on a feasible way to implement the checks.
Procedures when Exemptions Are Granted
- If I was granted a data exemption, how should I proceed with the replication package?
Answer
If you were granted a data exemption, your paper would still need to go through reproducibility checks before final acceptance. In order to do so, you can either (i) grant temporary (distance or physical) access to the data to the reproducibility team for the sole purpose of the checks (the data will be destroyed or access terminated after the checks), or (ii) supply simulated or synthetic dataset(s) instead of the one(s) used in the analysis. - What is the difference between simulated data and synthetic data?
Answer
A simulated dataset is generated by a model (ideally, your model). A synthetic dataset is a scrambling or perturbation of the actual dataset to ensure anonymity. - Is it better to provide temporary access to the restricted data or to provide a simulated/synthetic dataset?
Answer
Whenever feasible, we strongly recommend providing temporary access to restricted data. There are numerous advantages of this approach: (i) it saves the effort of producing synthetic or simulated datasets; (ii) the certification provided by the journal is stronger in the sense that we certify that we have been able to reproduce the results published in the paper as opposed to only checking that the code is complete, runs, and produces output for all tables, figures, and in-text numbers published in the article and its printed and online appendices; (iii) we can detect if the results cannot be reproduced, which gives the authors a chance to fix any errors before publication. - What is the procedure followed by the Economic Journal when I supply restricted datasets for the sole purpose of the reproducibility checks?
Answer
The reproducibility team will treat the data with the highest ethical standards, preventing any violations of confidentiality, and using them exclusively to run the reproducibility checks. The restricted datasets will be destroyed as soon as the checks are performed and, therefore, they will not be published. - What shall I do if I am not allowed to provide temporary access to the confidential data, but the data provider can run the code to implement the reproducibility checks?
Answer
Even if you cannot provide direct access to the reproducibility team, this option is preferred to the simulated/synthetic dataset alternative as long as the checks can be executed in a reasonable amount of time. In this case, you need to supply the replication package to the journal and the contact of the data provider. The reproducibility team will send the code to the provider and the provider will send the output back to the team, who will check the results. - What can I do if I am not allowed to provide temporary access to the confidential data, but a certification agency (e.g. cascad) can run the code in the original data source?
Answer
This option is still generally preferred to the simulated/synthetic dataset alternative. However, you should seek approval by the Data Editor before making any commitments with the certification agency. The Economic Journal, however, will NOT be able to cover the cost of certification. - If my restricted-access data provider has a public use testing sample (smaller sample, or perturbed dataset), can I provide this sample instead of a simulated/ synthetic dataset?
Answer
If this option is available, it is generally preferred to the simulated/synthetic dataset (but less preferred to providing temporary access to the original data) as long as the testing sample can be published with your package. Otherwise, a simulated/synthetic dataset that can be published with the package is preferred. - What is the procedure followed by the Economic Journal if I supply simulated/synthetic datasets?
Answer
The simulated/synthetic dataset will be published with the replication package. Even if these are not the real data, their structure, which by design will largely mimic the actual dataset, will give readers a better sense of your data. Please make sure the manipulations used to produce the synthetic/simulated datasets are described in the ReadMe file. - Why am I requested to supply a simulated/synthetic data?
Answer
Our view is that, when reproducibility checks cannot be performed on real data, there is still an advantage of running them on such simulated/synthetic datasets: they are still useful to make sure the code is complete and self-contained, and that it runs without errors. - My article estimates a non-linear model. The algorithm does not converge with randomly generated data. What shall I do?
Answer
In this case, we strongly recommend simulating data using your model as data generating process. If that is not feasible, please contact the Data Editor explaining in detail why this is the case. The Data Editor will either assist you in the process, and, eventually, s/he will make a proposal to your original Editor about how to handle the situation. - How do I decide whether to produce a simulated or a synthetic dataset?
Answer
In order to generate a dataset that mimics the same characteristics as the original one, the synthetic option may be easier. There are many open source routines that do it for you. However, there are also two main disadvantages: (i) you need to make sure that your scrambling/perturbation algorithm ensures correct anonymization of the data; and (ii) non-linear estimation routines may not converge on synthetic data, whereas they are more likely to converge in an artificial dataset generated by the model that you are estimating. - How should I produce a synthetic dataset?
Implementation of the Reproducibility Checks
- How long do the reproducibility checks take?
Answer
We usually provide the outcome of our reproducibility checks in less than two weeks. If the package is not complete or the code does not run, more than one iteration may be required, in which case the processing time might be increased. Articles that require a relatively long running time may take longer. The processing time also depends on how responsive the authors are to our requests. - How do the reproducibility checks work?
Answer
The reproducibility checks are handled by our Data Editor and our reproducibility team: a team of advanced Ph.D. students that have been hired to carry out the checks under the supervision of the Data Editor. Once an article is conditionally accepted for publication at the Economic Journal, the authors are requested to submit the replication package along with other production files. Upon submission, the Data Editor assigns the package to one or several members of the reproducibility team. The reproducibility team provides the Data Editor with a report summarizing the outcome of the checks. After reviewing it, the Data Editor contacts the authors informing them about the outcome of the replication checks, and eventually requests them to amend the package if needed. Once the replication checks are completed, the article is transferred back to the original Editor, who is in charge of final acceptance. If results in the paper need to be modified as a result of the checks, the original Editor in charge will be responsible for approving these changes before acceptance. If these changes imply a modification of the message of the paper, the original Editor can decide to reject the paper. Final acceptance is conditional on full reproducibility. - Will the Economic Journal run my code?
Answer
Yes. Upon submission, the Data Editor assigns the package to one or several members of the reproducibility team, who will run your code and check the output generated. The reproducibility team provides the Data Editor with a report summarizing the outcome of the checks. In some instances, the code is too demanding to be run in a reasonable amount of time. In such cases, the Data Editor will be in contact with you with a recommendation for supplying a simplified version of the code that allows testing the essential parts of the code. - What happens if my code is highly demanding computationally?
Answer
If the code is too demanding to be run in a reasonable amount of time, the Data Editor will be in contact with you with a recommendation for supplying a simplified version of the code that allows testing the essential parts of the code. For example, this can entail a reduced number of replications of a simulation exercise, the code that solves a structural model for a given set of parameters, a simplified function to test an optimization routine, etc. Such a simplified "testing" version will be published along with the original code in your replication package. This is so because we believe that these testing versions are extremely useful for other researchers that want to understand and use your code for replication or their related research, enhancing transparency and increasing the visibility of your research. - What happens if the results fail to reproduce?
Answer
If the data and code that you provided fail to replicate the results in the paper, the Data Editor will be in contact with you to identify the source of the discrepancy. Once the reproducibility checks are completed, if the discrepancy implies a change in the results presented in the paper or online appendices, even if minor, the Data Editor will notify it to the original Editor in charge. The Editor in charge will be responsible for approving these changes before acceptance. If these changes imply a modification of the message of the paper, the original Editor can decide to reject the paper. Final acceptance is conditional on full reproducibility. - What happens if the replication package I provided is not complete?
Answer
The Data Editor will be in contact with you indicating the amendments and additions that need to be done to the replication package to pass the reproducibility checks. Once amended, the revised package will go through the checks again. - What happens if the file I provided is not complete?
Answer
The Data Editor will be in contact with you indicating the amendments and additions that need to be done to the replication package to pass the reproducibility checks. Once amended, the revised package will go through the checks again. - Why do I need to resubmit the entire package (instead of only the revised part of it) when I incorporate the feedback received from the Data Editor and the reproducibility team?
Answer
We need you to submit the entire package again because updating the replication package ourselves increases the potential risk that the files you intend to submit for possible publication may be mishandled.
Content of the Replication Package
- What should be included in the replication package? Please see here.
- How do I provide physical access to the replication team to my restricted-access data when I have been granted a data exemption?
Answer
Whenever possible, the easiest way is to provide a physical copy of your data by including it in a separate folder labeled "4 Confidential data not for publication" outside of the replication package. All replicators and the Data Editor have signed confidentiality agreements that prevent them to use the data for any other purpose than the reproducibility checks. When that option is not feasible, we recommend you to contact our Data Editor to arrange the best way to provide access to the reproducibility team. - Why do I need to submit a signed checklist?
Answer
To ensure that you do not forget all elements of the replication package. This avoids repeated iterations and speeds up the process. - Should I respect the folder structure dictated by the checklist, or is it only for orientation?
Answer
Yes, you should and it is very important to do so. When submitted to production, your package is handled by different people at the Economic Journal and at the publisher, not all of them familiarized with data and code. Respecting the folder structure ensures that your package is published correctly. - What information should be included in the
ReadMe
file? Please see here. - Should I submit the raw data files and the code that generates my final dataset from them?
Answer
Yes, this is requested by our Data and Code Availability Policy. - Why do I need to supply all text documents (ReadMe, IRB, etc.) in PDF format?
Answer
The PDF format is portable, which means that it can be transferred without having to worry about dependencies, fonts, etc. This ensures readability across platforms and users. - Why do I need to include a copy of all datasets in non-proprietary format (ASCII, csv, etc.)?
Answer
Some users of your replication package may be not have access to the specific proprietary software that you used for your study. This ensures that they can have access to your data without problems. It also minimizes compatibility issues (e.g., old versions of Stata cannot open files saved by newer versions).
Data Citations
- What data should I cite?
Answer
All datasets used in the paper (with no exceptions) should be cited both in the paper and in a dedicated section of the ReadMe file. - If I mention my datasets in the Online Appendix or in the ReadMe file, should I cite them?
Answer
Yes, all datasets used in the paper (with no exceptions) should be listed in the references section of the paper in the same way that we cite other papers, and a copy of these citations should appear in a dedicated section of the ReadMe file. - How should I cite my data?
Answer
You should cite all datasets used in the paper (with no exceptions) in the references section of the paper in the same way that we cite other papers, and a copy of these citations should appear in a dedicated section of the ReadMe file. You can find some examples in page 7 of this document. More specific guidance on data citations is available here. - Why should I cite my data?
Answer
Data citations are as fundamental as citations to other papers, if not more. Giving proper credit to data providers is in line with all scientific ethical standards. Moreover, giving proper credit to data providers ensures that they can keep receiving external funding to make their datasets publicly available for research.
Reproducibility Certification, Publication of the Replication Package and Copyright Issues
- What kind of certification do we provide for papers that were checked for reproducibility?
Answer
The empirical/simulation/experimental papers that we checked include the following statement: "The data and codes for this paper are available at […]. They were checked for their ability to reproduce the results presented in the paper." This statement is adjusted accordingly when data exemptions are granted (acknowledging either that the authors provided temporary access to the confidential data or that the checks were implemented on simulated/synthetic data provided by the authors). In particular, we either certify "The authors were granted an exemption to publish their data because access to the data is restricted. However, the authors provided a simulated or synthetic dataset that allowed the Journal to run their codes. The synthetic/simulated data and codes are available at […]. They were checked for their ability to generate all tables and figures in the paper, however, the synthetic/simulated data are not designed to reproduce the same results." or "The authors were granted an exemption to publish their data because access to the data is restricted. However, the authors provided the Journal with temporary access to the data, which allowed the Journal to run their codes. The codes are available at […]. The data and codes were checked for their ability to reproduce the results presented in the paper.", depending on the case that is applicable. These statements are combined accordingly when more than one situation applies. The statements are also also adjusted when the nature of the algorithms is highly demanding, and a partial/simplified version of the code has been used for the reproducibility checks: we add the sentence "Given the highly demanding nature of the algorithms, the replication checks were run on a simplified version of the code, which is also available at […]" to the applicable statement. - Where will the replication package be published?
Answer
After all reproducibility checks are completed, you will be requested by the Editorial Office to publish your checked package at the Economic Journal’s community of Zenodo. Zenodo will assing your package a Digital Object Identifier (DOI), which then will be linked with your publication. - Do I keep the copyright of my package?
Answer
Yes, one of the main advantages of you publishing the package at the Economic Journal’s community at Zenodo is that you are the sole responsible and copyright owner of the specific publication. Therefore, it is important that you ensure that you have permission to publish your data before the time of first submission and request an exemption then if you don’t. - Why are packages published at the Economic Journal’s community at Zenodo instead of with the article on the Journal’s website?
Answer
There are many advantages from publishing the package at the Economic Journal’s community at Zenodo. Some of them are: (i) the author retains the copyright on the replication package, (ii) by having a specific DOI, it increases the visibility of all packages and, in turn, it increases the visibility of your article, and (iii) it makes it easier to cite. - Can I post my replication material on other sites?
Answer
Yes, as long as one copy is published at the Economic Journal’s community at Zenodo. The only exception is when your replication package is published in a "trusted" repository (see the following list for guidance) with a permanent DOI. In that case, your DOI can be used to link your article with your package, and the Data Editor can wave the requirement to publish the package at the Economic Journal’s community at Zenodo. However, publishing your package at the Economic Journal’s community at Zenodo is recommended, because it increases the visibility of your package. - All data is publicly available. Do I still need to get permission from the website owner to post data on Economic Journal’s publisher website?
Answer
Each provider offers a different policy regarding re-distribution of original and transformed datasets. Some providers, for example, allow re-distribution as long as your extract is deposited in a specific repository. You should make sure about the restrictions to publish your data before the first submission. You should also make sure to seek permission from the original owner of the data to publish them, and make sure to cite the original source accordingly. You will be the responsible of copyright infringements for what you publish with the replication package at the Economic Journal’s community at Zenodo. - Can I retrospectively include a replication package as supplementary online material to my Economic Journal article?
Answer
Yes. Please address your request to the Editorial Office at ej@res.org.uk