RAGE-redcap Data and Sequence Import Instructions
This guide provides step-by-step instructions for preparing and importing metadata and sequence data into RAGE-redcap.
1. Metadata Import
Download Import Template
- Go to Applications > Data Import Tool in REDCap.
- Click Download Your Data Import Template to get the Excel template.
Complete Template
- Fill in the template using the latest data dictionary as reference.
- Adhere to approved choices for dropdown or radio fields; only these will be accepted by the database.
- Not all fields are relevant for every user — complete what applies.
- Latest data dictionary can be obtained:
- From REDCap: Project Home > Design > Download Data Dictionary
- From GitHub: RAGE REDCap Data Dictionary
Prepare Metadata
- Save the completed template with a suitable filename.
- Use the rabvRedcapProcessing R tool to prepare the data for REDCap import: rabvRedcapProcessing GitHub
- Run the
checkScripts.R
script to process your imported sheet and generate sequencing and diagnostic forms ready for REDCap: checkScripts.R- Edit only the input file paths to match your data.
- Visually inspect the outputs.
Important Notes
- Duplicate sequencing events: Carefully check your import list against the latest REDCap dataset to identify any repeat sequencing events. Manually adjust the
redcap_repeat_instance
for these repeats and remove duplicates from the diagnostic sheet to prevent overwriting existing records. Assign repeat instances in the new data consistently to ensure accurate tracking and integration. - Review REDCap import warnings carefully; data that may be overwritten is highlighted in red.
Import Metadata
- Go to Applications > Data Import Tool.
- Select your import file (one at a time for diagnostic and sequencing forms).
- Leave other settings as default and click Upload File.
- Verify the import summary and confirm.
2. Sequence Data Import
Prepare Environment
Before running the sequence import scripts, set up the conda environment included in this repository.
- Create the environment from the provided
environment.yml
file:conda env create -f environment.yml
- Activate the environment:
conda activate rage-redcap
- Verify installation:
python --version
Prepare Sequences
- Ensure metadata has been imported first to associate sequences with records.
- Obtain consensus FASTA sequences from your latest run.
- Concatenate sequences from artic-rabv pipeline output:
results/concatenate/concat_genome.fasta
.
Split and Rename FASTA Files
- Use
multi_to_single_fasta.py
to split the multi-FASTA into individual FASTA files:python3 multi_to_single_fasta.py
- Edit input/output paths in the script as required.
- Ensure filenames correspond to sample IDs from FASTA headers.
- Negative controls: Copy and rename manually as
negative_runname__runname__instance1.fasta
and edit internal FASTA headers to match.
Match Metadata and Rename for Import
- Run
redcap-prepareFASTA.R
- Edit
metadata_file
,fasta_dir
, andoutput_dir
filepaths. - Script matches sequences to REDCap metadata and renames FASTA files as
sampleID__runname__instanceX.fasta
.
- Edit
- Verify all renamed FASTA files and negative controls.
Upload Sequences to REDCap
- Use
bulk_upload_fasta_repeatInstances.py
:python3 bulk_upload_fasta_repeatInstances.py
- Edit input folder path to the renamed FASTA directory.
- Ensure the REDCap API URL points to the correct project version (updates may change the endpoint).
- Monitor command-line output for successful uploads.
- If files do not appear, check API version and repeat instances.
Additional Notes
- Always verify sequencing records are not accidentally overwritten.
- Assumes you have REDCap API access with local environment files; contact project lead if not set up.
- REDCap provides informative error messages during import; use these to correct formatting or repeat instance issues before re-importing.