- Notifications
You must be signed in to change notification settings - Fork33
Description
Description of the bug
I am trying to runpgsc_calc on Google Cloud Batch to score chromosome files that I imputed from an ancestry.com report. Most of the pipeline is running successfully, but it fails on APPLY_SCORE:
Process `PGSCATALOG_PGSCCALC:PGSCCALC:APPLY_SCORE:PLINK2_SCORE (895a1bc3-77e6-4a28-858d-fc5d38c877e9 chromosome 12 effect type additive 0)` terminated with an error exit status (6)The issue seems to be caused by the.psam files that the pipeline generates not being accessible:
INFO: Error: No samples in GRCh37_895a1bc3-77e6-4a28-858d-fc5d38c877e9_12.psam.Fusion Info: fusion_version: 2.4.11-8ead802 clone_namespace: false kernel_version: 6.6 disk_cache_size: 368Gb max_open_files: 1048576This is the executed command that causes the error:
INFO: plink2 --threads 2 --memory 8192 --seed 31 --extract 895a1bc3-77e6-4a28-858d-fc5d38c877e9_12_additive_0.scorefile.gz --allow-extra-chr --score 895a1bc3-77e6-4a28-858d-fc5d38c877e9_12_additive_0.scorefile.gz zs header-read cols=+scoresums,+denom,+fid list-variants no-mean-imputation --error-on-freq-calc --score-col-nums 3-6 --pfile vzs GRCh37_895a1bc3-77e6-4a28-858d-fc5d38c877e9_12 --out 895a1bc3-77e6-4a28-858d-fc5d38c877e9_12_additive_0Some things I have tried:
I thought the issue might be with fusion, so I tried disabling it and rerunning the pipeline. I was met with a similar error:
Error: Failed to open GRCh37_895a1bc3-77e6-4a28-858d-fc5d38c877e9_12.psam : No such file or directory. However, in this case the command exit status was3instead of6.Messing around with the format of my original .psam files. I tried these two formats, and neither seems to make a difference:
Format 1:
#IIDSEXSAMPLE1Format 2:
#FIDIIDSEX895a1bc3-77e6-4a28-858d-fc5d38c877e9SAMPLE1
I understandpgsc_calc may run in to issues with imputed chromosome files due to lack of WGS support, but I am able to successfully run the pipeline on the same chromosome files on a local linux machine. So it seems the issue is coming from something wrong with the cloud executor and not my imputed chromosome files.
Any help would be appreciated, thanks.
Command used and terminal output
nextflow run pgscatalog/pgsc_calc \ -profile docker \ -c nextflow.config \ --input "$samplesheet_path" \ --target_build GRCh37 \ --pgs_id "$pgs_ids" \ -work-dir "$work_dir" \ --format json \
Relevant files
System information
nextflow.config:
// Google Cloud Batch configuration for Nextflowprocess { // Define the executor executor = 'google-batch' // Define the container image using an environment variable // Fallback to a generic gcloud image if not set container = System.getenv('CONTAINER_IMAGE') ?: 'gcr.io/google-containers/google-cloud-cli:latest' cpus = 7 memory = '28.GB' time = '24.h' // Error strategy for potential preemptions (exit code 50001 for GCE Spot VM preemption via Batch) errorStrategy = { task.exitStatus == 50001 ? 'retry' : 'terminate' } maxRetries = 3}// Google Cloud specific settingsgoogle { // Project ID and Location (Region) obtained from environment variables project = System.getenv('PROJECT_ID') location = System.getenv('GCP_REGION') batch.spot = false}// Enable Fusionfusion.enabled = true// Enable Wave container servicewave.enabled = true// Enable Towertower.accessToken = System.getenv('TOWER_ACCESS_TOKEN')// Enable Docker, required for container executiondocker.enabled = true// Scope for Nextflow execution reportsreport.enabled = truetimeline.enabled = truetrace.enabled = true// Manifest info (optional)manifest { name = 'pgscatalog/pgsc_calc' description = 'PGS Catalog Score Calculation pipeline' mainScript = 'main.nf'}