How to increase memory for the Input Check Format Scorefiles step #447

New issue

Open

How to increase memory for the Input Check Format Scorefiles step#447

Labels

user-queryUser queries & requests

Description

icdh99

opened

on Oct 20, 2025

Hi!

I am running pgsc_calc on a hpc, on a large dataset (800 individuals and 5k PGS scores) as following:

nextflow run pgscatalog/pgsc_calc
-profile apptainer
-resume
-c my_nextflow_config.config
--input samplesheet_subset.csv
--target_build GRCh37
--scorefile '*_hmPOS_GRCh37.txt.gz'
--run_ancestry /pgsc_HGDP+1kGP_v1.tar.zst
--outdir subset_inclancestry

The config file is copied from thetutorial on your website, thank you for providing!

The pipeline works for a subset of 100 scores. I do suspect the pipeline to use a lot more memory if I increase the number of scores. As a precautionary measure I have increased the memory for the steps MATCH_VARIANTS and MATCH_COMBINE in my config file. However, I noted already for the 100 scores that the first step of the pipeline (INPUT_CHECK_FORMAT_SCOREFILES) needs a lot of memory. Is there a way to increase the memory for this step beforehand? Currently, it needs to try the limit of 16G before moving to 32G and 64G, which takes a lot of time when I already know the process will be out of memory.

I hope you can help me, perhaps I have missed an obvious approach to change this variable but I did not find it!

Metadata

Assignees

No one assigned

Labels

user-queryUser queries & requests

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to increase memory for the Input Check Format Scorefiles step #447

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions