peaky_run.Rd
Run Peaky's core fine-mapping algorithm for a specified bait. This is the most computationally expensive step in Peaky's pipeline, costing many seconds to minutes per bait, and running it in parallel is recommended.
Fits additive models of peaks of varying strengths in various locations to the adjusted readcounts via RJMCMC, interprets these, and stores results on disk.
The next step of this pipeline is to combine results for all baits into a table with peaky_wrapup
.
peaky_run( peaky_output_dir, index, omega_power = -4, iterations = 1e+06, min_interactions = 20 )
peaky_output_dir | Directory that Peaky's intermediate files and results are stored in. Should have been created by |
---|---|
index | Which bait to process, with 1 corresponding to the first one on the list in peaky_output_dir/baits/baitlist.txt. Not baitID. (Tip: to parallelize execution, use an array job's element ID, provided by your compute cluster's scheduler, here.) |
omega_power | Expected decay of adjusted read counts around a truly interacting prey. See |
min_interactions | Minimum requirement for the number of prey fragments (and thus counts) associated with a bait, baits with fewer are skipped. |
subsample_size | Number of RJMCMC models to parametrize. Greated numbers should lead to increased reproducibility. |
List containing the output directory path and results table for the analyzed bait.
This function runs peaky_fs
and then interpret_peaky_fs
. For more control, these functions can be used individually.
base = system.file("extdata",package="peaky") chicago_rds_path = paste0(base,"/chicago_output.rds") peaky_output_dir = paste0(base,"/peaky_from_chicago") if (FALSE) { peaky_prepare_from_chicago(chicago_rds_path, peaky_output_dir, subsample_size=NA) #Big dataset? Consider subsample_size=10e3 for speed. for(i in 1:3){ peaky_run(peaky_output_dir,i) } #Tip: run this in parallel on a cluster by scheduling an array job and passing its elements to i. peaky_wrapup(peaky_output_dir) }