split_baits_fs.Rd
Calculates p-values under the negative binomial model. Subsequently regroups putative interactions by bait, rather than by distance, to prepare them for parallel RJMCMC processing. Generates a separate file for each bait. Paths are stored in baitlist.txt, which serves as a to-do list for peaky().
split_baits_fs(bins_dir, residuals_dir, indices, output_dir, plots = TRUE)
bins_dir | Directory containing putative interactions that are binned by distance. |
---|---|
residuals_dir | Directory where the adjusted read counts from each distance bin are stored. |
indices | Indices of distance bins whose baits are processed. These must all have had null models fitted. |
output_dir | Directory where all putative interactive will be stored, one file per bait. Will be created if it does not exist. |
plots | Whether adjusted readcounts are to be plotted aganst distance and stored for each bait. |
The output directory.
base = system.file("extdata",package="peaky") interactions_file = paste0(base,"/counts.tsv") bins_dir = paste0(base,"/bins") fragments_file = paste0(base,"/fragments.bed") bin_interactions_fs(interactions_file, fragments_file, output_dir=bins_dir)#> 13-09-2020 13:01:56 #> Reading interactions from /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/counts.tsv #> 13-09-2020 13:01:56 #> Reading fragment information from /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/fragments.bed #> 13-09-2020 13:01:56 #> Calculating fragment characteristics... #> 13-09-2020 13:01:56 #> Adding fragment characteristics for baits... #> 13-09-2020 13:01:56 #> Adding fragment characteristics for preys... #> 13-09-2020 13:01:56 #> Calculating interaction distances... #> 13-09-2020 13:01:56 #> Calculating total trans-chromosomal read counts for each bait... #> 13-09-2020 13:01:56 #> Modelling those as a function of bait chromosome... #> GAMLSS-RS iteration 1: Global Deviance = 363.9654 #> GAMLSS-RS iteration 2: Global Deviance = 363.9654 #> 13-09-2020 13:01:56 #> Adding trans-chromosomal interactivity covariate for preys that were also baited (0 for preys not baited)... #> 13-09-2020 13:01:56 #> Excluding 2 interactions that are too proximal (distance < 2500 bp)... #> 13-09-2020 13:01:56 #> Excluding 0 interactions that are too distal (distance > Inf bp)... #> 13-09-2020 13:01:56 #> Assigning 5 distance bins... #> 13-09-2020 13:01:56 #> Done. #> 13-09-2020 13:01:56 #> Saving binned interactions: #> /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins/bin_1.rds #> /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins/bin_2.rds #> /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins/bin_3.rds #> /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins/bin_4.rds #> /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins/bin_5.rds #> 13-09-2020 13:01:56 #> Saving bin details to /tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins/bins.txt#> $output_dir #> [1] "/tmp/RtmpZbxw1h/temp_libpath38210020903/peaky/extdata/bins" #> #> $interactions #> baitID preyID N b.chr b.mid b.length p.chr p.mid p.length #> 1: 53559 52096 1 1 212337456 8519 1 207345336 1932 #> 2: 53559 52107 1 1 212337456 8519 1 207380122 891 #> 3: 53559 52112 1 1 212337456 8519 1 207394962 6067 #> 4: 53559 52113 1 1 212337456 8519 1 207398127 260 #> 5: 53559 52121 1 1 212337456 8519 1 207434420 1859 #> --- #> 78050: 661722 830195 1 6 170834793 4398 X 155188600 619 #> 78051: 661722 831633 1 6 170834793 4398 Y 7238968 2825 #> 78052: 661722 832798 1 6 170834793 4398 Y 14832472 5436 #> 78053: 661722 833895 1 6 170834793 4398 Y 18220566 2150 #> 78054: 661722 835350 1 6 170834793 4398 Y 22944058 3103 #> dist b.trans b.trans_res p.trans_res dist.bin #> 1: -4992120 151 0.1116155 0 5 #> 2: -4957334 151 0.1116155 0 5 #> 3: -4942494 151 0.1116155 0 5 #> 4: -4939329 151 0.1116155 0 5 #> 5: -4903036 151 0.1116155 0 5 #> --- #> 78050: NA 1719 2.2141681 0 <NA> #> 78051: NA 1719 2.2141681 0 <NA> #> 78052: NA 1719 2.2141681 0 <NA> #> 78053: NA 1719 2.2141681 0 <NA> #> 78054: NA 1719 2.2141681 0 <NA> #> #> $bins #> dist.bin dist.abs.min dist.abs.max interactions #> 1: 1 2907 733944 9064 #> 2: 2 734021 1590586 9063 #> 3: 3 1590733 2558841 9063 #> 4: 4 2559052 3664236 9063 #> 5: 5 3664323 4999963 9064 #> 6: <NA> NA NA 32737 #>fits_dir = paste0(base,"/fits") for(bin_index in 1:5){ if (FALSE) model_bin_fs(bins_dir,bin_index,output_dir=fits_dir,subsample_size=1000) } baits_dir = paste0(base,"/baits") if (FALSE) split_baits_fs(bins_dir,residuals_dir = fits_dir, indices=1:5, output_dir = baits_dir)