Basic usage

fanc uses subcommands to run all of its analyses. The fanc command itself can be used to get an overview of available subcommands, to print the current FAN-C version, or to set logging and notification parameters that affect all subcommands.

Overview

usage: fanc <command> [options]

-- Matrix generation --
auto              Automatically process an entire Hi-C data set
map               Map reads in a FASTQ file to a reference genome
pairs             Process and filter read pairs
hic               Process, filter, and correct Hic files

-- Matrix analysis --
cis-trans         Calculate cis/trans ratio of this Hi-C object
expected          Calculate Hi-C expected values (distance decay)
pca               Do a PCA on multiple Hi-C objects
compartments      Calculate AB compartment matrix
insulation        Calculate insulation scores for Hic object
directionality    Calculate directionality index for Hic object
boundaries        Determine domain boundaries
compare           Create pairwise comparisons of Hi-C comparison maps
loops             Call loops in a Hic object using FAN-C implementation of HICCUPS
aggregate         Make aggregate plots with FAN-C

-- Other helpers --
fragments         In-silico genome digestion
sort-sam          Convenience function to sort a SAM file by name
from-juicer       Import a Hi-C object from juicer (Aiden lab)
from-txt          Import a Hi-C object from a sparse matrix txt format
from-cooler       Convert a Cooler (
to-cooler         Convert a Hic file into cooler format
to-juicer         Convert a ReadPairs file to Juicer 
dump              Dump Hic file to txt file(s)
overlap-peaks     Overlap peaks from multiple samples
subset            Create a new Hic object by subsetting
stats             Get statistics on number of reads used at each step of a pipeline
write-config      Write default config file to specified location
downsample        Downsample contacts from a Hic object
upgrade           Upgrade objects from old FAN-C versions

Positional Arguments

command Subcommand to run

Named Arguments

-V, --version Print version information
--verbose, -v Set verbosity level: Can be chained like “-vvv” to increase verbosity. Default is to show errors, warnings, and info messages (same as “-vv”). “-v” shows only errors and warnings, “-vvv” shows errors, warnings, info, and debug messages.
-s, --silent Do not print log messages to command line.
-l, --log-file Path to file in which to save log.
-m, --email Email address for fanc command summary.
--smtp-server SMTP server in the form smtp.server.com[:port].
--smtp-username
 SMTP username.
--smtp-password
 SMTP password.
--smtp-sender-address
 SMTP sender email address.

Logging

You can set the verbosity level of any fanc subcommand with the -v option. Use more or less v’s for more or less logging output. The default is -vv, which corresponds to error, warning, and info messages. -vvv also displays debug messages, which might be helpful to identify issues with an analysis. -v only displays error and warning messages. To disable logging completely, use the -s option.

By default, logging output is sent to stderr. You can redirect log messages to a file using -l <file name>.

Email notifications

Sometimes it is convenient to be notified by email if a fanc command finishes, especially for long-running commands such as fanc pairs or fanc map. You can instruct fanc to send an email when a command finished using -m <email address>. You must also specify the SMTP server settings using the options --smtp-<server|username|password|sender-address>, or you can pre-configure these using the FAN-C config files.

Temporary files

Many of the more computationally intensive FAN-C commands support the -tmp argument. This instructs the command to copy all input files to a temporary directory before processing. Similarly, output files will intitially generated in the temporary directory and only copied to their intended output locations once the command completes.

This can be very effective when your data is located, for example, on a network file system or a slow external HDD, while your local machine or computing node has access to a fast SSD. Using -tmp, and assuming the local machine’s default temporary directory resides on an SSD, files are copied from their original location to the SSD at the start of the command, thus avoiding the slow file system access throughout the remainder of the processing steps.

You can change the default temporary directory by setting the TMPDIR environment variable on your local system to a folder of your choice.

FAN-C config files

FAN-C supports configuration files, which can be located (in descending order of priority)

  • in the current directory, named fanc.conf
  • in a path specified by the Unix environment variable FANC_CONF
  • in the user’s home folder (named fanc.conf or .fanc.conf)
  • in the .config folder in a user’s home directory, called fanc.conf
  • in /etc/fanc/fanc.conf

Settings made in one config file are overridden by settings in a file with higher priority.

You can write the default config file to a location of your choice using fanc write-config.

usage: fanc write-config [-h] [-f] [config_file]

Positional Arguments

config_file Output file for default configuration.

Named Arguments

-f, --force Force overwrite of existing config file.

An explanation of the different settings can be found as comments in the default config file. The file is written in YAML

NumExpr ThreadPool configuration

FAN-C uses PyTables for fast querying of most of its storage classes. Condition-based queries in PyTables, which are used, for example, to find regions and pixels in certain matrix subsets, rely on the NumExpr package. NumExpr can be multi-threaded, and FAN-C uses the default NumExpr ThreadPool configuration (typically 8 threads). There is generally no need to change this preset, but if you want to optimise every single aspect of your pipeline, you may want to take a look at this NumExpr help page .