RegionBased

FAN-C builds extensively on the genomic_regions package, which provides a unified interface for most types of region-based genomic data. We highly recommend reading the documentation of that package before going into the details of FAN-C, as many of the concepts discussed therein are central to the handling of data in FAN-C.

You can check whether a FAN-C object supports the RegionBased interface with

import genomic_regions as gr
isinstance(o, gr.RegionBased)  # True for objects supporting the regions interface

The current list of FAN-C objects supporting the RegionBased interface is: InsulationScore, DirectionalityIndex, Boundaries, InsulationScores, DirectionalityIndexes, FoldChangeScores, DifferenceScores, DifferenceRegions, FoldChangeRegions, CoolerHic, JuicerHic, Hic, ABCompartmentMatrix, DifferenceMatrix, FoldChangeMatrix, PeakInfo, and RaoPeakInfo.

Any object built on that foundation supports, for example, region iterators:

for region in hic.regions:
    print(region)
    print(region.chromosome, region.start, region.end, region.strand)
    print(region.is_forward)
    print(region.center)
    # ...

Range queries:

for region in hic.regions('chr1:3mb-12mb'):
    print(region.chromosome)  # chr1
    # ...

and many more convenient features. The object type returned by all of those queries is GenomicRegion, which has many convenient functions to deal with region properties and operations.

len(region)  # returns the size of the region in base pairs
region.center  # returns the base (or fraction of base) at the center of the region
region.five_prime  # returns the starting base at the 5' end of the region
region.three_prime  # returns the starting base at the 3' end of the region
region.is_forward()  # True if strand is '+' or '+1'
region.is_reverse()  # True if strand is '-' or '-1'
region.attributes  # return all attribute names in this region object
region.copy()  # return a shallow copy of this region
region.to_string()  # return a region identifier string describing the region

region = gr.as_region('chr12:12.5Mb-18Mb')
region.overlaps('chr12:11Mb-13Mb')  # True
region.overlaps('chr12:11Mb-11.5Mb')  # False
region.overlaps('chr1:11Mb-13Mb')  # False

Refer to the genomic_regions documentation for all the details.

Similarly to the regions interface for handling collections of genomic regions, FAN-C implements interfaces for working with pairs of genomic regions (edges) and matrix operations (matrix). These work in exactly the same way for FAN-C, Cooler, and Juicer files. Hence, all of these are directly compatible with FAN-C architectural functions such as the insulation score or AB compartment analyses, …

These interfaces will be introduced in the following sections, starting with RegionPairsContainer.