Aggregate module¶

class
fanc.architecture.aggregate.
AggregateMatrix
(file_name=None, mode='r', tmpdir=None, x=None, y=None)¶ Bases:
fanc.general.FileGroup
Construct and store aggregate matrices from matrixbased objects.
Methods in this class can be used to generate various kinds of aggregate matrices, constructed from averaging the signal from different regions of a HiC (or similar) matrix. Particularly useful is the creation of aggregate matrices from observed/expected data.
Class methods control how exactly an aggregate matrix is constructed:
AggregateMatrix.from_center()
will aggregate HiC matrix regions along the diagonal in a fixed window around the region center. This is useful, for example, to observe the signal around TAD boundaries or other local features, such as the start of genes, enhancer locations, …AggregateMatrix.from_regions()
will extract submatrices using regions of variable size  such as TADs  and interpolate them to the same number of pixels before aggregating them.AggregateMatrix.from_center_pairs()
will extract arbitrary HiC submatrices from a list of region pairs (representing row and column of the matrix). Each submatrix is centered on each region, and a fixed number of pixels around the center is extracted. This is used, for example, to plot aggregate matrices around loops, using the loop anchors as input.

close
(copy_tmp=True, remove_tmp=True)¶ Close this HDF5 file and run exit operations.
If file was opened with tmpdir in readonly mode: close file and delete temporary copy.
If file was opened with tmpdir in write or append mode: Replace original file with copy and delete copy.
Parameters:  copy_tmp – If False, does not overwrite original with modified file.
 remove_tmp – If False, does not delete temporary copy of file.

components
(components=None)¶ Retrieve or store each individual submatrix composing the aggregate matrix.
Parameters: components – List of (masked) numpy arrays Returns: List of (masked) numpy arrays

classmethod
from_center
(matrix, regions, window=200000, rescale=False, scaling_exponent=0.25, keep_components=True, file_name=None, tmpdir=None, region_viewpoint='center', **kwargs)¶ Construct an aggregate matrix from square regions along the diagonal with a fixed window size.
By default, the submatrix that is extracted from
matrix
is centred on the region centre and has a window size specified bywindow
. You can change where the window will be centered usingregion_viewpoint
, which can be any of “center”, “start”, “end”, “five_prime”, or “three_prime”. The latter two may be particularly useful for genomic features such as genes.Example for TAD boundaries:
import fanc hic = fanc.load("/path/to/matrix.hic") tad_boundaries = fanc.load("/path/to/tad_boundaries.bed") # run aggregate analysis am = fanc.AggregateMatrix.from_center(hic, tad_boundaries.regions, window=500000) # extract matrix when done m = am.matrix()
Parameters:  matrix – An object of type
RegionMatrixContainer
, such as a Hic matrix  regions – A list of
GenomicRegion
objects  window – A window size in base pairs
 rescale – If True, will use
scaling_exponent
to artificially rescale the aggregate matrix values using a power law  scaling_exponent – The power law exponent used if
rescale
is True  keep_components – If True (default) will store each submatrix used
to generate the aggregate matrix in the
AggregateMatrix
object, which can be retrieved usingAggregateMatrix.components()
 file_name – If provided, stores the aggregate matrix object at this location.
 tmpdir – If True will work in temporary directory until the object is closed
 region_viewpoint – point on which window is centred. any of “center”, “start”, “end”, “five_prime”, or “three_prime”
 kwargs – Keyword arguments passed to
extract_submatrices()
Returns: aggregate matrix
classmethod
from_center_pairs
(hic, pair_regions, window=None, pixels=16, keep_components=True, file_name=None, tmpdir=None, region_viewpoint='center', **kwargs)¶ Construct an aggregate matrix from pairs of regions.
Parameters:  hic – A compatible HiC matrix
 pair_regions – A list of region pairs
 window – A window size in base pairs
 pixels – The dimension (in pixels) of the output matrix
 keep_components – Keep all submatrices that make up the aggregate matrix
 file_name – Optional path to an output file
 tmpdir – Optional. If
True
, will work in temporary directory until file is closed  region_viewpoint – Location in each region that is used as anchor for the extracted matrix. ‘center’ by default, also valid are ‘start’, ‘end’, ‘five_prime’, and ‘three_prime’
 kwargs – Keyword arguments passed on to
extract_submatrices()
Returns:

classmethod
from_regions
(hic, tad_regions, pixels=90, rescale=False, scaling_exponent=0.25, interpolation=0, boundary_mode='reflect', keep_mask=True, absolute_extension=0, relative_extension=1.0, keep_components=True, anti_aliasing=True, file_name=None, tmpdir=None, **kwargs)¶ Construct aggregate matrix from variable regions along the diagonal.
For each region in
tad_regions
, a submatrix is extracted and interpolated so that it is exactlypixels
xpixels
big. You can expand each region by a relative amount usingrelative_extension
.Example for aggregate TADs:
import fanc hic = fanc.load("/path/to/matrix.hic") tads = fanc.load("/path/to/tads.bed") # run aggregate analysis am = fanc.AggregateMatrix.from_regions(hic, tads.regions, relative_extension=3.) # extract matrix when done m = am.matrix() # 90x90 matrix with aggregate TAD in the centre
Parameters:  hic – An object of type
RegionMatrixContainer
, such as a Hic matrix  tad_regions – A list of
GenomicRegion
objects  pixels – Number of pixels along each dimension of the aggregate matrix
 rescale – If True, will use
scaling_exponent
to artificially rescale the aggregate matrix values using a power law  scaling_exponent – The power law exponent used if
rescale
is True  interpolation – Type of interpolation used on each submatrix in range 05. 0: Nearestneighbor (default), 1: Bilinear, 2: Biquadratic, 3: Bicubic, 4: Biquartic, 5: Biquintic
 boundary_mode – Points outside the boundaries of the input are filled according to the given mode. Options are constant, edge, symmetrix, reflect, and warp. Affects submatrix interpolation.
 keep_mask – If True (default) maksed HiC regions will also be interpolated.
 absolute_extension – Absolute number of base pairs by which to expand each region
 absolute_extension – Amount by which to expand each region as a fraction of each region. Values smaller than 1 lead to region shrinking
 keep_components – If True (default) will store each submatrix used
to generate the aggregate matrix in the
AggregateMatrix
object, which can be retrieved usingAggregateMatrix.components()
 file_name – If provided, stores the aggregate matrix object at this location.
 tmpdir – If True will work in temporary directory until the object is closed
 kwargs – Keyword argumnts passed to
extract_submatrices()
Returns: aggregate matrix
matrix
(m=None)¶ Retrieve or set the aggregate matrix in this object.
Parameters: m – Numpy matrix Returns: aggregate matrix

region_pairs
(pairs=None)¶ Retrieve or set the regions used to generate the aggregate matrix.
Parameters: pairs – Iterable of region tuples of the form [(region1, region2), (region3, region4), …]. If None, simply return the region pairs in this object. Returns: List of region pairs [(region1, region2), (region3, region4), …].