Interactive Usage

A detailed notebook tutorial at here.

Example Usage

The interactive API could be used as normal functions or calling as process chain.

The jupyter notebook of following example were provided at github

1. init plot

from trackplot.plot import Plot

# logfile: save progress logs to file, default: None [stdout]
# backend: switch the matplotlib backend, default: agg
# font_family: change the font family in trackplot, default: None [system default font family]
plot = Plot(logfile=None, backend="agg", font_family=None)

2. set annotation

plot.set_annotation(
    "../example/example.sorted.gtf.gz",                                             # path to gtf file
    add_domain=True,                                                                # whether add domain information into annotation track
    interval="../example/PolyASite.chr1.atlas.clusters.2.0.GRCh38.96.bed.gz",       # path to list of interval files in bed format, 1st column is path to file, 2nd column is the label
    interval_label="polyA",                                                         # the label of added interval
    show_gene=True,                                                                 # show gene id
    color="pink",                                                                   # the color of exons
)

3. setup plotting parameters

plot.add_interval(
    interval="../example/PolyASite.chr1.atlas.clusters.2.0.GRCh38.96.simple.bed.gz",          # path to list of interval files in bed format, 1st column is path to file, 2nd column is the label
    interval_label="polyAS"                                                                   # the label of added interval
).set_region(
    chromosome="chr1", start=1270656, end=1284730, strand="+"                                 # chromosome, start site, end site and strand were required
).add_density(
    path="../example/bams/1.bam",                                                             # path to input file
    category="bam",                                                                           # the category of given file
    color="blue",                                                                             # color of this density
    show_site_plot=True,                                                                      # whether to show site plot
).add_density(                                                                                # another density plot
    path="../example/bws/2.bw",
    category="bw",
    color="green"
).add_line(
    path="../example/bams/1.bam",                                                             # path to input file
    category="bam",                                                                           # the category of given file
    group="1",                                                                                # the group of this line, used to control color etc.
    color="blue",                                                                             # color of this line and other lines belong to group 1
    line_attrs={"lw": 3}                                                                      # additional parameters in dict to control the layout of line
).add_line(                                                                                   # another line
    path="../example/bams/2.bam",
    category="bam",
    group="2",
    color="red",
    line_attrs={"linestyle": "dashed"}                                                        # additional parameters in dict to control the layout of line, for instance this changes the line style
).add_heatmap(
    path="../example/bams/1.bam",                                                             # path to input file
    category="bam",                                                                           # the category of given file
    group="1",                                                                                # the group of this file, only files belong to same group will be drawn in same heatmap.
).add_heatmap(                                                                                # another file to heatmap 1
    path="../example/bams/2.bam",
    category="bam",
    group="1"
).add_igv(
    path="../example/bams/3.bam",                                                             # path to input file
    features={
        "m6a": "ma",
        "real_strand": "rs",
        "polya": "pa"
    },
    category="igv",                                                                           # the category of given file
    label="igv"                                                                               # the label of this plot
).add_igv(
    path="../example/SRX9697989.corrected_reads.bed.gz",
    category="igv",
    label="bed12"
).add_sites(
    1270656 + 1000                                                                            # the highlight site
).add_sites(
    1270656 + 1000                                                                            # the repeat highlight site will show in different color with normal highlight site
).add_sites(
    1270656 + 2000
).add_focus(
    f"{1270656 + 2000}-{1270656 + 3000}"                                                      # the focus region in start_site-end_site format
).add_focus(
    f"{1270656 + 5000}-{1270656 + 7000}"
).add_stroke(
    f"{1270656 + 5000}-{1270656 + 7000}:{1270656 + 7200}-{1270656 + 8000}@blue"               # the stroke in start_site-end_site:start_site-end_site@color format, this will add 2 strokes and the last on will be blue
).add_stroke(                                                                                 # add stroke with named parameters
    start=1270656 + 7500,                                                                     # the start site of added stroke
    end=1270656 + 8200,                                                                       # the end site of added stroke
    color="green",                                                                            # the color of added stroke
    label="test"                                                                              # the label of added stroke
)

4. save figures

plot.plot("test_plot.png", width=6, height=1, raster=True)
# save plot into test_plot.png, with given width and height, the raster=True, will reduce layers in pdf or svg format

API documentation

set_region

change the plot region

def set_region(self, chromosome: str, start: int, end: int, strand: str = "+")

chromosome: the chromosome of given region
start: the start site of given region
end: the end site of given region
strand: the strand of given region [optional]

Return

set_annotation

add transcripts into track

def set_annotation(self, gtf: str,
                  add_domain: bool = False,
                  local_domain: Optional[str] = False,
                  domain_include: Optional[str] = False,
                  domain_exclude: Optional[str] = False,
                  interval: Optional[str] = None,
                  interval_label: Optional[str] = None,
                  transcripts: Optional[List[str]] = None,
                  remove_empty_transcripts: bool = False,
                  choose_primary: bool = False,
                  color: Optional[str] = "black",

                  # transcripts related parameters
                  font_size: int = 5,
                  show_gene: bool = False,
                  show_id: bool = False,
                  exon_width: float = .3,
                  show_exon_id: bool = False,
                  theme: str = "blank"
                  )

gtf: path to gtf file
add_domain: whether to add domain
local_domain: whether add domain information into annotation track
domain_include: Which domain will be included in annotation plot
domain_exclude: Which domain will be excluded in annotation plot
interval: path to list of interval files in bed format, 1st column is path to file, 2nd column is the label
interval_label: the label of added interval
transcripts: the list of name or ids of transcripts to draw
remove_empty_transcripts: whether to remove transcripts without any exons
choose_primary: Whether choose primary transcript to plot
color: the color of exons
font_size: the size of transcript id, name
show_gene: whether to show gene name/id
show_id: show gene id or gene name
exon_width: the height of exons
show_exon_id: whether to show exon id
theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot

add_density

add a density plot into track

 def add_density(self,
                path: str,
                category: str = "bam",
                size_factor=None,

                # file loading parameters
                label: Union[str, List[str]] = "",
                title: str = "",
                barcode: str = "",
                barcode_groups: Dict[str, Set[str]] = None,
                barcode_tag: str = "BC",
                umi_tag: str = "UB",
                library: str = "fru",
                density_by_strand: bool = False,

                # plotting parameters
                color="blue",
                font_size: int = 8,
                show_junction_number: bool = True,
                junction_number_font_size: int = 5,
                n_y_ticks: int = 4,
                show_y_label: bool = True,
                y_label: str = "",
                theme: str = "ticks_blank",
                log_trans: Optional[str] = None,

                # site plot parameters
                show_site_plot: bool = False,
                strand_choice: Optional[str] = None,

                only_customized_junction: bool = False
                )

path: the path to input file
category: the input file type
size_factor: the total reads of bam or total fragments of fragments.tsv
label: the label of input file
title: the title of input file
barcode: key of barcode barcode_groups
barcode_groups: dict contains barcodes by groups; key -> Set[str]
barcode_tag: cell barcode tag
umi_tag: umi barcode tag
library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
density_by_strand: whether to draw density plot in strand-specific manner.
color: color for this density plot
font_size: the font size for ticks, y-axis label and title
show_junction_number: whether to show the number of junctions
junction_number_font_size: the size of numbers in junctions
distance_between_label_axis: distance between y-axis label and y-axis ticks
n_y_ticks: number of y ticks
show_y_label: whether to show y-axis label
y_label: the text of y-axis title
theme: the build-in theme, including blank, ticks, ticks_blank
log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10
show_site_plot: draw the density distribution of reads from different strand
strand_choice: the strand to draw on site plot
only_customized_junction: only work with bam files, only draw customized junctions

Returns Plot

add_heatmap

add a heatmap based on a group of objects into track

def add_heatmap(self,
                path: str,
                group: str = "",
                category: str = "bam",
                size_factor=None,

                # file loading parameters
                label: Union[str, List[str]] = "",
                title: str = "",
                barcode: str = "",
                barcode_groups: Dict[str, Set[str]] = None,
                barcode_tag: str = "BC",
                umi_tag: str = "UB",
                library: str = "fru",

                # plotting parameters
                color="viridis",
                font_size: int = 8,
                show_y_label: bool = True,
                theme: str = "ticks_blank",
                do_scale: bool = False,
                clustering: bool = False,
                clustering_method: str = "ward",
                distance_metric: str = "euclidean",
                show_row_names: bool = False,
                vmin=None, vmax=None,
                log_trans: Optional[str] = None)

path: path to input files
group: the heatmap group
category: file category corresponding to input file
size_factor: Total number of reads for bam file or total number of fragments required by scATAC
label: the label of input file
title: the title of input file
barcode: key of barcode barcode_groups
barcode_groups: dict contains barcodes by groups; key -> Set[str]
barcode_tag: cell barcode tag
umi_tag: umi barcode tag
library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
color: color for this density plot
font_size: the font size in plot
show_y_label: whether to show y-axis label
theme: the build-in theme, including blank, ticks, ticks_blank
do_scale: whether to scale the matrix
clustering: whether reorder matrix by clustering
clustering_method: same as scipy.cluster.hierarchy.linkage
distance_metric: same as scipy.spatial.distance.pdist
color: used for seaborn.heatmap, see: https://matplotlib.org/3.5.1/tutorials/colors/colormaps.html
normally including: 'binary', 'gist_yarg', 'gist_gray', 'gray', 'bone', 'pink', 'spring', 'summer', 'autumn', 'winter', 'cool', 'Wistia', 'hot', 'afmhot', 'gist_heat', 'copper'
show_row_names: show rownames along heatmap
vmin: Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.
vmax: Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.
log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10

Returns Plot

add_line

add a line plot based on a group of objects into track

def add_line(self,
             path: str,
             group: str = "",
             category: str = "bam",

             # file loading parameters
             label: Union[str, List[str]] = "",
             title: str = "",
             barcode: str = "",
             barcode_groups: Dict[str, Set[str]] = None,
             barcode_tag: str = "BC",
             umi_tag: str = "UB",
             library: str = "fru",

             # plotting parameters
             color="blue",
             font_size: int = 8,
             show_y_label: bool = True,
             line_attrs: Optional[Dict] = None,
             theme: str = "ticks_blank",
             n_y_ticks: int = 4,
             show_legend: bool = False,
             legend_position: str = "upper right",
             legend_ncol: int = 0,
             log_trans: Optional[str] = None,
             )

path: path to input files
group: the line group
category: file category corresponding to input file
label: the label of input file
title: the title of input file
barcode: key of barcode barcode_groups
barcode_groups: dict contains barcodes by groups; key -> Set[str]
barcode_tag: cell barcode tag
umi_tag: umi barcode tag
library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
color: color for this line
font_size: font size in this plot
show_y_label: whether to show y-axis label
line_attrs: the additional attributes to control the line, usd by matpltolib.axes.Axes.plot
theme: the build-in theme, including blank, ticks, ticks_blank
n_y_ticks: number of y ticks
show_legend: whether to show legend
legend_position: the position of legend
legend_ncol: the number of columns in legend
log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10

Returns Plot

add_hic

def add_hic(
        self,
        path: str,
        category: str = "hic",
        label: str = "",
        color: str = "RdYlBu_r",
        log_trans: Optional[str] = None,
        tad: Optional[str] = None,
        show_legend: bool = True,
        depth: int = 30000,
        font_size: int = 8,
        n_y_ticks: int = 4,
        show_y_label: bool = True,
        theme: str = "ticks")

path: path to input files
category: file category corresponding to input file
label: the label of input file
color: color for this line
log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10
tad: The region of topologically associating domains for HiC track
show_legend: whether to show legend
depth: The depth of region of interest in HiC track
font_size: font size in this plot
n_y_ticks: number of y ticks
show_y_label: whether to show y-axis label
theme: the build-in theme, including blank, ticks, ticks_blank

add_igv

add an igv-like plot into track

def add_igv(
            self,
            path: str,
            category: str = "igv",
            label: str = "",
            exon_focus: Optional[str] = None,

            # file loading parameters
            library: str = "fru",
            features: Optional[dict] = None,
            deletion_ignore: Optional[int] = True,
            del_ratio_ignore: float = .5,

            # plotting parameters
            exon_color: Optional[str] = None,
            intron_color: Optional[str] = None,
            feature_color: Optional[str] = None,
            exon_width: float = .3,
            font_size: int = 8,
            n_y_ticks: int = 1,
            show_y_label: bool = True,
            theme: str = "ticks_blank"
)

path: path to input files
category: file category for the input file
label: the y-axis title of igv plot
exon_focus: exon to focus, like start1-end1,start2-end2
library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
features: additional genomic features to show in igv plot, support m6a and polyA length from bam tag. like {"m6a": "ma", "polya": "pa", "real_strand": "rs"}
deletion_ignore: ignore the deletion length
del_ratio_ignore: ignore the deletion length which calculated by mapped length * ratio
exon_color: the color of drawn exons
intron_color: the color of drawn introns
feature_color: the color of additional features
exon_width: the width of exons
font_size: the font size of igv plot
n_y_ticks: the number of y ticks of igv plot
show_y_label: whether to show y-axis title
theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot

add_sites

add multiple highlight sites into track

def add_sites(self, sites)

highlight specific sites

sites: string in 100,200 format or int

Returns Plot

add_focus

add multiple highlight background into track

def add_focus(self, focus: Optional[str], start: int = 0, end: int = 0)

set focus region - focus: string in 100-200:300-400 - start: start site - end: end site

Returns Plot

add_stroke

add multiple highlight region under transcripts into track

def add_stroke(
                self,
                stroke: Optional[str] = None,
                start: int = 0,
                end: int = 0,
                label: str = "",
                color: str = "black"
)

stroke: string format of stroke, eg: 100-200:200-300@blue
start: start position
end: end position
label: stroke label
color: the color of stroke

Returns Plot

set_sequence

display the corresponding sequence under x-axis

def set_sequence(self, fasta: str)

set sequence info - fasta: path to indexed fasta file

Returns Plot

add_interval

def add_interval(self, interval: str, interval_label: str)

interval: path to local interval file
interval_label: the label of given interval

Returns Plot

add_manual

draw line or density plot based on manually added data.

def add_manual(self, 
               data: np.array, 
               image_type: str = "line", 
               label: str = "",
               group: str = "",
               color: str = "blue", 
               font_size: int = 8, 
               n_y_ticks: int = 1, 
               show_y_label: bool = True, 
               theme: str = "ticks_blank",)

data: the manual data object, should be 1D np.array with same length of target region
image_type: the plotting type, one of line [default], density, heatmap
label: the label of given manual data
group: the group of given data, used for line or heatmap plot
color: the color of given data
font_size: the font size of this plot
n_y_ticks: the number of y ticks to show
show_y_label: whether to show y label
theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot

add_motif

draw motif based on bedGraph file

def add_motif(self, 
              path: str, 
              category: str = "motif", 
              motif_region: GenomicLoci = None,
              width: float = 0.8, 
              theme: str = "blank")

path: the path to tabix indexed bedGraph file, first 3 columns is chromosome, start and end site, the rest 4 columns is scores for ATCG.
category: file category for the input file
motif_region: to specify the position of motif
width: the width of characters
theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot

merge_by_cell

def merge_by_cell(self)

This is used to merge input files by label, for instance,

p = Plot()
p.add_density('bam', label="cell1")
p.add_density('bam', label="cell2")
p.add_density('bam1', label="cell1")

# After that, 3 plot obj saved in p.plots

p.merge_by_cell()
# the bam and bam1 with label == 'cell1' merged into one

plot

save/show the final image

def plot(
        self,
        output: Optional[str] = None,
        dpi: int = 300,
        width: Union[int, float] = 0,
        height: Union[int, float] = 0,
        raster: bool = False,
        same_y: bool=False,
        remove_duplicate_umi: bool = False,
        threshold: int = 10,
        included_junctions = ["chr1:1-100"],
        fill_step: str = "post",
        return_image: Optional[str] = None,

        n_jobs: int = 1,
        normalize_format: str = "count",

        intron_scale: float=.5,
        exon_scale: float = 1,

        annotation_scale: float = .25,
        stroke_scale: float = .25,
        sc_height_ratio: Optional[Dict[str, float]] = None,
        distance_between_label_axis: float = .3,
)

output: if output is empty then show this image by plt.showfig
dpi: the dpi of saved plot
width: the width of figure, if width == 0, the let matplotlib decide the size of image
height: the height of figure, if height == 0, the let matplotlib decide the size of image
raster: plot rasterizer site plot
same_y: whether the density plots share same y-axis range
remove_duplicate_umi: drop duplicated UMIs by barcode
threshold: threshold to filter low abundance junctions
included_junctions: the list of junctions to draw, the junction should be chrom:start-end format string
fill_step: define step if the filling should be a step function, i.e. constant in between x. The value determines where the step will occur:
pre: The y value is continued constantly to the left from every x position, i.e. the interval (x[i-1], x[i]] has the value y[i].
post: The y value is continued constantly to the right from every x position, i.e. the interval [x[i], x[i+1]) has the value y[i].
mid: Steps occur half-way between the x positions.
return_image: used for interactive ui, this parameter takes png or pdf, then will return corresponding format of image in bytes array
n_jobs: the number of processes to use while loading data, recommended for huge number or size of input files
normalize_format: used to normalized input data, should be one of count[default], cpm or rpkm, only worked for bam file
intron_cale: used to control the plotting scale of introns, the introns only half size by default
exon_cale: used to control the plotting scale of exons, the introns only half size by default
annotation_scale: to adjust the max size of annotation plot, the annotations only occupy at most 1/4 of figure height by default
stroke_scale: to adjust the max size of stroke plot, the annotations only occupy at most 1/4 of figure height by default
sc_height_ratio: to adjust the relative height of single cell related plots, including single cell density and heatmap
distance_between_label_axis: to adjust the distance between y-axis label and y-axis ticks

Returns None or io.BytesIO

Theme

We have three build-in themes used for different kind of plot

blank: disable axis and ticks, normally used for transcripts, strokes and links at the bottom
ticks: disable top and right axis, used for the last plot under transcripts
ticks_blank: disable the top, right and bottom axis, remove x-axis ticks, used for most plots