Interactive Usage

A detailed notebook tutorial at here.

Example Usage

The interactive API could be used as normal functions or calling as process chain.

The jupyter notebook of following example were provided at github

1. init plot
from trackplot.plot import Plot

# logfile: save progress logs to file, default: None [stdout]
# backend: switch the matplotlib backend, default: agg
# font_family: change the font family in trackplot, default: None [system default font family]
plot = Plot(logfile=None, backend="agg", font_family=None)
2. set annotation
plot.set_annotation(
    "../example/example.sorted.gtf.gz",                                             # path to gtf file
    add_domain=True,                                                                # whether add domain information into annotation track
    interval="../example/PolyASite.chr1.atlas.clusters.2.0.GRCh38.96.bed.gz",       # path to list of interval files in bed format, 1st column is path to file, 2nd column is the label
    interval_label="polyA",                                                         # the label of added interval
    show_gene=True,                                                                 # show gene id
    color="pink",                                                                   # the color of exons
)
3. setup plotting parameters
plot.add_interval(
    interval="../example/PolyASite.chr1.atlas.clusters.2.0.GRCh38.96.simple.bed.gz",          # path to list of interval files in bed format, 1st column is path to file, 2nd column is the label
    interval_label="polyAS"                                                                   # the label of added interval
).set_region(
    chromosome="chr1", start=1270656, end=1284730, strand="+"                                 # chromosome, start site, end site and strand were required
).add_density(
    path="../example/bams/1.bam",                                                             # path to input file
    category="bam",                                                                           # the category of given file
    color="blue",                                                                             # color of this density
    show_site_plot=True,                                                                      # whether to show site plot
).add_density(                                                                                # another density plot
    path="../example/bws/2.bw",
    category="bw",
    color="green"
).add_line(
    path="../example/bams/1.bam",                                                             # path to input file
    category="bam",                                                                           # the category of given file
    group="1",                                                                                # the group of this line, used to control color etc.
    color="blue",                                                                             # color of this line and other lines belong to group 1
    line_attrs={"lw": 3}                                                                      # additional parameters in dict to control the layout of line
).add_line(                                                                                   # another line
    path="../example/bams/2.bam",
    category="bam",
    group="2",
    color="red",
    line_attrs={"linestyle": "dashed"}                                                        # additional parameters in dict to control the layout of line, for instance this changes the line style
).add_heatmap(
    path="../example/bams/1.bam",                                                             # path to input file
    category="bam",                                                                           # the category of given file
    group="1",                                                                                # the group of this file, only files belong to same group will be drawn in same heatmap.
).add_heatmap(                                                                                # another file to heatmap 1
    path="../example/bams/2.bam",
    category="bam",
    group="1"
).add_igv(
    path="../example/bams/3.bam",                                                             # path to input file
    features={
        "m6a": "ma",
        "real_strand": "rs",
        "polya": "pa"
    },
    category="igv",                                                                           # the category of given file
    label="igv"                                                                               # the label of this plot
).add_igv(
    path="../example/SRX9697989.corrected_reads.bed.gz",
    category="igv",
    label="bed12"
).add_sites(
    1270656 + 1000                                                                            # the highlight site
).add_sites(
    1270656 + 1000                                                                            # the repeat highlight site will show in different color with normal highlight site
).add_sites(
    1270656 + 2000
).add_focus(
    f"{1270656 + 2000}-{1270656 + 3000}"                                                      # the focus region in start_site-end_site format
).add_focus(
    f"{1270656 + 5000}-{1270656 + 7000}"
).add_stroke(
    f"{1270656 + 5000}-{1270656 + 7000}:{1270656 + 7200}-{1270656 + 8000}@blue"               # the stroke in start_site-end_site:start_site-end_site@color format, this will add 2 strokes and the last on will be blue
).add_stroke(                                                                                 # add stroke with named parameters
    start=1270656 + 7500,                                                                     # the start site of added stroke
    end=1270656 + 8200,                                                                       # the end site of added stroke
    color="green",                                                                            # the color of added stroke
    label="test"                                                                              # the label of added stroke
)
4. save figures
plot.plot("test_plot.png", width=6, height=1, raster=True)
# save plot into test_plot.png, with given width and height, the raster=True, will reduce layers in pdf or svg format

API documentation

set_region

change the plot region

def set_region(self, chromosome: str, start: int, end: int, strand: str = "+")
  • chromosome: the chromosome of given region
  • start: the start site of given region
  • end: the end site of given region
  • strand: the strand of given region [optional]

Return


set_annotation

add transcripts into track

def set_annotation(self, gtf: str,
                  add_domain: bool = False,
                  local_domain: Optional[str] = False,
                  domain_include: Optional[str] = False,
                  domain_exclude: Optional[str] = False,
                  interval: Optional[str] = None,
                  interval_label: Optional[str] = None,
                  transcripts: Optional[List[str]] = None,
                  remove_empty_transcripts: bool = False,
                  choose_primary: bool = False,
                  color: Optional[str] = "black",

                  # transcripts related parameters
                  font_size: int = 5,
                  show_gene: bool = False,
                  show_id: bool = False,
                  exon_width: float = .3,
                  show_exon_id: bool = False,
                  theme: str = "blank"
                  )
  • gtf: path to gtf file
  • add_domain: whether to add domain
  • local_domain: whether add domain information into annotation track
  • domain_include: Which domain will be included in annotation plot
  • domain_exclude: Which domain will be excluded in annotation plot
  • interval: path to list of interval files in bed format, 1st column is path to file, 2nd column is the label
  • interval_label: the label of added interval
  • transcripts: the list of name or ids of transcripts to draw
  • remove_empty_transcripts: whether to remove transcripts without any exons
  • choose_primary: Whether choose primary transcript to plot
  • color: the color of exons
  • font_size: the size of transcript id, name
  • show_gene: whether to show gene name/id
  • show_id: show gene id or gene name
  • exon_width: the height of exons
  • show_exon_id: whether to show exon id
  • theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot


add_density

add a density plot into track

 def add_density(self,
                path: str,
                category: str = "bam",
                size_factor=None,

                # file loading parameters
                label: Union[str, List[str]] = "",
                title: str = "",
                barcode: str = "",
                barcode_groups: Dict[str, Set[str]] = None,
                barcode_tag: str = "BC",
                umi_tag: str = "UB",
                library: str = "fru",
                density_by_strand: bool = False,

                # plotting parameters
                color="blue",
                font_size: int = 8,
                show_junction_number: bool = True,
                junction_number_font_size: int = 5,
                n_y_ticks: int = 4,
                show_y_label: bool = True,
                y_label: str = "",
                theme: str = "ticks_blank",
                log_trans: Optional[str] = None,

                # site plot parameters
                show_site_plot: bool = False,
                strand_choice: Optional[str] = None,

                only_customized_junction: bool = False
                )
  • path: the path to input file
  • category: the input file type
  • size_factor: the total reads of bam or total fragments of fragments.tsv
  • label: the label of input file
  • title: the title of input file
  • barcode: key of barcode barcode_groups
  • barcode_groups: dict contains barcodes by groups; key -> Set[str]
  • barcode_tag: cell barcode tag
  • umi_tag: umi barcode tag
  • library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
  • density_by_strand: whether to draw density plot in strand-specific manner.
  • color: color for this density plot
  • font_size: the font size for ticks, y-axis label and title
  • show_junction_number: whether to show the number of junctions
  • junction_number_font_size: the size of numbers in junctions
  • distance_between_label_axis: distance between y-axis label and y-axis ticks
  • n_y_ticks: number of y ticks
  • show_y_label: whether to show y-axis label
  • y_label: the text of y-axis title
  • theme: the build-in theme, including blank, ticks, ticks_blank
  • log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10
  • show_site_plot: draw the density distribution of reads from different strand
  • strand_choice: the strand to draw on site plot
  • only_customized_junction: only work with bam files, only draw customized junctions

Returns Plot


add_heatmap

add a heatmap based on a group of objects into track

def add_heatmap(self,
                path: str,
                group: str = "",
                category: str = "bam",
                size_factor=None,

                # file loading parameters
                label: Union[str, List[str]] = "",
                title: str = "",
                barcode: str = "",
                barcode_groups: Dict[str, Set[str]] = None,
                barcode_tag: str = "BC",
                umi_tag: str = "UB",
                library: str = "fru",

                # plotting parameters
                color="viridis",
                font_size: int = 8,
                show_y_label: bool = True,
                theme: str = "ticks_blank",
                do_scale: bool = False,
                clustering: bool = False,
                clustering_method: str = "ward",
                distance_metric: str = "euclidean",
                show_row_names: bool = False,
                vmin=None, vmax=None,
                log_trans: Optional[str] = None)
  • path: path to input files
  • group: the heatmap group
  • category: file category corresponding to input file
  • size_factor: Total number of reads for bam file or total number of fragments required by scATAC
  • label: the label of input file
  • title: the title of input file
  • barcode: key of barcode barcode_groups
  • barcode_groups: dict contains barcodes by groups; key -> Set[str]
  • barcode_tag: cell barcode tag
  • umi_tag: umi barcode tag
  • library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
  • color: color for this density plot
  • font_size: the font size in plot
  • show_y_label: whether to show y-axis label
  • theme: the build-in theme, including blank, ticks, ticks_blank
  • do_scale: whether to scale the matrix
  • clustering: whether reorder matrix by clustering
  • clustering_method: same as scipy.cluster.hierarchy.linkage
  • distance_metric: same as scipy.spatial.distance.pdist
  • color: used for seaborn.heatmap, see: https://matplotlib.org/3.5.1/tutorials/colors/colormaps.html
    normally including: 'binary', 'gist_yarg', 'gist_gray', 'gray', 'bone', 'pink', 'spring', 'summer', 'autumn', 'winter', 'cool', 'Wistia', 'hot', 'afmhot', 'gist_heat', 'copper'
  • show_row_names: show rownames along heatmap
  • vmin: Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.
  • vmax: Values to anchor the colormap, otherwise they are inferred from the data and other keyword arguments.
  • log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10

Returns Plot


add_line

add a line plot based on a group of objects into track

def add_line(self,
             path: str,
             group: str = "",
             category: str = "bam",

             # file loading parameters
             label: Union[str, List[str]] = "",
             title: str = "",
             barcode: str = "",
             barcode_groups: Dict[str, Set[str]] = None,
             barcode_tag: str = "BC",
             umi_tag: str = "UB",
             library: str = "fru",

             # plotting parameters
             color="blue",
             font_size: int = 8,
             show_y_label: bool = True,
             line_attrs: Optional[Dict] = None,
             theme: str = "ticks_blank",
             n_y_ticks: int = 4,
             show_legend: bool = False,
             legend_position: str = "upper right",
             legend_ncol: int = 0,
             log_trans: Optional[str] = None,
             )
  • path: path to input files
  • group: the line group
  • category: file category corresponding to input file
  • label: the label of input file
  • title: the title of input file
  • barcode: key of barcode barcode_groups
  • barcode_groups: dict contains barcodes by groups; key -> Set[str]
  • barcode_tag: cell barcode tag
  • umi_tag: umi barcode tag
  • library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
  • color: color for this line
  • font_size: font size in this plot
  • show_y_label: whether to show y-axis label
  • line_attrs: the additional attributes to control the line, usd by matpltolib.axes.Axes.plot
  • theme: the build-in theme, including blank, ticks, ticks_blank
  • n_y_ticks: number of y ticks
  • show_legend: whether to show legend
  • legend_position: the position of legend
  • legend_ncol: the number of columns in legend
  • log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10

Returns Plot


add_hic

def add_hic(
        self,
        path: str,
        category: str = "hic",
        label: str = "",
        color: str = "RdYlBu_r",
        log_trans: Optional[str] = None,
        tad: Optional[str] = None,
        show_legend: bool = True,
        depth: int = 30000,
        font_size: int = 8,
        n_y_ticks: int = 4,
        show_y_label: bool = True,
        theme: str = "ticks")
  • path: path to input files
  • category: file category corresponding to input file
  • label: the label of input file
  • color: color for this line
  • log_trans: Whether to perform log transformation, 0 -> not log transform;2 -> log2;10 -> log10
  • tad: The region of topologically associating domains for HiC track
  • show_legend: whether to show legend
  • depth: The depth of region of interest in HiC track
  • font_size: font size in this plot
  • n_y_ticks: number of y ticks
  • show_y_label: whether to show y-axis label
  • theme: the build-in theme, including blank, ticks, ticks_blank

add_igv

add an igv-like plot into track

def add_igv(
            self,
            path: str,
            category: str = "igv",
            label: str = "",
            exon_focus: Optional[str] = None,

            # file loading parameters
            library: str = "fru",
            features: Optional[dict] = None,
            deletion_ignore: Optional[int] = True,
            del_ratio_ignore: float = .5,

            # plotting parameters
            exon_color: Optional[str] = None,
            intron_color: Optional[str] = None,
            feature_color: Optional[str] = None,
            exon_width: float = .3,
            font_size: int = 8,
            n_y_ticks: int = 1,
            show_y_label: bool = True,
            theme: str = "ticks_blank"
)
  • path: path to input files
  • category: file category for the input file
  • label: the y-axis title of igv plot
  • exon_focus: exon to focus, like start1-end1,start2-end2
  • library: fru => fr-unstrand; frf => fr-firststrand; frs => fr-secondstrand
  • features: additional genomic features to show in igv plot, support m6a and polyA length from bam tag. like {"m6a": "ma", "polya": "pa", "real_strand": "rs"}
  • deletion_ignore: ignore the deletion length
  • del_ratio_ignore: ignore the deletion length which calculated by mapped length * ratio
  • exon_color: the color of drawn exons
  • intron_color: the color of drawn introns
  • feature_color: the color of additional features
  • exon_width: the width of exons
  • font_size: the font size of igv plot
  • n_y_ticks: the number of y ticks of igv plot
  • show_y_label: whether to show y-axis title
  • theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot


add_sites

add multiple highlight sites into track

def add_sites(self, sites)

highlight specific sites

  • sites: string in 100,200 format or int

Returns Plot


add_focus

add multiple highlight background into track

def add_focus(self, focus: Optional[str], start: int = 0, end: int = 0)

set focus region - focus: string in 100-200:300-400 - start: start site - end: end site

Returns Plot


add_stroke

add multiple highlight region under transcripts into track

def add_stroke(
                self,
                stroke: Optional[str] = None,
                start: int = 0,
                end: int = 0,
                label: str = "",
                color: str = "black"
)
  • stroke: string format of stroke, eg: 100-200:200-300@blue
  • start: start position
  • end: end position
  • label: stroke label
  • color: the color of stroke

Returns Plot


set_sequence

display the corresponding sequence under x-axis

def set_sequence(self, fasta: str)

set sequence info - fasta: path to indexed fasta file

Returns Plot


add_interval

def add_interval(self, interval: str, interval_label: str)
  • interval: path to local interval file
  • interval_label: the label of given interval

Returns Plot


add_manual

draw line or density plot based on manually added data.

def add_manual(self, 
               data: np.array, 
               image_type: str = "line", 
               label: str = "",
               group: str = "",
               color: str = "blue", 
               font_size: int = 8, 
               n_y_ticks: int = 1, 
               show_y_label: bool = True, 
               theme: str = "ticks_blank",)
  • data: the manual data object, should be 1D np.array with same length of target region
  • image_type: the plotting type, one of line [default], density, heatmap
  • label: the label of given manual data
  • group: the group of given data, used for line or heatmap plot
  • color: the color of given data
  • font_size: the font size of this plot
  • n_y_ticks: the number of y ticks to show
  • show_y_label: whether to show y label
  • theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot


add_motif

draw motif based on bedGraph file

def add_motif(self, 
              path: str, 
              category: str = "motif", 
              motif_region: GenomicLoci = None,
              width: float = 0.8, 
              theme: str = "blank")
  • path: the path to tabix indexed bedGraph file, first 3 columns is chromosome, start and end site, the rest 4 columns is scores for ATCG.
  • category: file category for the input file
  • motif_region: to specify the position of motif
  • width: the width of characters
  • theme: the build-in theme, including blank, ticks, ticks_blank

Returns Plot


merge_by_cell

def merge_by_cell(self)

This is used to merge input files by label, for instance,

p = Plot()
p.add_density('bam', label="cell1")
p.add_density('bam', label="cell2")
p.add_density('bam1', label="cell1")

# After that, 3 plot obj saved in p.plots

p.merge_by_cell()
# the bam and bam1 with label == 'cell1' merged into one

plot

save/show the final image

def plot(
        self,
        output: Optional[str] = None,
        dpi: int = 300,
        width: Union[int, float] = 0,
        height: Union[int, float] = 0,
        raster: bool = False,
        same_y: bool=False,
        remove_duplicate_umi: bool = False,
        threshold: int = 10,
        included_junctions = ["chr1:1-100"],
        fill_step: str = "post",
        return_image: Optional[str] = None,

        n_jobs: int = 1,
        normalize_format: str = "count",

        intron_scale: float=.5,
        exon_scale: float = 1,

        annotation_scale: float = .25,
        stroke_scale: float = .25,
        sc_height_ratio: Optional[Dict[str, float]] = None,
        distance_between_label_axis: float = .3,
)
  • output: if output is empty then show this image by plt.showfig
  • dpi: the dpi of saved plot
  • width: the width of figure, if width == 0, the let matplotlib decide the size of image
  • height: the height of figure, if height == 0, the let matplotlib decide the size of image
  • raster: plot rasterizer site plot
  • same_y: whether the density plots share same y-axis range
  • remove_duplicate_umi: drop duplicated UMIs by barcode
  • threshold: threshold to filter low abundance junctions
  • included_junctions: the list of junctions to draw, the junction should be chrom:start-end format string
  • fill_step: define step if the filling should be a step function, i.e. constant in between x. The value determines where the step will occur:
  • pre: The y value is continued constantly to the left from every x position, i.e. the interval (x[i-1], x[i]] has the value y[i].
  • post: The y value is continued constantly to the right from every x position, i.e. the interval [x[i], x[i+1]) has the value y[i].
  • mid: Steps occur half-way between the x positions.
  • return_image: used for interactive ui, this parameter takes png or pdf, then will return corresponding format of image in bytes array
  • n_jobs: the number of processes to use while loading data, recommended for huge number or size of input files
  • normalize_format: used to normalized input data, should be one of count[default], cpm or rpkm, only worked for bam file
  • intron_cale: used to control the plotting scale of introns, the introns only half size by default
  • exon_cale: used to control the plotting scale of exons, the introns only half size by default
  • annotation_scale: to adjust the max size of annotation plot, the annotations only occupy at most 1/4 of figure height by default
  • stroke_scale: to adjust the max size of stroke plot, the annotations only occupy at most 1/4 of figure height by default
  • sc_height_ratio: to adjust the relative height of single cell related plots, including single cell density and heatmap
  • distance_between_label_axis: to adjust the distance between y-axis label and y-axis ticks

Returns None or io.BytesIO


Theme

We have three build-in themes used for different kind of plot

  • blank: disable axis and ticks, normally used for transcripts, strokes and links at the bottom
  • ticks: disable top and right axis, used for the last plot under transcripts
  • ticks_blank: disable the top, right and bottom axis, remove x-axis ticks, used for most plots