Skip to contents

categoryCompare2: Alternative Visualization

Authored by: Robert M Flight <rflight79@gmail.com> on 2024-10-31 10:39:14.08975

Introduction

Current high-throughput molecular biology experiments are generating larger and larger amounts of data. Although there are many different methods to analyze individual experiments, methods that allow the comparison of different data sets are sorely lacking. This is important due to the number of experiments that have been carried out on biological systems that may be amenable to either fusion or comparison. Most of the current tools available focus on finding those genes in experiments that are listed as the same, or that can be shown statistically that it is significant that the gene was listed in the results of both experiments.

However, what many of these tools do not do is consider the similarities (and just as importantly, the differences) between experimental results at the categorical level. Categoical data includes any gene annotation, such as Gene Ontologies, KEGG pathways, chromosome location, etc. categoryCompare has been developed to allow the comparison of high-throughput experiments at a categorical level, and to explore those results in an intuitive fashion.

Sample Data

To make the concept more concrete, we will examine data from the microarray data set estrogen available from Bioconductor. This data set contains 8 samples, with 2 levels of estrogen therapy (present vs absent), and two time points (10 and 48 hours). A pre-processed version of the data is available with this package, the commands used to generate it are below. Note: the preprocessed one keeps only the top 100 genes, if you use it the results will be slightly different than those shown in the vignette.

library("affy")
library("hgu95av2.db")
library("genefilter")
library("estrogen")
library("limma")
datadir <- system.file("extdata", package = "estrogen")
pd <- read.AnnotatedDataFrame(file.path(datadir,"estrogen.txt"), 
    header = TRUE, sep = "", row.names = 1)
pData(pd)
##              estrogen time.h
## low10-1.cel    absent     10
## low10-2.cel    absent     10
## high10-1.cel  present     10
## high10-2.cel  present     10
## low48-1.cel    absent     48
## low48-2.cel    absent     48
## high48-1.cel  present     48
## high48-2.cel  present     48

Here you can see the descriptions for each of the arrays. First, we will read in the cel files, and then normalize the data using RMA.

currDir <- getwd()
setwd(datadir)
a <- ReadAffy(filenames=rownames(pData(pd)), phenoData = pd, verbose = TRUE)
## 1 reading low10-1.cel ...instantiating an AffyBatch (intensity a 409600x8 matrix)...done.
## Reading in : low10-1.cel
## Reading in : low10-2.cel
## Reading in : high10-1.cel
## Reading in : high10-2.cel
## Reading in : low48-1.cel
## Reading in : low48-2.cel
## Reading in : high48-1.cel
## Reading in : high48-2.cel
setwd(currDir)
eData <- rma(a)
## Warning: replacing previous import 'AnnotationDbi::tail' by 'utils::tail' when
## loading 'hgu95av2cdf'
## Warning: replacing previous import 'AnnotationDbi::head' by 'utils::head' when
## loading 'hgu95av2cdf'
## Background correcting
## Normalizing
## Calculating Expression

To make it easier to conceptualize, we will split the data up into two eSet objects by time, and perform all of the manipulations for calculating significantly differentially expressed genes on each eSet object.

So for the 10 hour samples:

e10 <- eData[, eData$time.h == 10]
e10 <- nsFilter(e10, remove.dupEntrez=TRUE, var.filter=FALSE, 
        feature.exclude="^AFFX")$eset

e10$estrogen <- factor(e10$estrogen)
d10 <- model.matrix(~0 + e10$estrogen)
colnames(d10) <- unique(e10$estrogen)
fit10 <- lmFit(e10, d10)
c10 <- makeContrasts(present - absent, levels=d10)
fit10_2 <- contrasts.fit(fit10, c10)
eB10 <- eBayes(fit10_2)
table10 <- topTable(eB10, number=nrow(e10), p.value=1, adjust.method="BH")
table10$Entrez <- unlist(mget(rownames(table10), hgu95av2ENTREZID, ifnotfound=NA))

And the 48 hour samples we do the same thing:

e48 <- eData[, eData$time.h == 48]
e48 <- nsFilter(e48, remove.dupEntrez=TRUE, var.filter=FALSE, 
        feature.exclude="^AFFX" )$eset

e48$estrogen <- factor(e48$estrogen)
d48 <- model.matrix(~0 + e48$estrogen)
colnames(d48) <- unique(e48$estrogen)
fit48 <- lmFit(e48, d48)
c48 <- makeContrasts(present - absent, levels=d48)
fit48_2 <- contrasts.fit(fit48, c48)
eB48 <- eBayes(fit48_2)
table48 <- topTable(eB48, number=nrow(e48), p.value=1, adjust.method="BH")
table48$Entrez <- unlist(mget(rownames(table48), hgu95av2ENTREZID, ifnotfound=NA))

And grab all the genes on the array to have a background set.

gUniverse <- unique(union(table10$Entrez, table48$Entrez))

For both time points we have generated a list of genes that are differentially expressed in the present vs absent samples. To compare the time-points, we could find the common and discordant genes from both experiments, and then try to interpret those lists. This is commonly done in many meta-analysis studies that attempt to combine the results of many different experiments.

An alternative approach, used in categoryCompare, would be to compare the significantly enriched categories from the two gene lists. Currently the package supports two category classes, Gene Ontology, and KEGG pathways. Both are used below.

Note 1: I am not proposing that this is the best way to analyse this particular data, it is a sample data set that merely serves to illustrate the functionality of this package. However, there are many different experiments where this type of approach is definitely appropriate, and it is up to the user to determine if their data fits the analytical paradigm advocated here.

Create Gene List

library("categoryCompare2")
library("GO.db")
library("org.Hs.eg.db")

g10 <- unique(table10$Entrez[table10$adj.P.Val < 0.05])
g48 <- unique(table48$Entrez[table48$adj.P.Val < 0.05])

Create GO Annotation Object

Before we can do our analysis, we need to define the annotation object, which maps the annotations to the features (genes in this case). For a Gene Ontology (GO) based analysis, this would be all the genes annotated to a particular GO term based on inheritance in the GO DAG. We can generate this list using the GOALL column of the org.Hs.eg.db, and then filter to the terms of interest, or use them all.

go_all_gene <- AnnotationDbi::select(org.Hs.eg.db, keys = gUniverse, columns = c("GOALL", "ONTOLOGYALL"))
## 'select()' returned 1:many mapping between keys and columns
go_all_gene <- go_all_gene[go_all_gene$ONTOLOGYALL == "BP", ]
bp_2_gene <- split(go_all_gene$ENTREZID, go_all_gene$GOALL)

bp_2_gene <- lapply(bp_2_gene, unique)
bp_desc <- AnnotationDbi::select(GO.db, keys = names(bp_2_gene), columns = "TERM", keytype = "GOID")$TERM
## 'select()' returned 1:1 mapping between keys and columns
names(bp_desc) <- names(bp_2_gene)

bp_annotation <- categoryCompare2::annotation(annotation_features = bp_2_gene,
                            description = bp_desc,
                            annotation_type = "GO.BP")

Do Enrichment

Now we can do hypergeometric enrichment with each of the gene lists.

g10_enrich <- hypergeometric_feature_enrichment(
  new("hypergeom_features", significant = g10,
      universe = gUniverse, annotation = bp_annotation),
  p_adjust = "BH"
)

g48_enrich <- hypergeometric_feature_enrichment(
  new("hypergeom_features", significant = g48,
      universe = gUniverse, annotation = bp_annotation),
  p_adjust = "BH"
)

Combine and Find Significant

bp_combined <- combine_enrichments(g10 = g10_enrich,
                                  g48 = g48_enrich)
bp_sig <- get_significant_annotations(bp_combined, padjust <= 0.001, counts >= 2)
bp_sig@statistics@significant
## Signficance Cutoffs:
##   padjust <= 0.001
##   counts >= 2
## 
## Counts:
##    g10 g48 counts
## G1   1   1     72
## G2   1   0     53
## G3   0   1     48
## G4   0   0  14118

Generate Graph

bp_graph <- generate_annotation_graph(bp_sig)
bp_graph
## A cc_graph with
## Number of Nodes = 135 
## Number of Edges = 7740 
##    g10 g48 counts
## G1   1   1     64
## G2   1   0     26
## G3   0   1     45
bp_graph <- remove_edges(bp_graph, 0.8)
## Removed 7530 edges from graph
bp_graph
## A cc_graph with
## Number of Nodes = 135 
## Number of Edges = 210 
##    g10 g48 counts
## G1   1   1     64
## G2   1   0     26
## G3   0   1     45
bp_assign <- annotation_combinations(bp_graph)
bp_assign <- assign_colors(bp_assign)

visNetwork Visualization

We can use the DiagrammeR and visNetwork html widgets to create interactive visualizations either in the RStudio viewer, or as panes in the html report.

Find Communities

It is useful to define the annotations in terms of their communities. To do this we run methods that find and then label the communities, before generating the visualization and table.

bp_communities <- assign_communities(bp_graph)
bp_comm_labels <- label_communities(bp_communities, bp_annotation)

Create Stats Table

To provide a list of which GO terms are in each of the communities we found, lets generate a table, with the community labels so it makes it easier to find them in the graph if desired.

bp_table <- table_from_graph(bp_graph, bp_assign, bp_comm_labels)
knitr::kable(bp_table)
name description sig_group g10.p g10.odds g10.expected g10.counts g10.padjust g48.p g48.odds g48.expected g48.counts g48.padjust group
spindle organization NA NA NA NA NA NA NA NA NA NA 1
GO:0007051 spindle organization g10,g48 0.0000029 3.207006 9.7765269 26 0.0004061 0.0000000 11.753385 1.3966467 13 0.0000003 1
GO:0007052 mitotic spindle organization g48 0.0000261 3.436829 6.7367665 19 0.0023784 0.0000000 12.910353 0.9623952 10 0.0000067 1
GO:0051225 spindle assembly g48 0.0009736 2.833085 6.1616766 15 0.0476519 0.0000002 12.542390 0.8802395 9 0.0000349 1
GO:1902850 microtubule cytoskeleton organization involved in mitosis g48 0.0000207 3.140964 8.3798802 22 0.0020126 0.0000000 12.654264 1.1971257 12 0.0000005 1
cell cycle checkpoint signaling NA NA NA NA NA NA NA NA NA NA 2
GO:0000077 DNA damage checkpoint signaling g10 0.0000000 5.178452 6.0795210 23 0.0000027 0.0017076 6.376032 0.8685030 5 0.1065655 2
GO:0031570 DNA integrity checkpoint signaling g10 0.0000000 5.773900 6.6546108 27 0.0000000 0.0003576 7.110435 0.9506587 6 0.0309761 2
GO:0044773 mitotic DNA damage checkpoint signaling g10 0.0000013 5.359087 4.1077844 16 0.0001887 0.0027179 7.591119 0.5868263 4 0.1526879 2
GO:0044774 mitotic DNA integrity checkpoint signaling g10 0.0000006 5.384322 4.3542515 17 0.0000912 0.0003693 9.189068 0.6220359 5 0.0314121 2
GO:0000075 cell cycle checkpoint signaling g10,g48 0.0000000 6.446372 9.0371257 39 0.0000000 0.0000000 14.159722 1.2910180 14 0.0000000 2
GO:0007093 mitotic cell cycle checkpoint signaling g10,g48 0.0000000 6.694418 6.7367665 30 0.0000000 0.0000000 18.137937 0.9623952 13 0.0000000 2
chromosome segregation NA NA NA NA NA NA NA NA NA NA 3
GO:0000070 mitotic sister chromatid segregation g10,g48 0.0000000 4.335366 9.0371257 30 0.0000006 0.0000000 16.934094 1.2910180 16 0.0000000 3
GO:0000819 sister chromatid segregation g10,g48 0.0000000 3.861664 10.8445509 33 0.0000011 0.0000000 16.061842 1.5492216 18 0.0000000 3
GO:0007059 chromosome segregation g10,g48 0.0000000 3.004279 18.0742515 45 0.0000016 0.0000000 11.036546 2.5820359 21 0.0000000 3
GO:0098813 nuclear chromosome segregation g10,g48 0.0000000 3.438435 13.3092216 37 0.0000016 0.0000000 15.688588 1.9013174 21 0.0000000 3
cell cycle phase transition NA NA NA NA NA NA NA NA NA NA 4
GO:0007346 regulation of mitotic cell cycle g10,g48 0.0000000 2.773562 25.9611976 60 0.0000001 0.0000000 6.441845 3.7087425 19 0.0000008 4
GO:0044770 cell cycle phase transition g10,g48 0.0000000 3.070092 26.9470659 67 0.0000000 0.0000000 7.990426 3.8495808 23 0.0000000 4
GO:0044772 mitotic cell cycle phase transition g10,g48 0.0000000 3.065021 22.3463473 56 0.0000000 0.0000000 9.265474 3.1923353 22 0.0000000 4
GO:1901987 regulation of cell cycle phase transition g10,g48 0.0000000 3.334450 20.9497006 56 0.0000000 0.0000000 7.609177 2.9928144 18 0.0000002 4
GO:1901990 regulation of mitotic cell cycle phase transition g10,g48 0.0000000 3.313661 16.7597605 45 0.0000002 0.0000000 9.051627 2.3942515 17 0.0000001 4
meiotic cell cycle NA NA NA NA NA NA NA NA NA NA 5
GO:0051321 meiotic cell cycle g10,g48 0.0000007 3.169753 11.4196407 30 0.0001044 0.0000000 9.863492 1.6313772 13 0.0000015 5
GO:0061982 meiosis I cell cycle process g10,g48 0.0000064 4.302375 5.0936527 17 0.0007571 0.0000000 15.643629 0.7276647 9 0.0000070 5
GO:1903046 meiotic cell cycle process g10,g48 0.0000052 3.373903 8.2977246 23 0.0006592 0.0000002 10.191059 1.1853892 10 0.0000416 5
GO:0007127 meiosis I g48 0.0001538 3.692345 4.6828743 14 0.0108774 0.0000003 14.880726 0.6689820 8 0.0000461 5
GO:0140013 meiotic nuclear division g48 0.0007968 2.679455 7.3118563 17 0.0398172 0.0000009 10.329775 1.0445509 9 0.0001326 5
negative regulation of mitotic cell cycle NA NA NA NA NA NA NA NA NA NA 6
GO:0045930 negative regulation of mitotic cell cycle g10,g48 0.0000000 4.347555 11.5017964 38 0.0000000 0.0000000 10.748677 1.6431138 14 0.0000002 6
GO:1901991 negative regulation of mitotic cell cycle phase transition g10,g48 0.0000000 4.506058 8.7906587 30 0.0000003 0.0000000 13.273342 1.2558084 13 0.0000001 6
organelle fission NA NA NA NA NA NA NA NA NA NA 7
GO:0000280 nuclear division g10,g48 0.0000000 2.927900 19.7173653 48 0.0000011 0.0000000 10.003736 2.8167665 21 0.0000000 7
GO:0048285 organelle fission g10,g48 0.0000000 2.823147 21.6069461 51 0.0000011 0.0000000 9.027047 3.0867066 21 0.0000000 7
GO:0140014 mitotic nuclear division g10,g48 0.0000001 3.136102 13.8843114 36 0.0000136 0.0000000 12.071027 1.9834731 18 0.0000000 7
establishment of chromosome localization NA NA NA NA NA NA NA NA NA NA 8
GO:0007080 mitotic metaphase chromosome alignment g48 0.0010316 4.231905 2.7111377 9 0.0498075 0.0000019 19.867150 0.3873054 6 0.0002745 8
GO:0051303 establishment of chromosome localization g48 0.0002278 3.527315 4.8471856 14 0.0150013 0.0000051 12.130178 0.6924551 7 0.0006653 8
GO:0051310 metaphase chromosome alignment g48 0.0000817 3.970833 4.4364072 14 0.0061796 0.0000028 13.428805 0.6337725 7 0.0003795 8
DNA biosynthetic process NA NA NA NA NA NA NA NA NA NA 9
GO:0071897 DNA biosynthetic process g10 0.0000000 5.521034 10.2694611 40 0.0000000 0.0006377 5.302477 1.4670659 7 0.0481780 9
GO:2000278 regulation of DNA biosynthetic process g10 0.0000005 4.129617 6.8189222 22 0.0000880 0.0161796 4.402370 0.9741317 4 0.5231269 9
GO:2000573 positive regulation of DNA biosynthetic process g10 0.0000000 6.545688 4.5185629 20 0.0000013 0.0264931 4.979757 0.6455090 3 0.6799512 9
chromosome separation NA NA NA NA NA NA NA NA NA NA 10
GO:0007094 mitotic spindle assembly checkpoint signaling g10,g48 0.0000041 7.789630 2.2182036 11 0.0005330 0.0000000 38.516959 0.3168862 8 0.0000002 10
GO:0010965 regulation of mitotic sister chromatid separation g10,g48 0.0000027 6.417275 2.9576048 13 0.0003836 0.0000000 35.952797 0.4225150 10 0.0000000 10
GO:0031577 spindle checkpoint signaling g10,g48 0.0000041 7.789630 2.2182036 11 0.0005330 0.0000000 38.516959 0.3168862 8 0.0000002 10
GO:0033046 negative regulation of sister chromatid segregation g10,g48 0.0000062 7.330457 2.3003593 11 0.0007471 0.0000000 36.586667 0.3286228 8 0.0000002 10
GO:0033047 regulation of mitotic sister chromatid segregation g10,g48 0.0000087 6.184516 2.7932934 12 0.0009923 0.0000000 33.277753 0.3990419 9 0.0000001 10
GO:0033048 negative regulation of mitotic sister chromatid segregation g10,g48 0.0000062 7.330457 2.3003593 11 0.0007471 0.0000000 36.586667 0.3286228 8 0.0000002 10
GO:0045839 negative regulation of mitotic nuclear division g10,g48 0.0000087 6.184516 2.7932934 12 0.0009923 0.0000000 28.123077 0.3990419 8 0.0000009 10
GO:0045841 negative regulation of mitotic metaphase/anaphase transition g10,g48 0.0000062 7.330457 2.3003593 11 0.0007471 0.0000000 36.586667 0.3286228 8 0.0000002 10
GO:0051306 mitotic sister chromatid separation g10,g48 0.0000054 5.902348 3.1219162 13 0.0006764 0.0000000 33.376623 0.4459880 10 0.0000000 10
GO:0051784 negative regulation of nuclear division g10,g48 0.0000039 6.149084 3.0397605 13 0.0005330 0.0000000 25.204598 0.4342515 8 0.0000017 10
GO:0071173 spindle assembly checkpoint signaling g10,g48 0.0000041 7.789630 2.2182036 11 0.0005330 0.0000000 38.516959 0.3168862 8 0.0000002 10
GO:0071174 mitotic spindle checkpoint signaling g10,g48 0.0000041 7.789630 2.2182036 11 0.0005330 0.0000000 38.516959 0.3168862 8 0.0000002 10
GO:2000816 negative regulation of mitotic sister chromatid separation g10,g48 0.0000062 7.330457 2.3003593 11 0.0007471 0.0000000 36.586667 0.3286228 8 0.0000002 10
GO:0051304 chromosome separation g48 0.0000322 4.414352 4.1077844 14 0.0027860 0.0000000 23.329545 0.5868263 10 0.0000001 10
GO:0051985 negative regulation of chromosome segregation g48 0.0000093 6.922305 2.3825150 11 0.0010208 0.0000000 34.840212 0.3403593 8 0.0000003 10
GO:1902100 negative regulation of metaphase/anaphase transition of cell cycle g48 0.0000093 6.922305 2.3825150 11 0.0010208 0.0000000 34.840212 0.3403593 8 0.0000003 10
GO:1905818 regulation of chromosome separation g48 0.0000148 4.817551 3.8613174 14 0.0015304 0.0000000 25.230344 0.5516168 10 0.0000001 10
GO:1905819 negative regulation of chromosome separation g48 0.0000093 6.922305 2.3825150 11 0.0010208 0.0000000 34.840212 0.3403593 8 0.0000003 10
negative regulation of cell cycle NA NA NA NA NA NA NA NA NA NA 11
GO:0010948 negative regulation of cell cycle process g10,g48 0.0000000 4.102046 14.9523353 47 0.0000000 0.0000000 8.749369 2.1360479 15 0.0000006 11
GO:0045786 negative regulation of cell cycle g10,g48 0.0000000 3.311475 19.8816766 53 0.0000000 0.0000000 6.929419 2.8402395 16 0.0000035 11
GO:1901988 negative regulation of cell cycle phase transition g10,g48 0.0000000 4.723056 12.3233533 43 0.0000000 0.0000000 9.946078 1.7604790 14 0.0000005 11
regulation of chromosome segregation NA NA NA NA NA NA NA NA NA NA 12
GO:0007091 metaphase/anaphase transition of mitotic cell cycle g48 0.0000301 4.156338 4.6007186 15 0.0026255 0.0000000 23.059259 0.6572455 11 0.0000000 12
GO:0030071 regulation of mitotic metaphase/anaphase transition g48 0.0000238 4.260805 4.5185629 15 0.0022562 0.0000000 23.586207 0.6455090 11 0.0000000 12
GO:0033045 regulation of sister chromatid segregation g48 0.0000339 3.870181 5.1758084 16 0.0028999 0.0000000 22.437756 0.7394012 12 0.0000000 12
GO:0044784 metaphase/anaphase transition of cell cycle g48 0.0000378 4.056845 4.6828743 15 0.0031623 0.0000000 22.555222 0.6689820 11 0.0000000 12
GO:0051983 regulation of chromosome segregation g48 0.0000374 3.473308 6.3259880 18 0.0031614 0.0000000 19.566912 0.9037126 13 0.0000000 12
GO:1902099 regulation of metaphase/anaphase transition of cell cycle g48 0.0000301 4.156338 4.6007186 15 0.0026255 0.0000000 23.059259 0.6572455 11 0.0000000 12
cell cycle G2/M phase transition NA NA NA NA NA NA NA NA NA NA 13
GO:0000086 G2/M transition of mitotic cell cycle g10,g48 0.0000013 3.873448 7.1475449 22 0.0001912 0.0000000 15.213023 1.0210778 12 0.0000001 13
GO:0010389 regulation of G2/M transition of mitotic cell cycle g10,g48 0.0000005 4.822955 5.2579641 19 0.0000798 0.0000000 17.251683 0.7511377 10 0.0000007 13
GO:0044839 cell cycle G2/M phase transition g10,g48 0.0000007 3.769896 7.9691018 24 0.0001039 0.0000000 14.871709 1.1384431 13 0.0000000 13
GO:1902749 regulation of cell cycle G2/M phase transition g10,g48 0.0000022 4.252197 5.7508982 19 0.0003114 0.0000000 15.515151 0.8215569 10 0.0000015 13
negative regulation of cell cycle G2/M phase transition NA NA NA NA NA NA NA NA NA NA 14
GO:0010972 negative regulation of G2/M transition of mitotic cell cycle g10,g48 0.0000054 5.902348 3.1219162 13 0.0006764 0.0000002 20.399504 0.4459880 7 0.0000399 14
GO:0044818 mitotic G2/M transition checkpoint g48 0.0000275 5.931076 2.6289820 11 0.0024601 0.0000016 20.633779 0.3755689 6 0.0002290 14
GO:1902750 negative regulation of cell cycle G2/M phase transition g48 0.0000104 5.463706 3.2862275 13 0.0011223 0.0000003 19.158508 0.4694611 7 0.0000545 14
DNA conformation change NA NA NA NA NA NA NA NA NA NA 15
GO:0032508 DNA duplex unwinding g10 0.0000000 7.162162 4.2720958 20 0.0000005 0.0000295 11.634215 0.6102994 6 0.0031932 15
GO:0032392 DNA geometric change g10,g48 0.0000000 6.691228 4.6828743 21 0.0000005 0.0000040 12.618462 0.6689820 7 0.0005454 15
GO:0071103 DNA conformation change g10,g48 0.0000000 5.730827 5.1758084 21 0.0000028 0.0000080 11.258242 0.7394012 7 0.0009519 15
regulation of spindle checkpoint NA NA NA NA NA NA NA NA NA NA 16
GO:0090231 regulation of spindle checkpoint g48 0.0000135 15.791753 0.9858683 7 0.0014139 0.0000001 63.325653 0.1408383 5 0.0000277 16
GO:0090266 regulation of mitotic cell cycle spindle assembly checkpoint g48 0.0000970 13.515882 0.9037126 6 0.0071487 0.0000055 50.121581 0.1291018 4 0.0006991 16
GO:1903504 regulation of mitotic spindle checkpoint g48 0.0000970 13.515882 0.9037126 6 0.0071487 0.0000055 50.121581 0.1291018 4 0.0006991 16
regulation of nuclear division NA NA NA NA NA NA NA NA NA NA 17
GO:0007088 regulation of mitotic nuclear division g48 0.0001549 3.167218 6.4081437 17 0.0109028 0.0000000 15.446046 0.9154491 11 0.0000004 17
GO:0051783 regulation of nuclear division g48 0.0006249 2.655067 7.8047904 18 0.0339568 0.0000000 12.294472 1.1149701 11 0.0000024 17
positive regulation of chromosome separation NA NA NA NA NA NA NA NA NA NA 18
GO:1901970 positive regulation of mitotic sister chromatid separation g48 0.0130991 5.612903 0.9858683 4 0.3579328 0.0000001 63.325653 0.1408383 5 0.0000277 18
GO:1905820 positive regulation of chromosome separation g48 0.0202334 3.744004 1.6431138 5 0.4518054 0.0000027 29.523298 0.2347305 5 0.0003724 18
meiotic chromosome segregation NA NA NA NA NA NA NA NA NA NA 19
GO:0045132 meiotic chromosome segregation g48 0.0043544 3.273312 3.2862275 9 0.1620543 0.0000003 19.158508 0.4694611 7 0.0000545 19
GO:0045143 homologous chromosome segregation g48 0.0112189 3.748039 1.9717365 6 0.3162607 0.0000072 23.296548 0.2816766 5 0.0008596 19
DNA-templated DNA replication maintenance of fidelity NA NA NA NA NA NA NA NA NA NA 20
GO:0031297 replication fork processing g10 0.0000000 10.555084 2.2182036 13 0.0000095 0.0037380 10.826316 0.3168862 3 0.1935512 20
GO:0045005 DNA-templated DNA replication maintenance of fidelity g10 0.0000000 11.414925 2.6289820 16 0.0000002 0.0060761 8.954265 0.3755689 3 0.2810139 20
telomere organization NA NA NA NA NA NA NA NA NA NA 21
GO:0000723 telomere maintenance g10 0.0000000 5.286843 8.1334132 31 0.0000000 0.0060263 4.665980 1.1619162 5 0.2805285 21
GO:0032200 telomere organization g10 0.0000000 5.768731 9.2014371 37 0.0000000 0.0019702 5.011895 1.3144910 6 0.1181049 21
recombinational repair NA NA NA NA NA NA NA NA NA NA 22
GO:0000724 double-strand break repair via homologous recombination g10,g48 0.0000000 4.865326 7.4761677 27 0.0000006 0.0000010 10.075363 1.0680240 9 0.0001586 22
GO:0000725 recombinational repair g10,g48 0.0000000 4.974795 7.6404790 28 0.0000002 0.0000012 9.833066 1.0914970 9 0.0001831 22
regulation of cyclin-dependent protein kinase activity NA NA NA NA NA NA NA NA NA NA 23
GO:0000079 regulation of cyclin-dependent protein serine/threonine kinase activity g48 0.0003254 3.591454 4.4364072 13 0.0200435 0.0000002 15.857005 0.6337725 8 0.0000330 23
GO:1904029 regulation of cyclin-dependent protein kinase activity g48 0.0003947 3.505484 4.5185629 13 0.0233513 0.0000002 15.517731 0.6455090 8 0.0000368 23
microtubule-based process NA NA NA NA NA NA NA NA NA NA 24
GO:0000226 microtubule cytoskeleton organization g48 0.0000108 2.096531 26.8649102 50 0.0011654 0.0000002 5.376902 3.8378443 17 0.0000360 24
GO:0007017 microtubule-based process g48 0.0001646 1.749803 37.6273054 60 0.0115335 0.0000050 3.994773 5.3753293 18 0.0006576 24
cell cycle DNA replication NA NA NA NA NA NA NA NA NA NA 25
GO:0033260 nuclear DNA replication g10,g48 0.0000000 10.623611 2.3825150 14 0.0000031 0.0000000 34.840212 0.3403593 8 0.0000003 25
GO:0044786 cell cycle DNA replication g10,g48 0.0000000 9.958333 2.4646707 14 0.0000050 0.0000000 33.252525 0.3520958 8 0.0000003 25
regulation of DNA metabolic process NA NA NA NA NA NA NA NA NA NA 26
GO:0051054 positive regulation of DNA metabolic process g10 0.0000000 3.342234 16.2668263 44 0.0000002 0.0022034 3.771696 2.3238323 8 0.1290505 26
GO:0051052 regulation of DNA metabolic process g10,g48 0.0000000 2.707851 26.4541317 60 0.0000003 0.0000045 4.677014 3.7791617 15 0.0005989 26
other NA NA NA NA NA NA NA NA NA NA 27
GO:0000731 DNA synthesis involved in DNA repair g10 0.0000002 12.473185 1.7252695 11 0.0000312 0.0247685 9.027412 0.2464671 2 0.6459254 27
GO:0006270 DNA replication initiation g10 0.0000000 14.254843 2.2182036 15 0.0000001 0.0000132 20.112414 0.3168862 5 0.0014995 27
GO:0006271 DNA strand elongation involved in DNA replication g10 0.0000000 74.001486 1.2323353 13 0.0000000 0.0006439 21.684210 0.1760479 3 0.0481780 27
GO:0006284 base-excision repair g10 0.0000042 6.804748 2.6289820 12 0.0005366 0.0004986 12.498480 0.3755689 4 0.0393665 27
GO:0006298 mismatch repair g10 0.0000005 9.078932 2.2182036 12 0.0000798 0.0395659 6.855833 0.3168862 2 0.8451971 27
GO:0006301 postreplication repair g10 0.0000040 11.307238 1.4788024 9 0.0005330 0.0184606 10.723958 0.2112575 2 0.5710414 27
GO:0006325 chromatin organization g10 0.0000054 1.952228 37.3808383 65 0.0006764 0.0854652 1.769890 5.3401198 9 1.0000000 27
GO:0009411 response to UV g10 0.0000084 3.157412 9.1192814 24 0.0009772 0.0018822 5.060248 1.3027545 6 0.1169500 27
GO:0022616 DNA strand elongation g10 0.0000000 18.278209 2.1360479 16 0.0000000 0.0033510 11.298398 0.3051497 3 0.1820863 27
GO:0034502 protein localization to chromosome g10 0.0000000 4.712993 6.4902994 23 0.0000091 0.0137047 4.639433 0.9271856 4 0.4608316 27
GO:0042770 signal transduction in response to DNA damage g10 0.0000007 3.319588 10.2694611 28 0.0001099 0.0034199 4.457253 1.4670659 6 0.1830496 27
GO:1901293 nucleoside phosphate biosynthetic process g10 0.0000066 2.537702 15.9382036 35 0.0007776 0.0262784 2.797410 2.2768862 6 0.6791041 27
GO:1901976 regulation of cell cycle checkpoint g10 0.0000003 9.728699 2.1360479 12 0.0000487 0.0000109 21.072709 0.3051497 5 0.0012601 27
GO:0000727 double-strand break repair via break-induced replication g10,g48 0.0000002 79.000000 0.6572455 7 0.0000322 0.0000012 87.744681 0.0938922 4 0.0001794 27
GO:0006260 DNA replication g10,g48 0.0000000 7.141972 13.9664671 63 0.0000000 0.0000000 11.109739 1.9952096 17 0.0000000 27
GO:0006261 DNA-templated DNA replication g10,g48 0.0000000 10.562406 8.2155689 47 0.0000000 0.0000002 10.305556 1.1736527 10 0.0000393 27
GO:0006268 DNA unwinding involved in DNA replication g10,g48 0.0000000 27.272404 1.3966467 12 0.0000001 0.0000011 36.917563 0.1995210 5 0.0001673 27
GO:0006275 regulation of DNA replication g10,g48 0.0000000 5.047509 6.7367665 25 0.0000011 0.0000000 14.568723 0.9623952 11 0.0000006 27
GO:0006281 DNA repair g10,g48 0.0000000 3.835524 28.2615569 82 0.0000000 0.0000000 5.866134 4.0373653 19 0.0000027 27
GO:0006302 double-strand break repair g10,g48 0.0000000 3.674703 13.7200000 40 0.0000001 0.0000001 8.042322 1.9600000 13 0.0000124 27
GO:0006310 DNA recombination g10,g48 0.0000000 3.729589 14.6237126 43 0.0000000 0.0000001 7.495971 2.0891018 13 0.0000256 27
GO:0010564 regulation of cell cycle process g10,g48 0.0000000 2.356025 35.2447904 71 0.0000013 0.0000000 5.243315 5.0349701 21 0.0000035 27
GO:0030174 regulation of DNA-templated DNA replication initiation g10,g48 0.0000003 30.131760 0.9037126 8 0.0000459 0.0000055 50.121581 0.1291018 4 0.0006991 27
GO:0033044 regulation of chromosome organization g10,g48 0.0000001 3.306310 12.5698204 34 0.0000110 0.0000000 9.727818 1.7956886 14 0.0000006 27
GO:0051276 chromosome organization g10,g48 0.0000000 4.344828 29.5760479 93 0.0000000 0.0000000 10.057752 4.2251497 29 0.0000000 27
GO:0051301 cell division g10,g48 0.0000000 2.376914 31.3834731 64 0.0000041 0.0000000 7.573577 4.4833533 25 0.0000000 27
GO:0090329 regulation of DNA-templated DNA replication g10,g48 0.0000000 11.399404 2.4646707 15 0.0000006 0.0000011 22.358696 0.3520958 6 0.0001610 27
GO:1902969 mitotic DNA replication g10,g48 0.0000000 22.659763 1.2323353 10 0.0000057 0.0000005 44.311828 0.1760479 5 0.0000864 27
GO:1903047 mitotic cell cycle process g10,g48 0.0000000 3.148061 37.9559281 94 0.0000000 0.0000000 9.257988 5.4222754 33 0.0000000 27
GO:0000910 cytokinesis g48 0.1173880 1.532776 8.2155689 12 1.0000000 0.0000023 9.068897 1.1736527 9 0.0003187 27
GO:0007143 female meiotic nuclear division g48 0.0423042 2.954247 1.9717365 5 0.7235861 0.0000072 23.296548 0.2816766 5 0.0008596 27
GO:0010639 negative regulation of organelle organization g48 0.0006199 1.966320 19.0601198 34 0.0338113 0.0000005 6.142202 2.7228743 14 0.0000728 27
GO:0031100 animal organ regeneration g48 0.0000301 4.156338 4.6007186 15 0.0026255 0.0000000 20.271739 0.6572455 10 0.0000002 27
GO:0051338 regulation of transferase activity g48 0.0002615 1.722460 37.4629940 59 0.0168343 0.0000047 4.014041 5.3518563 18 0.0006242 27
GO:0051984 positive regulation of chromosome segregation g48 0.0128177 4.321134 1.4788024 5 0.3536241 0.0000015 34.073615 0.2112575 5 0.0002225 27
GO:0090068 positive regulation of cell cycle process g48 0.0000760 2.466121 12.9805988 28 0.0057751 0.0000021 6.971225 1.8543713 11 0.0003011 27
GO:0090307 mitotic spindle assembly g48 0.0061271 3.074124 3.4505389 9 0.2099832 0.0000005 18.059341 0.4929341 7 0.0000757 27
GO:0097421 liver regeneration g48 0.0024515 4.637356 1.9717365 7 0.1015479 0.0000072 23.296548 0.2816766 5 0.0008596 27
GO:1904668 positive regulation of ubiquitin protein ligase activity g48 0.1652168 3.198413 0.7394012 2 1.0000000 0.0000021 70.187234 0.1056287 4 0.0003011 27
GO:2001251 negative regulation of chromosome organization g48 0.0000378 4.056845 4.6828743 15 0.0031623 0.0000000 19.838008 0.6689820 10 0.0000003 27

Actually Visualize It!

bp_network <- graph_to_visnetwork(bp_graph, bp_assign, bp_comm_labels)
vis_visnetwork(bp_network)

annotation_table <- annotation_gene_table(bp_combined, graph::nodes(bp_graph), use_db = org.Hs.eg.db)

We will show the table that is generated here for the first 3 GO terms.

This one is not run, below you find the table.

kable_annotation_table(annotation_table, header = 4)
csv_annotation_table(annotation_table, out_file = "bp_annotations.csv")
GO:0000070 - mitotic sister chromatid segregation
ENTREZID SYMBOL GENENAME significant
3832 KIF11 kinesin family member 11 g10
10615 SPAG5 sperm associated antigen 5 g10
701 BUB1B BUB1 mitotic checkpoint serine/threonine kinase B g10
7283 TUBG1 tubulin gamma 1 g10
81620 CDT1 chromatin licensing and DNA replication factor 1 g10
5901 RAN RAN, member RAS oncogene family g10
4085 MAD2L1 mitotic arrest deficient 2 like 1 g10
57405 SPC25 SPC25 component of NDC80 kinetochore complex g10
7517 XRCC3 X-ray repair cross complementing 3 g10
9735 KNTC1 kinetochore associated 1 g10
26065 LSM14A LSM14A mRNA processing body assembly factor g10
10403 NDC80 NDC80 kinetochore complex component g10
10726 NUDC nuclear distribution C, dynein complex regulator g10
3835 KIF22 kinesin family member 22 g10
2801 GOLGA2 golgin A2 g10
23636 NUP62 nucleoporin 62 g10
8243 SMC1A structural maintenance of chromosomes 1A g10
23212 RRS1 ribosome biogenesis regulator 1 homolog g10
23310 NCAPD3 non-SMC condensin II complex subunit D3 g10
126353 MISP mitotic spindle positioning g10
4605 MYBL2 MYB proto-oncogene like 2 g10,g48
11130 ZWINT ZW10 interacting kinetochore protein g10,g48
9319 TRIP13 thyroid hormone receptor interactor 13 g10,g48
332 BIRC5 baculoviral IAP repeat containing 5 g10,g48
983 CDK1 cyclin dependent kinase 1 g10,g48
3833 KIFC1 kinesin family member C1 g10,g48
9212 AURKB aurora kinase B g10,g48
11065 UBE2C ubiquitin conjugating enzyme E2 C g10,g48
9700 ESPL1 extra spindle pole bodies like 1, separase g10,g48
1843 DUSP1 dual specificity phosphatase 1 g10,g48
991 CDC20 cell division cycle 20 g48
22974 TPX2 TPX2 microtubule nucleation factor g48
891 CCNB1 cyclin B1 g48
9928 KIF14 kinesin family member 14 g48
11004 KIF2C kinesin family member 2C g48
5347 PLK1 polo like kinase 1 g48
GO:0000070 - mitotic sister chromatid segregation
ENTREZID SYMBOL GENENAME significant
3832 KIF11 kinesin family member 11 g10
10615 SPAG5 sperm associated antigen 5 g10
701 BUB1B BUB1 mitotic checkpoint serine/threonine kinase B g10
7283 TUBG1 tubulin gamma 1 g10
81620 CDT1 chromatin licensing and DNA replication factor 1 g10
5901 RAN RAN, member RAS oncogene family g10
4085 MAD2L1 mitotic arrest deficient 2 like 1 g10
57405 SPC25 SPC25 component of NDC80 kinetochore complex g10
7517 XRCC3 X-ray repair cross complementing 3 g10
9735 KNTC1 kinetochore associated 1 g10
26065 LSM14A LSM14A mRNA processing body assembly factor g10
10403 NDC80 NDC80 kinetochore complex component g10
10726 NUDC nuclear distribution C, dynein complex regulator g10
3835 KIF22 kinesin family member 22 g10
2801 GOLGA2 golgin A2 g10
23636 NUP62 nucleoporin 62 g10
8243 SMC1A structural maintenance of chromosomes 1A g10
23212 RRS1 ribosome biogenesis regulator 1 homolog g10
23310 NCAPD3 non-SMC condensin II complex subunit D3 g10
126353 MISP mitotic spindle positioning g10
4605 MYBL2 MYB proto-oncogene like 2 g10,g48
11130 ZWINT ZW10 interacting kinetochore protein g10,g48
9319 TRIP13 thyroid hormone receptor interactor 13 g10,g48
332 BIRC5 baculoviral IAP repeat containing 5 g10,g48
983 CDK1 cyclin dependent kinase 1 g10,g48
3833 KIFC1 kinesin family member C1 g10,g48
9212 AURKB aurora kinase B g10,g48
11065 UBE2C ubiquitin conjugating enzyme E2 C g10,g48
9700 ESPL1 extra spindle pole bodies like 1, separase g10,g48
1843 DUSP1 dual specificity phosphatase 1 g10,g48
991 CDC20 cell division cycle 20 g48
22974 TPX2 TPX2 microtubule nucleation factor g48
891 CCNB1 cyclin B1 g48
9928 KIF14 kinesin family member 14 g48
11004 KIF2C kinesin family member 2C g48
5347 PLK1 polo like kinase 1 g48
GO:0000070 - mitotic sister chromatid segregation
ENTREZID SYMBOL GENENAME significant
3832 KIF11 kinesin family member 11 g10
10615 SPAG5 sperm associated antigen 5 g10
701 BUB1B BUB1 mitotic checkpoint serine/threonine kinase B g10
7283 TUBG1 tubulin gamma 1 g10
81620 CDT1 chromatin licensing and DNA replication factor 1 g10
5901 RAN RAN, member RAS oncogene family g10
4085 MAD2L1 mitotic arrest deficient 2 like 1 g10
57405 SPC25 SPC25 component of NDC80 kinetochore complex g10
7517 XRCC3 X-ray repair cross complementing 3 g10
9735 KNTC1 kinetochore associated 1 g10
26065 LSM14A LSM14A mRNA processing body assembly factor g10
10403 NDC80 NDC80 kinetochore complex component g10
10726 NUDC nuclear distribution C, dynein complex regulator g10
3835 KIF22 kinesin family member 22 g10
2801 GOLGA2 golgin A2 g10
23636 NUP62 nucleoporin 62 g10
8243 SMC1A structural maintenance of chromosomes 1A g10
23212 RRS1 ribosome biogenesis regulator 1 homolog g10
23310 NCAPD3 non-SMC condensin II complex subunit D3 g10
126353 MISP mitotic spindle positioning g10
4605 MYBL2 MYB proto-oncogene like 2 g10,g48
11130 ZWINT ZW10 interacting kinetochore protein g10,g48
9319 TRIP13 thyroid hormone receptor interactor 13 g10,g48
332 BIRC5 baculoviral IAP repeat containing 5 g10,g48
983 CDK1 cyclin dependent kinase 1 g10,g48
3833 KIFC1 kinesin family member C1 g10,g48
9212 AURKB aurora kinase B g10,g48
11065 UBE2C ubiquitin conjugating enzyme E2 C g10,g48
9700 ESPL1 extra spindle pole bodies like 1, separase g10,g48
1843 DUSP1 dual specificity phosphatase 1 g10,g48
991 CDC20 cell division cycle 20 g48
22974 TPX2 TPX2 microtubule nucleation factor g48
891 CCNB1 cyclin B1 g48
9928 KIF14 kinesin family member 14 g48
11004 KIF2C kinesin family member 2C g48
5347 PLK1 polo like kinase 1 g48