How to show the diagram contents of a Venn diagram

Question:

I have the working code below.

from matplotlib import pyplot as plt
import numpy as np
from matplotlib_venn import venn3, venn3_circles
Gastric_tumor_promoters = set(['DPEP1', 'CDC42BPA', 'GNG4', 'RAPGEFL1', 'MYH7B', 'SLC13A3', 'PHACTR3', 'SMPX', 'NELL2', 'PNMAL1', 'KRT23', 'PCP4', 'LOX', 'CDC42BPA'])

Ovarian_tumor_promoters = set(['ABLIM1','CDC42BPA','VSNL1','LOX','PCP4','SLC13A3'])

Gastric_tumor_suppressors = set(['PLCB4', 'VSNL1', 'TOX3', 'VAV3'])
#Ovarian_tumor_suppressors = set(['VAV3', 'FREM2', 'MYH7B', 'RAPGEFL1', 'SMPX', 'TOX3'])
venn3([Gastric_tumor_promoters,Ovarian_tumor_promoters, Gastric_tumor_suppressors], ('GCPromoters', 'OCPromoters', 'GCSuppressors'))

venn3([Gastric_tumor_promoters,Ovarian_tumor_promoters, Gastric_tumor_suppressors], ('GCPromoters', 'OCPromoters', 'GCSuppressors'))
plt.show()

How can I show the contents of each of the set in these 3 circles? With the color alpha being 0.6. Circles must be bigger to accommodate all the symbols.

Asked By: J.A

||

Answers:

I’m not sure there is a simple way to do this automatically for any possible combination of sets. If you’re ready to do some manual tuning in your particular example, start with something like that:

A = set(['DPEP1', 'CDC42BPA', 'GNG4', 'RAPGEFL1', 'MYH7B', 'SLC13A3', 'PHACTR3', 'SMPX', 'NELL2', 'PNMAL1', 'KRT23', 'PCP4', 'LOX', 'CDC42BPA'])
B = set(['ABLIM1','CDC42BPA','VSNL1','LOX','PCP4','SLC13A3'])
C = set(['PLCB4', 'VSNL1', 'TOX3', 'VAV3'])

v = venn3([A,B,C], ('GCPromoters', 'OCPromoters', 'GCSuppressors'))

v.get_label_by_id('100').set_text('n'.join(A-B-C))
v.get_label_by_id('110').set_text('n'.join(A&B-C))
v.get_label_by_id('011').set_text('n'.join(B&C-A))
v.get_label_by_id('001').set_text('n'.join(C-A-B))
v.get_label_by_id('010').set_text('')
plt.annotate(',n'.join(B-A-C), xy=v.get_label_by_id('010').get_position() +
             np.array([0, 0.2]), xytext=(-20,40), ha='center',
             textcoords='offset points', 
             bbox=dict(boxstyle='round,pad=0.5', fc='gray', alpha=0.1),
             arrowprops=dict(arrowstyle='->',              
                             connectionstyle='arc',color='gray'))

Note that methods like v.get_label_by_id('001') return the matplotlib Text objects, and you are free to configure them to your liking (e.g. you can change font size by calling set_fontsize(8), etc).

Answered By: KT.

Here is an example which automates the whole thing. It creates a temporary dictionary which contains the id’s needed by venn as keys and the intersections of all participating sets for this very id.

If you don’t want the labels sorted remove the sorted() call in the second last line.

import math
from matplotlib import pyplot as plt
from matplotlib_venn import venn2, venn3
import numpy as np

# Convert number to indices into binary
# e.g. 5 -> '101' > [2, 0]
def bits2indices(b):
    l = []
    if b == 0:
        return l
    for i in reversed(range(0, int(math.log(b, 2)) + 1)):
        if b & (1 << i):
            l.append(i)
    return l

# Make dictionary containing venn id's and set intersections
# e.g. d = {'100': {'c', 'b', 'a'}, '010': {'c', 'd', 'e'}, ... }
def set2dict(s):
    d = {}
    for i in range(1, 2**len(s)):
        # Make venn id strings
        key = bin(i)[2:].zfill(len(s))
        key = key[::-1]
        ind = bits2indices(i)
        # Get the participating sets for this id
        participating_sets = [s[x] for x in ind] 
        # Get the intersections of those sets
        inter = set.intersection(*participating_sets)
        d[key] = inter
    return d

# Define some sets
a = set(['a', 'b', 'c']) 
b = set(['c', 'd', 'e'])
c = set(['e', 'f', 'a'])
s = [a, b, c]

# Create dictionary from sets
d = set2dict(s)

# Plot it
h = venn3(s, ('A', 'B', 'C'))
for k, v in d.items():
    l = h.get_label_by_id(k)
    if l:
        l.set_text('n'.join(sorted(v)))
plt.show()

/edit
I’m sorry I just figured out that the above code does not remove duplicate labels and is therefor wrong. The number of elements shown by venn and the number of labels was different. Here is a new version which removes wrong duplicates from other intersections. I guess there is a smarter and more functional way to do that instead of iterating over all intersections twice…

import math, itertools
from matplotlib import pyplot as plt
from matplotlib_venn import venn2, venn3
import numpy as np

# Generate list index for itertools combinations
def gen_index(n):
    x = -1
    while True:       
        while True:
            x = x + 1
            if bin(x).count('1') == n:
                break
        yield x

# Generate all combinations of intersections
def make_intersections(sets):
    l = [None] * 2**len(sets)
    for i in range(1, len(sets) + 1):
        ind = gen_index(i)
        for subset in itertools.combinations(sets, i):
            inter = set.intersection(*subset)
            l[next(ind)] = inter
    return l

# Get weird reversed binary string id for venn
def number2venn_id(x, n_fill):
    id = bin(x)[2:].zfill(n_fill)
    id = id[::-1]
    return id

# Iterate over all combinations and remove duplicates from intersections with
# more sets
def sets2dict(sets):
    l = make_intersections(sets)
    d = {}
    for i in range(1, len(l)):
        d[number2venn_id(i, len(sets))] = l[i]
        for j in range(1, len(l)):
            if bin(j).count('1') < bin(i).count('1'):
                l[j] = l[j] - l[i]
                d[number2venn_id(j, len(sets))] = l[j] - l[i]
    return d

# Define some sets
a = set(['a', 'b', 'c', 'f']) 
b = set(['c', 'd', 'e'])
c = set(['e', 'f', 'a'])
sets = [a, b, c]

d = sets2dict(sets)

# Plot it
h = venn3(sets, ('A', 'B', 'C'))
for k, v in d.items():
   l = h.get_label_by_id(k)
   if l:
       l.set_fontsize(12)
       l.set_text('n'.join(sorted(v)))

# Original for comparison
f = plt.figure(2)
venn3(sets, ('A', 'B', 'C'))  
plt.show()
Answered By: Vinci

Thanks for the automation, @Vinci! I wonder if you (or somebody else) has written a version that rearranges the content so that the elements stay within the bubble(s) in a random fashion instead of a long list? … bonus track: re-dimensioning the bubbles if the elements do not fit? 😉

v850_instruction_sets

Answered By: brainstorm