Efficient calculation by using the **dot product**: ```python import pandas as pd df = pd.DataFrame({ 'instance': [ '1', '1', '2', '2', '2', '3', '3', '3', '3' ], 'label': [ 'A', 'B', 'B', 'C', 'D', 'A', 'C', 'D', 'E' ] }) binary_df = pd.get_dummies( df.set_index('instance')['label'] ).groupby(level=0).max().astype(int) co_occurrence_matrix = binary_df.T.dot(binary_df) ```