$$X \rightarrow Y$$
Support của tập item X:
$$\text{support}(X) = \frac{|\{T \in D \mid X \subseteq T\}|}{|D|}$$
Support của luật X → Y:
$$\text{support}(X \rightarrow Y) = \text{support}(X \cup Y)$$
$$\text{confidence}(X \rightarrow Y) = \frac{\text{support}(X \cup Y)}{\text{support}(X)}$$
$$\text{lift}(X \rightarrow Y) = \frac{\text{confidence}(X \rightarrow Y)}{\text{support}(Y)} = \frac{\text{support}(X \cup Y)}{\text{support}(X)\,\text{support}(Y)}$$
$$\text{leverage}(X \rightarrow Y) = \text{support}(X \cup Y) - \text{support}(X)\,\text{support}(Y)$$
$$\text{conviction}(X \rightarrow Y) = \frac{1 - \text{support}(Y)}{1 - \text{confidence}(X \rightarrow Y)}$$
$$\text{cosine}(X \rightarrow Y) = \frac{\text{support}(X \cup Y)}{\sqrt{\text{support}(X)\,\text{support}(Y)}}$$
Min Support = 0.3
| Transaction ID | Items |
|---|---|
| T1 | Hot Dogs, Buns, Ketchup |
| T2 | Hot Dogs, Buns |
| T3 | Hot Dogs, Coke, Chips |
| T4 | Chips, Coke |
| T5 | Chips, Ketchup |
| T6 | Hot Dogs, Coke, Chips |
Buns, Chips, Coke, Hot Dogs, Ketchup
Min support = 0.3
| Transaction ID | Items Purchased |
| T1 | Rock, Jazz |
| T2 | Jazz, Pop, Classical |
| T3 | Rock, Pop |
| T4 | Jazz, Rock, Pop, Classical |
| T5 | Pop, Classical |
| T6 | Rock, Jazz, Classical |
| T7 | Jazz, Pop, Classical |
| T8 | Rock, Pop, Jazz |
Classical, Jazz, Pop, Rock
Min support = 0.25
| Transaction ID | Items |
|---|---|
| T1 | egg, bread |
| T2 | juice, egg, butter |
| T3 | juice, egg, bread |
| T4 | juice, bread |
| T5 | juice, egg |
| T6 | juice, bread, butter |
| T7 | juice, egg, butter |
| T8 | bread, butter |
| T9 | juice, bread |
| T10 | egg, butter |
| T11 | juice, egg, butter |
bread, butter, egg, juice
data = pd.DataFrame([
['T1', ['Hot Dogs', 'Buns', 'Ketchup']],
['T2', ['Hot Dogs', 'Buns']],
['T3', ['Hot Dogs', 'Coke', 'Chips']],
['T4', ['Chips', 'Coke']],
['T5', ['Chips', 'Ketchup']],
['T6', ['Hot Dogs', 'Coke', 'Chips']]
], columns=['Transaction', 'itemset'])
encoder = TransactionEncoder()
encoder.fit(data['itemset'])
df = pd.DataFrame(data = encoder.transform(data['itemset']), columns=encoder.columns_, dtype=int)
apriori(df, min_support=0.3, use_colnames=True)