Hello everyone,

this is a theory question about finding noisy influences and automatically divorcing parents, and how to do this in Smile.

Nodes can be automatically divorced if they have a noisy definition: build a binary tree from the parents to the child, put the noisy influences into the nodes next to the parents and leave the leak in the child. The other probabilities are deterministic "pass-through" diagonal matrices.

Smile has builtin functionality to convert a general CPT into a noisy definition (i.e. chance nodes into noisy-MAX nodes). This works only properly if the states of the nodes involved are in the right order (parents need their 'false' state at last position, the child's states need to be completely ordered). They can be re-ordered by heuristics or by trying all possible combinations, and then seeing which noisy definition yields the CPT with the smalles KL divergence to the original one.

Another requirement of this to work is that the CPT does represent noisy-OR influences (or noisy-AND which makes no difference thanks to DeMorgan). Take for example the formula X = A or B or C. In the case of AND, we set X = A and B and C --> notX = notA or notB or notC. However, Smile's algorithm does not work if the CPT represents a noisy version of a nested formula like X = A or (B and C). In this case, we would need a method to extract noisy influences partially (one parent at a time). In this case, we could in a first step extract the noisy influence of A on X, and move the rest of the CPT to an auxiliary node Y:

X (noisy definition) = A or Y; Y (CPT definition) = B and C.

In a second step we could find that notY (noisy definition) = notB or notC.

Is there any chance Smile will provide a function for finding the noisy influence of a single parent? (Or, more general, for a subset of the parents, instead of the whole group.) Or maybe it is easier to implement the splitting of nodes? I think this also amounts to separating the parents' influences, so it means the same...

On the other hand, if someone wants to implement this "from the outside", does anyone have hints about how to construct such an algorithm? Any feedback will be appreciated most gratefully

I appended an example BN with the variables discussed above. The CPTs are simply deterministic.