
ZOJ Problem Set  4007
Consider a famous problem in the field of semisupervised learning: Let's say we have a complete graph with \(n = l + u\) vertices. For the first \(l\) vertices, the \(i\)th of them can be represented as \((x_i, y_i)\), where \(x_i\) is the feature vector and \(y_i\) is the label. As for the following \(u\) vertices, the \(j\)th of them can only be represented as \(x_{l+j}\), as its label is unknown. The weight of the edge connecting vertex \(i\) and vertex \(j\) is calculated by the formula \[w_{ij} = \exp(\frac{x_ix_j^2_2}{2\sigma^2})\] where \(x_ix_j_2\) is the L2 norm (or length) of the vector \((x_ix_j)\), and \(\sigma\) is a hyperparameter defined by the user. What we need to do is to assign labels \(y_{l+1}, y_{l+2}, \dots, y_{l+u}\) to the last \(u\) vertices and minimize the following objective function (which is called the energy function): \[E(f) = \frac{1}{2}\sum_{i=1}^n\sum_{j=1}^nw_{ij}(f(i)f(j))^2\] where \(f(i)\) is the \(i\)th element in the vector \(f\), and \[f = \begin{bmatrix} f_l \\ f_u \end{bmatrix} \quad f_l = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_l \end{bmatrix} \quad f_u = \begin{bmatrix} y_{l+1} \\ y_{l+2} \\ \vdots \\ y_{l+u} \end{bmatrix}\] By some mathematical deduction, we can easily discover that the optimal \(f_u\) can be calculated by the following formula: \[f_u = (D_{uu}  W_{uu})^{1}W_{ul}f_l\] where \((D_{uu}  W_{uu})\) is the Laplacian matrix, and \[d_i = \sum_{j=1}^nw_{ij} \quad D_{uu} = \text{diag}(d_{l+1}, d_{l+2}, \dots, d_{l+u})\] \[W_{uu} = \begin{bmatrix} w_{(l+1)(l+1)} & w_{(l+1)(l+2)} & \dots & w_{(l+1)(l+u)} \\ w_{(l+2)(l+1)} & w_{(l+2)(l+2)} & \dots & w_{(l+2)(l+u)} \\ \vdots & \vdots & \ddots & \vdots \\ w_{(l+u)(l+1)} & w_{(l+u)(l+2)} & \dots & w_{(l+u)(l+u)} \end{bmatrix} \quad W_{ul} = \begin{bmatrix} w_{(l+1)1} & w_{(l+1)2} & \dots & w_{(l+1)l} \\ w_{(l+2)1} & w_{(l+2)2} & \dots & w_{(l+2)l} \\ \vdots & \vdots & \ddots & \vdots \\ w_{(l+u)1} & w_{(l+u)2} & \dots & w_{(l+u)l} \end{bmatrix}\] But this problem is too hard for BaoBao to solve, so he comes up with an easier version of the problem: Let's say we are given a tree, not a complete graph, with \(n\) vertices. The \(i\)th vertex can be represented by \((x_i, y_i)\), and what's more, we have \(x_i = x_j\) for all \(1 \le i, j \le n\). The label of the \(i\)th vertex is equal to \(y_i\) if \(y_i = 0\) or \(y_i = 1\), and the label of the \(i\)th vertex is unknown if \(y_i = 1\). The weight of the edge connecting vertex \(i\) and vertex \(j\) (which is \(w_{ij}\)) is also calculated by the formula given above, and \(w_{ij} = 0\) if there is no edge between vertex \(i\) and \(j\). We still need to assign labels to the vertices with \(y_i = 1\) to minimize the objective function given above, but there is a constraint: the label we assign to the vertex must be either 0 or 1. Please calculate the minimum possible value of the objective function after all the vertices with unknown labels are properly assigned. InputThere are multiple test cases. The first line of the input contains an integer \(T\), indicating the number of test cases. For each test case: The first line contains an integer \(n\) (\(2 \le n \le 10^5\)), indicating the size of the tree. The second line contains \(n\) integers \(y_1, y_2, \dots, y_n\) (\(1 \le y_i \le 1\)), indicating the label of each vertex. If \(y_i = 1\) then the label of that vertex is unknown. For the following \((n1)\) lines, the \(i\)th line contains two integers \(u_i\) and \(v_i\) (\(1 \le u_i, v_i \le n\)), indicating an edge between vertex \(u_i\) and \(v_i\) in the tree. It's guaranteed that the given graph is a tree, and the sum of \(n\) over all test cases will not exceed \(10^6\). As the stack of the online judge system is limited to 8MB, the maximum diameter of the tree is limited to about \(6 \times 10^4\). OutputFor each test case output one line containing one integer, indicating the minimum possible value of the objective function after assigning all the vertices with unknown labels properly. Sample Input2 6 1 0 1 1 1 0 1 2 2 4 1 3 3 5 3 6 3 1 1 1 1 2 2 3 Sample Output2 0 HintFor the first sample test case, we can assign \(y_3 = 1\) and \(y_4 = 0\). For the second sample test case, just assign all the labels to the same value. Author: WENG, Caizhi Source: ZOJ Monthly, March 2018 