图中连通块问题

一、序言

连通块问题是图的搜索算法中比较经典的一类问题了。

判断两点是否连通、判断连通块个数等问题都是我们在图问题中经常碰到的。

在这里着重讨论DFS和并查集对于求解此类问题的帮助，本文将从各种例子出发，从各个细节展示对此类问题的把握。

tips:阅读本文前请对图的搜索算法（dfs和bfs）有所掌握，此部分的介绍将简略。

Now,Let’s go!

二、Analysis

Problem One

Given a 2D board containing ‘X’ and ‘O’ (the letter O), capture all regions surrounded by ‘X’.

A region is captured by flipping all 'O’s into 'X’s in that surrounded region.

Example:

X X X X
X O O X
X X O X
X O X X

After running your function, the board should be:

X X X X
X X X X
X X X X
X O X X

Explanation:

Surrounded regions shouldn’t be on the border, which means that any ‘O’ on the border of the board are not flipped to ‘X’. Any ‘O’ that is not on the border and it is not connected to an ‘O’ on the border will be flipped to ‘X’. Two cells are connected if they are adjacent cells connected horizontally or vertically.

本质上，这是一个规则的矩形图。

1、结点构造

首先我们先表示每个顶点

（1）2维矩阵中每个点表示为（i，j）的形式，我们可以通过i * 列数 + j使其变为唯一的一维数

（2）新建一个Node类

以java为例，需要重写euqals等

class Node {
    
    
    private int i;
    private int j;
    
    public Node(int i, int j) {
    
    
        this.i = i;
        this.j = j;
    }
    
    @Override
    public int hashCode() {
    
    
        return i + j;
    }
    
    @Override
    public boolean equals(Object obj) {
    
    
        if (obj instanceof Node) {
    
    
            return ((Node) obj).getI() == i && ((Node) obj).getJ() == j;
        }
        return false;
    }
    
    public int getI() {
    
    
        return i;
    }
    
    public int getJ() {
    
    
        return j;
    }
}

显然，方法一简单，方法二效率略低但表示更明确，适用于更复杂结点的问题

本例采用（1）更好

2、边构造

对于图的边搭建一般有两种

（1）一种是经典的邻接矩阵、邻接链表一类的。

以

X O
X X

为例，

表示为NODE(0,0)–NODE(1,0) NODE(1,0)–NODE(1,1)的无向图

（2）本题由于是规则的矩形图，其邻接范围无非上、下、左、右，因此对边的描述可以简化

其实上面都不是重点

3、策略

现在我们来分析一下这道题的核心策略。

被‘X’包围的’O‘，即对所有的’O’判断是否周围被’X’环绕。

但是，这是一个极其复杂的问题，搜索复杂度与O的数量以及矩阵大小密切相关。

不妨换个思路，是不是找出不被‘X’环绕的’O’呢？？

其实不难发现，不被’X’环绕的‘O’必与边界的‘O’连通，也就是说我们只需从边界的’O’进行搜索即可，问题也就变的简单了。

搜索策略自然很容易想到bfs、dfs之类的，当然并查集也是一种不错的方法。

1、DFS

从边界的’O’出发，向上、下、左、右搜索，如果遇到’O’，进一步搜索，且标记该’O‘即可。

边界及与边界的连通点标记为’#‘，剩下的’O’为被’X’环绕。

此时剩下’O’变为’X‘，’#‘变为’O’，即可完成。

class Solution {
    
    
    //存放二维矩阵（区域）
    private char[][] board;
    
    public void solve(char[][] board) {
    
    
        this.board = board;
        
        if (board == null || board.length == 0) {
    
    
            return;
        }
        
        //从边界dfs
        for (int i = 0; i < board.length; i++) {
    
    
            if (board[i][0] == 'O') {
    
    
                search(i, 0);
            }
            if (board[i][board[0].length - 1] == 'O') {
    
    
                search(i, board[0].length - 1);
            }
        }
        
        for (int j = 0; j < board[0].length; j++) {
    
    
            if (board[0][j] == 'O') {
    
    
                search(0, j);
            }
            if (board[board.length - 1][j] == 'O') {
    
    
                search(board.length - 1, j);
            }
        }
        
        //转变
        for (int i = 0; i < this.board.length; i++) {
    
    
            for (int j = 0; j < this.board[0].length; j++) {
    
    
                if (this.board[i][j] == 'O') {
    
    
                    this.board[i][j] = 'X';
                }
                if (this.board[i][j] == '#') {
    
    
                    this.board[i][j] = 'O';
                }
            }
        }
    }
    
   private void search(int i, int j) {
    
    
        if (i < board.length && i >= 0 && j >= 0 && j < board[0].length && board[i][j] == 'O') {
    
    
            board[i][j] = '#';
            //事实上，只需向下和向右搜索
            search(i + 1, j);
            search(i - 1, j);
            search(i, j + 1);
            search(i, j - 1);
        }
    }
}

2、Union-Find

首先，什么是并查集？

并查集是一种树型的数据结构，用于处理一些不交集（Disjoint Sets）的合并及查询问题。有一个联合-查找算法（union-find algorithm）定义了两个用于此数据结构的操作：

Find：确定元素属于哪一个子集。这个确定方法就是不断向上查找找到它的根节点，它可以被用来确定两个元素是否属于同一子集。
Union：将两个子集合并成同一个集合。

具体可以参见算法书或者https://www.cnblogs.com/MrSaver/p/9607552.html本博文

在这里给出并查集的java表示，采用了高效的路径压缩。

//Node类参照上文
class UnionFind {
    
    
    HashMap<Node, Node> parent = new HashMap<>();
    HashMap<Node, Integer> size = new HashMap<>();
    
    public void union(Node node1, Node node2) {
    
    
        if (!parent.containsKey(node1)) {
    
    
            parent.put(node1, node1);
            size.put(node1, 1);
        }
    
        if (!parent.containsKey(node2)) {
    
    
            parent.put(node2, node2);
            size.put(node2, 1);
        }
        
        Node root1 = find(node1);
        Node root2 = find(node2);
        if (root1.equals(root2)) {
    
    
            return;
        }
        if (size.get(root1) < size.get(root2)) {
    
    
            size.put(root2, size.get(root1) + size.get(root2));
            parent.put(root1, root2);
        } else {
    
    
            size.put(root1, size.get(root1) + size.get(root2));
            parent.put(root2, root1);
        }
    }
    
    public Node find(Node node) {
    
    
        if (!parent.containsKey(node)) {
    
    
            parent.put(node, node);
        }
        while (!node.equals(parent.get(node))) {
    
    
            node = parent.get(node);
        }
        return node;
    }
    
    public boolean connected(Node node1, Node node2) {
    
    
        return find(node1).equals(find(node2));
    }
}

使用并查集可以找到与边界‘O’连通的所有‘O’。

方法很简单，采用虚拟点的思想来降低算法复杂度。

即把边界点与一个虚拟点连一起，与边界点连通的点与边界点相连，即所有非’X’环绕点都与一个虚拟点相连

class Solution {
    
    
    public void solve(char[][] board) {
    
    
        if (board == null || board.length == 0) {
    
    
            return;
        }
        UnionFind uf = new UnionFind();
        for (int i = 0; i < board.length; i++) {
    
    
            for (int j = 0; j < board[0].length; j++) {
    
    
                if (board[i][j] == 'O') {
    
    
                    if (i == 0 || i == board.length - 1 || j == 0 || j == board[0].length - 1) {
    
    
                        uf.union(new Node(i, j), new Node(-1, -1));
                    } else {
    
    
                        if (i > 0 && board[i - 1][j] == 'O')
                            uf.union(new Node(i, j), new Node(i - 1, j));
                        if (i < board.length - 1 && board[i + 1][j] == 'O')
                            uf.union(new Node(i, j), new Node(i + 1, j));
                        if (j > 0 && board[i][j - 1] == 'O')
                            uf.union(new Node(i, j), new Node(i, j - 1));
                        if (j < board.length - 1 && board[i][j + 1] == 'O')
                            uf.union(new Node(i, j), new Node(i, j + 1));
                    }
                }
            }
        }
        for (int i = 0; i < board.length; i++) {
    
    
            for (int j = 0; j < board[0].length; j++) {
    
    
                if (uf.connected(new Node(i, j), new Node(-1, -1))) {
    
    
                    // 和dummyNode 在一个连通区域的,那么就是O；
                    board[i][j] = 'O';
                } else {
    
    
                    board[i][j] = 'X';
                }
            }
        }
    
    }
}

但是据我所知，此方法无法通过leetcode上的最后一个case，理由是超时。

考虑构造点的问题，我们改用了简单的构造方法，通过了测试。

class Solution {
    
    
    public void solve(char[][] board) {
    
    
        if (board == null || board.length == 0) {
    
    
            return;
        }
        UnionFind uf = new UnionFind();
        for (int i = 0; i < board.length; i++) {
    
    
            for (int j = 0; j < board[0].length; j++) {
    
    
                if (board[i][j] == 'O') {
    
    
                    if (i == 0 || i == board.length - 1 || j == 0 || j == board[0].length - 1) {
    
    
                        uf.union(i * board[0].length + j, -1);
                    } else {
    
    
                        if (i > 0 && board[i - 1][j] == 'O')
                            uf.union(i * board[0].length + j, (i - 1) * board[0].length + j);
                        if (i < board.length - 1 && board[i + 1][j] == 'O')
                            uf.union(i * board[0].length + j, (i + 1) * board[0].length + j);
                        if (j > 0 && board[i][j - 1] == 'O')
                            uf.union(i * board[0].length + j, i * board[0].length + j - 1);
                        if (j < board.length - 1 && board[i][j + 1] == 'O')
                            uf.union(i * board[0].length + j, i * board[0].length + j + 1);
                    }
                }
            }
        }
        for (int i = 0; i < board.length; i++) {
    
    
            for (int j = 0; j < board[0].length; j++) {
    
    
                if (uf.connected(i * board[0].length + j, -1)) {
    
    
                    // 和dummyNode 在一个连通区域的,那么就是O；
                    board[i][j] = 'O';
                } else {
    
    
                    board[i][j] = 'X';
                }
            }
        }
        
    }
}

class UnionFind {
    
    
    HashMap<Integer, Integer> parent = new HashMap<>();
    HashMap<Integer, Integer> size = new HashMap<>();
    
    public void union(int node1, int node2) {
    
    
        if (!parent.containsKey(node1)) {
    
    
            parent.put(node1, node1);
            size.put(node1, 1);
        }
        
        if (!parent.containsKey(node2)) {
    
    
            parent.put(node2, node2);
            size.put(node2, 1);
        }
        
        int root1 = find(node1);
        int root2 = find(node2);
        if (root1 == root2) {
    
    
            return;
        }
        if (size.get(root1) < size.get(root2)) {
    
    
            size.put(root2, size.get(root1) + size.get(root2));
            parent.put(root1, root2);
        } else {
    
    
            size.put(root1, size.get(root1) + size.get(root2));
            parent.put(root2, root1);
        }
    }
    
    public int find(int node) {
    
    
        if (!parent.containsKey(node)) {
    
    
            parent.put(node, node);
        }
        while (!(node == (parent.get(node)))) {
    
    
            node = parent.get(node);
        }
        return node;
    }
    
    public boolean connected(int node1, int node2) {
    
    
        return find(node1) == find(node2);
    }
}

Problem Two

Given a 2d grid map of '1’s (land) and '0’s (water), count the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically. You may assume all four edges of the grid are all surrounded by water.

Example 1:

Input:
11110
11010
11000
00000

Output: 1

这个问题和第一个问题应该非常类似，都是二维矩阵区域内的图问题。

通过Problem One我们已经了解了此类问题的搜索方法和并查集的uniton方法，即可在无需特地构边的情况下，直接遍历Node的上、下、左、右方向完成搜索和uniton，对于这种结构特殊的图问题，我们可以总结其边的规律，以达到更好得构造算法的目的，不必一未得想去构造标准的邻接矩阵，使简单的问题复杂化。

对于本题的点构造与边构造与Problem One如出一辙，这里不再赘述。

笔者直接开始策略分析。

策略

岛屿的数量很容易发现就是’1’的点构成的连通块的数量。

对于连通块数量的计算，我们很容易想到采用并查集，简单直接。

策略一：并查集

判断‘1‘的node有多少个top点，即找并查集树林中有几棵树，即为岛屿数量。

首先我们在Problem One的并查集基础上增加寻找树的数量的method。

class UnionFind {
    
    
    HashMap<Integer, Integer> parent = new HashMap<>();
    HashMap<Integer, Integer> size = new HashMap<>();
    HashSet<Integer> top = new HashSet<>();   //新增top的node的搜索
    
    public UnionFind() {
    
    }
    
    public UnionFind(char[][] grid, int row, int col) {
    
       //新增top初始化
        for (int i = 0; i < row; i++) {
    
    
            for (int j = 0; j < col; j++) {
    
    
                if (grid[i][j] == '1') {
    
    
                    top.add((i * col + j));
                }
            }
        }
    }
    
    /*
     *	对于被合并的node，从top中移除
     */
    public void union(int node1, int node2) {
    
    
        if (!parent.containsKey(node1)) {
    
    
            parent.put(node1, node1);
            size.put(node1, 1);
        }
        
        if (!parent.containsKey(node2)) {
    
    
            parent.put(node2, node2);
            size.put(node2, 1);
        }
        
        int root1 = find(node1);
        int root2 = find(node2);
        if (root1 == root2) {
    
    
            return;
        }
        if (size.get(root1) < size.get(root2)) {
    
    
            size.put(root2, size.get(root1) + size.get(root2));
            parent.put(root1, root2);
            top.remove(root1);
        } else {
    
    
            size.put(root1, size.get(root1) + size.get(root2));
            parent.put(root2, root1);
            top.remove(root2);
        }
    }
    
    public int find(int node) {
    
    
        if (!parent.containsKey(node)) {
    
    
            parent.put(node, node);
        }
        while (!(node == (parent.get(node)))) {
    
    
            node = parent.get(node);
        }
        return node;
    }
    
    public boolean connected(int node1, int node2) {
    
    
        return find(node1) == find(node2);
    }
    
    public int getTop() {
    
    
        return top.size();
    }
}

接下来，我们就可以非常简单的应用Uniton-Find，进行添加即可。

class Solution {
    
    
    private char[][] grid;
    private int row;
    private int col;
    
    public int numIslands(char[][] grid) {
    
    
        if (grid == null || grid.length == 0) {
    
    
            return 0;
        }
        
        int row = grid.length;
        int col = grid[0].length;
        UnionFind uf = new UnionFind(grid, col, row);
        
        //union
        for (int i = 0; i < row; i++) {
    
    
            for (int j = 0; j < col; j++) {
    
    
                if (grid[i][j] == '1') {
    
    
                    if (j + 1 < col && grid[i][j + 1] == '1') {
    
    
                        uf.union(i * col + j, i * col + j + 1);
                    }
                    
                    if (i + 1 < row && grid[i + 1][j] == '1') {
    
    
                        uf.union((i + 1) * col + j, i * col + j);
                    }
                    
                }
            }
        }
        
        return uf.getTop();
    }
}

当然，我们怎么可能能忘记DFS的教导呢

策略二：DFS

在这里，介绍一个小技巧

回想Problem One中，搜索过程中我们对部分’O‘进行标记，这种对dfs过的路径进行标记，以保证下次不被访问的方法可以称为“格式化”。

本题，我们可以通过把从某点出发DFS经过的区域标记为’0’,保证下回不再搜索此点，即“格式化”经过的区域。

每次对一个点进行搜索，连通块个数加1，然后对与此点连通的所有点格式化，下次从未格式化的点再开始搜索，个数加1…

class Solution {
    
    
    private char[][] grid;
    private int row;
    private int col;
   
    public int numIslands(char[][] grid) {
    
    
        int count = 0;
        this.grid = grid;
        if (grid == null || grid.length == 0) {
    
    
            return count;
        }
        this.row = grid.length;
        this.col = grid[0].length;
        for (int i = 0; i < row; i++) {
    
    
            for (int j = 0; j < col; j++) {
    
    
                if (grid[i][j] == '1') {
    
    
                    count++;
                    dfs(i, j);
                }
            }
        }
        return count;
    }
    
    public void dfs(int i, int j) {
    
    
        if (i >= row || i < 0 || j >= col || j < 0 || grid[i][j] != '1') {
    
    
            return;
        }
        //！！！！！！！！！！！！！！格式化
        grid[i][j] = '0';
        dfs(i - 1, j);
        dfs(i + 1, j);
        dfs(i, j + 1);
        dfs(i, j - 1);
    }
}

最后，希望大家通过这个问题加深对DFS和并查集对处理连通块问题的能力。

Problem Three

There are N students in a class. Some of them are friends, while some are not. Their friendship is transitive in nature. For example, if A is a direct friend of B, and B is a direct friend of C, then A is an indirect friend of C. And we defined a friend circle is a group of students who are direct or indirect friends.

Given a N*N matrix M representing the friend relationship between students in the class. If M[i][j] = 1, then the ith and jth students are direct friends with each other, otherwise not. And you have to output the total number of friend circles among all the students.

Example 1:
Input:
[[1,1,0],
[1,1,0],
[0,0,1]]
Output: 2
Explanation:The 0th and 1st students are direct friends, so they are in a friend circle. The 2nd student himself is in a friend circle. So return 2.

这次题目直接给的是邻接矩阵，无论是点构造，还是边构造都已经提前完成了。

本题变成了已知邻接矩阵判断连通块数量。

策略一：并查集

对于此问题，使用并查集非常简单，这里直接上代码

class Solution {
    
    
    public int findCircleNum(int[][] M) {
    
    
        if (M == null || M.length == 0) {
    
    
            return 0;
        }
        UnionFind uf = new UnionFind(M);
        return uf.getTop();
    }
    
}

class UnionFind {
    
    
    HashMap<Integer, Integer> parent = new HashMap<>();
    HashMap<Integer, Integer> size = new HashMap<>();
    HashSet<Integer> top = new HashSet<>();
    
    public UnionFind() {
    
    }
    
    //根据邻接矩阵，进行union
    public UnionFind(int[][] matrix) {
    
    
        for (int i = 0; i < matrix.length; i++) {
    
    
            top.add(i);
        }
        for (int i = 0; i < matrix.length; i++) {
    
    
            for (int j = 0; j < matrix[0].length; j++) {
    
    
                if (matrix[i][j] == 1) {
    
    
                    union(i, j);
                }
            }
        }
        
    }
    
    public void union(int node1, int node2) {
    
    
        if (!parent.containsKey(node1)) {
    
    
            parent.put(node1, node1);
            size.put(node1, 1);
        }
        
        if (!parent.containsKey(node2)) {
    
    
            parent.put(node2, node2);
            size.put(node2, 1);
        }
        
        int root1 = find(node1);
        int root2 = find(node2);
        if (root1 == root2) {
    
    
            return;
        }
        if (size.get(root1) < size.get(root2)) {
    
    
            size.put(root2, size.get(root1) + size.get(root2));
            parent.put(root1, root2);
            //top.add(root2);
            top.remove(root1);
        } else {
    
    
            size.put(root1, size.get(root1) + size.get(root2));
            parent.put(root2, root1);
            //top.add(root1);
            top.remove(root2);
        }
    }
    
    public int find(int node) {
    
    
        if (!parent.containsKey(node)) {
    
    
            parent.put(node, node);
        }
        while (!(node == (parent.get(node)))) {
    
    
            node = parent.get(node);
        }
        return node;
    }
    
    public boolean connected(int node1, int node2) {
    
    
        return find(node1) == find(node2);
    }
    
    public int getTop() {
    
    
        return top.size();
    }
}

策略二：DFS

仍然考虑格式化的策略

**格式化node，就是把grid[node][another]从1变为0，然后进一步格式化another **

class Solution {
    
    
   
   /* 
    * 对于与其他点相连通的i，进行dfs
    */
    public int findCircleNum(int[][] grid) {
    
    
        int circleNum = 0;
        int len = grid.length;
        for(int i = 0 ; i < len; i++){
    
    
            for(int j = 0; j < len ;j++){
    
    
                if(grid[i][j] == 1){
    
    
                    circleNum++;
                    dfs(grid,i);
                }
            }
        }
        return circleNum;
    }
    
    /*
     * 搜索+格式化
     */
    public void dfs(int[][] grid, int x){
    
    
        int len = grid.length;
        grid[x][x] = 0;
        for(int i=1;i<len;i++){
    
    
            if(grid[x][i]==1){
    
    
                grid[x][i] = 0;
                grid[i][x] = 0;
                dfs(grid,i);
            }
        }
    }
}

当然，上述代码虽然清晰，但累赘。

我们可以考虑dfs常用的visited数组的策略去保证对每个node进行dfs，降低时间复杂度。

class Solution {
    
    
   private boolean[] visited;
    private int[][] visitMa;
    
    public int findCircleNum(int[][] M) {
    
    
        if (M == null || M.length == 0) {
    
    
            return 0;
        }
        visitMa = M;
        int length = visitMa.length;
        int count = 0;
        visited = new boolean[length];//访问标志
        
        //与上述代码复杂度相比，大大降低，因为visited数组的使用
        for(int i = 0;i < length;i++){
    
    
            if(visited[i] == false){
    
    			//如果未被访问
                DFS(i);				//深度优先搜索
                count++;			//朋友圈个数+1
            }
        }
        return count;
    }
    
    //深度优先搜索
    public void DFS(int i){
    
    
        visited[i] = true;
        for(int j = 0;j < visitMa.length;j++){
    
    
            if(visited[j] == false && visitMa[i][j] == 1){
    
    
                DFS(j);
            }
        }
    }
}