# Chapter 4: Trees and Graph

## 知识解析

### **4.4 Graph的表述**

**类似于Tree，Graph主要用于描述事物之间的关系，只要每个节点的parent可能有多个（常见），或者有可能存在cycle，就必须用graph而不是tree来解决。Graph由Vertices (点)和 Edges (边) 组成**

{% hint style="info" %}
Graph有4种表述方式  (representation) . 这4种表述方式都是完备的，也就是说给出其中任何一个表述方式，我都可以画出这个图而不会有任何歧义。
{% endhint %}

**4.4.1 用 object `GraphNode`代表vertex，用 array of `GraphNode`代表从vertex出发的edges指向的children node, 即**

```python
class GraphNode:
  def __init__(val):
    self.val = val
    self.children: List[GraphNode] = [] # array of GraphNode
```

**4.4.2 【基准表述】用 2d array 的index （或者hash的key）代表vertex的id，用subarray代表edges指向的children ids， 即**

```python
edge_matrix = [
  [ 1, 2 ], # children ids for node 0
  [ 2 ],    # children ids for node 1
  [ ],      # children ids for node 2
]

# or use a hash
vertice_children_map = {
  0: [1,2],
  1: [2]
}
```

**4.4.3 用 2d array 的index 代表vertex的id，用subarray中value为1的index来代表children id，即**

```python
edge_matrix = [
  [ 0, 1, 1],  # [1,2] are children ids for node 0
  [ 0, 0, 1],  # [2] are children ids for node 1
  [ ],         # [] are children ids for node 2
]
```

**4.4.4 用 0 \~ n 的整数代表vertex的id, 用array of pairs代表所有的edges, 每个pair代表指出和指入的vertex id，一般先转换成 4.4.2 基准表述再进行处理。即**

```python
vertex_ids = range(0, 3)
edges = [
  [0, 1],  # node 0 -> node 1
  [0, 2],  # node 0 -> node 2
  [1, 2],  # node 1
]

# we need to iterate edges in order to get the children for each node
vertex_children_map = defaultdict(list)
for edge in edges:
  vertex_children_map[edge[0]].append(edge[1])

# vertex_children_map will now behave the same as 4.4.2  
```

## 模式识别

### **4.5 BFS 和 DFS**

{% hint style="warning" %}
访问顺序与深度有关的问题，比如**寻找最短路径，最长路径，按层访问记录的问题，用BFS**；**其他问题，用DFS**。对于许多应用题，DFS相当于在backtracking节点的空间，只不过一个节点可能由多个节点访问到而已。
{% endhint %}

### **4.8 Graph Traversal**

{% hint style="warning" %}
几乎所有的Graph问题都是在Graph Traversal的基础之上。与Tree Traversal的不同点在于，Graph 需要用 `visited` 这一collection 来避免重复访问（视具体问题，非严格必须）；特定地，DFS需要用 `visiting` 或  `indegrees`这一collection 来探测 cycle（必须，避免死循环）。一个节点在 `visiting` 代表着这个节点或者这个节点出发的subgraph正在被访问，因此可以被去除；而在 `visited` 中仅代表这个节点曾经被访问过，因此一旦加入就不会被去除。\
\
**注意：如果需要检测cycle，BFS需要用topological sort的方法。(详见4.9)**\
\
另外，因为Graph可能存在多个root，因此可以循环从任意点出发访问，用 `visited` 避免重复地穷尽所有的节点。
{% endhint %}

{% hint style="success" %}
如果需要按顺序记录访问的路径（e.g. topological sort)，DFS可以把`visited` 改成`OrderedDict` 来正向记录当前section或当前路径访问过的节点，而BFS可以把 `visited` 改成 `parent_map` 或者  `depth_map` 这个hash table 来反向记录 parent 或 深度，前者最后通过backtracking 来结算这一路径。
{% endhint %}

{% hint style="success" %}
【TODO】对于问题的潜在解之间存在重复部分的情况，DFS的return值必须体现subproblem的结果，应该将 `visited` 改成 hash 来实现考虑memoization，也就是 `id` 作为hash的key，return值也就是子问题的解，作为hash的value。这一点在Chapter 5中会详细描述。
{% endhint %}

```python
# [Template] (assume we have the standard representation of id => child_ids) 

class Graph:
  def __init__(self, ids, vertex_children_map):
    self.ids = ids
    # could be 2d array or hash of arrays
    self.vertex_children_map = vertext_children_map
    self.visited = set()  # OrderedDict or Dict if we need the path
    self.visiting = set()
    
  def traverse(self):
    try:
      for id in self.ids:
        self.dfs(id)    # or bfs(id)
    except Exception:
      return # optional
    
    return
  
  """
  DFS
  """
  def dfs(self, id: int):
    if id in visited: return
    if id in visiting:
      raise Exception('has cycle')
    
    visiting.add(id)
    
    child_ids = vertext_children_map[id]
    for child_id in child_ids:
      dfs(child_id)
      
    visiting.remove(id)
    visited.add(id)  # visited[id] = id for path using OrderedDict
    return
    
  """
  BFS without cycle detection
  """
  def bfs(self, id: int):
    if id in visited: return
    queue = [id]
    visited.add(id)
    while len(queue) != 0:
      id = queue.pop(0)
      
      child_ids = vertex_children_map[id]
      for child_id in child_ids:
        if id not in visited:
          # use parent[child_id] = id for path
          # use depth[child_id] = depth[child_id] + 1 for depth
          visited.add(child_id)  
          queue.append(child_id)
    
    return    
  
```

> *e.g.1* *There are `N` rooms and you start in room `0`.  Each room may have some keys to access the next room. Return `true` if and only if you can enter every room. (*[*LeetCode: Keys and Rooms*](https://leetcode.com/problems/keys-and-rooms/)*)*

```python
class Solution:
    def canVisitAllRooms(self, rooms: List[List[int]]) -> bool:
        self.visited = set()
        self.visiting = set() # not entirely necessary, just as demo
        self.vertex_children_map = rooms
        self.traverse(0)
        return len(self.visited) == len(rooms)

    def traverse(self, id):
        if id in self.visited:
            return
        if id in self.visiting:
            return

        self.visiting.add(id)
        child_ids = self.vertex_children_map[id]
        for child_id in child_ids:
            self.traverse(child_id)
        self.visiting.remove(id)
        self.visited.add(id)
```

*e.g.2* *There are a total of numCourses courses you have to take, labeled from 0 to numCourses-1.Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: \[0,1]. Given the total number of courses and a list of prerequisite pairs, is it possible for you to finish all courses? (* [*LeetCode: Course Schedule*](https://leetcode.com/problems/course-schedule/) *)*

```python
class Solution:
    def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
        self.ids = range(numCourses)
        self.vertex_children_map = defaultdict(list)
        for pair in prerequisites:
            self.vertex_children_map[pair[1]].append(pair[0])
        self.visited = set()
        self.visiting = set()

        try:
            for id in self.ids:
                self.traverse(id)
        except Exception:
            return False

        return len(self.visited) == len(self.ids)

    def traverse(self, id: int):
        if id in self.visited:
            return
        if id in self.visiting:
            raise Exception('has cycle')

        self.visiting.add(id)
        child_ids = self.vertex_children_map[id]
        for child_id in child_ids:
            self.traverse(child_id)
        self.visiting.remove(id)
        self.visited.add(id)
```

### **4.9 Topological Sort**

{% hint style="info" %}
Topological Sort是指对于一个Graph，找到一个sequence使得任何一条edge的parent，都出现在这条edge的child之前。\
\
一种方法是最后把 `visited` 反向，最后 `visited` 就是一条满足条件的sequence；\* 第二种方法是用 [Kahn's Algorithm for Topological Sorting](https://en.wikipedia.org/wiki/Topological_sorting#Kahn%27s_algorithm) （仅供参考）
{% endhint %}

```python
# *Kahn's Algorithm

L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edge

while S is not empty do
    remove a node x from S
    add x to L
    for each x's children y  do
        decrement y's indegree count
        if y has 0 indegree then
            insert y into S

if graph has edges then
    return error   (graph has at least one cycle)
else 
    return L   (a topologically sorted order)


# Python Code

class Graph:
  def __init__(self, ids, edges: List[List[int]]):
    self.ids = ids
    # could be 2d array or hash of arrays
    self.edges = edges
    
    self.indegrees = Counter()
    self.vertex_children_map = defaultdict(list)
    for edge in edges:
      self.indegrees[edge[1]] += 1
      self.vertex_children_map[edge[0]].append(edge[1])
      
    self.visited = []
    
  def traverse(self):
    queue = []
    for ids in self.ids:
      if self.indegrees[id] == 0:
        queue.append(id)
    
    while len(queue) != 0:
      id = queue.pop(0)
      visited.append(id)
      
      child_ids = vertex_children_map[id]
      for child_id in child_ids:
        indegrees[id] -= 1
        if indegrees[id] == 0:
          queue.append(id)
      
    return visited if visited.length == self.ids.length else None # has cycle
```

> *e.g.1&#x20;*&#x20;*There are a total of n courses you have to take labelled from 0 to n - 1. Some courses may have prerequisites, for example, if prerequisites\[i] = \[ai, bi] this means you must take the course bi before the course ai. Given the total number of courses numCourses and a list of the prerequisite pairs, return the ordering of courses you should take to finish all courses. (* [*LeetCode: Course Schedule II*](https://leetcode.com/problems/course-schedule-ii/) *)*

```python
class Solution:
    def findOrder(self, numCourses: int, prerequisites: List[List[int]]) -> List[int]:
        self.ids = range(numCourses)
        self.vertex_children_map = defaultdict(list)
        for pair in prerequisites:
            self.vertex_children_map[pair[1]].append(pair[0])
        self.visited = OrderedDict() # change from course schedule I
        self.visiting = set()
        
        try:
            for id in self.ids:
                self.traverse(id)
        except Exception:
            return [] # change from course schedule I

        return reversed(self.visited.keys()) # change from course schedule I

    def traverse(self, id: int):
        if id in self.visited:
            return
        if id in self.visiting:
            raise Exception('has cycle')

        self.visiting.add(id)
        child_ids = self.vertex_children_map[id]
        for child_id in child_ids:
            self.traverse(child_id)
        self.visiting.remove(id)
        self.visited[id] = id # change from course schedule I
```

> *e.g.2&#x20;*&#x20;*There is a new alien language which uses the latin alphabet. However, the order among letters are unknown to you. You receive a list of **non-empty** words from the dictionary, where **words are sorted lexicographically by the rules of this new language**. Derive the order of letters in this language. (*[*LeetCode: Alien Dictionary*](https://leetcode.com/problems/alien-dictionary/)*)*

```python
class Solution:
    def get_edge_from_words(self, word_a, word_b) -> Optional[List[str]]:
        length = min(len(word_a), len(word_b))
        for idx in range(length):
            if word_a[idx] != word_b[idx]:
                return [word_a[idx], word_b[idx]]

        if len(word_a) > len(word_b):
            raise Exception(
                '{} should not appear before {}'.format(word_a, word_b))
        return None

    def alienOrder(self, words: List[str]) -> str:
        self.vertex_children_map = defaultdict(list)
        self.ids = set()
        for word in words:
            self.ids.update(list(word))

        try:
            for idx in range(1, len(words)):
                edge = self.get_edge_from_words(words[idx - 1], words[idx])
                if edge:
                    self.vertex_children_map[edge[0]].append(edge[1])

            self.visited = OrderedDict()
            self.visiting = set()
            for id in self.ids:
                self.traverse(id)
        except Exception:
            return ''

        return ''.join(reversed(self.visited.keys()))

    def traverse(self, id):
        if id in self.visited:
            return
        if id in self.visiting:
            raise Exception('cycle ocurrs')

        self.visiting.add(id)
        child_ids = self.vertex_children_map[id]
        for child_id in child_ids:
            self.traverse(child_id)

        self.visiting.remove(id)
        self.visited[id] = id
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://smilence-yu.gitbook.io/coding/chapter-4-trees-and-graph.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
