Chapter 4: Trees and Graph
Tree 和 Graph 本质上是刻画事物之间的关系的data structure,根据关系的特性决定使用Tree还是Graph。无论输入给定了什么数据结构,只要路径空间是二维的,我们就应该使用这两种数据结构
知识解析
4.4 Graph的表述
类似于Tree,Graph主要用于描述事物之间的关系,只要每个节点的parent可能有多个(常见),或者有可能存在cycle,就必须用graph而不是tree来解决。Graph由Vertices (点)和 Edges (边) 组成
4.4.1 用 object GraphNode
代表vertex,用 array of GraphNode
代表从vertex出发的edges指向的children node, 即
class GraphNode:
def __init__(val):
self.val = val
self.children: List[GraphNode] = [] # array of GraphNode
4.4.2 【基准表述】用 2d array 的index (或者hash的key)代表vertex的id,用subarray代表edges指向的children ids, 即
edge_matrix = [
[ 1, 2 ], # children ids for node 0
[ 2 ], # children ids for node 1
[ ], # children ids for node 2
]
# or use a hash
vertice_children_map = {
0: [1,2],
1: [2]
}
4.4.3 用 2d array 的index 代表vertex的id,用subarray中value为1的index来代表children id,即
edge_matrix = [
[ 0, 1, 1], # [1,2] are children ids for node 0
[ 0, 0, 1], # [2] are children ids for node 1
[ ], # [] are children ids for node 2
]
4.4.4 用 0 ~ n 的整数代表vertex的id, 用array of pairs代表所有的edges, 每个pair代表指出和指入的vertex id,一般先转换成 4.4.2 基准表述再进行处理。即
vertex_ids = range(0, 3)
edges = [
[0, 1], # node 0 -> node 1
[0, 2], # node 0 -> node 2
[1, 2], # node 1
]
# we need to iterate edges in order to get the children for each node
vertex_children_map = defaultdict(list)
for edge in edges:
vertex_children_map[edge[0]].append(edge[1])
# vertex_children_map will now behave the same as 4.4.2
模式识别
4.5 BFS 和 DFS
访问顺序与深度有关的问题,比如寻找最短路径,最长路径,按层访问记录的问题,用BFS;其他问题,用DFS。对于许多应用题,DFS相当于在backtracking节点的空间,只不过一个节点可能由多个节点访问到而已。
4.8 Graph Traversal
几乎所有的Graph问题都是在Graph Traversal的基础之上。与Tree Traversal的不同点在于,Graph 需要用 visited
这一collection 来避免重复访问(视具体问题,非严格必须);特定地,DFS需要用 visiting
或 indegrees
这一collection 来探测 cycle(必须,避免死循环)。一个节点在 visiting
代表着这个节点或者这个节点出发的subgraph正在被访问,因此可以被去除;而在 visited
中仅代表这个节点曾经被访问过,因此一旦加入就不会被去除。
注意:如果需要检测cycle,BFS需要用topological sort的方法。(详见4.9)
另外,因为Graph可能存在多个root,因此可以循环从任意点出发访问,用 visited
避免重复地穷尽所有的节点。
如果需要按顺序记录访问的路径(e.g. topological sort),DFS可以把visited
改成OrderedDict
来正向记录当前section或当前路径访问过的节点,而BFS可以把 visited
改成 parent_map
或者 depth_map
这个hash table 来反向记录 parent 或 深度,前者最后通过backtracking 来结算这一路径。
【TODO】对于问题的潜在解之间存在重复部分的情况,DFS的return值必须体现subproblem的结果,应该将 visited
改成 hash 来实现考虑memoization,也就是 id
作为hash的key,return值也就是子问题的解,作为hash的value。这一点在Chapter 5中会详细描述。
# [Template] (assume we have the standard representation of id => child_ids)
class Graph:
def __init__(self, ids, vertex_children_map):
self.ids = ids
# could be 2d array or hash of arrays
self.vertex_children_map = vertext_children_map
self.visited = set() # OrderedDict or Dict if we need the path
self.visiting = set()
def traverse(self):
try:
for id in self.ids:
self.dfs(id) # or bfs(id)
except Exception:
return # optional
return
"""
DFS
"""
def dfs(self, id: int):
if id in visited: return
if id in visiting:
raise Exception('has cycle')
visiting.add(id)
child_ids = vertext_children_map[id]
for child_id in child_ids:
dfs(child_id)
visiting.remove(id)
visited.add(id) # visited[id] = id for path using OrderedDict
return
"""
BFS without cycle detection
"""
def bfs(self, id: int):
if id in visited: return
queue = [id]
visited.add(id)
while len(queue) != 0:
id = queue.pop(0)
child_ids = vertex_children_map[id]
for child_id in child_ids:
if id not in visited:
# use parent[child_id] = id for path
# use depth[child_id] = depth[child_id] + 1 for depth
visited.add(child_id)
queue.append(child_id)
return
e.g.1 There are
N
rooms and you start in room0
. Each room may have some keys to access the next room. Returntrue
if and only if you can enter every room. (LeetCode: Keys and Rooms)
class Solution:
def canVisitAllRooms(self, rooms: List[List[int]]) -> bool:
self.visited = set()
self.visiting = set() # not entirely necessary, just as demo
self.vertex_children_map = rooms
self.traverse(0)
return len(self.visited) == len(rooms)
def traverse(self, id):
if id in self.visited:
return
if id in self.visiting:
return
self.visiting.add(id)
child_ids = self.vertex_children_map[id]
for child_id in child_ids:
self.traverse(child_id)
self.visiting.remove(id)
self.visited.add(id)
e.g.2 There are a total of numCourses courses you have to take, labeled from 0 to numCourses-1.Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: [0,1]. Given the total number of courses and a list of prerequisite pairs, is it possible for you to finish all courses? ( LeetCode: Course Schedule )
class Solution:
def canFinish(self, numCourses: int, prerequisites: List[List[int]]) -> bool:
self.ids = range(numCourses)
self.vertex_children_map = defaultdict(list)
for pair in prerequisites:
self.vertex_children_map[pair[1]].append(pair[0])
self.visited = set()
self.visiting = set()
try:
for id in self.ids:
self.traverse(id)
except Exception:
return False
return len(self.visited) == len(self.ids)
def traverse(self, id: int):
if id in self.visited:
return
if id in self.visiting:
raise Exception('has cycle')
self.visiting.add(id)
child_ids = self.vertex_children_map[id]
for child_id in child_ids:
self.traverse(child_id)
self.visiting.remove(id)
self.visited.add(id)
4.9 Topological Sort
# *Kahn's Algorithm
L ← Empty list that will contain the sorted elements
S ← Set of all nodes with no incoming edge
while S is not empty do
remove a node x from S
add x to L
for each x's children y do
decrement y's indegree count
if y has 0 indegree then
insert y into S
if graph has edges then
return error (graph has at least one cycle)
else
return L (a topologically sorted order)
# Python Code
class Graph:
def __init__(self, ids, edges: List[List[int]]):
self.ids = ids
# could be 2d array or hash of arrays
self.edges = edges
self.indegrees = Counter()
self.vertex_children_map = defaultdict(list)
for edge in edges:
self.indegrees[edge[1]] += 1
self.vertex_children_map[edge[0]].append(edge[1])
self.visited = []
def traverse(self):
queue = []
for ids in self.ids:
if self.indegrees[id] == 0:
queue.append(id)
while len(queue) != 0:
id = queue.pop(0)
visited.append(id)
child_ids = vertex_children_map[id]
for child_id in child_ids:
indegrees[id] -= 1
if indegrees[id] == 0:
queue.append(id)
return visited if visited.length == self.ids.length else None # has cycle
e.g.1 There are a total of n courses you have to take labelled from 0 to n - 1. Some courses may have prerequisites, for example, if prerequisites[i] = [ai, bi] this means you must take the course bi before the course ai. Given the total number of courses numCourses and a list of the prerequisite pairs, return the ordering of courses you should take to finish all courses. ( LeetCode: Course Schedule II )
class Solution:
def findOrder(self, numCourses: int, prerequisites: List[List[int]]) -> List[int]:
self.ids = range(numCourses)
self.vertex_children_map = defaultdict(list)
for pair in prerequisites:
self.vertex_children_map[pair[1]].append(pair[0])
self.visited = OrderedDict() # change from course schedule I
self.visiting = set()
try:
for id in self.ids:
self.traverse(id)
except Exception:
return [] # change from course schedule I
return reversed(self.visited.keys()) # change from course schedule I
def traverse(self, id: int):
if id in self.visited:
return
if id in self.visiting:
raise Exception('has cycle')
self.visiting.add(id)
child_ids = self.vertex_children_map[id]
for child_id in child_ids:
self.traverse(child_id)
self.visiting.remove(id)
self.visited[id] = id # change from course schedule I
e.g.2 There is a new alien language which uses the latin alphabet. However, the order among letters are unknown to you. You receive a list of non-empty words from the dictionary, where words are sorted lexicographically by the rules of this new language. Derive the order of letters in this language. (LeetCode: Alien Dictionary)
class Solution:
def get_edge_from_words(self, word_a, word_b) -> Optional[List[str]]:
length = min(len(word_a), len(word_b))
for idx in range(length):
if word_a[idx] != word_b[idx]:
return [word_a[idx], word_b[idx]]
if len(word_a) > len(word_b):
raise Exception(
'{} should not appear before {}'.format(word_a, word_b))
return None
def alienOrder(self, words: List[str]) -> str:
self.vertex_children_map = defaultdict(list)
self.ids = set()
for word in words:
self.ids.update(list(word))
try:
for idx in range(1, len(words)):
edge = self.get_edge_from_words(words[idx - 1], words[idx])
if edge:
self.vertex_children_map[edge[0]].append(edge[1])
self.visited = OrderedDict()
self.visiting = set()
for id in self.ids:
self.traverse(id)
except Exception:
return ''
return ''.join(reversed(self.visited.keys()))
def traverse(self, id):
if id in self.visited:
return
if id in self.visiting:
raise Exception('cycle ocurrs')
self.visiting.add(id)
child_ids = self.vertex_children_map[id]
for child_id in child_ids:
self.traverse(child_id)
self.visiting.remove(id)
self.visited[id] = id
Last updated
Was this helpful?