xiantang/grokking_algorithmsPublic

NotificationsYou must be signed in to change notification settings
Fork5
Star18

grokking algorithms 《图解算法》代码

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
code		code
README.md		README.md
note.md		note.md

Repository files navigation

图解算法代码

这是本人在18年学习《图解算法》时候学习留下的代码和相应的思考，这本书是一本很不错的算法书籍，如果认真阅读思考可以为你的算法打下不错的基础，在阅读了这本书之后可以去阅读《算法4th》《算法导论》等书，同时也可以辅以 leetcode 练习实践一下。
对于这个仓库的问题和疑问欢迎在 issue 区讨论或者提出 MR 修改代码👏。

二分查找

importmathdefbina_search(list,item):low=0hight=len(list)-1whilelow<=hight:#每次猜测mid=math.floor((low+hight)/2)guess=list[mid]ifguess==item:returnmidifguess>item:hight=mid-1else:low=mid+1returnNonemy_list= [1,3,5,7,8]print(bina_search(my_list,7))

这个思路主要是不断中断数组然后判断是否在中点上

疑问: 为什么判断错误之后 mid 要+1/-1呢?
回答: 画了张图理解了一下主要是把猜错的中点给排除掉

选择排序

deffind_small(arry:list)->int:"""    找到最小的元素    :param arry:    :return:    """smallest=arry[0]smallest_index=0foriinrange(1,len(arry)):ifsmallest>arry[i]:smallest=arry[i]smallest_index=ireturnsmallest_indexdefselectionSort(arry:list)->list:"""    选择排序    :param arry:    :return:    """new_array= []foriinrange(len(arry)):smallest=find_small(arry)new_array.append(arry.pop(smallest))returnnew_arrayprint(selectionSort([5,3,6,2,10]))

疑问:为什么循环len(array)次就可以完成了排序?
回答:因为数组长度是len(array),每次都会弹出一个元素,弹len(array)次就会让原数组为空.

数组链表区别

数组元素是在一起的/列表是分开的,每个元素到都存储了下一个元素的地址
数组读取,随机读取很快/链表插入删除很快.
数组中所有元素都是一个类型.

递归

defcountdown1(i):print(i)countdown1(i-1)# countdown(1)defcountdown2(i):print(i)ifi<=1:#基线条件returnelse:#递归条件countdown2(i-1)a=countdown2(3)

递归主要由两个部分组成:
1. 递归条件:函数调用自己
2. 基线条件:函数不再调用自己

调用栈

当调用另一个函数时候,当前函数处于未完成状态.

deffact(x):ifx==1:return1else:returnx*fact(x-1)#塞入栈中 #fact函数没有结束

递归总结

递归指的是调用自己的函数
1. 递归条件:函数调用自己
2. 基线条件:函数不再调用自己
两种操作1.压入栈中 2.弹出栈
所有函数都进入调用栈
调用栈很长就会很占用内存

D&C算法

defDC(x,y):ifx>y:x=x%yifx==0:#基线条件returnyreturnDC(x,y)ifx<y:y=y%xify==0:returnxreturnDC(x,y)

分治法思路

找出基线条件/尽可能简单
不断分割问题->符合基线条件

求和

#我写的a= [1,2,3,4,5]defsum(array):iflen(array)==0 :return0iflen(array)==1 :returnarray[0]else:first=array[0]array.pop(0)#pop 指的是pop的索引returnfirst+sum(array)print(sum(a))

#标准答案a= [1,2,3,4,5]defsum(array):ifarray== []:return0else:returnarray[0]+sum(array[1:])print(sum(a))

数组求和问题

将数组问题分割成数组第一个元素之后的元素和第一个元素相加
涉及数组的递归函数时候,基线条件通常是数组为空/只包含一个元素

求数目

a= [1,2,3,4,5]defcount(array):ifarray== []:return0else:return1+count(array[1:])print(count(a))

求最大值

a= [1,2,8,4,5]defbiggest(array):ifarray== []:return0else:returnarray[0]if\array[0]>biggest(array[1:])\elsebiggest(array[1:])print(biggest(a))

或许我们接触的是for循环/如果我们用递归呢?使用递归实现二分法

importmatha= [1,2,3,4,5,8,10]defcut(array,item):len_array=len(array)iflen_array==1:return0else:mid=math.floor(len_array/2)ifitem==array[mid]:returnmidifitem>array[mid]:returnlen(array[:mid])+cut(array[mid:],item)else:returncut(array[:mid],item)print(cut(a,8))

基线条件
1. 只剩下一个元素,就是那个需要的元素
2. 中点直接是猜到的元素
递归条件
1. 猜的元素比中点大
2. 猜的元素比中点小

qsort

a= [1,2,6,3,8,5]defqsort(array):iflen(array)<2:returnarraymid=array[0]low= [array[i]foriinrange(1,len(array))ifarray[i]<=mid]up= [array[i]foriinrange(1,len(array))ifarray[i]>mid]returnqsort(low)+[mid]+qsort(up)print(qsort(a))

选择基准值
分别找出比基准大的值和比基准小的值
递归的对子数组进行排序

归纳证明

基线条件
归纳条件证明排序len=1的数组有用 len=2 有用 len=3有用所以之后无需证明都有效

大O表示法

结论:

指明的是算法的增速
指出的是算法的最糟运行时间
不考虑+*-/
通常不考虑常量(快速查找常量比归并小)

qsort 时间复杂度

最糟糕(头)
1. 以头为基准,需要涉及整个列表
2. 因为要遍历两边的数组,而且O(n)不受常量影响
最优(中间)
1. 以中间为基准(类似二分log n)
2. 每层O(n)最佳的就是平均的情况TODO: 随机的选择用作基准的元素

array= [5,7,2,4,3,1,8,6]fromrandomimportrandintdefqsort(array):iflen(array)<=1:returnarrayran=randint(0,len(array)-1)mid=array[ran]array.pop(ran)smaller= [iforiinarrayifi<=mid]bigger= [iforiinarrayifi>mid ]returnqsort(smaller)+ [mid]+qsort(bigger)print(qsort(array))

散列表

DNS 解析域名->ip地址

广度优先遍历

fromcollectionsimportdequegraph= {}graph['you']= ["alice","bob","claire"]graph['bob']= ["anuj","peggy"]graph["alice"]= ["peggy"]graph["claire"]= ["thom","jooy"]graph["anuj"]= []graph["peggy"]= []graph["thom"]= []graph["jonny"]= []defperson_is_seller(name):returnname[-1]=='m'# dequedeffind():search_queue=deque()search_queue+=graph["you"]whilesearch_queue:person=search_queue.popleft()ifperson_is_seller(person):print(person+"is seller")returnTrueelse:search_queue+=graph[person]returnFalsefind()

使用一个队列从右侧加入一个节点的所有的下个节点然后从左侧读取一个节点往返直到队列为空

fromcollectionsimportdequedefperson_is_seller(name):returnname[-1]=='m'graph= {}graph['you']= ["alice","bob","claire"]graph['bob']= ["anuj","peggy"]graph["alice"]= ["peggy"]graph["claire"]= ["thom","jooy"]graph["anuj"]= []graph["peggy"]= []graph["thom"]= []graph["jonny"]= []search_queue=deque()deffind2(search_queue):ifsearch_queue:person=search_queue.popleft()ifperson_is_seller(person):print(person+"is seller")returnTrueelse:search_queue+=graph[person]returnfind2(search_queue)else:returnNonesearch_queue+=graph["you"]print(find2(search_queue))

广度优先遍历是找到是否A-B的
有就可以找到最短路径
有向图可以指定方向
无向图关系双向
按照顺序放入队列就可以找到最短路径,检查过的人需要放入去重列表

迪杰斯特拉算法

graph= {}graph['start']= {}graph['start']['a']=5graph['start']['b']=0graph['a']= {}graph['a']['c']=15graph['a']['d']=20graph['b']= {}graph['b']['c']=30graph['b']['d']=25graph['c']= {}graph['c']['fin']=20graph['d']= {}graph['d']['fin']=10graph['fin']= {}costs= {}costs['a']=5costs['b']=0costs['c']=float("inf")costs['d']=float("inf")costs['fin']=float("inf")parents= {}parents['a']='start'parents['b']='start'parents['c']='start'parents['d']='start'parents['fin']=Noneprocessed= []deffind_lowst(costs):low_cost=float("inf")low_cost_node=Nonefornodeincosts:cost=costs[node]ifcost<low_costandnodenotinprocessed:low_cost=costlow_cost_node=nodereturnlow_cost_nodenode=find_lowst(costs)whilenodeisnotNone:cost=costs[node]friends=graph[node]forfriendinfriends.keys():new_cost=friends[friend]+cost# if new_cost <ifnew_cost<costs[friend]:costs[friend]=new_costparents[friend]=nodeprocessed.append(node)node=find_lowst(costs)print(costs)

简洁版本

graph= {}graph['start']= {}graph['start']['a']=5graph['start']['b']=0graph['a']= {}graph['a']['c']=15graph['a']['d']=20graph['b']= {}graph['b']['c']=30graph['b']['d']=25graph['c']= {}graph['c']['fin']=20graph['d']= {}graph['d']['fin']=10graph['fin']= {}defget_costs_parent(graph):costs= {}parents= {}fornodeingraph.keys():ifnodeingraph['start'].keys():costs[node]=graph['start'][node]parents[node]='start'else:costs[node]=float('inf')parents[node]=Nonereturncosts,parentscosts ,parents=get_costs_parent(graph)processed= []deffind_lowst(costs):low_cost=float("inf")low_cost_node=Nonefornodeincosts:cost=costs[node]ifcost<low_costandnodenotinprocessed:low_cost=costlow_cost_node=nodereturnlow_cost_nodenode=find_lowst(costs)whilenodeisnotNone:cost=costs[node]friends=graph[node]forfriendinfriends.keys():new_cost=friends[friend]+cost# if new_cost <ifnew_cost<costs[friend]:costs[friend]=new_costparents[friend]=nodeprocessed.append(node)node=find_lowst(costs)print(costs['fin'])

graph= {}graph['start']= {}graph['start']['a']=5graph['start']['b']=0graph['a']= {}graph['a']['c']=15graph['a']['d']=20graph['b']= {}graph['b']['c']=30graph['b']['d']=25graph['c']= {}graph['c']['fin']=20graph['d']= {}graph['d']['fin']=10graph['fin']= {}defget_costs_parent(graph):costs= {}parents= {}fornodeingraph.keys():ifnodeingraph['start'].keys():costs[node]=graph['start'][node]parents[node]='start'else:costs[node]=float('inf')parents[node]=Nonereturncosts,parentscosts ,parents=get_costs_parent(graph)deffind_lowst(costs):low_cost=float("inf")low_cost_node=Nonefornodeincosts:cost=costs[node]ifcost<low_costandnodenotinprocessed:low_cost=costlow_cost_node=nodereturnlow_cost_nodedeffind_short_path(costs,processed,parents):node=find_lowst(costs)ifnodeisnotNone:cost=costs[node]neighbors=graph[node]forninneighbors.keys():new_cost=neighbors[n]+costifnew_cost<costs[n]:costs[n]=new_costparents[n]=nodeprocessed.append(node)returnfind_short_path(costs,processed,parents)else:returncosts['fin']processed= []fastst=find_short_path(costs,processed,parents)

递归实现:

基准条件:找到的最小节点为空
递归条件:还有节点没有处理

主要思路:

找到一个节点然后取找他的所有相邻节点
将相邻节点到本节点的距离与本节点到起点的距离相加
判断是否比相邻结点原来的距离短
如果更短直接更新,并且加入去重列表
如何取找一个节点:我们选取的是最小的节点,如果这个节点不在去重队列并且是最小的,就以他为节点更新他的相邻节点,至于我们要选择最小的,是因为要是选择最大的化会遇到无限大的情况(没有找到到达线路)

广度优先搜索可用来在非加权图中查找最短路径
Dijkstra适合在加权图中查找最短路径
加权图为正:Dijkstra/加权图为负:贝尔曼-福德

贪心算法

array= ['mt','wa','or','id','nv','ut','ca','az']states_needed=set(array)# 转换为集合stations= {}stations['kone']=set(['id','nv','ut'])stations['ktwo']=set(['wa','id','mt'])stations['kthree']=set(['or','nv','ca'])stations['kfour']=set(['nv','ut'])stations['kfive']=set(['ca','az'])stations_array= []whilestates_needed:bast_station=Nonestates_covered=set()forstation,state_for_stationinstations.items():# print(station,state_for_station)covered=state_for_station&states_needediflen(covered)>len(states_covered):bast_station=stationstates_covered=coveredstates_needed=states_needed-states_covered# print(states_needed)stations_array.append(bast_station)#最好的stationprint(stations_array)

贪心算法思路:每步都选择最有解,从而达到整体最优解
不适用场景:背包问题/只能找到非常接近最优解的解法

#递归实现deffind_station(states_needed):ifstates_needed:bast_station=Nonestates_covered=set()forstation,state_for_stationinstations.items():# print(station,state_for_station)covered=state_for_station&states_needediflen(covered)>len(states_covered):bast_station=stationstates_covered=coveredstates_needed=states_needed-states_coveredreturn  [bast_station]+find_station(states_needed)else:return []print(find_station(states_needed))

实现思路:1.寻找覆盖未被覆盖区域最多的电台
2.将这个加入队列3.更新未被覆盖区域4.重复1-3

NP完全问题

元素较少运行速度快,元素越来越多速度会变得非常慢.
涉及所有组合的通常是NP完全问题
不能分割成小问题,需要考虑各种情况的
问题涉及旅行商序列等的且难以解决的
问题涉及广播台集合的
问题可以转换为旅行商或者集合覆盖的问题的

graph= {}graph['a']= {}graph['a']['b']=3graph['a']['c']=6graph['a']['d']=1graph['c']= {}graph['c']['a']=6graph['c']['b']=2graph['c']['d']=7graph['b']= {}graph['b']['c']=2graph['b']['d']=5graph['b']['a']=3graph['d']= {}graph['d']['c']=7graph['d']['a']=1graph['d']['b']=5list_city_array=list(graph.keys())deffind_fast(graph,next_array):ifnext_array:finded_array= []city=next_array.pop()list_array=list(graph.keys())cost=0finded_array.append(city)whilelen(finded_array)<len(list_array):low_cost=float('inf')low_cost_city=Noneforneiboingraph[city].keys():ifgraph[city][neibo]<low_costandneibonotinfinded_array:low_cost=graph[city][neibo]low_cost_city=neibo# print(low_cost)cost+=low_costfinded_array.append(low_cost_city)city=low_cost_citymined=min(find_fast(graph,next_array),cost)returnminedelse:returnfloat("inf")print(find_fast(graph,list_city_array))

旅行商问题使用近似算法实现

基准条件:剩余被查找队列为空
递归条件:剩余查找队列不为空1.从城市列表中获得一个城市
2.寻找这个城市最近距离的城市,且不再已查找列表3.更新时间花费,将被查找城市加入已查找队列
4.以这个城市为起点查找最近距离城市5.重复2-4

动态规划

问题:为什么第一排每个单元格都是1500?回答:第一个单元格指的是第一个1磅时候最大价格是1500第二个单元格指的是2磅的时候价格是1500 以此类推

我想应该把他们分开

About

grokking algorithms 《图解算法》代码

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

xiantang/grokking_algorithms