Movatterモバイル変換


[0]ホーム

URL:


syntax from diveintopython

iddwbiddwb at imap1.asu.edu
Tue Apr 17 22:48:50 EDT 2001


On Tue, 17 Apr 2001, Mark Pilgrim wrote:> Well, you wouldn't be the first person to tell me that. <0.5 wink>>thanks for the expanded reply. However, I still am just not gettingSGMLParser> For those not familiar with how SGMLParser works, it will call this method> with an HTML tag ("tag", a string) and the attributes of the tag ("attrs", aI've tried again with a formulation from Guido's intro to webprogramming.  Here's the error..=====================================Traceback (most recent call last):  File "./html3", line 46, in ?    htmlbuffer.feed(buffer)  File "/usr/local/lib/python1.6/sgmllib.py", line 82, in feed    self.rawdata = self.rawdata + dataTypeError: illegal argument type for built-in operation===================================I grabbed the rpm for python 1.6.  I'm so new to the language that Ididn't see why 2.x would help.  I'm still trying to overcome years ofRexx.  anyway, comments appreciated.====================================#!/usr/local/bin/python# first test to open web pages using urlopen2import sysfrom sgmllib import SGMLParserclass HtmlBody(SGMLParser):        def __init__(self):self.links = []self.body = ()SGMLParser.__init__(self)def do_body(self, attrs):for (name, value) in attrs:if name == "body":value = valueif value:self.body = valueif name == "href":value = cleanlink(value)if value:self.links.append(value)def getlinks(self):return self.linksdef cleanlink(link):i = string.find(link, '#')if i >= 0:link = link[:i]words = string.split(link)string.join(words, "")if __name__ == '__main__':#print sys.argv[1:]try:f = open("dean.html")except IOError:print "couldn't open ", sys.argv[1:]sys.exit(1)        buffer = ""htmlbuffer = HtmlBody()buffer = f.readlines()f.close()htmlbuffer.feed(buffer)htmlbuffer.close()body = htmlbuffer.do_bodylinks = htmlbuffer.getlinksprint body#print %s %links>> - Suppose the original tag is '<a href="index.html" title="Go to home> page">'> - The method will be called with tag='a' and attrs=[('href', 'index.html'),> ('title', 'Go to home page')]> - The list comprehension will produce a list of 2 elements: ['> href="index.html"', ' title="Go to home page"']> - strattrs will be ' href="index.html" title="Go to home page"'> - The string appended to self.parts will be '<a href="index.html" title="Go> to home page">', which is what we want.>> Other than using string.join(..., "") instead of "".join(...) -- a topic> which has been beaten to death recently on this newsgroup and which I> address explicitly in my book> (http://diveintopython.org/odbchelper_join.html) -- how would you rewrite> this?>> -M> You're smart; why haven't you learned Python yet?>http://diveintopython.org/> Now in Chinese!http://diveintopython.org/cn/>>>>David BearCollege of Public Programs/ASU


More information about the Python-listmailing list

[8]ページ先頭

©2009-2025 Movatter.jp