Movatterモバイル変換


[0]ホーム

URL:


[Python-Dev] Proof of the pudding: str.partition()

Raymond Hettingerraymond.hettinger at verizon.net
Mon Aug 29 07:48:57 CEST 2005


As promised, here is a full set of real-world comparative codetransformations using str.partition().  The patch isn't intended to beapplied; rather, it is here to test/demonstrate whether the newconstruct offers benefits under a variety of use cases.Overall, I found that partition() usefully encapsulated commonlyoccurring low-level programming patterns.  In most cases, it completelyeliminated the need for slicing and indices.  In several cases, code wassimplified dramatically; in some, the simplification was minor; and in afew cases, the complexity was about the same.  No cases were made worse.Most patterns using str.find() directly translated into an equivalentusing partition.  The only awkwardness that arose was in cases where theoriginal code had a test like, "if s.find(pat) > 0".  That casetranslated to a double-term test, "if found and head".  Also, somepieces of code needed a tail that included the separator.  That need wasmet by inserting a line like "tail = sep + tail".  And that solution ledto a minor naming discomfort for the middle term of the result tuple, itwas being used as both a Boolean found flag and as a string containingthe separator (hence conflicting the choice of names between "found" and"sep").In most cases, there was some increase in efficiency resulting fewertotal steps and tests, and from eliminating double searches.  However,in a few cases, the new code was less efficient because the fragmentonly needed either the head or tail but not both as provided bypartition().In every case, the code was clearer after the transformation.  Also,none of the transformations required str.partition() to be used in atricky way.  In contrast, I found many contortions using str.find()where I had to diagram every possible path to understand what the codewas trying to do or to assure myself that it worked.The new methods excelled at reducing cyclomatic complexity byeliminating conditional paths.  The methods were especially helpful inthe context of multiple finds (i.e. split at the leftmost colon ifpresent within a group following the rightmost forward slash ifpresent).  In several cases, the replaced code exactly matched the purepython version of str.partition() -- this confirms that people areroutinely writing multi-step low-level in-line code that duplicates wasstr.partition() does in a single step.The more complex transformations were handled by first figuring outexactly was the original code did under all possible cases and thenwriting the partition() version to match that spec.  The lesson was thatit is much easier to program from scratch using partition() than it isto code using find().  The new method more naturally expresses a seriesof parsing steps interleaved with other code.With further ado, here are the comparative code fragments:Index: CGIHTTPServer.py===================================================================*** 106,121 ****      def run_cgi(self):          """Execute a CGI script."""          dir, rest = self.cgi_info!         i = rest.rfind('?')!         if i >= 0:!             rest, query = rest[:i], rest[i+1:]!         else:!             query = ''!         i = rest.find('/')!         if i >= 0:!             script, rest = rest[:i], rest[i:]!         else:!             script, rest = rest, ''          scriptname = dir + '/' + script          scriptfile = self.translate_path(scriptname)          if not os.path.exists(scriptfile):--- 106,113 ----      def run_cgi(self):          """Execute a CGI script."""          dir, rest = self.cgi_info!         rest, _, query = rest.rpartition('?')!         script, _, rest = rest.partition('/')          scriptname = dir + '/' + script          scriptfile = self.translate_path(scriptname)          if not os.path.exists(scriptfile):Index: ConfigParser.py===================================================================*** 599,612 ****          if depth > MAX_INTERPOLATION_DEPTH:              raise InterpolationDepthError(option, section, rest)          while rest:!             p = rest.find("%")!             if p < 0:!                 accum.append(rest)                  return!             if p > 0:!                 accum.append(rest[:p])!                 rest = rest[p:]!             # p is no longer used              c = rest[1:2]              if c == "%":                  accum.append("%")--- 599,611 ----          if depth > MAX_INTERPOLATION_DEPTH:              raise InterpolationDepthError(option, section, rest)          while rest:!             head, sep, rest = rest.partition("%")!             if not sep:!                 accum.append(head)                  return!             rest = sep + rest!             if found and head:!                 accum.append(head)              c = rest[1:2]              if c == "%":                  accum.append("%")Index: cgi.py===================================================================*** 337,346 ****      key = plist.pop(0).lower()      pdict = {}      for p in plist:!         i = p.find('=')!         if i >= 0:!             name = p[:i].strip().lower()!             value = p[i+1:].strip()              if len(value) >= 2 and value[0] == value[-1] == '"':                  value = value[1:-1]                  value = value.replace('\\\\', '\\').replace('\\"','"')--- 337,346 ----      key = plist.pop(0).lower()      pdict = {}      for p in plist:!         name, found, value = p.partition('=')!         if found:!             name = name.strip().lower()!             value = value.strip()              if len(value) >= 2 and value[0] == value[-1] == '"':                  value = value[1:-1]                  value = value.replace('\\\\', '\\').replace('\\"','"')Index: cookielib.py===================================================================*** 610,618 ****    def request_port(request):      host = request.get_host()!     i = host.find(':')!     if i >= 0:!         port = host[i+1:]          try:              int(port)          except ValueError:--- 610,617 ----    def request_port(request):      host = request.get_host()!     _, sep, port = host.partition(':')!     if sep:          try:              int(port)          except ValueError:****************** 670,681 ****      '.local'        """!     i = h.find(".")!     if i >= 0:!         #a = h[:i]  # this line is only here to show what a is!         b = h[i+1:]!         i = b.find(".")!         if is_HDN(h) and (i >= 0 or b == "local"):              return "."+b      return h  --- 669,677 ----      '.local'        """!     a, found, b = h.partition('.')!     if found:!         if is_HDN(h) and ('.' in b or b == "local"):              return "."+b      return h  ****************** 1451,1463 ****          else:              path_specified = False              path = request_path(request)!             i = path.rfind("/")!             if i != -1:                  if version == 0:                      # Netscape spec parts company from reality here!                     path = path[:i]                  else:!                     path = path[:i+1]              if len(path) == 0: path = "/"            # set default domain--- 1447,1459 ----          else:              path_specified = False              path = request_path(request)!             head, sep, _ = path.rpartition('/')!             if sep:                  if version == 0:                      # Netscape spec parts company from reality here!                     path = head                  else:!                     path = head + sep              if len(path) == 0: path = "/"            # set default domainIndex: gopherlib.py===================================================================*** 57,65 ****      """Send a selector to a given host and port, return a file withthe reply."""      import socket      if not port:!         i = host.find(':')!         if i >= 0:!             host, port = host[:i], int(host[i+1:])      if not port:          port = DEF_PORT      elif type(port) == type(''):--- 57,65 ----      """Send a selector to a given host and port, return a file withthe reply."""      import socket      if not port:!         head, found, tail = host.partition(':')!         if found:!             host, port = head, int(tail)      if not port:          port = DEF_PORT      elif type(port) == type(''):Index: httplib.py===================================================================*** 490,498 ****          while True:              if chunk_left is None:                  line = self.fp.readline()!                 i = line.find(';')!                 if i >= 0:!                     line = line[:i] # strip chunk-extensions                  chunk_left = int(line, 16)                  if chunk_left == 0:                      break--- 490,496 ----          while True:              if chunk_left is None:                  line = self.fp.readline()!                 line, _, _ = line.partition(';')  # stripchunk-extensions                  chunk_left = int(line, 16)                  if chunk_left == 0:                      break****************** 586,599 ****        def _set_hostport(self, host, port):          if port is None:!             i = host.rfind(':')!             j = host.rfind(']')         # ipv6 addresses have [...]!             if i > j:                  try:!                     port = int(host[i+1:])                  except ValueError:!                     raise InvalidURL("nonnumeric port: '%s'" %host[i+1:])!                 host = host[:i]              else:                  port = self.default_port              if host and host[0] == '[' and host[-1] == ']':--- 584,595 ----        def _set_hostport(self, host, port):          if port is None:!             host, _, port = host.rpartition(':')!             if ']' not in port:         # ipv6 addresses have [...]                  try:!                     port = int(port)                  except ValueError:!                     raise InvalidURL("nonnumeric port: '%s'" % port)              else:                  port = self.default_port              if host and host[0] == '[' and host[-1] == ']':****************** 976,998 ****          L = [self._buf]          self._buf = ''          while 1:!             i = L[-1].find("\n")!             if i >= 0:                  break              s = self._read()              if s == '':                  break              L.append(s)!         if i == -1:              # loop exited because there is no more data              return "".join(L)          else:!             all = "".join(L)!             # XXX could do enough bookkeeping not to do a 2nd search!             i = all.find("\n") + 1!             line = all[:i]!             self._buf = all[i:]!             return line        def readlines(self, sizehint=0):          total = 0--- 972,990 ----          L = [self._buf]          self._buf = ''          while 1:!             head, found, tail = L[-1].partition('\n')!             if found:                  break              s = self._read()              if s == '':                  break              L.append(s)!         if not found:              # loop exited because there is no more data              return "".join(L)          else:!             self._buf = found + tail!             return "".join(L) + head        def readlines(self, sizehint=0):          total = 0Index: ihooks.py===================================================================*** 426,438 ****          return None        def find_head_package(self, parent, name):!         if '.' in name:!             i = name.find('.')!             head = name[:i]!             tail = name[i+1:]!         else:!             head = name!             tail = ""          if parent:              qname = "%s.%s" % (parent.__name__, head)          else:--- 426,432 ----          return None        def find_head_package(self, parent, name):!         head, _, tail = name.partition('.')          if parent:              qname = "%s.%s" % (parent.__name__, head)          else:****************** 449,457 ****      def load_tail(self, q, tail):          m = q          while tail:!             i = tail.find('.')!             if i < 0: i = len(tail)!             head, tail = tail[:i], tail[i+1:]              mname = "%s.%s" % (m.__name__, head)              m = self.import_it(head, mname, m)              if not m:--- 443,449 ----      def load_tail(self, q, tail):          m = q          while tail:!             head, _, tail = tail.partition('.')              mname = "%s.%s" % (m.__name__, head)              m = self.import_it(head, mname, m)              if not m:Index: locale.py===================================================================*** 98,106 ****      seps = 0      spaces = ""      if s[-1] == ' ':!         sp = s.find(' ')!         spaces = s[sp:]!         s = s[:sp]      while s and grouping:          # if grouping is -1, we are done          if grouping[0]==CHAR_MAX:--- 98,105 ----      seps = 0      spaces = ""      if s[-1] == ' ':!         spaces, sep, tail = s.partition(' ')!         s = sep + tail      while s and grouping:          # if grouping is -1, we are done          if grouping[0]==CHAR_MAX:****************** 148,156 ****          # so, kill as much spaces as there where separators.          # Leading zeroes as fillers are not yet dealt with, as it is          # not clear how they should interact with grouping.!         sp = result.find(" ")!         if sp==-1:break!         result = result[:sp]+result[sp+1:]          seps -= 1        return result--- 147,156 ----          # so, kill as much spaces as there where separators.          # Leading zeroes as fillers are not yet dealt with, as it is          # not clear how they should interact with grouping.!         head, found, tail = result.partition(' ')!         if not found:!             break!         result = head + tail          seps -= 1        return resultIndex: mailcap.py===================================================================*** 105,117 ****      key, view, rest = fields[0], fields[1], fields[2:]      fields = {'view': view}      for field in rest:!         i = field.find('=')!         if i < 0:!             fkey = field!             fvalue = ""!         else:!             fkey = field[:i].strip()!             fvalue = field[i+1:].strip()          if fkey in fields:              # Ignore it              pass--- 105,113 ----      key, view, rest = fields[0], fields[1], fields[2:]      fields = {'view': view}      for field in rest:!         fkey, found, fvalue = field.partition('=')!         fkey = fkey.strip()!         fvalue = fvalue.strip()          if fkey in fields:              # Ignore it              passIndex: mhlib.py===================================================================*** 356,364 ****          if seq == 'all':              return all          # Test for X:Y before X-Y because 'seq:-n' matches both!         i = seq.find(':')!         if i >= 0:!             head, dir, tail = seq[:i], '', seq[i+1:]              if tail[:1] in '-+':                  dir, tail = tail[:1], tail[1:]              if not isnumeric(tail):--- 356,364 ----          if seq == 'all':              return all          # Test for X:Y before X-Y because 'seq:-n' matches both!         head, found, tail = seq.partition(':')!         if found:!             dir = ''              if tail[:1] in '-+':                  dir, tail = tail[:1], tail[1:]              if not isnumeric(tail):****************** 394,403 ****                      i = bisect(all, anchor-1)                      return all[i:i+count]          # Test for X-Y next!         i = seq.find('-')!         if i >= 0:!             begin = self._parseindex(seq[:i], all)!             end = self._parseindex(seq[i+1:], all)              i = bisect(all, begin-1)              j = bisect(all, end)              r = all[i:j]--- 394,403 ----                      i = bisect(all, anchor-1)                      return all[i:i+count]          # Test for X-Y next!         head, found, tail = seq.find('-')!         if found:!             begin = self._parseindex(head, all)!             end = self._parseindex(tail, all)              i = bisect(all, begin-1)              j = bisect(all, end)              r = all[i:j]Index: modulefinder.py===================================================================*** 140,148 ****              assert caller is parent              self.msgout(4, "determine_parent ->", parent)              return parent!         if '.' in pname:!             i = pname.rfind('.')!             pname = pname[:i]              parent = self.modules[pname]              assert parent.__name__ == pname              self.msgout(4, "determine_parent ->", parent)--- 140,147 ----              assert caller is parent              self.msgout(4, "determine_parent ->", parent)              return parent!         pname, found, _ = pname.rpartition('.')!         if found:              parent = self.modules[pname]              assert parent.__name__ == pname              self.msgout(4, "determine_parent ->", parent)****************** 152,164 ****        def find_head_package(self, parent, name):          self.msgin(4, "find_head_package", parent, name)!         if '.' in name:!             i = name.find('.')!             head = name[:i]!             tail = name[i+1:]!         else:!             head = name!             tail = ""          if parent:              qname = "%s.%s" % (parent.__name__, head)          else:--- 151,157 ----        def find_head_package(self, parent, name):          self.msgin(4, "find_head_package", parent, name)!         head, _, tail = name.partition('.')          if parent:              qname = "%s.%s" % (parent.__name__, head)          else:Index: pdb.py===================================================================*** 189,200 ****          # split into ';;' separated commands          # unless it's an alias command          if args[0] != 'alias':!             marker = line.find(';;')!             if marker >= 0:!                 # queue up everything after marker!                 next = line[marker+2:].lstrip()                  self.cmdqueue.append(next)!                 line = line[:marker].rstrip()          return line        # Command definitions, called by cmdloop()--- 189,200 ----          # split into ';;' separated commands          # unless it's an alias command          if args[0] != 'alias':!             line, found, next = line.partition(';;')!             if found:!                 # queue up everything after command separator!                 next = next.lstrip()                  self.cmdqueue.append(next)!                 line = line.rstrip()          return line        # Command definitions, called by cmdloop()****************** 217,232 ****          filename = None          lineno = None          cond = None!         comma = arg.find(',')!         if comma > 0:              # parse stuff after comma: "condition"!             cond = arg[comma+1:].lstrip()!             arg = arg[:comma].rstrip()          # parse stuff before comma: [filename:]lineno | function-         colon = arg.rfind(':')          funcname = None!         if colon >= 0:!             filename = arg[:colon].rstrip()              f = self.lookupmodule(filename)              if not f:                  print '*** ', repr(filename),--- 217,232 ----          filename = None          lineno = None          cond = None!         arg, found, cond = arg.partition(',')!         if found and arg:              # parse stuff after comma: "condition"!             arg = arg.rstrip()!             cond = cond.lstrip()          # parse stuff before comma: [filename:]lineno | function          funcname = None!         filename, found, arg = arg.rpartition(':')!         if found:!             filename = filename.rstrip()              f = self.lookupmodule(filename)              if not f:                  print '*** ', repr(filename),****************** 234,240 ****                  return              else:                  filename = f!             arg = arg[colon+1:].lstrip()              try:                  lineno = int(arg)              except ValueError, msg:--- 234,240 ----                  return              else:                  filename = f!             arg = arg.lstrip()              try:                  lineno = int(arg)              except ValueError, msg:****************** 437,445 ****              return          if ':' in arg:              # Make sure it works for "clear C:\foo\bar.py:12"!             i = arg.rfind(':')!             filename = arg[:i]!             arg = arg[i+1:]              try:                  lineno = int(arg)              except:--- 437,443 ----              return          if ':' in arg:              # Make sure it works for "clear C:\foo\bar.py:12"!             filename, _, arg = arg.rpartition(':')              try:                  lineno = int(arg)              except:Index: rfc822.py===================================================================*** 197,205 ****          You may override this method in order to use Message parsingon tagged          data in RFC 2822-like formats with special header formats.          """!         i = line.find(':')!         if i > 0:!             return line[:i].lower()          return None        def islast(self, line):--- 197,205 ----          You may override this method in order to use Message parsingon tagged          data in RFC 2822-like formats with special header formats.          """!         head, found, tail = line.partition(':')!         if found and head:!             return head.lower()          return None        def islast(self, line):****************** 340,348 ****              else:                  if raw:                      raw.append(', ')!                 i = h.find(':')!                 if i > 0:!                     addr = h[i+1:]                  raw.append(addr)          alladdrs = ''.join(raw)          a = AddressList(alladdrs)--- 340,348 ----              else:                  if raw:                      raw.append(', ')!                 head, found, tail = h.partition(':')!                 if found and head:!                     addr = tail                  raw.append(addr)          alladdrs = ''.join(raw)          a = AddressList(alladdrs)****************** 859,867 ****              data = stuff + data[1:]      if len(data) == 4:          s = data[3]!         i = s.find('+')!         if i > 0:!             data[3:] = [s[:i], s[i+1:]]          else:              data.append('') # Dummy tz      if len(data) < 5:--- 859,867 ----              data = stuff + data[1:]      if len(data) == 4:          s = data[3]!         head, found, tail = s.partition('+')!         if found and head:!             data[3:] = [head, tail]          else:              data.append('') # Dummy tz      if len(data) < 5:Index: robotparser.py===================================================================*** 104,112 ****                      entry = Entry()                      state = 0              # remove optional comment and strip line!             i = line.find('#')!             if i>=0:!                 line = line[:i]              line = line.strip()              if not line:                  continue--- 104,110 ----                      entry = Entry()                      state = 0              # remove optional comment and strip line!             line, _, _ = line.partition('#')              line = line.strip()              if not line:                  continueIndex: smtpd.py===================================================================*** 144,156 ****                  self.push('500 Error: bad syntax')                  return              method = None!             i = line.find(' ')!             if i < 0:!                 command = line.upper()                  arg = None              else:!                 command = line[:i].upper()!                 arg = line[i+1:].strip()              method = getattr(self, 'smtp_' + command, None)              if not method:                  self.push('502 Error: command "%s" not implemented' %command)--- 144,155 ----                  self.push('500 Error: bad syntax')                  return              method = None!             command, found, arg = line.partition(' ')!             command = command.upper()            !             if not found:                  arg = None              else:!                 arg = tail.strip()              method = getattr(self, 'smtp_' + command, None)              if not method:                  self.push('502 Error: command "%s" not implemented' %command)****************** 495,514 ****          usage(1, 'Invalid arguments: %s' % COMMASPACE.join(args))        # split into host/port pairs!     i = localspec.find(':')!     if i < 0:          usage(1, 'Bad local spec: %s' % localspec)!     options.localhost = localspec[:i]      try:!         options.localport = int(localspec[i+1:])      except ValueError:          usage(1, 'Bad local port: %s' % localspec)!     i = remotespec.find(':')!     if i < 0:          usage(1, 'Bad remote spec: %s' % remotespec)!     options.remotehost = remotespec[:i]      try:!         options.remoteport = int(remotespec[i+1:])      except ValueError:          usage(1, 'Bad remote port: %s' % remotespec)      return options--- 494,513 ----          usage(1, 'Invalid arguments: %s' % COMMASPACE.join(args))        # split into host/port pairs!     head, found, tail = localspec.partition(':')!     if not found:          usage(1, 'Bad local spec: %s' % localspec)!     options.localhost = head      try:!         options.localport = int(tail)      except ValueError:          usage(1, 'Bad local port: %s' % localspec)!     head, found, tail = remotespec.partition(':')        !     if not found:          usage(1, 'Bad remote spec: %s' % remotespec)!     options.remotehost = head      try:!         options.remoteport = int(tail)      except ValueError:          usage(1, 'Bad remote port: %s' % remotespec)      return optionsIndex: smtplib.py===================================================================*** 276,284 ****            """          if not port and (host.find(':') == host.rfind(':')):!             i = host.rfind(':')!             if i >= 0:!                 host, port = host[:i], host[i+1:]                  try: port = int(port)                  except ValueError:                      raise socket.error, "nonnumeric port"--- 276,283 ----            """          if not port and (host.find(':') == host.rfind(':')):!             host, found, port = host.rpartition(':')!             if found:                  try: port = int(port)                  except ValueError:                      raise socket.error, "nonnumeric port"Index: urllib2.py===================================================================*** 289,301 ****      def add_handler(self, handler):          added = False          for meth in dir(handler):!             i = meth.find("_")!             protocol = meth[:i]!             condition = meth[i+1:]!               if condition.startswith("error"):!                 j = condition.find("_") + i + 1!                 kind = meth[j+1:]                  try:                      kind = int(kind)                  except ValueError:--- 289,297 ----      def add_handler(self, handler):          added = False          for meth in dir(handler):!             protocol, _, condition = meth.partition('_')              if condition.startswith("error"):!                 _, _, kind = condition.partition('_')                  try:                      kind = int(kind)                  except ValueError:Index: zipfile.py===================================================================*** 117,125 ****          self.orig_filename = filename   # Original file name inarchive  # Terminate the file name at the first null byte.  Null bytes in file  # names are used as tricks by viruses in archives.!         null_byte = filename.find(chr(0))!         if null_byte >= 0:!             filename = filename[0:null_byte]  # This is used to ensure paths in generated ZIP files always use  # forward slashes as the directory separator, as required by the  # ZIP format specification.--- 117,123 ----          self.orig_filename = filename   # Original file name inarchive  # Terminate the file name at the first null byte.  Null bytes in file  # names are used as tricks by viruses in archives.!         filename, _, _ = filename.partition(chr(0))  # This is used to ensure paths in generated ZIP files always use  # forward slashes as the directory separator, as required by the  # ZIP format specification.


More information about the Python-Devmailing list

[8]ページ先頭

©2009-2025 Movatter.jp