\$\begingroup\$\$\endgroup\$
I have the following Ruby code, which parses an XML document sax style (it's a very simplified version):
require 'nokogiri'class ParsingError < RuntimeError; endclass Parser def initialize mapper, collector, error_collector @error_collector = error_collector @sax_handler = Handler.new(collector, @error_collector, mapper) end def process io_stream parser = Nokogiri::XML::SAX::Parser.new(@sax_handler) parser.parse_io(io_stream, 'UTF-8') rescue ParsingError => e @error_collector << "ParsingError: #{e.message}" endendclass Handler < Nokogiri::XML::SAX::Document def initialize collector, error_collector, mapper @mapper = mapper @collector = collector @error_collector = error_collector end def start_element name, attrs=[] return if name=='nodes' @current_element = @mapper.new(name) end def end_element name, attrs=[] return if @current_element.nil? data = @current_element.get_errors if data[:errors].empty? @collector << @current_element.result else @error_collector << data end @current_element = nil end def characters string @current_element.record(string) if @current_element end def error message raise ParsingError, message endendclass Mapper def initialize name @data = "" @name = name end def record str @data << str end def result {name: @name, data: @data} # here are more properties end def get_errors arr = [] if @data.index("Error 123") arr << "Error Code 123" elsif @data.index("Error 456") arr << "Error Code 456" elsif @data.index("Error 789") arr << "Error Code 789" # and a lot more validations end {name: @name, errors: arr} endend# Example usagexml = %{<?xml version="1.0" encoding="UTF-8"?> <nodes> <node1>some more data</node1> <node2>data Error 123</node2> <malformed xml </nodes>}collector = []error_collector = []mapper = Mapperparser = Parser.new(Mapper, collector, error_collector)parser.process(StringIO.new(xml))puts "Data:"puts collectorputsputs "Errors:"puts error_collectorMy problem is the data collection and error handling.
- I'm passing two arrays around,
collectoranderror_collector, which doesn't seem right. I want to collect data and errors in one place, but the collector itself should simply implement a<<method (like an Array). Any advice how to improve it? - Generally, the error handling feels weird. I need it in two places: Logging parsing errors and validations on items. To identify the items I also need the name of a node, except when it's a parsing error. So there are two kinds of errors: Parsing and node validation errors. How could I improve this code and remove the code smells?
- Any other advice how to improve the code above?
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.
