C-Sharp | Java | Python | Swift | GO | WPF | Ruby | Scala | F# | JavaScript | SQL | PHP | Angular | HTML
Tip: A def-method name could instead be used. Lambda expressions may be sufficient if you need just one statement.
DefInfo: For StartElementHandler, we append the tag name to the list. For CharacterDataHandler, we append the data.
And: This yields a list containing start element names, and the contents of those elements.
Note: Many other handlers, including EndElementHandler and CommentHandler are available. Please see the Python documentation.
xml.parsers.expat: Python.orgPython program that uses xml.parsers.expat
import xml.parsers.expat
# Will store tag names and char data.
list = []
# Create the parser.
parser = xml.parsers.expat.ParserCreate()
# Specify handlers.
parser.StartElementHandler = lambda name, attrs: list.append(name)
parser.CharacterDataHandler = lambda data: list.append(data)
# Parse a string.
parser.Parse("""<?xml version="1.0"?>
<item><name>Sam</name>
<name>Mark</name>
</item>""", True)
# Print the items in our list.
print(list)
Output
['item', 'name', 'Sam', '\n', 'name', 'Mark', '\n']
So: For performance, Expat is a good choice. It may be harder to use than other solutions. This is a tradeoff you must evaluate.
The Expat XML Parser: github.ioBut: A C-based, optimized XML parser like Expat is likely one of the fastest options. It requires less testing: it is already developed.
Strings