I recently needed to programmatically modify some Tomcat context files related to Solr which are encoded as XML. As python is quick to use for scripting while being more comfortable than the Bash for such tasks, I decided to use the module xml.dom.minidom, which is shipped with every python distribution.
In the following, I describe some use cases for working with XML using minidom. I will use the following example file /tmp/test.xml:
<?xml version="1.0" ?> <Context docBase="changeme"> <Environment name="solr/home" value="changeme"/> <Environment name="other_env_var" value="123"/> </Context>
If we want to edit an existing XML file, we first need to read this file:
import xml.dom.minidom dom = xml.dom.minidom.parse("/tmp/test_template.xml")
Afterwards, I fetch the Context element and check for the attribute docBase. I do not check whether the document contains a Context element, as I always use the same template file to work with:
context_element = dom.getElementsByTagName("Context")[0] # We assume there exists exactly one Context element context_element.setAttribute("docBase", "/srv/tomcat/webapps/solr.war") # Update of attribute docBase print(context_element.getAttribute("docBase"))
In the previous case we assumed that the Context element we (hopefully) got back from getElementsByName was the right one. What if we have mulitple elements and want to select a specific one? In the following example, we want to select a certain Environment element inside Context which has an attribute of name name with a value solr/home. We use the powerful filter function for finding the correct element. The filter function accepts a list of elements and a function which maps from the element type to True/False, thereby deciding whether an element shall be in the output or not.
env_elements = dom.getElementsByTagName("Environment") home_element = filter(lambda e: e.getAttribute("name") == "solr/home", env_elements)[0] # Again, the check for existence of the home_element is omitted here home_element.setAttribute("value","/src/solr/") print(home_element.getAttribute("value")
Finally, we need to write out the file
dom.writexml(open("/tmp/test.xml", "w"))
or we may inspect the generated XML code before writing it to file:
print(dom.toxml())
Links
- xml.dom.minidom Reference
- filter function Reference
- lambda calculus in Python (German)