DOM文档对象模型

  • DOM(Document Object Model)定义了访问和操作XML文档的标准方法,DOM把XML文档作为树结构来查看,能够通过DOM树来读写所有元素。

image.png

Dom4j

  • Dom4j是一个易用的开源的库,用于解析XML。它应用于Java平台,具有性能优异、功能强大和极易使用的特点
  • Dom4j将XML视为Document对象
  • XML标签被Dom4j定义为Element对象
  • 下载Dom4j.jar:链接
  • 安装如下图所示

image.png

利用Dom4j遍历XML

  1. package week11.dom4j;
  2. import java.util.List;
  3. import org.dom4j.Attribute;
  4. import org.dom4j.Document;
  5. import org.dom4j.DocumentException;
  6. import org.dom4j.Element;
  7. import org.dom4j.io.SAXReader;
  8. public class HrReader {
  9. public void readXml(){
  10. String file = ".\\week11\\hr.xml";
  11. //SAXReader类是读取XML文件的核心类,用于将XML解析后以“树”的形式保存在内存中。
  12. SAXReader reader = new SAXReader();
  13. try {
  14. Document document = reader.read(file);
  15. //获取XML文档的根节点,即hr标签
  16. Element root = document.getRootElement();
  17. //elements方法用于获取指定的标签集合
  18. List<Element> employees = root.elements("employee");
  19. for(Element employee : employees){
  20. //element方法用于获取唯一的子节点对象
  21. Element name = employee.element("name");
  22. String empName = name.getText();//getText()方法用于获取标签文本
  23. System.out.println(empName);
  24. System.out.println(employee.elementText("age"));
  25. System.out.println(employee.elementText("salary"));
  26. Element department = employee.element("department");
  27. System.out.println(department.element("dname").getText());
  28. System.out.println(department.element("address").getText());
  29. Attribute att = employee.attribute("no");
  30. System.out.println(att.getText());
  31. }
  32. } catch (DocumentException e) {
  33. // TODO Auto-generated catch block
  34. e.printStackTrace();
  35. }
  36. }
  37. public static void main(String[] args) {
  38. HrReader reader = new HrReader();
  39. reader.readXml();
  40. }
  41. }

利用Dom4j更新XML

  1. package week11.dom4j;
  2. import java.io.FileOutputStream;
  3. import java.io.OutputStreamWriter;
  4. import java.io.Writer;
  5. import org.dom4j.Document;
  6. import org.dom4j.DocumentException;
  7. import org.dom4j.Element;
  8. import org.dom4j.io.SAXReader;
  9. public class HrWriter {
  10. public void writeXml(){
  11. String file = ".\\week11\\hr.xml";
  12. SAXReader reader = new SAXReader();
  13. try {
  14. Document document = reader.read(file);
  15. Element root = document.getRootElement();
  16. Element employee = root.addElement("employee");
  17. employee.addAttribute("no", "3311");
  18. Element name = employee.addElement("name");
  19. name.setText("李铁柱");
  20. employee.addElement("age").setText("37");
  21. employee.addElement("salary").setText("3600");
  22. Element department = employee.addElement("department");
  23. department.addElement("dname").setText("人事部");
  24. department.addElement("address").setText("XX大厦-B105");
  25. Writer writer = new OutputStreamWriter(new FileOutputStream(file) , "UTF-8");
  26. document.write(writer);
  27. writer.close();
  28. } catch (Exception e) {
  29. // TODO Auto-generated catch block
  30. e.printStackTrace();
  31. }
  32. }
  33. public static void main(String[] args) {
  34. HrWriter hrWriter = new HrWriter();
  35. hrWriter.writeXml();
  36. }
  37. }

XPath

路径表达式

  • XPath路径表达式是XML文档中查找数据的语言
  • 掌握XPath可以极大可能的提高在提取数据时的开发效率
  • 学习XPath本质就是掌握各种形式表达式的使用技巧

    XPath基本表达式

    image.png
    image.png
    image.png

    Jaxen介绍

  • Jaxen是一个Java编写的开源的XPath库。可以适应多种不同对象的模型,如

    • DOM
    • XOM
    • Dom4j
    • JDOM
  • Dom4j底层依赖Jaxen实现XPath查询
  • Jaxen下载地址
  • 安装如下图所示

image.png

  1. package week11.dom4j;
  2. import java.util.List;
  3. import org.dom4j.Document;
  4. import org.dom4j.DocumentException;
  5. import org.dom4j.Element;
  6. import org.dom4j.Node;
  7. import org.dom4j.io.SAXReader;
  8. public class XPathTestor {
  9. public void xpath(String xpathExp){
  10. String file = ".\\week11\\hr.xml";
  11. SAXReader reader = new SAXReader();
  12. try {
  13. Document document = reader.read(file);
  14. List<Node> nodes = document.selectNodes(xpathExp);
  15. for(Node node : nodes){
  16. Element emp = (Element)node;
  17. System.out.println(emp.attributeValue("no"));
  18. System.out.println(emp.elementText("name"));
  19. System.out.println(emp.elementText("age"));
  20. System.out.println(emp.elementText("salary"));
  21. System.out.println("==============================");
  22. }
  23. } catch (DocumentException e) {
  24. // TODO Auto-generated catch block
  25. e.printStackTrace();
  26. }
  27. }
  28. public static void main(String[] args) {
  29. XPathTestor testor = new XPathTestor();
  30. // testor.xpath("/hr/employee");
  31. // testor.xpath("//employee");
  32. // testor.xpath("//employee[salary<4000]");
  33. // testor.xpath("//employee[name='李铁柱']");
  34. // testor.xpath("//employee[@no=3304]");
  35. // testor.xpath("//employee[1]");
  36. // testor.xpath("//employee[last()]");
  37. //testor.xpath("//employee[position()<3]");
  38. testor.xpath("//employee[3] | //employee[8]");
  39. }
  40. }

编程练习一

使用Dom4j操作存储课程信息的plan.xml文件:

  1. 为plan.xml文件添加一条新的课程信息
  2. 遍历plan.xml文件并将节点和属性等内容输出

    编程练习二

    利用XPath对存储课程信息的plan.xml文档进行查询并将结果输出,要求如下:

  3. 获取所有课程信息

  4. 查询课时小于50的课程信息
  5. 查询课程名为高等数学的课程信息
  6. 查询属性id为001的课程信息
  7. 查询前两条课程信息