前言

不同的可能存在Java XXE漏洞解析函数
1. SAXBuilder
2. SAXParserFactory
3.SAXParser
4. SAXReader
5. SAXTransformerFactory
6. SchemaFactory
7. TransformerFactory
8. ValidatorSample
9. XMLReader
10. Unmarshaller
11.DocumentBuilder
12.Digester

解析xml有四种方法:DOM,SAX,DOM4j,JDOM.
主要的两种:DOM和SAX.
1、DOM适于解析比较简单的XML
原因:

  1. 这是因为DOM解析XML文档时,把所有内容一次性的装载入内存,并构建一个驻留在内存中的树状结构(节点树)。如果需要解析的XML文档过大,或者我们只对该文档中的一部分感兴趣,这样就会引起性能问题

2、SAX则适于解析较复杂的XML文件
加载一点,读取一点,处理一点。对内存要求比较低,不容易导致内存溢出。但是功能没DOM类型的丰富,只能进行读取,不适用于修改。

1、SAXBuilder

这个函数使用起来非常明显和简单,inputsource可控就可以传入任意可解析的xml

  1. SAXBuilder builder = new SAXBuilder();
  2. Document doc = builder.build(InputSource);

修复方式——添加一个true

  1. SAXBuilder builder = new SAXBuilder(true);
  2. Document doc = builder.build(InputSource);

一般inputsource为post直接传过来的所有内容,一般是用
getInputStream()
getReader()
getRequestbody()

2、SAXParserFactory/SAXParser

该类为内置类,没有回显,可以借助DNSLOG平台进行探测是否存在

  1. SAXParserFactory spf = SAXParserFactory.newInstance();
  2. SAXParser parser = spf.newSAXParser();
  3. parser.parse(InputSource, (HandlerBase) null);

同样为inputsource为直接post传过来的内容,一般是用
getInputStream()
getReader()
getRequestbody()

3、SAXReader

该类为第三方库,同样没有回显

  1. SAXReader saxReader = new SAXReader();
  2. saxReader.read(InputSource);

这个代码就非常明显和简单了,inputsource可控就可以传入任意可解析的xml
相关代码demo
https://blog.csdn.net/qq_36501591/article/details/80522531

4、SAXTransformerFactory(后端服务解析后会报错)

  1. SAXTransformerFactory sf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
  2. StreamSource source = new StreamSource(InputSource);
  3. sf.newTransformerHandler(source);

5、SchemaFactory(后端服务解析后会报错)

  1. SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
  2. StreamSource source = new StreamSource(ResourceUtils.getPoc1());
  3. Schema schema = factory.newSchema(InputSource);

6、TransformerFactory

  1. TransformerFactory tf = TransformerFactory.newInstance();
  2. StreamSource source = new StreamSource(InputSource);
  3. tf.newTransformer().transform(source, new DOMResult());


7、ValidatorSample(应该很少遇到)

  1. SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
  2. Schema schema = factory.newSchema();
  3. Validator validator = schema.newValidator();
  4. StreamSource source = new StreamSource(InputSource);
  5. validator.validate(source);

8、Digester

  1. Digester digester = new Digester();
  2. digester.parse(new StringReader(xml_con));

直接对xml_con(post传入的值)进行xml解析

9、DocumentBuilder(常见)

DocumentBuilder是比较常见的xxe出现漏洞点

  1. public String DocumentBuilder(HttpServletRequest request) {
  2. try { // 获取http的requests body请求
  3. String xml_con = WebUtils.getRequestBody(request);
  4. System.out.println(xml_con);
  5. DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
  6. DocumentBuilder db = dbf.newDocumentBuilder();
  7. StringReader sr = new StringReader(xml_con);
  8. InputSource is = new InputSource(sr);
  9. Document document = db.parse(is); // parse xml
  10. // 遍历xml节点name和value
  11. StringBuffer result = new StringBuffer();
  12. NodeList rootNodeList = document.getChildNodes();
  13. for (int i = 0; i < rootNodeList.getLength(); i++) {
  14. Node rootNode = rootNodeList.item(i);
  15. NodeList child = rootNode.getChildNodes();
  16. for (int j = 0; j < child.getLength(); j++) {
  17. Node node = child.item(j);
  18. //如果节点是ELEMENT_NODE类型,那么就添加进去,如果不加判断,直接输出,可能报错或者是直接输出内容
  19. if (child.item(j).getNodeType() == Node.ELEMENT_NODE) {
  20. result.append(node.getNodeName() + ": " + node.getFirstChild().getNodeValue() + "\n");
  21. }
  22. }
  23. }
  24. sr.close();
  25. System.out.println(result.toString());
  26. return result.toString();
  27. } catch (Exception e) {
  28. System.out.println(e);
  29. return "except";
  30. }
  31. }

简单分析下代码:
DocumentBuilderFactory 解析工厂
它不能直接示例化,但是有一个newInstance()方法能自动创造工厂并返回对象
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

接着调用工厂对象的newDocumentBuilder方法来得到一个DOM解析实例化对象DB
DocumentBuilder db = dbf.newDocumentBuilder();

将接受到的xml_con字符串转为字符串流的形式
StringReader sr = new StringReader(xml_con);

再把要解析的XML文档sr——字符串流再转为输入流
InputSource is = new InputSource(sr);

调用db.parse来进行解析xml文档,得到一个Documet对象
Document document = db.parse(is);

修复方式:
添加dbf.setFeature()

  1. dbf.setFeature(FEATURE, true);
  2. FEATURE = "http://apache.org/xml/features/disallow-doctype-decl";
  3. dbf.setFeature(FEATURE, true);
  4. FEATURE = "http://xml.org/sax/features/external-parameter-entities";
  5. dbf.setFeature(FEATURE, false);
  6. FEATURE = "http://xml.org/sax/features/external-general-entities";
  7. dbf.setFeature(FEATURE, false);
  8. FEATURE = "http://apache.org/xml/features/nonvalidating/load-external-dtd";
  9. dbf.setFeature(FEATURE, false);

10、XMLReader

  1. XMLReader reader = XMLReaderFactory.createXMLReader();
  2. reader.parse(new InputSource(InputSource));

也是直接对Post的值直接进行默认解析

11、unmarshal(安全,默认使用不会产生漏洞)

  1. Class tClass = Some.class;
  2. JAXBContext context = JAXBContext.newInstance(tClass);
  3. Unmarshaller um = context.createUnmarshaller();
  4. Object o = um.unmarshal(ResourceUtils.getPoc1());
  5. tClass.cast(o);

参考文章:
https://blog.spoock.com/2018/10/23/java-xxe/
https://www.cnblogs.com/nice0e3/p/13746076.html#xmlreader