Java 漏洞 - Java XXE学习 - 《代码审计🥺》

前言
1、SAXBuilder
2、SAXParserFactory/SAXParser
3、SAXReader
4、SAXTransformerFactory（后端服务解析后会报错）
5、SchemaFactory（后端服务解析后会报错）
6、TransformerFactory
8、Digester
9、DocumentBuilder（常见）
10、XMLReader
11、unmarshal（安全，默认使用不会产生漏洞）

前言

不同的可能存在Java XXE漏洞解析函数
1. SAXBuilder
2. SAXParserFactory
3.SAXParser
4. SAXReader
5. SAXTransformerFactory
6. SchemaFactory
7. TransformerFactory
8. ValidatorSample
9. XMLReader
10. Unmarshaller
11.DocumentBuilder
12.Digester

解析xml有四种方法：DOM，SAX，DOM4j，JDOM.
主要的两种：DOM和SAX.
1、DOM适于解析比较简单的XML
原因：

这是因为DOM解析XML文档时，把所有内容一次性的装载入内存，并构建一个驻留在内存中的树状结构（节点树）。如果需要解析的XML文档过大，或者我们只对该文档中的一部分感兴趣，这样就会引起性能问题

2、SAX则适于解析较复杂的XML文件
加载一点，读取一点，处理一点。对内存要求比较低，不容易导致内存溢出。但是功能没DOM类型的丰富，只能进行读取，不适用于修改。

1、SAXBuilder

这个函数使用起来非常明显和简单，inputsource可控就可以传入任意可解析的xml

SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(InputSource);

修复方式——添加一个true

SAXBuilder builder = new SAXBuilder(true);
Document doc = builder.build(InputSource);

一般inputsource为post直接传过来的所有内容，一般是用
getInputStream()
getReader()
getRequestbody()

2、SAXParserFactory/SAXParser

该类为内置类，没有回显，可以借助DNSLOG平台进行探测是否存在

SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser parser = spf.newSAXParser();
parser.parse(InputSource, (HandlerBase) null);

同样为inputsource为直接post传过来的内容，一般是用
getInputStream()
getReader()
getRequestbody()

3、SAXReader

该类为第三方库，同样没有回显

SAXReader saxReader = new SAXReader();
saxReader.read(InputSource);

这个代码就非常明显和简单了，inputsource可控就可以传入任意可解析的xml
相关代码demo
https://blog.csdn.net/qq_36501591/article/details/80522531

4、SAXTransformerFactory（后端服务解析后会报错）

SAXTransformerFactory sf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();
StreamSource source = new StreamSource(InputSource);
sf.newTransformerHandler(source);

5、SchemaFactory（后端服务解析后会报错）

SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
StreamSource source = new StreamSource(ResourceUtils.getPoc1());
Schema schema = factory.newSchema(InputSource);

6、TransformerFactory

TransformerFactory tf = TransformerFactory.newInstance();
StreamSource source = new StreamSource(InputSource);
tf.newTransformer().transform(source, new DOMResult());

7、ValidatorSample（应该很少遇到）

SchemaFactory factory = SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");
Schema schema = factory.newSchema();
Validator validator = schema.newValidator();
StreamSource source = new StreamSource(InputSource);
validator.validate(source);

8、Digester

Digester digester = new Digester();
digester.parse(new StringReader(xml_con));

直接对xml_con(post传入的值)进行xml解析

9、DocumentBuilder（常见）

DocumentBuilder是比较常见的xxe出现漏洞点

public String DocumentBuilder(HttpServletRequest request) {
        try { // 获取http的requests body请求
            String xml_con = WebUtils.getRequestBody(request);
            System.out.println(xml_con);
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            StringReader sr = new StringReader(xml_con);
            InputSource is = new InputSource(sr);
            Document document = db.parse(is);  // parse xml
            // 遍历xml节点name和value
            StringBuffer result = new StringBuffer();
            NodeList rootNodeList = document.getChildNodes();
            for (int i = 0; i < rootNodeList.getLength(); i++) {
                Node rootNode = rootNodeList.item(i);
                NodeList child = rootNode.getChildNodes();
                for (int j = 0; j < child.getLength(); j++) {
                    Node node = child.item(j);                
 //如果节点是ELEMENT_NODE类型，那么就添加进去，如果不加判断，直接输出，可能报错或者是直接输出内容
                    if (child.item(j).getNodeType() == Node.ELEMENT_NODE) {
                        result.append(node.getNodeName() + ": " + node.getFirstChild().getNodeValue() + "\n");
                    }
                }
            }
            sr.close();
            System.out.println(result.toString());
            return result.toString();
        } catch (Exception e) {
            System.out.println(e);
            return "except";
        }
    }

简单分析下代码：
DocumentBuilderFactory 解析工厂
它不能直接示例化，但是有一个newInstance()方法能自动创造工厂并返回对象
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();

接着调用工厂对象的newDocumentBuilder方法来得到一个DOM解析实例化对象DB
DocumentBuilder db = dbf.newDocumentBuilder();

将接受到的xml_con字符串转为字符串流的形式
StringReader sr = new StringReader(xml_con);

再把要解析的XML文档sr——字符串流再转为输入流
InputSource is = new InputSource(sr);

调用db.parse来进行解析xml文档，得到一个Documet对象
Document document = db.parse(is);

修复方式：
添加dbf.setFeature()

dbf.setFeature(FEATURE, true);
FEATURE = "http://apache.org/xml/features/disallow-doctype-decl";
dbf.setFeature(FEATURE, true);
FEATURE = "http://xml.org/sax/features/external-parameter-entities";
dbf.setFeature(FEATURE, false);
FEATURE = "http://xml.org/sax/features/external-general-entities";
dbf.setFeature(FEATURE, false);
FEATURE = "http://apache.org/xml/features/nonvalidating/load-external-dtd";
dbf.setFeature(FEATURE, false);

10、XMLReader

XMLReader reader = XMLReaderFactory.createXMLReader();
reader.parse(new InputSource(InputSource));

也是直接对Post的值直接进行默认解析

11、unmarshal（安全，默认使用不会产生漏洞）

Class tClass = Some.class;
JAXBContext context = JAXBContext.newInstance(tClass);
Unmarshaller um = context.createUnmarshaller();
Object o = um.unmarshal(ResourceUtils.getPoc1());
tClass.cast(o);

参考文章：
https://blog.spoock.com/2018/10/23/java-xxe/
https://www.cnblogs.com/nice0e3/p/13746076.html#xmlreader