Java XML 解析器

在工作中我们也许会用到xml,比如java中的配置文件,或者是一些基于硬件方面的接口通讯,一般都不是json,而是xml格式的,那为了好操作,我们需要把xml文件格式转换为我们需要的实体对象,那么:如何高效的将xml对象解析为我们的实体类对象?

目前在java中比较流行的,xml解析器有四种:

1.DOM解析器

2.SAX 解析器

3.StAX解析器

4.JAXB解析器  (这里暂不试验,用起来相对复杂一些)

当然除了上面这四种,github或其他开源平台上也有许多开源的xml解析插件。这里主要来结合代码来说明这四种解析器的使用。

DOM 解析器

DOM 解析器是最容易学习的java xml解析器。DOM解析器将XML文件加载到内存中,我们可以逐节点遍历它来解析XML。DOM Parser适用于小文件,但是当文件大小增加时,它执行速度慢并消耗更多内存。

测试代码如下:

创建一个employee.xml的测试文件:

<?xml version="1.0"?>
<Employees>
    <Employee>
        <name>Pankaj</name>
        <age>544</age>
        <role>Java Developer</role>
        <gender>Male</gender>
    </Employee>
    <Employee>
        <name>Lisa</name>
        <age>35</age>
        <role>CSS Developer</role>
        <gender>Female</gender>
    </Employee>
</Employees>

DOMParse类如下:

public class DOMParse {
    //DOM Parser适用于小型XML文档,但由于它将完整的XML文件加载到内存中,因此对大型XML文件不利。对于大型XML文件,您应该使用SAX Parser。
    public static void main(String[] args) throws Exception {
        String filePath = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        File xmlFile = new File(filePath);
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder;
        try {
            dBuilder = dbFactory.newDocumentBuilder();
            Document doc = dBuilder.parse(xmlFile);
            doc.getDocumentElement().normalize();
            System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
            NodeList nodeList = doc.getElementsByTagName("Employee");
            //now XML is loaded as Document in memory, lets convert it to Object List
            List<Employee> empList = new ArrayList<Employee>();
            for (int i = 0; i < nodeList.getLength(); i++) {
                empList.add(getEmployee(nodeList.item(i)));
            }
            //lets print Employee list information
            for (Employee emp : empList) {
                System.out.println(emp.toString());
            }
        } catch (SAXException | ParserConfigurationException | IOException e1) {
            e1.printStackTrace();
        }

    }


    private static Employee getEmployee(Node node) {
        //XMLReaderDOM domReader = new XMLReaderDOM();
        Employee emp = new Employee();
        if (node.getNodeType() == Node.ELEMENT_NODE) {
            Element element = (Element) node;
            emp.setName(getTagValue("name", element));
            emp.setAge(Integer.parseInt(getTagValue("age", element)));
            emp.setGender(getTagValue("gender", element));
            emp.setRole(getTagValue("role", element));
        }

        return emp;
    }


    private static String getTagValue(String tag, Element element) {
        NodeList nodeList = element.getElementsByTagName(tag).item(0).getChildNodes();
        Node node = (Node) nodeList.item(0);
        return node.getNodeValue();
    }

}

输出结果:

Root element :Employees
Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer

SAX 解析器

Java SAX 解析器提供了解析XML文档的API。SAX解析器与DOM解析器不同,因为它不会将完整的XML加载到内存中并按顺序读取xml文档。它是一个基于事件的解析器,我们需要实现我们的Handler类来解析XML文件。对于大型XML文件而言,它在时间和内存使用方面比DOM Parser更优秀。

javax.xml.parsers.SAXParser提供了使用事件处理程序解析XML文档的方法。此类实现XMLReader接口并提供重载版本的parse()方法,以从File,InputStream,SAX InputSource和String URI读取XML文档。

实际的解析由Handler类完成。我们需要创建自己的处理程序类来解析XML文档。我们需要实现org.xml.sax.ContentHandler接口来创建自己的处理程序类。此接口包含回调方法,这些方法在发生任何事件时接收通知。例如StartDocument,EndDocument,StartElement,EndElement,CharacterData等。

org.xml.sax.helpers.DefaultHandler提供了ContentHandler接口的默认实现,我们可以扩展这个类来创建自己的处理程序。建议扩展此类,因为我们可能只需要很少的方法来实现。扩展此类将使我们的代码更清晰,更易于维护。

我们依然沿用相同的employee.xml文件

创建我们自己的Handler对象EmployeeXMLHandler

public class EmployeeXMLHandler extends DefaultHandler {

    //List to hold Employees object
    private List<Employee> empList = null;
    private Employee emp = null;


    //getter method for employee list
    public List<Employee> getEmpList() {
        return empList;
    }

    boolean bAge = false;
    boolean bName = false;
    boolean bGender = false;
    boolean bRole = false;

    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes)
            throws SAXException {

        if (qName.equalsIgnoreCase("Employee")) {
            //create a new Employee and put it in Map
            //initialize Employee object and set id attribute
            emp = new Employee();
            //initialize list
            if (empList == null)
                empList = new ArrayList<>();
        } else if (qName.equalsIgnoreCase("name")) {
            //set boolean values for fields, will be used in setting Employee variables
            bName = true;
        } else if (qName.equalsIgnoreCase("age")) {
            bAge = true;
        } else if (qName.equalsIgnoreCase("gender")) {
            bGender = true;
        } else if (qName.equalsIgnoreCase("role")) {
            bRole = true;
        }
    }

    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {
        if (qName.equalsIgnoreCase("Employee")) {
            //add Employee object to list
            empList.add(emp);
        }
    }

    @Override
    public void characters(char ch[], int start, int length) throws SAXException {

        if (bAge) {
            //age element, set Employee age
            emp.setAge(Integer.parseInt(new String(ch, start, length)));
            bAge = false;
        } else if (bName) {
            emp.setName(new String(ch, start, length));
            bName = false;
        } else if (bRole) {
            emp.setRole(new String(ch, start, length));
            bRole = false;
        } else if (bGender) {
            emp.setGender(new String(ch, start, length));
            bGender = false;
        }
    }
}

 

测试类XMLParserSAX:

public class XMLParserSAX {

    public static void main(String[] args) {
        SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
        try {
            SAXParser saxParser = saxParserFactory.newSAXParser();
            EmployeeXMLHandler handler = new EmployeeXMLHandler();
            saxParser.parse(new File("D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml"), handler);
            //Get Employees list
            List<Employee> empList = handler.getEmpList();
            //print employee information
            for(Employee emp : empList)
                System.out.println(emp);
        } catch (ParserConfigurationException  | IOException | org.xml.sax.SAXException e) {
            e.printStackTrace();
        }
    }
}

输出结果:

Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer

要覆盖的SAX解析器方法

重写的重要方法是startElement()endElement()characters()

SAXParser开始解析文档,当找到任何start元素时,startElement()调用方法。我们重写此方法以设置将用于标识元素的布尔变量。

每次找到Employee start元素时,我们也使用此方法创建新的Employee对象。检查如何读取id属性以设置Employee Object id字段。

characters()SAXParser在元素中找到字符数据时调用方法。我们使用布尔字段将值设置为在Employee对象中更正字段。

endElement()是我们Employee对象添加到每当我们发现员工结束元素标签列表中的位置。

SAXParserFactory提供工厂方法来获取SAXParser实例。我们将File对象与MyHandler实例一起传递给parse方法来处理回调事件。

SAXParser在开始时有点混乱,但如果您正在处理大型XML文档,它提供了比DOM Parser更有效的XML读取方法。这就是Java中的SAX Parser。

 

StAX Java XML 解析器

用于XML的Java Streaming API(Java StAX)提供了在java中处理XML的实现。StAX包含两组API - 基于游标的API基于迭代器的API

 基于迭代的API

我们依然沿用上面的employee.xml文件来做测试。

 

创建StaxXMLReader类:

public class StaxXMLReader {

    public static void main(String[] args) {
        String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        List<Employee> empList = parseXML(fileName);
        for(Employee emp : empList){
            System.out.println(emp.toString());
        }
    }

    private static List<Employee> parseXML(String fileName) {
        List<Employee> empList = new ArrayList<>();
        Employee emp = null;
        XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        try {
            XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
            while(xmlEventReader.hasNext()){
                XMLEvent xmlEvent = xmlEventReader.nextEvent();
                if (xmlEvent.isStartElement()){
                    StartElement startElement = xmlEvent.asStartElement();
                    if(startElement.getName().getLocalPart().equals("Employee")){
                        emp = new Employee();
                        //Get the 'id' attribute from Employee element
                        Attribute idAttr = startElement.getAttributeByName(new QName("id"));
                        /*if(idAttr != null){
                            emp.setId(Integer.parseInt(idAttr.getValue()));
                        }*/
                    }
                    //set the other varibles from xml elements
                    else if(startElement.getName().getLocalPart().equals("age")){
                        xmlEvent = xmlEventReader.nextEvent();
                    // 这里得注意一下,如果age可能为空则需要这样来判断一下
                        if(xmlEvent.isEndElement()) {
                            emp.setAge(Integer.parseInt("1000"));
                        }
                        else
                        {
                            emp.setAge(Integer.parseInt(xmlEvent.asCharacters().getData()));
                        }

                    }else if(startElement.getName().getLocalPart().equals("name")){
                        xmlEvent = xmlEventReader.nextEvent();
                        emp.setName(xmlEvent.asCharacters().getData());
                    }else if(startElement.getName().getLocalPart().equals("gender")){
                        xmlEvent = xmlEventReader.nextEvent();
                        emp.setGender(xmlEvent.asCharacters().getData());
                    }else if(startElement.getName().getLocalPart().equals("role")){
                        xmlEvent = xmlEventReader.nextEvent();
                        emp.setRole(xmlEvent.asCharacters().getData());
                    }
                }
                //if Employee end element is reached, add employee object to list
                if(xmlEvent.isEndElement()){
                    EndElement endElement = xmlEvent.asEndElement();
                    System.out.println("取到的结束标签"+endElement.getName().getLocalPart());
                    if(endElement.getName().getLocalPart().equals("Employee")){
                        empList.add(emp);
                    }
                }
            }

        } catch (FileNotFoundException | XMLStreamException e) {
            e.printStackTrace();
        }
        return empList;
    }
}

 

 

基于游标的API

当我们使用StAX XML Parser时,我们需要创建XMLInputFactory读取XML文件。然后我们可以通过创建XMLStreamReader对象来读取文件来选择基于游标的API 。XMLStreamReader next()方法用于获取下一个解析事件,并根据事件类型返回int值。常见事件类型包括Start Document,Start Element,Characters,End Element和End Document。XMLStreamConstants包含可用于根据事件类型处理事件的int常量。

测试类StaxXMLReader2

public class StaxXMLReader2
{
    private static boolean bName;
    private static boolean bAge;
    private static boolean bGender;
    private static boolean bRole;

    public static void main(String[] args) {
        String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        List<Employee> empList = parseXML(fileName);
        for(Employee emp : empList){
            System.out.println(emp.toString());
        }
    }

    private static List<Employee> parseXML(String fileName) {
        List<Employee> empList = new ArrayList<>();
        Employee emp = null;
        XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        try {
            XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
            int event = xmlStreamReader.getEventType();
            while(true){
                switch(event) {
                    case XMLStreamConstants.START_ELEMENT:
                        if(xmlStreamReader.getLocalName().equals("Employee")){
                            emp = new Employee();
                           // emp.setId(Integer.parseInt(xmlStreamReader.getAttributeValue(0)));
                        }else if(xmlStreamReader.getLocalName().equals("name")){
                            bName=true;
                        }else if(xmlStreamReader.getLocalName().equals("age")){
                            bAge=true;
                        }else if(xmlStreamReader.getLocalName().equals("role")){
                            bRole=true;
                        }else if(xmlStreamReader.getLocalName().equals("gender")){
                            bGender=true;
                        }
                        break;
                    case XMLStreamConstants.CHARACTERS:
                        if(bName){
                            emp.setName(xmlStreamReader.getText());
                            bName=false;
                        }else if(bAge){
                            emp.setAge(Integer.parseInt(xmlStreamReader.getText()));
                            bAge=false;
                        }else if(bGender){
                            emp.setGender(xmlStreamReader.getText());
                            bGender=false;
                        }else if(bRole){
                            emp.setRole(xmlStreamReader.getText());
                            bRole=false;
                        }
                        break;
                    case XMLStreamConstants.END_ELEMENT:
                        if(xmlStreamReader.getLocalName().equals("Employee")){
                            empList.add(emp);
                        }
                        break;
                }
                if (!xmlStreamReader.hasNext())
                    break;

                event = xmlStreamReader.next();
            }

        } catch (FileNotFoundException | XMLStreamException e) {
            e.printStackTrace();
        }
        return empList;
    }
}

运行结果:

Employee:: Name=Pankaj Age=544 Gender=Male Role=Java Developer
Employee:: Name=Lisa Age=35 Gender=Female Role=CSS Developer

 

 

 

 

Java XML Parser - JDOM

JDOM提供了一个出色的Java XML解析器API,可以轻松地读取,编辑和编写XML文档。JDOM提供了包装类,用于从SAX Parser,DOM Parser,STAX Event Parser和STAX Stream Parser中选择底层实现。

添加maven依赖:

    <!-- https://mvnrepository.com/artifact/org.jdom/jdom2 -->
        <dependency>
            <groupId>org.jdom</groupId>
            <artifactId>jdom2</artifactId>
            <version>2.0.6</version>
        </dependency>

 

测试类JDOMXMLReader

public class JDOMXMLReader {
    //使用JDOM的好处是可以轻松地从SAX切换到DOM到STAX Parser,您可以提供工厂方法让客户端应用程序选择实现。
    public static void main(String[] args) {
        final String fileName = "D:/spring-boot/xml-demo/src/main/java/com/example/xmldemo/XMLHandler/employee.xml";
        org.jdom2.Document jdomDoc;
        try {
            //we can create JDOM Document from DOM, SAX and STAX Parser Builder classes
            jdomDoc = useDOMParser(fileName);
           // jdomDoc = useSAXParser(fileName);
          //  jdomDoc = useSTAXParser(fileName,"stream");
            Element root = jdomDoc.getRootElement();
            List<Element> empListElements = root.getChildren("Employee");
            List<Employee> empList = new ArrayList<>();
            for (Element empElement : empListElements) {
                Employee emp = new Employee();
               // emp.setId(Integer.parseInt(empElement.getAttributeValue("id")));
                emp.setAge(Integer.parseInt(empElement.getChildText("age")));
                emp.setName(empElement.getChildText("name"));
                emp.setRole(empElement.getChildText("role"));
                emp.setGender(empElement.getChildText("gender"));
                empList.add(emp);
            }
            //lets print Employees list information
            for (Employee emp : empList)
                System.out.println(emp);
        } catch (Exception e) {
            e.printStackTrace();
        }

    }


    //Get JDOM document from DOM Parser
    private static org.jdom2.Document useDOMParser(String fileName)
            throws ParserConfigurationException, SAXException, IOException {
        //creating DOM Document
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder;
        dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(new File(fileName));
        DOMBuilder domBuilder = new DOMBuilder();
        return domBuilder.build(doc);

    }

    //Get JDOM document from SAX Parser
    private static org.jdom2.Document useSAXParser(String fileName) throws JDOMException,
            IOException {
        SAXBuilder saxBuilder = new SAXBuilder();
        return saxBuilder.build(new File(fileName));
    }

    //Get JDOM Document from STAX Stream Parser or STAX Event Parser
    private static org.jdom2.Document useSTAXParser(String fileName, String type) throws FileNotFoundException, XMLStreamException, JDOMException{
        if(type.equalsIgnoreCase("stream")){
            StAXStreamBuilder staxBuilder = new StAXStreamBuilder();
            XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
            XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileInputStream(fileName));
            return staxBuilder.build(xmlStreamReader);
        }
        StAXEventBuilder staxBuilder = new StAXEventBuilder();
        XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
        XMLEventReader xmlEventReader = xmlInputFactory.createXMLEventReader(new FileInputStream(fileName));
        return staxBuilder.build(xmlEventReader);

    }
}

使用JDOM的好处是可以轻松地从SAX切换到DOMSTAX Parser,我们可以提供相关实现接口让客户端应用程序选择实现。

 

 

完整的测试代码地址:https://github.com/bo-zhang-1/Xml-Parser

 

 

 

 

 

猜你喜欢

转载自blog.csdn.net/qq_35716892/article/details/82817147