您现在的位置： Linux教程網 >> UnixLinux > >> Linux編程 >> Linux編程

PULL解析XML的運行機制詳解

PULL解析簡單易上手，基本上看一遍，基本上就會解析啦，但總是感覺對PULL解析的運行機制不是很了解，就總結了以下事件驅動到底是怎麼執行的。

PULL: Android內置了PULL解析器。PULL解析器與SAX解析器類似，它提供了類似的事件，例如，開始元素和結束元素事件，使用parser.next()可以進入下一個元素並觸發事件。每一種事件將作為數值代碼被傳送，因此使用一個switch來對感興趣的事件進行處理。

這也是我最喜歡的方法，簡單好用。

下面將對解析過程進行詳細介紹，它到底是怎麼運行的呢。

這是XML Pull Parsing官網：http://www.xmlpull.org/ 裡邊有詳細的解析。

一、XmlPullParser 常見事件：

START_DOCUMENT：文檔開始

START_TAG ：標簽開始

TEXT ：文本

END_DOCUMENT：文檔結束

END_TAG：標簽結束

CDSECT ：CDATA sections was just read (this token is available only from nextToken())

在標記CDATA下，所有的標記、實體引用都被忽略，而被XML處理程序一視同仁地當做字符數據看待，CDATA的形式如下：

<![CDATA[文本內容]]>
CDATA的文本內容中不能出現字符串“]]>”

另外，CDATA不能嵌套。
COMMENT ：注釋
DOCDECL：就是

<DOCTYPE

IGNORABLE_WHITESPACE ：可忽略的空白。在沒用dtd約束文檔時， IGNORABLE_WHITESPACE只會出現在根元素外面；對於有dtd約束的文檔，空白由dtd約束文檔定義。（dtd約束文檔就是在DOCTYPE中指定的那個文件，它規定了可以在xml出現什麼標簽、以及標簽可以出現在哪等）
。。。。
常有的標簽就5個：START_DOCUMENT ，START_TAG， TEXT， END_DOCUMENT ， END_TAG。

二、一些比較重要復雜的常見方法：結合源代碼解析

1） int nextTag():

Call next() and return event if it is START_TAG or END_TAG otherwise throw an exception. It will skip whitespace TEXT before actual tag if any.
//調用next（）返回START_TAG or END_TAG這兩個事件，其他拋出異常。它會跳過空白text
本質執行過程就是這樣：

int eventType = next();
if(eventType == TEXT && isWhitespace()) { // skip whitespace
eventType = next();
}
if (eventType != START_TAG && eventType != END_TAG) {
throw new XmlPullParserException(“expected start or end tag”, this, null);
}
return eventType;

2） String nextText()：

If current event is START_TAG then if next element is TEXT then element content is returned or if next event is END_TAG then empty string is returned, otherwise exception is thrown. After calling this function successfully parser will be positioned on END_TAG.
//當前事件是START_TAG，下一元素是text就返回它的內容，或下一個事件是END_TAG那麼返回空字符串“”，其他拋出異常。調用完這個函數事件定位在END_TAG。
The motivation for this function is to allow to parse consistently both empty elements and elements that has non empty content, for example for input:

<tag>foo</tag>. <tag></tag> (which is equivalent to both input can be parsed with the same code:

3. p.nextTag()
4. p.requireEvent(p.START_TAG, “”, “tag”);
5. String content = p.nextText();
6. p.requireEvent(p.END_TAG, “”, “tag”);

This function together with nextTag make it very easy to parse XML that has no mixed content.
本質執行過程就是這樣：

if(getEventType() != START_TAG) {
throw new XmlPullParserException(
“parser must be on START_TAG to read next text”, this, null);
}
int eventType = next();
if(eventType == TEXT) {
String result = getText();
eventType = next();
if(eventType != END_TAG) {
throw new XmlPullParserException(
“event TEXT it must be immediately followed by END_TAG”, this, null);
}
return result;
} else if(eventType == END_TAG) {
return “”;
} else {
throw new XmlPullParserException(
“parser must be on START_TAG or TEXT to read text”, this, null);
}

3）int nextToken()：

This method works similarly to next() but will expose additional event types (COMMENT, CDSECT, DOCDECL, ENTITY_REF, PROCESSING_INSTRUCTION, or IGNORABLE_WHITESPACE) if they are available in input.
//這個方法和next（）相似，但是會揭露其他事件類型，例如：COMMENT, CDSECT, DOCDECL, ENTITY_REF, PROCESSING_INSTRUCTION, or IGNORABLE_WHITESPACE。如果它們在輸入中。
即它可以返回所有的事件類型：空白，注釋，CDSECT。。等等
注釋太多，而且不常用，想研究的自己看源代碼吧。

4）public int next()

//返回下一個解析事件
Get next parsing event - element content wil be coalesced and only one TEXT event must be returned for whole element content (comments and processing instructions will be ignored and emtity references must be expanded or exception mus be thrown if entity reerence can not be exapnded). If element content is empty (content is “”) then no TEXT event will be reported.
**另：next()與nextToken()的區別：
next：主要是用於返回比較高層的事件的。其中包括：START_TAG, TEXT, END_TAG, END_DOCUMENT
nextToken()：返回所有事件。
While next() provides access to high level parsing events, nextToken() allows access to lower level tokens.

5） void require(int type,String namespace,String name)

Test if the current event is of the given type and if the
namespace and name do match. null will match any namespace and any name. If the current event is TEXT with isWhitespace()= true, and the required type is not TEXT, next () is called prior to the test.
//測試當前事件是給定事件類型。Namespace，name是null表示任意匹配。
如果當前事件是Text且為空，required type不是text，那麼就會調用next（）

if (getEventType() == TEXT && type != TEXT && isWhitespace ())
next ();

if (type != getEventType()
|| (namespace != null && !namespace.equals (getNamespace ()))
|| (name != null && !name.equals (getName ()))
throw new XmlPullParserException ( “expected “+ TYPES[ type ]+getPositionDesctiption());

更多詳情見請繼續閱讀下一頁的精彩內容： http://www.linuxidc.com/Linux/2015-08/121972p2.htm

上一篇文章： Python中異常(Exception)的總結
下一篇文章：在Android用Get方式發送HTTP請求

Linux編程

Android 中使用Pull解析XML文件

Android中XML解析

Android平台基於Pull方式對XML文件解析及寫入

Android PULL解析xml文件

Android-XML解析Dom,Sax,Pull

Android平台下的XML文件解析之PULL模式解析

Android使用Pull解析器解析XML文件

詳解Android解析Xml的三種方式——DOM、SAX以及XMLpull

Linux編程

SHELL編程

PERL編程