Using dominate to Generate Web Page

2021-10-29

字数统计: 3.1k | 阅读时长≈ 14 分钟

Dominate学习笔记

1. 问题背景

算法调试、文档输出等需要有相应的文档记录。常用的记录保存方式：markdown、office、html等多种方式。个人常用的方式还是偏向于MarkDown文档（Typora），但是在使用的过程中，还是遇到一些问题。比如：MarkDown相比较于其他两种记录方式较为简洁，自定义程度不高，在需要输出一些复杂度要求高的文档时会显得有些无力。因此，在一些情况下会选择使用html作为记录方式。理所当然，出现了一个新的问题，html是一种结构化文档，能够很方便地通过自己写代码自动修改、生成文档。Python的 dominate库能够是一个封装得很好的库，对于dominate库学习、使用，是以为记。

2. Dominate

2.1 dominate简介

Dominate是一个使用DOM api创建和修改HTML(HyperText Markup Language)文件的python库。通过dominate能够很方便的用python地写html网页，且无需深入学习模板语言(css，html等)。

3. 安装

官方推荐安装方法：

1	$ sudo pip install dominate

4. 示例

开始使用之前需要导入适当的HTML标签或者直接导入整个HTML标签(tag set)集，HTML的标签有div、a、p、h1等，相关标签详细介绍参考[1]。

1	from dominate.tags import *

4.1 Hello,World

4.1.1 使用dominate输出第一个网页内容。

Dominate最基本的功能是，为每个HTML元素提供了一个实现类。在Dominate中对应的HTML元素类的构造函数能够将该元素的子元素、文本、关键字属性参数。dominate节点通过__str__()、__unicode__()、__render()__方法返回HTML对应的表达式。

from dominate.tags import *
mainpage = html(body(h1('Hello, World!')))
htmlRepreStrI = mainpage.__str__()
htmlRepreStrII = mainpage.__unicode__()
htmlRepreStrIII = mainpage.render()
print(mainpage)

<html>
  <body>
    <h1>Hello, World!</h1>
  </body>
</html>

4.1.2 class:html和class:dominate.document

import dominate
from dominate.tags import *

mainpageI = dominate.document()
mainpageII = html()
print('# dominate.document')
print(mainpageI)
print('------------------\n', '# html')
print(mainpageII)

# dominate.document
<!DOCTYPE html>
<html>
  <head>
    <title>Dominate</title>
  </head>
  <body></body>
</html>
------------------
 # html
<html></html>

document类相比于html类，document能够输出标准的HTML结构。实例化过程中，构造函数无参数输入，document能够生成带有网页title的head、body的html标签。而html仅生成html标签。这是因为html类是document的父类，document类在继承了html类，且其构造函数中丰富了相关内容。

Dominate库中的document.py构造函数。

class document(tags.html):
  tagname = 'html'
  def __init__(self, title='Dominate', doctype='<!DOCTYPE html>', request=None):
    '''
    Creates a new document instance. Accepts `title`, `doctype`, and `request` keyword arguments.
    '''
    super(document, self).__init__()
    self.doctype    = doctype
    self.head       = super(document, self).add(tags.head())
    self.body       = super(document, self).add(tags.body())
    self.title_node = self.head.add(tags.title(title))
    self._entry     = self.body

5. 属性

5.1 关键字参数修改属性

Dominate 可以使用关键字参数将属性(attribute)附加到标签(tag)上。大多数属性都是从HTML规范复制过来的仅有很少的差异。因为class和for属性关键词和Python保留关键词冲突，因此可以使用下面的别名。

class	for
_class	_for
cls	fr
className	htmlFor
class_name	html_for

如：

1 2	test = label(cls='classname anothername', fr='someinput') print(test)

1	<label class="classname anothername" for="someinput"></label>

5.1 使用`data_*`修改HTML5数据属性：

在HTML5中data属性是“xx-xx”的形式。~~如background-color属性。~~但是python中“-”是数学运算符号“减”。

1 2	test = div(data_employee='101011') print(test)

1	<div data-employee="101011"></div>

todo：background-color如何处理。

尝试使用下划线的方式处理，则结果如下：

1
2
3

test = div(background_color='#ffffff')
print(test)

1	<div background_color="#ffffff"></div>

5.3 使用类似于字典的接口修改标签属性

1
2
3

header = div()
header['id'] = 'header'
print(header)

1	<div id="header"></div>

6.构造复杂网页结构

6.1 `+=`&`.add()`

Dominate 支持+= 运算符和.add()方法，通过上述方法能够很轻易地创建出更复杂的网页结构。

例如，创建一个简单的无序列表(list)容器:

list = ul()
for item in range(4):
    list += li('Item #', item)
print(list)

<ul>
  <li>Item #0</li>
  <li>Item #1</li>
  <li>Item #2</li>
  <li>Item #3</li>
</ul>

同样也可以使用这种方式创建多个<div>

bd = body()
for item in range(4):
    bd += div('Item #', item)
print(bd)

<body>
  <div>Item #0</div>
  <div>Item #1</div>
  <div>Item #2</div>
  <div>Item #3</div>
</body>

复杂网页种，通常存在一些重复的模块，使用+=或者.add() 能够使代码更为简便。

Dominate支持使用迭代的方式简化代码

1 2	menu_items = (['home', r'/home/'], ['about', '/about']) print(ul(li(a(name, href=link), __pretty=False) for name, link in menu_items))

<ul>
  <li><a href="/home/">home</a></li>
  <li><a href="/about">about</a></li>
</ul>

一个简单HTML文件树如下：

_html = html()
_head = _html.add(head(title("Simple Document Tree")))
_body = _html.add(body())
header = _body.add(div(id='header'))
content = _body.add(div(id='content'))
footer = _body.add(div(id='footer'))
print(_html)

<html>
  <head>
    <title>Simple Document Tree</title>
  </head>
  <body>
    <div id="header"></div>
    <div id="content"></div>
    <div id="footer"></div>
  </body>
</html>

上面实现生成HTML文件树的方式是通过.add()方法，调用 .add() 方法后，其通过元胞的方式返回子标签。上面的实现方式可以简化为如下：

_html = html()
_head, _body = _html.add(head(title('Simple Document Tree')), body())
names = ['header', 'content', 'footer']
header, content, footer = _body.add([div(id=name) for name in names])
print(_html)

<html>
  <head>
    <title>Simple Document Tree</title>
  </head>
  <body>
    <div id="header"></div>
    <div id="content"></div>
    <div id="footer"></div>
  </body>
</html>

代码种的_head, _body 都是通过.add() 方法添加到 _html类中的，从属关系为父子。使用.add() 方法生成HTML文件树的时候，内部并没有根据标签类型进行文件格式调整，只是按照先后顺序进行添加，一个典型的HTML文件XX.html应该包含< head > < body >且为了方便阅读，< head >通常在文件的前面。

如果按照以下方式实现生成一个简单的HTML文件树：

_html = html()
_body = _html.add(body())
_head = _html.add(head(title("Simple Document Tree")))
header = _body.add(div(id='header'))
content = _body.add(div(id='content'))
footer = _body.add(div(id='footer'))
print(_html)

则最终输出为：

<html>
  <body>
    <div id="header"></div>
    <div id="content"></div>
    <div id="footer"></div>
  </body>
  <head>
    <title>Simple Document Tree</title>
  </head>
</html>

6.2 使用类数组接口修改标签子项

5.3中提到过，可以通过类字典方式修改标签的属性，Dominate也支持通过类数组方式修改标签下子标签的属性。

Todo: 文档中说的是通过类数组方式修改children of a tag 实际代码中修改的是tag自身的属性。

header = div('Test')
print(header)
header[0] = 'Hello World'
print(header)

1 2	<div>Test</div> <div>Hello World</div>

6.3 使用commit类为HTML文件添加注释

1	print(comment('this is a piece of commit'))

<!--this is a piece of commit-->
<!--[if lt IE9]>
<p>Upgrade to newer IE!</p>
<![endif]-->

7. 渲染网页文本(render)

render()函数输出标签的文本字符串，可以用于后面生成本地html网页文件。

a = div(span('Hello World'))
print(a.render(), '\n', type(a.render()))
with open('test.html',mode='w',encoding='utf-8') as f:
    f.write(a.render())

<div>
  <span>Hello World</span>
</div> 
<class 'str'>

render()默认输出的每个HTML元素单独占一行和两个空格的缩进。

网页文本的渲染结果由创建HTML元素的__pretty属性决定，使用render()时候可以修改属性值包括：pretty（默认值：True，出来某些元素类型，比如pre）、indent （默认值：''）、xhtml(默认值：False)。render()渲染选项设置后，其所有的子节点同样有效。

a = div(span('Hello World'))
print(a.render())
print(a.render(pretty=False))
print(a.render(indent='\t'))
a = div(span('Hello World'), __pretty=False)
print(a.render())
d = div()
with d:
    hr()
    p("Test")
    br()
print(d.render())
print(d.render(xhtml=True))

--------------
<div>
  <span>Hello World</span>
</div>
--------------
<div><span>Hello World</span></div>
--------------
<div>
	<span>Hello World</span>
</div>
--------------
<div><span>Hello World</span></div>
--------------
<div>
  <hr>
  <p>Test</p><br>
</div>
--------------
<div>
  <hr />
  <p>Test</p><br />
</div>

8.HTML文件上下文管理

8.1 with添加子标签

使用Python的with添加子标签：

# 创建一个无序列表标签
h = ul()
# 使用with给无序列表添加列表项目
with h:
    li('One')
    li('Two')
    li('Three')

print(h)

<ul>
  <li>One</li>
  <li>Two</li>
  <li>Three</li>
</ul>

同样，可以通过嵌套使用with生成更复杂的网页：

h = html()
with h.add(body()).add(div(id='content')):
    h1('Hello World!')
    p('Lorem ipsum ...')
    with table().add(tbody()):
        l = tr()
        l += td('One')
        l.add(td('Two'))
        with l:
            td('Three')

print(h)

<html>
  <body>
    <div id="content">
      <h1>Hello World!</h1>
      <p>Lorem ipsum ...</p>
      <table>
        <tbody>
          <tr>
            <td>One</td>
            <td>Two</td>
            <td>Three</td>
          </tr>
        </tbody>
      </table>
    </div>
  </body>
</html>

8.2 attr添加属性

使用with的方式打开节点，通过attr函数为节点添加属性。

d = div()
with d:
    attr(id='header')

print(d)

1	<div id="header"></div>

8.3 text函数为文本节点添加文本

from dominate.util import text
para = p("This is a paragraph,", __pretty=False)
print(para)
with para:
    text('Have a look at our ')
    a('other products', href='/products')

print(para)

1 2	<p>This is a paragraph,</p> <p>This is a paragraph,Have a look at our <a href="/products">other products</a></p>

从上面结果看，dominate.util.text函数修改文本节点的文本内容时，并不是对原有文本内容进行替换，而是添加原有文本的后面。

9.装饰器(Decorators)

Dominate非常适合为页面部分创建可复用的小部件，一种实现方式如下：

def greeting(name):
    with div() as d:
        p('Hello, %s' % name)
    return d

print(greeting('Bob'))

上面这种实现方式抽象成模板：

def widget(parameters):
    with tag() as t:
        ...
    return t

通过使用标签（对象和实例）作为装饰器，可以避免模板中再引入相应标签，

@div
def greeting(name):
    p('Hello %s' % name)
print(greeting('Bob'))

1
2
3

<div>
  <p>Hello Bob</p>
</div>

被标签(tag)装饰的函数会返回一个用来装饰标签的实例，比如用div标签装饰的函数则返回一个div实例。并且由于在函数内部隐式调用with语句，返回的实例会包含函数中创建的节点。

如果需要将属性或其他数据添加到窗口小部件的根节点中，则还可以使用标签的实例作为装饰器。每个调用被装饰函数将返回用于装饰它的节点的副本。

@div(h2('Welcome'), cls='greeting')
def greeting(name):
    p('Hello %s' % name)

print(greeting('Bob'))

<div class="greeting">
  <h2>Welcome</h2>
  <p>Hello Bob</p>
</div>

10.创建文件(Creating Documents)

每次创建一个通用结构的HTML文档是很繁琐的事情，Dominate 的document类能够很轻松地地解决这个问题。

新建一个document类时，其类成员包括基本的HTML标签。

1 2	d = document() print(d)

<!DOCTYPE html>
<html>
    <head>
       <title>Dominate</title>
    </head>
    <body></body>
</html>

document类接收title、doctype、request这几个关键字，其对应的默认值分别为：Dominate、<!DOCTYPE html>和None。同时document类能够直接访问title、head和body节点。

d = document()
>>> d.head
<dominate.tags.head: 0 attributes, 1 children>
>>> d.body
<dominate.tags.body: 0 attributes, 0 children>
>>> d.title
u'Dominate'

document类也的基本用法同其他节点一样。

11.SVG

dominate.svg 模块包含SVG标签。SVG元素会将’_’自动转换为’-‘。

1 2	from dominate.svg import * print(circle(stroke_width=5))

Reference

1.开始学习 HTML - 学习 Web 开发 | MDN (mozilla.org)

2.HTML5 Data Attributes - Vegibit

3.Context Managers and Python’s with Statement – Real Python

版权声明： 本博客所有文章除特别声明外，著作权归作者所有。转载请注明出处！