Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitc044ff4

Browse files
committed
fix errors in docs
1 parentbb8703c commitc044ff4

File tree

4 files changed

+49
-22
lines changed

4 files changed

+49
-22
lines changed

‎README.rst‎

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,10 @@ Example
3232
>>>from selectorlibimport Extractor
3333
>>>yaml_string="""
3434
title:
35-
selector: "h1"
35+
css: "h1"
3636
type: Text
3737
link:
38-
selector: "h2 a"
38+
css: "h2 a"
3939
type: Link
4040
"""
4141
>>>extractor= Extractor.from_yaml_string(yaml_string)
@@ -45,5 +45,5 @@ Example
4545
<a class="headerlink" href="http://test">¶</a>
4646
</h2>
4747
"""
48-
>>>selector.extract(html)
48+
>>>extractor.extract(html)
4949
{'title': 'Title', 'link': 'http://test'}

‎docs/usage.rst‎

Lines changed: 42 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,47 @@
11
Usage
22
======
33

4-
To use selectorlib with requests:
4+
Using selectorlib with requests
5+
--------------------------------
56

67
>>>import requests
78
>>>from selectorlibimport Extractor
89
>>>selector_yaml="""
910
name:
10-
selector: h1.product_title
11+
css: h1.product_title
1112
price:
12-
selector: p.price
13+
css: p.price
1314
stock:
14-
selector: p.stock
15+
css: p.stock
1516
tags:
16-
selector: span.tagged_as a
17+
css: span.tagged_as a
1718
short_description:
18-
selector: .woocommerce-product-details__short-description > p
19+
css: .woocommerce-product-details__short-description > p
1920
description:
20-
selector: div#tab-description p
21+
css: div#tab-description p
2122
attributes:
22-
selector: table.shop_attributes
23+
css: table.shop_attributes
2324
multiple: True
2425
children:
2526
name:
26-
selector: th
27+
css: th
2728
value:
28-
selector: td
29+
css: td
2930
related_products:
30-
selector: li.product
31+
css: li.product
3132
multiple: True
3233
children:
3334
name:
34-
selector: h2
35+
css: h2
3536
url:
36-
selector: a[href]
37+
css: a[href]
3738
price:
38-
selector: .price
39+
css: .price
3940
"""
4041
>>>extractor= Extractor.from_yaml_string(selector_yaml)
4142
>>>url='https://scrapeme.live/shop/Bulbasaur/'
4243
>>>response= requests.get(url)
43-
>>>selector.extract(response.text,base_url=response.url)
44+
>>>extractor.extract(response.text,base_url=response.url)
4445
{'attributes': [{'name': 'Weight', 'value': '15.2 kg'}],
4546
'description': 'Bulbasaur can be seen napping in bright sunlight. There is a '
4647
'seed on its back. By soaking up the sun’s rays, the seed '
@@ -61,3 +62,29 @@ related_products:
6162
'the seed grows progressively larger.',
6263
'stock': '45 in stock',
6364
'tags': 'bulbasaur'}
65+
66+
67+
Using formatter with selectors
68+
-------------------------------
69+
70+
>>>from selectorlibimport Extractor, Formatter
71+
>>>classNumber(Formatter):
72+
def format(self, text):
73+
return int(text)
74+
>>>yaml_string="""
75+
title:
76+
css: "h1"
77+
type: Text
78+
num:
79+
css: "h2 span"
80+
format: Number
81+
"""
82+
>>>formatters= Formatter.get_all()
83+
>>>extractor= Extractor.from_yaml_string(yaml_string,formatters=formatters)
84+
>>>html="""
85+
<h1>Title</h1>
86+
<h2>
87+
<span>123</span>
88+
</h2>
89+
"""
90+
>>>extractor.extract(html)

‎selectorlib/selectorlib.py‎

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ def from_yaml_string(cls, yaml_string: str, formatters=None):
3535
3636
>>> yaml_string = '''
3737
title:
38-
selector: "h1"
38+
css: "h1"
3939
type: Text
4040
'''
4141
>>> extractor = Extractor.from_yaml_string(yaml_string)
@@ -62,7 +62,7 @@ def extract(self, html: str, base_url: str = None):
6262
dict: extracted data from given html string
6363
6464
>>> response = requests.get(url)
65-
>>>selector.extract(response.text, base_url=response.url)
65+
>>>extractor.extract(response.text, base_url=response.url)
6666
"""
6767
sel=parsel.Selector(html,base_url=base_url)
6868
ifbase_url:

‎tests/test_selectorlib.py‎

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ def output_yaml():
3737
deftest_content(html,input_yaml,output_yaml):
3838
base_url="https://scrapeme.live/shop/Bulbasaur/"
3939
formatters=formatter.Formatter.get_all()
40-
selector=selectorlib.Extractor.from_yaml_string(input_yaml,formatters=formatters)
41-
output=selector.extract(html,base_url=base_url)
40+
extractor=selectorlib.Extractor.from_yaml_string(input_yaml,formatters=formatters)
41+
output=extractor.extract(html,base_url=base_url)
4242
assertoutput==yaml.safe_load(output_yaml)
4343

4444

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp