Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fix do all redirects#49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
QuLogic merged 18 commits intomatplotlib:masterfromjklymak:fix-do-all-reditrects
Feb 12, 2021

Conversation

jklymak
Copy link
Member

@jklymakjklymak commentedJan 21, 2021
edited
Loading

(Update 31 Jan 2021):

closes:matplotlib/matplotlib#12374
closes:#25

Obviously this blows git hub up, but the script that does this is in the first commit...

Problem 1:

Currently the top level of the website has a copy of every file that has existed on our webpage, even if the file is obsolete, and not part of current matplotlib docs. For instance the/examples/ directory was removed after 2.0.2 (and replaced by/gallery/) but is still accessible athttps://matplotlib.org/examples/.@tacaswell wants this to remain so old links do not die, but it also means that search engines think this is a perfectly acceptable current set of webpages, whereas we would like these versions to not show up in searches.

Proposed solution:

The script here either soft links all top-level files to theirnewest version in the docs, or makes an html-refresh to do that.

So, for examplegallery/api was moved for 3.0.0, so:ls -halt gallery/api gives:

-rw-r--r--  1 jklymak staff  463 Jan 20 22:24 quad_bezier.htmllrwxr-xr-x  1 jklymak staff   33 Jan 20 22:24 legend.py -> ../../2.2.5/gallery/api/legend.py-rw-r--r--  1 jklymak staff  463 Jan 20 22:24 radar_chart.html-rw-r--r--  1 jklymak staff  448 Jan 20 22:24 logos2.htmllrwxr-xr-x  1 jklymak staff   34 Jan 20 22:24 legend.png -> ../../2.2.5/gallery/api/legend.png

andless quad_bezier.html gives

<!DOCTYPE HTML><html lang="en">    <head>        <meta charset="utf-8">        <meta http-equiv="refresh" content="0;url=https://matplotlib.org/2.2.5/gallery/api/quad_bezier.html" />        <link rel="canonical" href="https://matplotlib.org/2.2.5/gallery/api/quad_bezier.html" />    </head>    <body>        <h1>            The page been moved to <a href="https://matplotlib.org/2.2.5/gallery/api/quad_bezier.html"</a>        </h1>    </body></html>

Problem 2:

Similarly our canonical links go to the top level or the level they were introduced in. Sohttps://matplotlib.org/stable/gallery/showcase/mandelbrot.html has<link rel="canonical" href="https://matplotlib.org/3.3.4/gallery/showcase/mandelbrot.html"/> as its canonical link. Older versions of the docs would link to<link rel="canonical" href="https://matplotlib.org/gallery/showcase/mandelbrot.html"/>

Solution:

The script goes through each html file in all versions (including old versions) and changes the canonical link to the newest version. So forquad_bezier.html:

less 2.2.5/gallery/api/quad_bezier.html gives<link rel="canonical" href="https://matplotlib.org/2.2.5/gallery/api/quad_bezier.html" />

less 2.2.4/gallery/api/quad_bezier.html gives the same link (because 2.2.5 is the newest).

For files that exist instable:

less 2.2.4/tutorials/intermediate/artists.html gives the canonical version in stable.
<link rel="canonical" href="https://matplotlib.org/stable/tutorials/intermediate/artists.html" />

Maintenance Burden

  • the script will need updating if a new top-level subdirectory is added and we don't want it included in the linking
  • the script needs to be run when a new version of the docs is released; the script is slow, but I added multi-processing to the canonical links part to move it along.

@jklymak
Copy link
MemberAuthor

Procedure for a release:

As before

rsync -a 2.0.0/* ./rm stableln -s 2.0.0 stable python make_redirects_links.py

If a file is removed from 2.0.0, then the make_redirects_links will link it to 1.9.9 (or whatever the last version was). A new file will then just be linked back tostable, which is a bit silly, but it keeps the top level consistent.

Conversely, we could leave out the rsync step from now on, so new docs are never installed at the top level (newest is always in stable)

@lgtm-com
Copy link

This pull requestintroduces 3 alerts andfixes 1414 when mergingddbe02b intoae49ba9 -view on LGTM.com

new alerts:

  • 2 for Unused import
  • 1 for Unnecessary 'else' clause in loop

fixed alerts:

  • 801 for Variable defined multiple times
  • 323 for Unused import
  • 155 for Unused local variable
  • 53 for Constant in conditional expression or statement
  • 38 for Unreachable code
  • 11 for Implicit string concatenation in a list
  • 8 for Unhashable object hashed
  • 8 for Suspicious unused loop iteration variable
  • 5 for Module is imported more than once
  • 4 for Except block handles 'BaseException'
  • 3 for First parameter of a method is not named 'self'
  • 2 for Redundant assignment
  • 1 for __iter__ method returns a non-iterator
  • 1 for Wrong number of arguments for format
  • 1 for Module is imported with 'import' and 'import from'

@jklymak
Copy link
MemberAuthor

So I think this should also go back through all the html files, and change the canonical. canonical would be the latest available version, usuallystable if the page still exists. If the page doesn't exist any longer, canonical would be the newest version.

@jklymakjklymakforce-pushed thefix-do-all-reditrects branch 3 times, most recently fromd4f4528 to7a1c948CompareJanuary 31, 2021 00:33
@lgtm-com
Copy link

This pull requestintroduces 5 alerts andfixes 1414 when merging83007d9 into512a813 -view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Unnecessary 'else' clause in loop
  • 1 for Implicit string concatenation in a list

fixed alerts:

  • 801 for Variable defined multiple times
  • 323 for Unused import
  • 155 for Unused local variable
  • 53 for Constant in conditional expression or statement
  • 38 for Unreachable code
  • 11 for Implicit string concatenation in a list
  • 8 for Unhashable object hashed
  • 8 for Suspicious unused loop iteration variable
  • 5 for Module is imported more than once
  • 4 for Except block handles 'BaseException'
  • 3 for First parameter of a method is not named 'self'
  • 2 for Redundant assignment
  • 1 for __iter__ method returns a non-iterator
  • 1 for Wrong number of arguments for format
  • 1 for Module is imported with 'import' and 'import from'

@lgtm-com
Copy link

This pull requestintroduces 5 alerts andfixes 1414 when merging694bc49 into512a813 -view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Unnecessary 'else' clause in loop
  • 1 for Implicit string concatenation in a list

fixed alerts:

  • 801 for Variable defined multiple times
  • 323 for Unused import
  • 155 for Unused local variable
  • 53 for Constant in conditional expression or statement
  • 38 for Unreachable code
  • 11 for Implicit string concatenation in a list
  • 8 for Unhashable object hashed
  • 8 for Suspicious unused loop iteration variable
  • 5 for Module is imported more than once
  • 4 for Except block handles 'BaseException'
  • 3 for First parameter of a method is not named 'self'
  • 2 for Redundant assignment
  • 1 for __iter__ method returns a non-iterator
  • 1 for Wrong number of arguments for format
  • 1 for Module is imported with 'import' and 'import from'

@jklymakjklymak linked an issueFeb 1, 2021 that may beclosed by this pull request
@jklymakjklymak changed the titleFix do all reditrectsFix do all redirectsFeb 2, 2021
@jklymak
Copy link
MemberAuthor

A possible improvement of this script might be to put a banner after<body> on every old version of the webpages so they are marked as not current. Of course the most recent page would not get this, and we'd have to make sure we do not add it twice

@jklymak
Copy link
MemberAuthor

Note I've dropped the second commit because there is no reason to upload it here until its ready to go. Let me know if you'd like me to regenerate it, or if one of you would like to do it.

@tacaswell
Copy link
Member

Ah, I have been indpendently working on the script, have some ways to make it faster.

I think it is possible to make the re-directs relative as a kindness to anyone who wants to host these files locally / on an airgapped network.

@lgtm-com
Copy link

lgtm-combot commentedFeb 3, 2021

This pull requestintroduces 4 alerts when merging67572bf intoaa7c836 -view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Unnecessary 'else' clause in loop

os.raname does not work across filesystems
This is helpful to people who want to host off-line versions of thedocs.
@jklymak
Copy link
MemberAuthor

Does a relative redirect work so that the new address looks correct? We don't wanthttps://matplotlib.org/boo/who/../../stable/boo/who/index.html!

@jklymak
Copy link
MemberAuthor

Added the banner logic.

It somewhat fragilely assumes that the<body> tag is on a line of its own. If we wanted to use BeautifulSoup and were willing to prettify the output, we could do this much more robustly. But I wasn't sure if that was too invasive.

Also removed the double recursion under do_canonical! Its quite fast now and I'm 90% sure it hits everything.

@lgtm-com
Copy link

lgtm-combot commentedFeb 4, 2021

This pull requestintroduces 4 alerts when merging4fd177e intoaa7c836 -view on LGTM.com

new alerts:

  • 3 for Unused import
  • 1 for Unnecessary 'else' clause in loop

@lgtm-com
Copy link

lgtm-combot commentedFeb 4, 2021

This pull requestintroduces 1 alert when merging1e9977f intoaa7c836 -view on LGTM.com

new alerts:

  • 1 for Unnecessary 'else' clause in loop

@jklymak
Copy link
MemberAuthor

Note I don't think this needs to wait formatplotlib/matplotlib#19456

@jklymak
Copy link
MemberAuthor

This is working so far as I can tell. Header and start of body now look like:

...<linkrel="canonical"href="https://matplotlib.org/stable/index.html"/><linkrel="stylesheet"href="_static/custom.css"type="text/css"/><metaname="viewport"content="width=device-width, initial-scale=0.9, maximum-scale=0.9"/></head><body><divid="olddocs-message"> You are reading an old version of thedocumentation (v3.3.2).  For the latest version see<ahref="https://matplotlib.org/stable/index.html">https://matplotlib.org/stable/index.html</a></div>

@lgtm-com
Copy link

lgtm-combot commentedFeb 7, 2021

This pull requestintroduces 1 alert when merging2f144a8 intoaa7c836 -view on LGTM.com

new alerts:

  • 1 for Unnecessary 'else' clause in loop

@jklymak
Copy link
MemberAuthor

@tacaswell@QuLogic I don't see any reason to not move forward with this. If you do, happy to chat, but if its OK, I think implementing it sooner rather than later is preferable...

last = findlast(basename, tocheck)
if last is not None:
update_canonical(fullname, last, dname == tocheck[1])
for fullname in dname.rglob("*.html"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Note,Path.rglob doesn't seem to support multiple patterns, but we do not have any.htm files.

@QuLogic
Copy link
Member

I pushed a few cleanups and improvements.

@jklymak
Copy link
MemberAuthor

Hmmm, isfunctools.cache 3.9 only?

@QuLogic
Copy link
Member

Oh, yes, but it's basically a lighter version oflru_cache; we can switch to that if you prefer.

@jklymak
Copy link
MemberAuthor

I don't mind, I just need to upgrade my env

if not args.no_canonicals:
if np is not None:
with multiprocessing.Pool(np) as pool:
pool.map(do_canonicals, tocheck[1:])
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This option now fails:

Traceback (most recent call last):  File "/Users/jklymak/anaconda3/envs/matplotlibdev/lib/python3.9/multiprocessing/pool.py", line 125, in worker    result = (True, func(*args, **kwds))  File "/Users/jklymak/anaconda3/envs/matplotlibdev/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar    return list(map(*args))  File "/Users/jklymak/matplotlib.github.com/_websiteutils/make_redirects_links.py", line 142, in do_canonicals    last = findlast(basename, tocheck)TypeError: unhashable type: 'list'

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Perhaps just remove it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah,map must convert it from a tuple to a list.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

so its the cache that is causing the problem? Happy to remove my optimization in favour of your optimization ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actually no, this works fine for me; do you have some stashed changes?tocheck should be a tuple afterdf8ed61 (which was before6fcc3b7).

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That aside, I think this is working great....

@jklymak
Copy link
MemberAuthor

Changed the banner tag div id from "olddocs-message" which doesn't exist, to "unreleased-message" which while not quite accurate, already exists in many of the old versions. Gives a banner that is sticky at the top of the viewport:

Banner

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@QuLogicQuLogicQuLogic left review comments

@tacaswelltacaswellAwaiting requested review from tacaswell

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

root pages should redirect to versioned pages Examples in docs should be redirected to latest version number
3 participants
@jklymak@tacaswell@QuLogic

[8]ページ先頭

©2009-2025 Movatter.jp