Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Jordan Hansen
Jordan Hansen

Posted on • Originally published atjavascriptwebscrapingguy.com on

     

Jordan Plays With Playwright

Demo code here

Much to my surprise, Playwright has entered the scene. I followAndrey Lushnikov on twitter and on January 22nd, he made this tweet:

Folks! I'm happy to share what we've been working on:

📣https://t.co/ABrJpvbwSy

Playwright is like Puppeteer, but cross-browser.pic.twitter.com/PiMjqwr7uF

— Andrey Lushnikov (@aslushnikov)January 22, 2020

It turns out that the whole Puppeteer team has moved over to Microsoft in pursuit of creating Playwright. Playwright uses, as far as I can tell, almost exactly the same API as Puppeteer. One big drawback for a typescript guy like me is that there isn’t a type definition file for it yet, like there is for puppeteer. Maybe it’s time for me to learn how to create a definition file.

Check out the documentation for Playwrighthere.

For learning to web scrape with puppeteer, checkhere.

Different devices

mobile devices gif

Playwright and puppeteer were both largely built for automated web testing and they do a great job with this. While I mostly use them for web scraping and automating tedious tasks, there is a large part of these tools that is available to help with testing.

One of the opening examples it shows is how easy it is to test with different devices. Look how the code works:

const pixel2 = devices['Pixel 2'];        const browser = await chromium.launch({ headless: false });        const context = await browser.newContext({            viewport: pixel2.viewport,            userAgent: pixel2.userAgent,            geolocation: { longitude: longitude, latitude: latitude },            permissions: { 'https://www.google.com': ['geolocation'] }        });        const page = await context.newPage();        await page.goto('https://maps.google.com');        await page.click('text="Your location"');        await page.waitForRequest(/.*pwa\/net.js.*/);        await page.screenshot({ path: `${longitude}, ${latitude}-android.png` });        await browser.close();
Enter fullscreen modeExit fullscreen mode

pixel2 is imported from Playwright (const playwright = require('playwright');) and from there you can just all the stats that comes with that device. Pretty amazing and very simple.

I wanted to mess around a little bit with the geolocation things since I’d never used that with puppeteer. I built a random longitude and latitude function and then tried hitting google maps from each of these random positions and see how that kind of thing would affect google blocking me. After 20 attempts google hadn’t flagged anything. In this example I just have five loops.

async function tryDevices() {    // Loop five times with random locations    for (let i = 0; i < 5; i++) {        const latitude = getRandomInRange(-90, 90, 3);        const longitude = getRandomInRange(-90, 90, 3);        const pixel2 = devices['Pixel 2'];        const browser = await chromium.launch({ headless: false });        const context = await browser.newContext({            viewport: pixel2.viewport,            userAgent: pixel2.userAgent,            geolocation: { longitude: longitude, latitude: latitude },            permissions: { 'https://www.google.com': ['geolocation'] }        });        const page = await context.newPage();        await page.goto('https://maps.google.com');        await page.click('text="Your location"');        await page.waitForRequest(/.*pwa\/net.js.*/);        await page.screenshot({ path: `${longitude}, ${latitude}-android.png` });        await browser.close();    }}// Longitude and latitude functionfunction getRandomInRange(from, to, fixed) {    return (Math.random() * (to - from) + from).toFixed(fixed) * 1;}
Enter fullscreen modeExit fullscreen mode

I also learned that there is a lot of ocean on Earth. Surprise.

ocean perfect gif loop

It could possibly be a neat trick to use the differing geolocations but I still think what happens with puppeteer stealth and the items I discussed in thehow to avoid being blocked with puppeteer post are better for just avoiding blocked.

Different browsers

different browsers gif

Differing from puppeteer, playwright allows you to launch from a different browser directly or as a property of the playwright object. As we saw up with the differing devices, we call the launch function directly from a browser type withconst browser = await chromium.launch({ headless: false });. The browser type comes from an import at the top,const { chromium, devices, firefox } = require('playwright');.

The docs also show it’s simple to just loop through the available browsers like so:

    for (const browserType of ['chromium', 'firefox', 'webkit']) {        const browser = await playwright[browserType].launch({ headless: false });                // do your stuff here        }
Enter fullscreen modeExit fullscreen mode

Conclusion

At this point, it looks to be superior to puppeteer. While the fact that it can handle multiple browsers very easily and is clearly a major goal for them is awesome, it’s probably not that impactful when using for web scraping.

An important point is, however, with the whole amazing team that created puppeteer in the first place working on playwright, this is where the updates will be. In fact, I found a cool one that wasn’t even explicitly mentioned. The ability to select based on text content. I searched high and low and couldn’t find anyway to do it this way in puppeteer, so I’m fairly certain it’s specific to playwright.

This is how I would have done something where I had a list of header items with the same selectors and I only wanted to select the one that had pricing.

playwright text selector example

        // Search through content and find pricing        const headerElementHandles = await page.$$('.hometop-btn .mat-button-wrapper');        for (let elementHandle of headerElementHandles) {            const text: string = await elementHandle.$eval('strong', element => element.textContent);            console.log('text', text);            if (text && text.toLocaleLowerCase().includes('pricing')) {                await elementHandle.click();            }        }
Enter fullscreen modeExit fullscreen mode

I’d just get the list of all of them and then loop through them and click the one that has the text content I’m looking for.

And…with this new playwright way?

        // Click based on text content        await page.click('text="Pricing"');
Enter fullscreen modeExit fullscreen mode

That’s it. A lot simpler. Love it. Good job, playwright team!

Demo code here

Looking for business leads?

Using the techniques talked about here atjavascriptwebscrapingguy.com, we’ve been able to launch a way to access awesome business leads. Learn more atCobalt Intelligence!

The postJordan Plays With Playwright appeared first onJavaScript Web Scraping Guy.

Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Software engineer and javascript lover.I love the power of the web and getting the data from it with web scraping.
  • Location
    Eagle, ID
  • Education
    Boise State University
  • Work
    Software Engineer at Lenovo Software
  • Joined

More fromJordan Hansen

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp