Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Instrument headless chrome/chromium instances from PHP

License

NotificationsYou must be signed in to change notification settings

chrome-php/chrome

Latest Stable VersionLicense

This library lets you start playing with chrome/chromium in headless mode from PHP.

Can be used synchronously and asynchronously!

Features

  • Open Chrome or Chromium browser from php
  • Create pages and navigate to pages
  • Take screenshots
  • Evaluate javascript on the page
  • Make PDF
  • Emulate mouse
  • Emulate keyboard
  • Always IDE friendly

Happy browsing!

Requirements

Requires PHP 7.4-8.4 and a Chrome/Chromium 65+ executable.

Note that the library is only tested on Linux but is compatible with macOS and Windows.

Installation

The library can be installed with Composer and is available on packagist underchrome-php/chrome:

$ composer require chrome-php/chrome

Usage

It uses a simple and understandable API to start Chrome, to open pages, take screenshots, crawl websites... and almost everything that you can do with Chrome as a human.

useHeadlessChromium\BrowserFactory;$browserFactory =newBrowserFactory();// starts headless Chrome$browser =$browserFactory->createBrowser();try {// creates a new page and navigate to an URL$page =$browser->createPage();$page->navigate('http://example.com')->waitForNavigation();// get page title$pageTitle =$page->evaluate('document.title')->getReturnValue();// screenshot - Say "Cheese"! 😄$page->screenshot()->saveToFile('/foo/bar.png');// pdf$page->pdf(['printBackground' =>false])->saveToFile('/foo/bar.pdf');}finally {// bye$browser->close();}

Using different Chrome executable

When starting, the factory will look for the environment variable"CHROME_PATH" to use as the Chrome executable.If the variable is not found, it will try to guess the correct executable path according to your OS or use"chrome" as the default.

You are also able to explicitly set up any executable of your choice when creating a new object. For instance"chromium-browser":

useHeadlessChromium\BrowserFactory;// replace default 'chrome' with 'chromium-browser'$browserFactory =newBrowserFactory('chromium-browser');

Debugging

The following example disables headless mode to ease debugging

useHeadlessChromium\BrowserFactory;$browserFactory =newBrowserFactory();$browser =$browserFactory->createBrowser(['headless' =>false,// disable headless mode]);

Other debug options:

['connectionDelay' =>0.8,// add 0.8 second of delay between each instruction sent to Chrome,'debugLogger'     =>'php://stdout',// will enable verbose mode]

AboutdebugLogger: this can be any of a resource string, a resource, or an object implementingLoggerInterface from Psr\Log (such asmonologorapix/log).

API

Browser Factory

Options set directly in thecreateBrowser method will be used only for a single browser creation. The default options will be ignored.

useHeadlessChromium\BrowserFactory;$browserFactory =newBrowserFactory();$browser =$browserFactory->createBrowser(['windowSize'   => [1920,1000],'enableImages' =>false,]);// this browser will be created without any options$browser2 =$browserFactory->createBrowser();

Options set using thesetOptions andaddOptions methods will persist.

$browserFactory->setOptions(['windowSize' => [1920,1000],]);// both browser will have the same 'windowSize' option$browser1 =$browserFactory->createBrowser();$browser2 =$browserFactory->createBrowser();$browserFactory->addOptions(['enableImages' =>false]);// this browser will have both the 'windowSize' and 'enableImages' options$browser3 =$browserFactory->createBrowser();$browserFactory->addOptions(['enableImages' =>true]);// this browser will have the previous 'windowSize', but 'enableImages' will be true$browser4 =$browserFactory->createBrowser();

Available options

Here are the options available for the browser factory:

Option nameDefaultDescription
connectionDelay0Delay to apply between each operation for debugging purposes
customFlagsnoneAn array of flags to pass to the command line. Eg:['--option1', '--option2=someValue']
debugLoggernullA string (e.g "php://stdout"), or resource, or PSR-3 logger instance to print debug messages
disableNotificationsfalseDisable browser notifications
enableImagestrueToggles loading of images
envVariablesnoneAn array of environment variables to pass to the process (example DISPLAY variable)
headersnoneAn array of custom HTTP headers
headlesstrueEnable or disable headless mode
ignoreCertificateErrorsfalseSet Chrome to ignore SSL errors
keepAlivefalseSet totrue to keep alive the Chrome instance when the script terminates
noSandboxfalseEnable no sandbox mode, useful to run in a docker container
noProxyServerfalseDon't use a proxy server, always make direct connections. Overrides other proxy settings.
proxyBypassListnoneSpecifies a list of hosts for whom we bypass proxy settings and use direct connections
proxyServernoneProxy server to use. usage:127.0.0.1:8080 (authorisation with credentials does not work)
sendSyncDefaultTimeout5000Default timeout (ms) for sending sync messages
startupTimeout30Maximum time in seconds to wait for Chrome to start
userAgentnoneUser agent to use for the whole browser (see page API for alternative)
userDataDirnoneChrome user data dir (default: a new empty dir is generated temporarily)
userCrashDumpsDirnoneThe directory crashpad should store dumps in (crash reporter will be enabled automatically)
windowSizenoneSize of the window. usage:$width, $height - see also Page::setViewport
excludedSwitchesnoneAn array of Chrome flags that should be removed from the default set (example --enable-automation)

Persistent Browser

This example shows how to share a single instance of Chrome for multiple scripts.

The first time the script is started we use the browser factory in order to start Chrome, afterwards we save the uri to connect to this browser in the file system.

The next calls to the script will read the uri from that file in order to connect to the Chrome instance instead of creating a new one. If Chrome was closed or crashed, a new instance is started again.

use \HeadlessChromium\BrowserFactory;use \HeadlessChromium\Exception\BrowserConnectionFailed;$socketFile ='/tmp/chrome-php-demo-socket';// path to the file to store websocket's uri$socket =\file_get_contents($socketFile);try {$browser = BrowserFactory::connectToBrowser($socket);}catch (BrowserConnectionFailed$e) {// The browser was probably closed, start it again$factory =newBrowserFactory();$browser =$factory->createBrowser(['keepAlive' =>true,    ]);// save the uri to be able to connect again to browser\file_put_contents($socketFile,$browser->getSocketUri(),LOCK_EX);}

Browser API

Create a new page (tab)

$page =$browser->createPage();

Get opened pages (tabs)

$pages =$browser->getPages();

Close the browser

$browser->close();

Set a script to evaluate before every page created by this browser will navigate

$browser->setPagePreScript('// Simulate navigator permissions;const originalQuery = window.navigator.permissions.query;window.navigator.permissions.query = (parameters) => (    parameters.name ==='notifications' ?        Promise.resolve({ state: Notification.permission }) :originalQuery(parameters));');

Page API

Navigate to an URL

// navigate$navigation =$page->navigate('http://example.com');// wait for the page to be loaded$navigation->waitForNavigation();

When using$navigation->waitForNavigation() you will wait for 30sec until the page event "loaded" is triggered.You can change the timeout or the event to listen for:

useHeadlessChromium\Page;// wait 10secs for the event "DOMContentLoaded" to be triggered$navigation->waitForNavigation(Page::DOM_CONTENT_LOADED,10000);

Available events (in the order they trigger):

  • Page::DOM_CONTENT_LOADED: dom has completely loaded
  • Page::FIRST_CONTENTFUL_PAINT: triggered when the first non-white content element is painted on the screen
  • Page::FIRST_IMAGE_PAINT: triggered when the first image is painted on the screen
  • Page::FIRST_MEANINGFUL_PAINT: triggered when the primary content of a page is visible to the user
  • Page::FIRST_PAINT: triggered when any pixel on the screen is painted, including the browser's default background color
  • Page::INIT: connection to DevTools protocol is initialized
  • Page::INTERACTIVE_TIME: scripts have finished loading and the main thread is no longer blocked by rendering or other tasks
  • Page::LOAD: (default) page and all resources are loaded
  • Page::NETWORK_IDLE: page has loaded, and no network activity has occurred for at least 500ms

When you want to wait for the page to navigate 2 main issues may occur.First, the page is too long to load and second, the page you were waiting to be loaded has been replaced.The good news is that you can handle those issues using a good old try-catch:

useHeadlessChromium\Exception\OperationTimedOut;useHeadlessChromium\Exception\NavigationExpired;try {$navigation->waitForNavigation()}catch (OperationTimedOut$e) {// too long to load}catch (NavigationExpired$e) {// An other page was loaded}

Evaluate script on the page

Once the page has completed the navigation you can evaluate arbitrary script on this page:

// navigate$navigation =$page->navigate('http://example.com');// wait for the page to be loaded$navigation->waitForNavigation();// evaluate script in the browser$evaluation =$page->evaluate('document.documentElement.innerHTML');// wait for the value to return and get it$value =$evaluation->getReturnValue();

Sometimes the script you evaluate will click a link or submit a form, in this case, the page will reload and youwill want to wait for the new page to reload.

You can achieve this by using$page->evaluate('some js that will reload the page')->waitForPageReload().

Call a function

This is an alternative toevaluate that allows calling a given function with the given arguments in the page context:

$evaluation =$page->callFunction("function(a, b) {\n    window.foo = a + b;\n}",    [1,2]);$value =$evaluation->getReturnValue();

Add a script tag

That's useful if you want to add jQuery (or anything else) to the page:

$page->addScriptTag(['content' =>file_get_contents('path/to/jquery.js')])->waitForResponse();$page->evaluate('$(".my.element").html()');

You can also use an URL to feed the src attribute:

$page->addScriptTag(['url' =>'https://code.jquery.com/jquery-3.3.1.min.js'])->waitForResponse();$page->evaluate('$(".my.element").html()');

Set the page HTML

You can manually inject html to a page using thesetHtml method.

// Basic$page->setHtml('<p>text</p>');// Specific timeout & event$page->setHtml('<p>text</p>',10000, Page::NETWORK_IDLE);

When a page's HTML is updated, we'll wait for the page to unload. You can specify how long to wait and which event to wait for through two optional parameters. This defaults to 3000ms and the "load" event.

Note that this method will not append to the current page HTML, it will completely replace it.

Get the page HTML

You can get the page HTML as a string using thegetHtml method.

$html =$page->getHtml();

Add a script to evaluate upon page navigation

$page->addPreScript('// Simulate navigator permissions;const originalQuery = window.navigator.permissions.query;window.navigator.permissions.query = (parameters) => (    parameters.name ==='notifications' ?        Promise.resolve({ state: Notification.permission }) :originalQuery(parameters));');

If your script needs the dom to be fully populated before it runs then you can use the option "onLoad":

$page->addPreScript($script, ['onLoad' =>true]);

Set viewport size

This feature allows changing the size of the viewport (emulation) for the current page without affecting the size ofall the browser's pages (see also option"windowSize" ofBrowserFactory::createBrowser).

$width =600;$height =300;$page->setViewport($width,$height)    ->await();// wait for the operation to complete

Make a screenshot

// navigate$navigation =$page->navigate('http://example.com');// wait for the page to be loaded$navigation->waitForNavigation();// take a screenshot$screenshot =$page->screenshot(['format'  =>'jpeg',// default to 'png' - possible values: 'png', 'jpeg', 'webp''quality' =>80,// only when format is 'jpeg' or 'webp' - default 100'optimizeForSpeed' =>true// default to 'false' - Optimize image encoding for speed, not for resulting size]);// save the screenshot$screenshot->saveToFile('/some/place/file.jpg');

Screenshot an area on a page

You can use the option "clip" to choose an area on a page for the screenshot

useHeadlessChromium\Clip;// navigate$navigation =$page->navigate('http://example.com');// wait for the page to be loaded$navigation->waitForNavigation();// create a rectangle by specifying to left corner coordinates + width and height$x =10;$y =10;$width =100;$height =100;$clip =newClip($x,$y,$width,$height);// take the screenshot (in memory binaries)$screenshot =$page->screenshot(['clip'  =>$clip,]);// save the screenshot$screenshot->saveToFile('/some/place/file.jpg');

Full-page screenshot

You can also take a screenshot for the full-page layout (not only the viewport) using$page->getFullPageClip with attributecaptureBeyondViewport = true

// navigate$navigation =$page->navigate('https://example.com');// wait for the page to be loaded$navigation->waitForNavigation();$screenshot =$page->screenshot(['captureBeyondViewport' =>true,'clip' =>$page->getFullPageClip(),'format' =>'jpeg',// default to 'png' - possible values: 'png', 'jpeg', 'webp']);// save the screenshot$screenshot->saveToFile('/some/place/file.jpg');

Print as PDF

// navigate$navigation =$page->navigate('http://example.com');// wait for the page to be loaded$navigation->waitForNavigation();$options = ['landscape'           =>true,// default to false'printBackground'     =>true,// default to false'displayHeaderFooter' =>true,// default to false'preferCSSPageSize'   =>true,// default to false (reads parameters directly from @page)'marginTop'           =>0.0,// defaults to ~0.4 (must be a float, value in inches)'marginBottom'        =>1.4,// defaults to ~0.4 (must be a float, value in inches)'marginLeft'          =>5.0,// defaults to ~0.4 (must be a float, value in inches)'marginRight'         =>1.0,// defaults to ~0.4 (must be a float, value in inches)'paperWidth'          =>6.0,// defaults to 8.5 (must be a float, value in inches)'paperHeight'         =>6.0,// defaults to 11.0 (must be a float, value in inches)'headerTemplate'      =>'<div>foo</div>',// see details above'footerTemplate'      =>'<div>foo</div>',// see details above'scale'               =>1.2,// defaults to 1.0 (must be a float)];// print as pdf (in memory binaries)$pdf =$page->pdf($options);// save the pdf$pdf->saveToFile('/some/place/file.pdf');// or directly output pdf without savingheader('Content-Description: File Transfer');header('Content-Type: application/pdf');header('Content-Disposition: inline; filename=filename.pdf');header('Content-Transfer-Encoding: binary');header('Expires: 0');header('Cache-Control: must-revalidate, post-check=0, pre-check=0');header('Pragma: public');echobase64_decode($pdf->getBase64());

OptionsheaderTemplate andfooterTemplate:

Should be valid HTML markup with the following classes used to inject printing values into them:

  • date: formatted print date
  • title: document title
  • url: document location
  • pageNumber: current page number
  • totalPages: total pages in the document

Save downloads

You can set the path to save downloaded files.

// After creating a page.$page->setDownloadPath('/path/to/save/downloaded/files');

Mouse API

The mouse API is dependent on the page instance and allows you to control the mouse's moves, clicks and scroll.

$page->mouse()    ->move(10,20)// Moves mouse to position x=10; y=20    ->click()// left-click on position set above    ->move(100,200, ['steps' =>5])// move mouse to x=100; y=200 in 5 equal steps    ->click(['button' => Mouse::BUTTON_RIGHT];// right-click on position set above// given the last click was on a link, the next step will wait// for the page to load after the link was clicked$page->waitForReload();

You can emulate the mouse wheel to scroll up and down in a page, frame, or element.

$page->mouse()    ->scrollDown(100)// scroll down 100px    ->scrollUp(50);// scroll up 50px

Finding elements

Thefind method will search for elements usingquerySelector and move the cursor to a random position over it.

try {$page->mouse()->find('#a')->click();// find and click at an element with id "a"$page->mouse()->find('.a',10);// find the 10th or last element with class "a"}catch (ElementNotFoundException$exception) {// element not found}

This method will attempt to scroll right and down to bring the element to the visible screen. If the element is inside an internal scrollable section, try moving the mouse to inside that section first.

Keyboard API

The keyboard API is dependent on the page instance and allows you to type like a real user.

$page->keyboard()    ->typeRawKey('Tab')// type a raw key, such as Tab    ->typeText('bar');// type the text "bar"

To impersonate a real user you may want to add a delay between each keystroke using thesetKeyInterval method:

$page->keyboard()->setKeyInterval(10);// sets a delay of 10 milliseconds between keystrokes

Key combinations

The methodspress,type, andrelease can be used to send key combinations such asctrl + v.

// ctrl + a to select all text$page->keyboard()    ->press('control')// key names are case insensitive and trimmed        ->type('a')// press and release    ->release('Control');// ctrl + c to copy and ctrl + v to paste it twice$page->keyboard()    ->press('Ctrl')// alias for Control        ->type('c')        ->type('V')// upper and lower cases should behave the same way    ->release();// release all

You can press the same key several times in sequence, this is the equivalent to a user pressing and holding the key. The release event, however, will be sent only once per key.

Key aliases

KeyAliases
ControlControl,Ctrl,Ctr
AltAlt,AltGr,Alt Gr
MetaMeta,Command,Cmd
ShiftShift

Cookie API

You can set and get cookies for a page:

Set Cookie

useHeadlessChromium\Cookies\Cookie;$page =$browser->createPage();// example 1: set cookies for a given domain$page->setCookies([    Cookie::create('name','value', ['domain' =>'example.com','expires' =>time() +3600// expires in 1 hour    ])])->await();// example 2: set cookies for the current page$page->navigate('http://example.com')->waitForNavigation();$page->setCookies([    Cookie::create('name','value', ['expires'])])->await();

Get Cookies

useHeadlessChromium\Cookies\Cookie;$page =$browser->createPage();// example 1: get all cookies for the browser$cookies =$page->getAllCookies();// example 2: get cookies for the current page$page->navigate('http://example.com')->waitForNavigation();$cookies =$page->getCookies();// filter cookies with name == 'foo'$cookiesFoo =$cookies->filterBy('name','foo');// find first cookie with name == 'bar'$cookieBar =$cookies->findOneBy('name','bar');if ($cookieBar) {// do something}

Set user agent

You can set up a user-agent per page:

$page->setUserAgent('my user-agent');

See also BrowserFactory optionuserAgent to set up it for the whole browser.

Advanced usage

The library ships with tools that hide all the communication logic but you can use the tools used internally tocommunicate directly with Chrome debug protocol.

Example:

useHeadlessChromium\Communication\Connection;useHeadlessChromium\Communication\Message;// Chrome devtools URI$webSocketUri ='ws://127.0.0.1:9222/devtools/browser/xxx';// create a connection$connection =newConnection($webSocketUri);$connection->connect();// send method "Target.activateTarget"$responseReader =$connection->sendMessage(newMessage('Target.activateTarget', ['targetId' =>'xxx']));// wait up to 1000ms for a response$response =$responseReader->waitForResponse(1000);

Create a session and send a message to the target

// given a target id$targetId ='yyy';// create a session for this target (attachToTarget)$session =$connection->createSession($targetId);// send message to this target (Target.sendMessageToTarget)$response =$session->sendMessageSync(newMessage('Page.reload'));

Debugging

You can ease the debugging by setting a delay before each operation is made:

$connection->setConnectionDelay(500);// wait for 500ms between each operation to ease debugging

Browser (standalone)

useHeadlessChromium\Communication\Connection;useHeadlessChromium\Browser;// Chrome devtools URI$webSocketUri ='ws://127.0.0.1:9222/devtools/browser/xxx';// create connection given a WebSocket URI$connection =newConnection($webSocketUri);$connection->connect();// create browser$browser =newBrowser($connection);

Interacting with DOM

Find one element on a page by CSS selector:

$page =$browser->createPage();$page->navigate('http://example.com')->waitForNavigation();$elem =$page->dom()->querySelector('#index_email');

Find all elements inside another element by CSS selector:

$elem =$page->dom()->querySelector('#index_email');$elem->querySelectorAll('a.link');

Find all elements on a page by XPath selector:

$page =$browser->createPage();$page->navigate('http://example.com')->waitForNavigation();$elem =$page->dom()->search('//div/*/a');

Wait for an element by CSS selector:

$page =$browser->createPage();$page->navigate('http://example.com')->waitForNavigation();$page->waitUntilContainsElement('div[data-name=\"el\"]');

If a string is passed toPage::waitUntilContainsElement, an instance ofCSSSelector is created for you byPage::waitForElement. To use otherselectors, you can pass an instance of the requiredSelector.

Wait for element by XPath selector:

useHeadlessChromium\Dom\Selector\XPathSelector;$page =$browser->createPage();$page->navigate('http://example.com')->waitForNavigation();$page->waitUntilContainsElement(newXPathSelector('//div[contains(text(), "content")]'));

You can send out a text to an element or click on it:

$elem->click();$elem->sendKeys('Sample text');

You can upload file to file from the input:

$elem->sendFile('/path/to/file');

You can get element text or attribute:

$text =$elem->getText();$attr =$elem->getAttribute('class');

Contributing

SeeCONTRIBUTING.md for contribution details.

License

This project is licensed under theThe MIT License (MIT).


[8]ページ先頭

©2009-2025 Movatter.jp