We talked about the beginners’ rules, now let’s do something for real.

Warming up

Install NodeJS 10.x or higher

Install Yarn Package Manager

In the terminal, execute the following commands:

mkdir hello_puppeteer

cd hello_puppeteer

yarn init yarn add puppeteer

The last command takes a while; it installs Puppeteer alongside all dependencies (like Chromium).

Create a new js file and name it to google.js , then fill it with the following code:

const puppeteer = require("puppeteer"); (async () => {

// lanuch the browser

const browser = await puppeteer.launch({

headless: false,

// devtools: false,

// args: ["--start-maximized"]

}); // create a new tab

const page = await browser.newPage();

})();

The puppeteer.laucnh launches a new instance of the Chromium browser. If you look at the code, you see that I wrote headless: false . I’d like to see what’s happing under the hood.

launches a new instance of the Chromium browser. If you look at the code, you see that I wrote . I’d like to see what’s happing under the hood. The browser.newPage creates a new tab inside the browser.

creates a new tab inside the browser. I wrapped the whole code inside an async function because all Puppeteer methods are async .

You can run the script by node google.js inside the terminal and you should see this:

Result of the script

Now we want to navigate to Google.com. To do this, add the following code next to the const page = ... line.

// navigates to google.com await page.goto('https://google.com');

Now we want to search for something on Google.com. First, Inspect the search input with Web Developer Tools (right-click on search input and click on inspect element).

Inspect element in Web Developer Tools

You need to find a static CSS selector for the element to work with them in Puppeteer. This input element has several attributes, but I choose two:

class: Unfortunately, it’s dynamic, and we can not rely on that

name: The value of the name attribute is unique on the whole page. It’s the right candidate!

Try your CSS selector inside the web developer tools — console tab. with document.querySelector or document.querySelectorAll you can test them.

My CSS selector is input[name=q] and it works well. Now it’s time to type something.

await page.focus('input[name=q]'); /* or you can: first find the element, then focus it



const searchInput = await page.$('input[name=q]');

await searchInput.focus();

*/

💡 In page API, $ is equal to documnet.querySelector and $$ is equal to document.querySelectorAll .

To make Puppeeter type inside an input, we need to focus on it. There are two APIs to do it. They both have the same result, but it depends on the situation what API you use.

Typing inside the webpage is easy with keyboard API.

await page.keyboard.type('Puppeteer');

Type in action

Now, how to get the search results?

People usually press Enter or click on the Google Search button. To emulate them you can do:

// press the keyboard Enter button

await page.keyboard.press("Enter"); // or you can first, focus on search button, then click it

// const searchButton = await page.$('input[name=btnK]');

// await searchButton.click();

We learned the basics, but there are more things you need to know.

Ask Puppeteer to wait for something

Consider these scenarios:

An Ajax request after clicking on a button

Example of uploading a file for file-input:

const fileInput= await page.$('input[type=file]');

const uploadButton = await page.$('#upload-btn'); await fileInput.uploadFile("d:/image.jpg");

await uploadButton.click(); // now wait for upload response

await page.waitForResponse('https://example.com/services/upload/'); // find dynamic generated link after ajax upload

const fileUrlElement = await page.$('a.file-url');

Waiting for a couple of seconds

await page.waitFor(1000); // in mili seconds

Waiting for page navigation

await page.waitForNavigation();

Waiting for specific CSS selector to be rendered

await page.waitForSelector('css selector');

Evaluate scripts in the browser

I faced a scenario that after uploading the file with uploadFile method, the page wouldn’t execute the Ajax upload process. For this scenario, I need to raise a jQuery code to make the job done.

const input = await page.$('input[type=file]');

await input.uploadFile(fileToUpload); await page.evaluate(

element => $(element).trigger("custom_event"),

input

);

In this example, the $(element).trigger("...") is the JS code, that will execute inside the browser (not inside Puppeteer script).

What else can you do?

In the first example, I used headless: false to see what happens under the hood. Usually, it’s better to use it in the headless mode, because it runs the script and the browser in the background, and it won’t interrupt you with any open windows.

But how can you see the results if you run it with headless mode? How about taking a screenshot!