daveb2's avatar

Pest V4 browser testing - unreliable

Hi all,

Browser testing is awesome, and very powerful. I love being able to script actions and save screenshots.

I'd love to be able to use these, maybe even in some basic CI like github workflows, but for some reason it is just incredibly unreliable for me.

I've worked through everything I can find with online guides and the LLMs etc. I pare down my browser test scripts and test adding one line at a time. I'll get some 20 line script working and then it will randomly start failing, and I'll try reducing it back to a few lines and it's still not working... very frustrating. Things like the screenshots being correct but the script exiting with auth errors, or timeouts etc.

How are people finding browser testing - is it as flaky as my experience, or is it generally reliable?

0 likes
5 replies
Tray2's avatar

I would say that it's generally reliable, the thing is most likely that things sometimes takes a little longer to run/render that you would suspect. It could be a db query, or and API hit that causes the delay, and if something is missing when you try to access it, it might look like random errors.

I've been using Cypress and Playwright to run browser tests, and I had to add extra delays in multiple places to get the tests to green every time.

1 like
ghabriel25's avatar

For each action that require some time to finish, you should always add some pause before doing anything else.

But always remember, the purpose of browser testing or unit testing is that something you would expect to happened.

1 like
daveb2's avatar

I setup a script to run tests repeatedly and catch success/failure and I get results like:

Success: 70  Failures: 3752  Total: 3822  Error Rate: 98.168%
Success: 3  Failures: 9  Total: 12  Error Rate: 75.000%
Success: 8  Failures: 8  Total: 16  Error Rate: 50.000%

That big one with the 70 successes I left running for a couple of hours, and when I came back the failure rate was ticking up about 1 count per second and there were a ton of child php processes in ps -ef f, I had to restart the machine.

After a while the mariadb process gets overloaded or something, and sql connections appear to start dropping.

I try to isolate as many variables as possible and to make tests as repeatable as possible, I drop all tables from the test database first, and re-import the same sql dump for each test outside of any pest scripts before running the tests.

Here's what my test script looks like:

By using --filter="provider can create first report" I'm limiting the test runs to just the first test, which modifies the database (hence the need to re-import each time).

Here is the "provider can create first report" test:

Nothing overly complex. But I have tried trimming this back to just the first 1 or 2 screenshots, and I get similar results - it either completes or doesn't complete with about the same frequency.

I'm stumped with this one. I don't think browser tests are going to work for me - I'll probably have to go back to ordinary feature tests and perform the same testing manually before I push updates.

ghabriel25's avatar

You're overly use one test for many feature, I suggest split it into different test within one test file

  • show login page
  • user can login
  • user can access the specific page
  • user can create client
  • user can create report
  • etc
daveb2's avatar

Thank you for the feedback. Then I think I know this form of testing is not for me, because I will have to have many dozens of highly coupled functions (ie. the order of execution is important because the state of the system is modified by many of the tests).

Please or to participate in this conversation.