Provide a way to stringify your tree, then test that stringification against some $expected. If this means you have to write a stringify function, that's not a bad thing. If your tests want to do it, then your users will, as well.
Your answer seems to imply that you suggest to stick to the "abbreviated" test flavour. Is this correct? And if so... why? I'm looking for motivations to take one path or another, so forgive me for all these questions.
Are you wondering how many ok() calls to make or how many scenarios to test? It doesn't matter how many ok() calls you make per test scenario so long as you test enough scenarios.