Seems like the same text/instructions are cached, so they give you the same result. Just a hack, but adding in a data box with current date+time fixes it as expected as the inputs are no longer the same. I wouldn't rely on this for true randomness though.
It's interesting how sometimes the instructions seem to specifically dictate JSON output and other times not. Even without changing the prompt, it seems that this aspect - the generated series of steps for each instruction - is pretty random each time you run it. Or maybe it caches for awhile. What would be really nice would be if there were a checkbox to request structured or unstructured output from instructions, or better yet just lock in a set of derived steps you were happy with.