Smoke-testing scenes, using Jenkins and Unity Test Framework

You may be familiar with Unity’s Test Runner window, where you can execute tests and see results. This is the user-facing part of the Unity Test Framework, which is a very extensible system for running tests of any kind. At Roll7 I recently set up the test runner to automatically run simple smoketests on every level of our (unannounced) game, and made jenkins report on failures. In this post I’ll outline how I did the former, and in part two I’ll cover the later.

Play mode tests, automatically generated for every level
Some of our [redacted] playmode and editmode tests, running on Jenkins

(I’m going to assume you have passing knowledge of how to write tests for the test runner)

(a lot of this post is based on this interesting talk about UTF from its creators at Unite 2019)

The UTF is built upon NUnit, a .net testing framework. That’s what provides all those [TestFixture] and [Test] attributes. One feature of NUnit that UTF also supports is [TestFixtureSource]. This attribute allows you to make a sort of “meta testfixture”, a template for how to make test fixtures for specific resources. If you’re familiar with parameterized tests, it’s like that but on a fixture level.

We’re going to make a TestFixtureSource provider that finds all level scenes in our project, and then the TestFixutreSource itself that loads a specific level and runs some generic smoke tests on it. The end result is that adding a new level will automatically add an entry for it to the play mode tests list.

There’s a few options for different source providers (see the NUnit docs for more), but we’re going to make an IEnumerable that finds all our level scenes. The results of this IEnumerable are what gets passed to our constructor – you could use any type here.

class AllRequiredLevelsProvider : IEnumerable<string>
{
    IEnumerator<string> IEnumerable<string>.GetEnumerator()
    {
        var allLevelGUIDs = AssetDatabase.FindAssets("t:Scene", new[] {"Assets/Scenes/Levels"} );
        foreach(var levelGUID in allLevelGUIDs)
        {
            var levelPath = AssetDatabase.GUIDToAssetPath(levelGUID);
            yield return levelPath;
        }
    }
    public IEnumerator GetEnumerator() => (this as IEnumerable<string>).GetEnumerator();
}

Our TestFixture looks like a regular fixture, except also with the source attribute linking to our provider. Its constructor takes a string that defines which level to load.

[TestFixtureSource(typeof(AllRequiredLevelsProvider))]
public class LevelSmokeTests
{
    private string m_levelToSmoke;
    public LevelSmokeTests(string levelToSmoke)
    {
        m_levelToSmoke = levelToSmoke;
    }

Now our fixture knows which level to test, but not how to load it. TestFixtures have a [SetUp] attribute which runs before each test, but loading the level fresh for each test would be slow and wasteful. Instead let’s use [OneTimeSetup] (đź‘€ at the inconsistent capitalisation) and to load and unload our level for each fixture. This depends somewhat on your game implementation, but for now let’s go with UnityEngine.SceneManagement:

// class LevelSmokeTests {
    [OneTimeSetUp]
    public void LoadScene()
    {
        SceneManager.LoadScene(m_levelToSmoke);
    }

Finally, we need some tests that would work on any level we throw at it. The simplest approach is probably to just watch the console for errors as we load in, sit in the level, and then as we load out. Any console errors at any of these stages should fail the test.

UTF provides LogAsset to validate the output of the log, but at this time it only lets you prescribe what should appear. We don’t care about Debug.Log() output, but want to know if there was anything worse than that. Particularly, in our case, we’d like to fail for warnings as well as errors. Too many “benign” warnings can hide serious issues! So, here’s a little utility class called LogSeverityTracker, that helps check for clean consoles. Check the comments for usage.

Our tests can use the [Order] attribute to ensure they happen in sequence:

// class LevelSmokeTests {
    [Test, Order(1)]
    public void LoadsCleanly()
    {
        m_logTracker.AssertCleanLog();
    }

    [UnityTest, Order(2)]
    public IEnumerator RunsCleanly()
    {
        // wait some arbitrary time
        yield return new WaitForSeconds(5);
        m_logTracker.AssertCleanLog();
    }

    [UnityTest, Order(3)]
    public IEnumerator UnloadsCleanly()
    {
        // how you unload is game-dependent 
        yield return SceneManager.LoadSceneAsync("mainmenu");
        m_logTracker.AssertCleanLog();
    }

Now we’re at the point where you can hit Run All in the Test Runner and see each of your levels load in turn, wait a while, then unload. You’ll get failed tests for console warnings or errors, and newly-added levels will get automatically-generated test fixtures.

More tests are undoubtedly more useful than less. Depending on the complexity and setup of your game, the next steps might be to get the player to dumbly walk around for a little bit. You can get a surprising amount of info from a dumb walk!

In part 2, I’ll outline how I added all this to jenkins. It’s not egregiously hard, but it can be a bit cryptic at times.

Advertisement

DecisionFlex

DecisionFlex considerations

For a while I’ve been working on an AI plugin for Unity3D called DecisionFlex. It’s a decision-making tool for your games, based on Utility Theory. It’s great for when you have multiple actions to choose between, and lots of factors to take into consideration. For instance:

  • A soldier choosing its next target, based on target range, type and the soldier’s ammo count
  • A Sims-like human choosing to eat, drink, work or exercise based on hunger, thirst, wealth and fitness
  • A bot choosing to prioritise picking up a health pack, based on the bot’s HP and distance to the nearest pack
  • Any time you might find yourself writing highly nested or cascading if-then statements

DecisionFlex editor-based and data-driven, so you can construct new decisions and actions, and tweak considerations, without diving into code. The code to hook your game up to DecisionFlex is minimal and you don’t need to understand complex equations.

DecisionFlex isn’t quite ready to ship yet, but I’m ready to start talking about it. You can find more information and a web demo here:

http://www.tenpn.com/decisionflex.html

How to build Unity3D scripts from the command line

I use emacs for day-to-day programming, including writing unity scripts. For a while I made changes then tabbed over to the unity editor to watch them compile and check for errors. This was, imaginably, tedious. Anyone not coding in MonoDevelop has the same issue, and even the MonoDevelop folks (god have mercy upon their souls) have issues.

It turns out there are multiple ways to compile your unity scripts from the command line, in such a way that most editors can be told to do it and parse the output for errors, greatly speeding up iteration time. I don’t think there’s anything new here, I’m just collating information that seems spread out across the net like horcruxes.

The first way to build your script files is the easiest to get working, but is cumbersome and slow. We’ll just pass a range of cryptic command line arguments to our Unity executable, it’ll launch without an editor front end, compile the scripts, dump any errors to the console, and quit. Cue up some regex and your editor can pull in the errors. Here is the command:

    path/to/unity -batchmode -logFile -projectPath path/to/project/dir -quit

On OSX you can find the unity executable at /Applications/Unity/Unity.app/Contents/MacOS/Unity.

You can read more about unity command line arguments in the unity docs, but let’s go through the options here:

  • -batchmode: stops any front end from loading
  • -logFile: tells unity what to do with any output. Without this parameter it’s all thrown away, which is of no use to our text editors. The docs above will tell you this argument needs a file parameter, but it’s undocumented that if you omit the parameter, all output, including errors, gets spat out to the terminal. That’s just what we want!
  • -projectPath: the path to the root of your project, where you find the sln file and the Assets directory
  • -quit: tells unity to exit straight after compiling.

This works on every platform, and has the nice feature of pulling in new files before compiling, so we don’t need a separate step for that. However it is slow as a milk cart going uphill, because it has to load and boot a lot of the unity runtime before it can do the simple thing of finding mono and asking it to compile a solution file for us.

It is also annoying because it doesn’t work if the unity editor is currently open. Even when in a heavy coding session there’s normally a need to run the game or tweak inspector values at regular intervals, and opening and closing the editor is not going to help you stay in a nice flow. Having said that, this drawback might not matter on a continuous integration server, so it might be the way to go.

So since this is just asking mono to compile for us in a round-about way, how do we cut out the middle man? The mono command-line compilation tool is called mdtool, and it ships with Unity. On OSX you can find it at /Applications/Unity/MonoDevelop.app/Contents/MacOS/mdtool. Here’s the command to give it:

    path/to/mdtool build path/to/project/project.sln

Here we just need to give mdtool the command build, and the path to the sln file. This is super-quick compared to the Unity approach, and works with the editor still open, but unfortunately won’t pick up newly added files. You’ll still have to tab to the editor for those to be picked up. However that’s relatively uncommon, so isn’t too much of a bother.

However when I tried this with the mdtool that ships with unity, I got very strange errors. They used to be relatively compact, but nowadays there are segfaults and all kinds of stuff going on. I did some casual googling over a few weeks, and couldn’t find a solution. But there was a workaround: install the latest Xamarin Studio (a free mono dev environment based on MonoDevelop), and use its mdtool. On OSX that’s at /Applications/Xamarin\ Studio.app/Contents/MacOS/mdtool.

So there you go: with one of these two approaches, you should be able to compile your unity scripts on the command line, find the errors and connect them to your editor of choice. These should all work on windows as well, but if someone can confirm in the comments that would be great.

If you’re interested, my emacs unity helper functions, including some flycheck compilers using both mdtool and unity, can be found on github.

Anatomy of a unit test

In my spare time I maintain a unit testing library built for Unity3D. It’s called UnTest, it’s open-sourced under the MIT license, and you can download it here.

In the Unity3D community forums announcing UnTest, _Shockwave asked for an example of some real-life unit tests written with this framework. UnTest is very xUnit-flavoured, so they follow a standard pattern, but I thought it would be a good excuse to talk about good unit testing practice.

Much of my unit testing approach is from Roy Osherove’s Art of Unit Testing, which is a very readable and practical book on unit testing. It’s aimed at .Net, so highly applicable for Unity3D development. The Art of Unit Testing website also has some recorded conference lectures from Osherove that are also worth watching. If you want to get better at writing unit tests, these are great resources.

The unit test I’m going to dissect is below. It’s a real-life test from a production behaviour tree system. It’s not really important here to understand what a behaviour tree or a selection node is, as much as the patterns and conventions I followed. Good unit tests are readable, maintainable and trustworthy. As we walk through the test, I’ll explain how these qualities apply, and how to maximise them.

    [Test]
    void UpdatePath_OneActionChild_AddsSelAndChildToPath() {

        var actionNode = FakeAction.CreateStub();
        m_selNode.AddChild(actionNode);

        m_selNode.UpdatePath(m_path, NodeResult.Fails, null);

        var selectionThenAction = new ITreeNode[] {
            m_selNode, actionNode };
        Assert.IsEqualSequence(m_path, selectionThenAction);
    }

To increase readability, the first thing to note is the context of the file you can’t see. It’s in a file called SelectionNodeTests.cs, so I instantly know this test applies to the SelectionNode class. There’s only one class in this file, with the same name, so there’s no chance of confusion.

The name of the function follows a consistent convention throughout the codebase: FunctionUndertest_Context_ExpectedResult. There are many naming conventions you could follow, this is the one I do. Context is how we set up the world before running the function. In this case, we’re adding a single action node to the selection node. ExpectedResult is how we want the function to behave; here we want the selection node and the action node to be added to the path.

It’s not important how long the name of this function is, since it’s never called from anywhere. The more readable and informative you can make the function name, the easier it will be to figure out what went wrong when it fails.

The unit test is split into three sections following the AAA pattern: Arrange, Act, Assert.

Arrange, where I set up the selection for testing:

    var actionNode = FakeAction.CreateStub();
    m_selNode.AddChild(actionNode);

All I need to do is create an action node and add it to the selection node.

Act, where I execute the function we want to test:

    m_selNode.UpdatePath(m_path, NodeResult.Fails, null);

Assert, where I check to see that end condition we expected has been fulfilled:

    var selectionThenAction = new ITreeNode[] {
        m_selNode, actionNode };
    Assert.IsEqualSequence(m_path, selectionThenAction);

Here I construct what I expect the path to be, and assert that it matches the actual path using an Assert library provided by UnTest.

By following this layout, all tests can be easily scanned. When time comes to fix or update a test, developers can dive in quickly.

You’ll notice I could compact the Assert section into one line:

    Assert.IsEqualSequence(m_path, new ITreeNode[] { m_selNode, actionNode });

The reason I keep it separate is to avoid “magic numbers”. It’s too easy to write code like, Assert.IsEqual(result, 5). The writer may know what this 5 means, but it would be much better for future readers to put it in a named variable and write Assert.IsEqual(result, hypotenuseLength).

Now this test is as readable as possible, how did I make it maintainable too? You’ll notice that by improving readability I’ve gone some way to also helping maintainability, as something that’s easier to read is also easier to understand, and therefore is easier to maintain. But there are other things I do as well.

Check out the first line:

    var actionNode = FakeAction.CreateStub();

I need an action to put into the selection node. I could use an existing IAction concrete class, but then any bugs in that concrete class might cause this test to fail. I’ll cover more why that’s bad later, but just pretend it sucks.

I could derive a new class from IAction, which I could keep simple enough to avoid bugs, but then I’d have to maintain that whenever the Action class interface changed. It’s much easier to use a “mocking framework” to do most of the hard work for me.

A mocking framework is a library that can be used to construct a new type at runtime that derives from Action and just does the right thing (among many other things). Then any changes are picked up for me automatically, and I have less code to maintain. If that sounds like magic, that’s because it is.

There’s a mocking framework behind that FakeAction.CreateStub() call, but since it’s such a common use case in this test suite I’ve wrapped it up in a helper function.

Any mocking framework that works with mono will work with Unity3D. I use Moq. The latest version is here. I’ve mirrored this in a unitypackage here for easy importing to Unity.

To further isolate myself from changes, I’m constructing the member variables m_selNode and m_path in a setup function (not shown). This function is run automatically before every test, and makes new SelectionNode and Path objects. This is not only handy, because they’re used in every test in the class, but also isolates the unit tests from changes to the constructor signatures. Other commonly-used functions can also be hidden behind helper functions, but it’s best not to hide your function-under-test for readability reasons.

The final thing I need to do is make the test “trustworthy”.

By going through the maintainable and readable steps, I’ve made sure this test depends on the minimum amount of game code. When this test fails, hopefully it will only be because the function under test, UpdatePath(), had an error.

The more game code you depend on, the closer your test slips along the spectrum from unit to integration test. Integration tests check how systems connect together, rather than single assumptions. They have their place in a testing plan, but here I’m trying to write a unit test. A great rule of thumb is that a single line of buggy code should cause failures in the minimum of unit tests, and ideally only one. If lots fail, that’s because the code isn’t isolated properly and you’ve ended up with some integration tests.

Some of my early unit tests, from F1 2011, created a whole world for the AI to move in and recorded the results, rather than mocking surrounding code like we have here. The end result was that a single bug in the world code could cause many many tests to fail. That makes it hard to track down the root cause of the bug, and meant I had probably written integration tests instead of unit tests.

When this test does fail, it will be deterministic. There’s no dependency here on databases, network services, or random number generators. There’s nothing worse than unit tests that fail once in a blue moon, because they erode developer trust in the test suite. That’s how you end up with swathes of tests commented out, and wasted engineering time.

Now you understand why I’ve written this real-life unit test in this way, and why it’s important your unit tests are readable, maintainable and trustworthy. Like any type of programming, writing good unit tests takes practice and perseverance. They’re truly the foundation of your project, giving you the freedom to restructure at will and the confidence that your game code is high quality. But like any foundation, if they’re not well engineered the whole edifice comes rapidly crumbling down. Take the time to follow up with the resources I linked above, and you will hopefully avoid that situation.