I've occasionally seen this common bug in several games where a character gets stuck in what seems to be the center of the map, doing a T pose. Is this is a common output of several different types of errors? If so, have you ever hidden the origin of the map under a hill, or made it hidden to the player somehow, so bugs like that are not spotted, even if they occur?
You're right - the T-pose (or its cousin the A-pose) is usually the default pose we put a model in when a particular pose or animation isn't working. The T-pose exists mostly because artists can see all parts of the model by rotating, moving, and zooming the camera, so they can fix up any odd texturing issues on the model.
The purpose of using the T-pose for the default is because it is instantly and visibly recognizable to just about anyone that it doesn't belong there. We often use things like a checkerboard pattern or bright magenta/cyan colors on a cube to indicate a missing model or texture so that the game doesn't crash when it's missing an asset, but QA can immediately flag it as not working properly or using a placeholder asset. As such, we don't want to hide them - we want them to be as visible as we can so that we can fix them.
How often are playtesters used in game development? I know Valve uses a lot them a lot as they vocally say so during dev commentaries on some of their games. But what about everyone else? Do they use a lot of playtesters very much as well?
There are a lot of different kinds of playtesters. On my current project, we have multiple internal playtests every week so that the different teams across the studio can get an opportunity to play the latest version of the game and gather feedback for the game's dev team. The thing we test each week varies - sometimes it is a specific game mode, others we're gathering feedback for a particular map, and others we're looking to test a particular feature.
We also have regular focus group playtests on a regular basis. The User Research team usually brings in a new group of Kleenex testers every week or two. Kleenex testers are only ever brought in once to try the game fresh. They are not told what they will be testing beforehand, have signed the NDA, and will never be asked back again. They also often only get to test a specific aspect of the game, like the UI, the tutorial, a specific feature, etc. and do not get to play the full game. Focus playtests are there to prove out user experience and intuitiveness of the game's design.
Most AAA studios will also have some form of beta playtest as well. Invitations to closed betas are often limited to company employees and their friends and family. Notable player community members often also receive invites. Closed betas exist for the dual purpose of testing the full game flow and to solicit their opinions on the user experience. Open betas are similar, but we essentially let anyone in who wants to play. The goal of an open beta is not really to solicit feedback (though we will take it), but to stress test the game to make sure it works at scale in real-world conditions (e.g. lag, server capacity, network infrastructure, game stability, etc.).
Generally speaking, we try to get as much playtesting as we have the resources to support, provided that we have something specific we want to test (e.g. a new patch) and the thing we want to test is in a sufficiently stable state that it can be tested (it doesn't crash). Playtesting isn't free, we need to find and organize the testers, make sure they sign the NDAs, provide necessary setups for them, hold the test, collect the feedback, then parse and analyze that feedback to figure out what we can feasibly change to improve the game. It requires a lot of time, money, and effort on our part to hold these tests.
The quick answer is "all kinds", but QA usually has a collection of [test plans] that provide a road map for QA to validate specific content, systems, or features. A test plan is a list of specific tests they run in order to make sure that a specific chunk of the game is functioning as expected. This test plan might be (broadly) to make sure each of the class abilities meets expectations, or to cover all of the level geometry and object placement in the sewer dungeon, or to check that the game meets all of the known platform certification requirements. The totality of all of these test plans together should encompass testing the entire game.
These test plans come from a variety of sources. Certification requirements are obtained from platform and past failed certification submission notes. Game system test plans are written by the designers who created the rules for the game system. Level design test plans are usually written by the level designers who create the levels. There's also tech test plans where an engineer could write a list of tests for graphics settings or online matchmaking. Whenever a specific test is failed and the tester observes unexpected behavior, they create and log a new bug.
Hello, i am learning game designer and researching on QA. What does a test plan look like in the field of mechanics/abilities/power?
Let's take a step back and consider what a test actually is. When you test something, you are checking what happens in a game against expected behavior. If the results in game match the expected behavior, then the test passes. If the results in game do not match the expected behavior, then the test is failed. Each test failure then gets written up by QA with the steps taken to conduct the test, the behavior expected from the game, and the behavior of the game that was observed. This is also known as a bug report - it is a report about the game behaving in an unexpected way, along with steps to reproduce this unexpected behavior.
If you consider this as the definition of a test, then a test plan is a collection of tests used to determine whether a larger thing - a level, a mechanic, a game system, etc. - is working as intended. Let's consider an ability in a game like World of Warcraft as an example. An ability in a game might have:
A resource cost
Some kind of target
Some effect(s) applied when the ability is used
An animation that plays when the ability is used
Visual effects that play when the ability is used
A cooldown
An in-game description/tooltip
An icon
An effective range
We can turn each of these requirements into its own test. First, we test to see if using the ability consumes the correct amount of the right resource. Can the ability be used when I lack enough of the resource? Does it consume too much or too little resource? Then we can test to see if the ability chooses the proper targets. Does the ability select the proper targets correctly? And so on and so forth. The collection of all tests we've created to make sure each aspect of a given ability exhibit the expected behavior is the ability test plan. We can then use this test plan to test each ability. This then enables us to utilize more than one tester that will perform the same consistent and repeatable tests in order to validate the behavior of each new ability as it gets added to the game. As requirements for the expected behavior change, so must the tests and test plans.
How do exploits get unnoticed by the dev team even after testing? Are exploits sometimes intentional? Thank you for answering!
Have you ever misplaced something very personal like your phone or your keys in your house? Like... you put it down somewhere for some reason and then forgot where you put it. Or have you (or anyone else) ever left used dishes in your home somewhere and forgotten about them? It's a lot like that. Even though the people most familiar with your home are the people living in it (like you), there's still accidental misplacements and mistakes that can happen because of miscommunication, forgetting, distractions, or a variety of other circumstances. These kind of mistakes gets amplified when things are really busy and a lot of people are all working or living in the same space (e.g. when family is visiting for the holidays).
When you have a group of people working in the same space with the same common materials and tools, it's not uncommon that small things get accidentally misplaced or forgotten about. Home inspections can catch some (often the most egregious) of these issues, but won't be able to catch everything. For example, they’ll probably notice if a wall is missing or a window is broken, but it’s a lot tougher for them to realize that your little sister flushed a whole barbie doll down the toilet last week just by a cursory glance. This is why sometimes exploits ship unnoticed by the devs, especially because we’re usually scrambling to fix the hundreds or thousands of the problems we do know about during the last stretch before shipping the game. If we didn't realize the basement window was left unlocked because it looked locked last we checked, we probably didn't think to lock it before the raccoons snuck in and made a mess.
PS. Exploits aren’t intentional. That’s why they’re exploits.
When it comes to difficulty in a game, do you think a dev should be able to clear every challenge they put in their game at the highest difficulty? Is it 'fair' if, say, they use debugging tools to use extra checkpoints to clear a gauntlet challenge?
That’s not really a reasonable ask because it isn’t broadly applicable. When I build gameplay systems, I usually know how they work and I can usually handle it myself under controlled conditions at maximum difficulty, sure. However, that doesn’t (and can’t) always apply. The skills needed to design content don’t necessarily match the skills needed to execute on that content at the highest difficulty level. As long as the iterative process involves people on the team who can perform the actions consistently, it’s largely fine if the designer or programmer working on the feature can’t - as long as some people can do it within acceptable limits, we’re good.
Let’s take an example from my own history. One of the gameplay systems I’ve worked on in the past was a QTE system - press this button during this canned animation sequence within this timing window to succeed. The creative leads decided that the timing window to press the button at higher difficulties should be smaller. I adjusted the timing window based on difficulty, and then pulled in various other team members to test out how the different difficulties felt. I wasn’t always able to hit the timing consistently at the highest difficulty (I gave a 500ms window), but a few of the QA testers were able to do it. That was good enough for the leads to give it their stamp of approval.
As another example, let’s say that I’m designing new multiplayer content. There is no way for me to test it on my own because I have only two arms and two eyes. There’s no way for me to test group content like raids in Destiny or zombie mode in Call of Duty by myself. I can verify that the monster abilities work, but I need enough people to field a group to test it properly and tune it. So I have to organize playtests - developers, QA testers, etc. For high execution content like WoW’s Mythic raid difficulty, it generally involves scheduling a lot of highly skilled testers (almost always QA) to test.
Overall, given the breadth of content and features we have to design, it’s unreasonable to expect a developer to be able to clear all content at all difficulties. Skills needed to perform at that execution level aren’t the same as skills needed to create that content in the first place, especially for things like multiplayer content. As long as we have somebody involved who can perform at the necessary level of execution during the difficulty tuning process, it’s fine. That somebody might be the designer, but it doesn’t have to be.
How much truth is there to QA being a common entry point for developers? And how could that type of career path look? What should I put forward or focus on if I was looking to apply for such a position?
QA is definitely a common entry point for designers and producers, but is also a difficult path to take. That might seem counterintuitive at first glance, but bear with me. First off… QA teams often outnumber dev teams, especially in AAA games. This means that, by necessity, the conversion rate of people switching career tracks from QA to production or design is not high. Many QA testers who want to become designers or producers never get that chance, simply because there aren’t that many opportunities to switch compared to the number of people in the QA field. That said, a significant number of designers and producers I know came from QA roots, which means it is definitely possible. My current employer purposely tries to keep job openings for junior positions in those tracks open for our own QA and customer service techs to enter and generally only recruits mid-level and above from outside the studio.
The career path for QA, like most careers, is generally a branching structure. It often looks something like this:
As QA, you’d have more opportunities to grow into other QA roles than you would a designer or producer role. Staying as a generalist QA tester could also lead to managing other testers. Specializing in specific types of testing is an option for a specific project and/or large game system rather than being a generalist. These sorts of fields might be major gameplay systems within a game, specializing in certification testing, server testing, and so on. Test engineers are a special breed (and probably the best paid) of QA who write scripts and programs to automate the software testing, rather than test it manually. Finally, there’s the option of applying for junior design or production roles as they open up, typically when somebody on the team leaves and/or gets promoted, or a new project enters the production phase and needs more people to build game content.
If you want to make the jump from QA to design or production, you need to bring the skills necessary for those roles along with you while taking care of all of your job duties as a QA tester. This means a strong ability to write clear and concise reports and documents, a methodical approach to your daily duties, and a good understanding of reverse engineering how game systems work. The good news is that a lot of that crosses over as well - designers need excellent writing skills in order to convey important ideas and concepts to other developers. Producers need a bigger-picture view so that they can maintain the production schedule and keep everybody working. If your goal is to switch to design or production, you will need to learn to do that stuff in addition to your QA duties. This is usually through self-driven study and/or mentorship among the people already on the team. It’s generally easier to get a junior position as QA than it is from outside the studio mostly because you’re already working there so you usually get first crack at any positions that open up. However, you also need to keep in mind that there’s usually a lot of QA testers and not a lot of positions available, so the conversion rate is not going to be high.
What does the process look like when removing a bug from your game? And how does the team decide which bugs to prioritize fixing?
The overall process of bug fixing in game dev is done in five steps. In order these are:
Identify the bug
Prioritize the bug
Assign the bug
Fix the bug
Verify the bug is fixed
Each step usually has its own specialists handling that aspect of it. We’ll go over each of these briefly.
1. Identify the bug
This is usually handled by testers to some degree. Usually it starts by observing some kind of unexpected behavior in the game - a move does too much or too little damage, the game crashes, the wrong animation plays, etc. The issue is initially reported and then it is investigated further to isolate the problem and its exact cause. In some cases like crashing, the development version of game can provide useful information like a callstack or memory dump to investigate. Overall, the goal in this step is to figure out what exactly is causing the bug and write up a detailed report on how to reproduce it so that the right developer knows what to look for in order to fix it.
2. Prioritize the bug
Once the bug has been identified and the cause of the problem ascertained, the production team has to decide how important the bug is. The most important bugs are usually those that stop other developers from working like unavoidable crashes in critical parts of the game. If Neelo and Desmal are both held up by this bug that I’m fixing, that’s three peoples’ worth of working hours who are unable to do other work while I’m trying to fix the bug. That’s a multiplicative productivity loss! As you might guess, this means bugs that break the entire team’s ability to work are the absolute highest priority. Overall bugs generally follow this basic hierarchy:
Bug that blocks other developers from working
Bug that blocks certification from passing
Bug that crashes/freezes the game
Bug that blocks the game’s critical path
Bug that hurts performance beyond acceptable bounds
Bug that stops major gameplay system from working
Bug that stops minor gameplay system from working
Bug that is annoying to player (but game is otherwise playable)
Bug that looks weird (but game is otherwise playable)
This prioritization process is called [Triage]. As you can see, it’s a prioritization list that makes a lot of sense if you think about it - the goal is to keep as many developers able to do their work as possible, and then to keep as many major elements of the game working as possible.
3. Assign the bug
Different developers are responsible for different things. Once the bug has been prioritized, production assigns the bug to the appropriate developer to fix it. The bugs I fix as a technical designer are very different than those an animator or an environment artist would fix. Assigning a design tools problem to an environment artist wouldn’t make any sense. Producers need to know what each person is working on and their rough ability to take on new bugs. I’ve been assigned bugs outside of my expertise in the past because the dev who would normally fix the bug was unavailable and I had the room in my schedule to look at it, even though the producers knew it would take more time for me to fix it than it would for the other dev. The bug was important, I had lower priority tasks, so I got the assignment.
4. Fix the bug
After I’ve had a bug assigned to me, I still need to figure out where it fits into my task list. I’m not idle - I have other stuff to work on too. Sometimes the bug is high enough priority that I have to shelve what I was working on in order to focus on fixing the bug. Sometimes the bug isn’t a bug at all - it’s intended behavior that the QA department just didn’t know about. Sometimes the bug is lower priority, so I set it aside to work on later. It is entirely possible that a logged bug can take months for me to look at because it just isn’t a high enough priority and other things require my attention first.
By that token, sometimes the bug is such a low priority that I never get around to fixing it. Sometimes there’s just a steady stream of more-important tasks and bugs that keep showing up that I need to fix. These low priority bugs are often called “wishlist” bugs because we wish we could fix them, but there just aren’t enough hours in the day. On any AAA project, the wishlist is usually miles and miles long.
Overall, this is probably the most straightforward part of the process. I figure out what’s causing the aberrant behavior, figure out what it should do instead, and make the game do that. Maybe it’s because I mistyped the name of the action as GAME_ACTOIN_JUMP instead of GAME_ACTION_JUMP. Maybe it’s because the multiplier was missing a decimal point, so instead of the bonus being +2.5%, it was +250%. Or maybe it’s a super intricate bug that’s dependent on a lot of other factors [like an unexplained crash that happens if the player looks in one direction for several hours on a specific map]. This sort of work is usually handled through our tools, data, or code and probably the thing you are thinking of when you think “bug fixing”.
5. Verify the bug is fixed
The final step after I submit my bug fix is that QA needs to get my changes and verify that the bug is fixed. Sometimes the thing I thought would fix the bug doesn’t fix it. Sometimes the cause of the problem was something that I didn’t think about. Sometimes the fix I submitted fixes the problem but also causes a whole new bug (or multiple bugs) that I will need to fix. The bug fix needs to be verified by the QA testers who also do full regressions of the game (test all major elements of the game in order to make sure it’s all still working properly) on a regular basis. During these regressions it is entirely possible that a previously fixed bug is discovered to be broken again, in which case the bug is reopened, reprioritized, reassigned, and the process begins again.
Off of the last question, do you think large QA teams will eventually be cut from game development? It seems like the most likely position to be replaced by bots like how many cashier positions are being lost to self checkout machines. Also if you make reuseable public beta tools and improve automatic testing could you offload almost all of the more tedious map scraping parts of QA to the public via open betas? Or would that risk cannibalizing to many sales since open betas are basically demos?
Nah. We haven’t yet reached the point where good and experienced QA can be automated. Game-playing humans carry a ton of context with them that currently isn’t feasible to unit test or automate.
Let’s go through an example system to help illustrate what I mean. Let’s say that you’re working on a new game system - a socketing gem system that lets players insert jewels into their gear to provide various bonuses. Let’s also say that there are requirements on the jewels - level requirements and such to keep players under an expected power curve that we define to keep them within expected power ranges. Automating tests to check the numbers on stats and such isn’t that difficult. The bots just need to check the resulting stats against the expected values. Simple, right?
However, there are a whole lot of other elements that need to be done alongside just testing for relative power levels. For example, you need UI to actually convey these different constraintsto the player. If this jewel requires level 25 and the player is level 20, you need to display the unmet requirement somehow. Maybe you need to display a little lock symbol over the jewel’s icon and the level requirement in red so the player understands that they need to level up before using the jewel. How do you get the automated test to make sure that this exists? Image recognition can be incredibly complex software, but a human can do this in seconds.
“But dev, how much UI testing actually needs to go into a game?” you might ask.
There is a metric ton of UI in every game. Every screen, every menu, every popup, every life bar, every hud element, every save game screen, every load game screen, every DLC on the store, every map icon, every thing touches UI. This also includes everything that the UI describes, like requirements. We can’t really automate that stuff. We generally need a human with contextual understanding of this stuff to validate these elements. We need them during the process while we construct these systems, and we need them at the end to make sure we dot every “i” and cross every “t” for the monster that is certification.
Beyond UI, there are also many other “fuzzy” elements that need testing but are hard to automate. Is this challenge too difficult? Did this thing not spawn when it was supposed to? Is there a leg of this quest that is uncompletable? Is something missing? Is something unintuitive?
A lot of what needs testing isn’t particularly glamorous, fun, or engaging on its own. That means that crowdsourcing isn’t going to work so well - volunteers are looking for a good time, and spending hours testing menus and quests that might be too hard or too easy isn’t something we can get people to do for an extended period of time. The dev team’s content creators need test results in a timely fashion - we need to incorporate feedback into our dev process to improve the way the game plays, but getting volunteer report results may take days instead of minutes or hours. That’s a lot of wasted dev time waiting for results, especially if the reports are unclear or not well-written.
As you can probably guess, this is why we hire people to do these tasks. Game testing is a job and it’s a very important one. We need testers who can communicate, who have an eye for gameplay and game context, who can work with the rest of the dev team on specific features, and who are willing to put in the work testing the less glamorous or fun parts of the game in order to make sure they work. While we can do some automation and we do get benefits from hosting public tests, we need dedicated professional testers to cover other elements that automation and public testing can’t reliably cover.
The FANTa Project is currently on hiatus while I am crunching at work too busy.
How is being a certification tester different from being a tester at a game company (besides the fact that you work for different people)? How do you become a certification tester?
When you’re QA at a game development studio or publisher, your goal is to find bugs, write them up, and pass them on to the dev team to fix. You work on games that are in development nearly from start to finish, see things come online as you go, and may make suggestions and feedback to help with the game’s design and direction. A lot of what you’ll be testing will be unfinished because you’ll be finding bugs that creep in while it’s in development.
When you’re a [certification] tester, you test the submission candidates from game publishers. You test (nearly) completed games. Your entire goal is to check whether the submitted game meets the platform requirements to sell at retail. You have zero ability to affect the game’s direction or development; you only get to test whether the final game actually works as it should. Any bugs you log get sent back to the developer to fix, but it’s smaller stuff like “This button texture is wrong” or “this game crashes when you leave it for two hours on after unplugging controller from port one and plugging it into port two”. Instead of working for a months or more on a single project, you’d be working for a week or two on each game.
You become a cert tester the same way you become a QA tester - you apply to platform companies like Sony/Microsoft/Nintendo, pass an interview, and they offer you the job. The sort of work you’d be doing is similar, but the stuff you actually test is different.
This week we continue the Design Phase of the FANTa Project!
I am learning about Unit Testing, and it seems like if game companies used this approach we'd have games with less bugs, they'd be more stable, and they could produce features faster because there would be less time spent on QA stuff. Do they do this? If not, why don't they?
For those who aren’t familiar with unit testing, it is basically the concept that you can test a software system by giving it all kinds of expected inputs, and then making sure that the system’s outputs are all correct for that set of inputs. By testing each individual system this way, you can ensure that larger systems built from smaller ones are (or should be) less buggy. Most unit tests can be automated because engineers know what goes into the system and what should come out, so they can just write a script that calls the system’s interface with the different inputs and then see if the system spits out the correct output for each input.
In theory, this is great. As you say, it can catch a lot of bugs at the early stages. In fact, most development studios already do this. Most projects I’ve worked on have automated test suites that continuously build and validate the code in the depot to make sure it compiles, runs, and works. Granularity isn’t always as fine as testing each and every function or class, but we generally try to catch as much as we can. We do test everything we put in. Every studio I’ve worked at has code and content reviews to double check before allowing people to submit changes. On top of that, we also have human QA to make sure that we have sanity checks on things. Many of us learned about software construction in school too and that includes concepts like unit testing.
In practice, however, unit testing really only works if you can devote the time to keep maintaining it. For stuff where you don’t have to go back and do a lot of revision or iteration, it’s pretty cut and dry. A math library that will calculate the dot product of two vectors or the matrix inverse is a known quantity. But it’s not always so easy when the existing spell system isn’t giving the designers enough granular control over the variables, so the designers want a new way of tweaking specific damage values from the database. When this happens, the engineers would have to go back to do what the designers want, as well as update all of the unit tests with the updated set of inputs and outputs to check for. Imagine that, when designer Neelo wrote the original document, she wanted to account for inputs of values 1 through 10. Months later, designer Desmal decides to put in a value of 0 instead, causing the system to break. That wouldn’t have been caught in unit testing, but such things can happen in real situations. If the design keeps changing, or we don’t get the inputs and/or outputs right for the test, or we just fall behind because we absolutely need this done by the end of the sprint and a functional system is more important than functional unit tests for that system…
You’re probably starting to see the cracks now, right? The problem isn’t necessarily unit testing as a concept, but the human factor. We do use unit testing when we can, and it does help us catch bugs we missed… but it is far from the catch-all panacea that it seems like it should be. It has a very real overhead/maintenance cost and has all of the vulnerabilities of the humans trying to create the test criteria properly, especially when we’re working in an area that undergoes a lot of revision.
How do devs give their game an appropriate difficulty curve when their audience has vastly different levels of skill? A small indie game I play recently received a difficulty patch that proved very unpopular in the community. The devs actually apologized and said they'd designed the patch because their testers felt the game lacked challenge - but said testers had thousands of hours of play experience. So how do devs maintain a perspective that keeps inexperienced players in mind?
The first pass is usually developed for the people who we think will be buying our game. Usually we take things into account like familiarity with the genre and mechanics of other popular games, the brand or franchise, the platform it’s on (e.g. a dedicated game console audience vs a mobile gaming audience), and so forth. We’ll try it ourselves and pretend we are new players, then we’ll step up by inviting other members of the team or company who aren’t working on the stuff we’re doing to try it. Then we hold feedback sessions to gather opinions on the content. This can be both good and bad - most game developers are pretty hard core gamers and have a much more technical understanding of how games work, so it isn’t a guarantee that the ordinary customer would have the same concerns or issues with the gameplay. Usually we just use internal playtests as a rough starting point. After we’ve built something polished enough to show outsiders, we gather some playtesters and have them sign their NDAs.
Usually in the AAA sphere, we use what we call “kleenex playtesters” for the new player experience. Kleenex testers are just players who have never played the game before. They are basically single-use. We try to get a good spread of different types of players in our playtest batches - various backgrounds, seriousness about gaming, familiarity with the franchise or genre, comfort with technology, etc., then we record them playing - their body language, their controller inputs, and what they see on the screen. Then we can go back and review what they did, where they got stuck, what they picked up on, what they missed, where they had fun, and where they got frustrated. Once we get some good feedback, we can go back to the introductory content and improve it. We gather some more kleenex testers and the cycle repeats for as long as the budget and schedule allows for it.
In the software world, Unit Testing is the Holy Grail. Departments always talk about how great they’d be to have, but I’ve rarely seen any company much coverage. I could probably see it being feasible for libraries, tools, or server code – but have you ever worked at a studio where they really had in in gameplay code? It seems like the nature of games (needing a loaded state, the UI/UX, and the fact that they’re applications that have a timeline) make Unit Tests lofty but unreasonable.
For those who aren’t aware, Unit Testing is the concept of taking the smallest possible parts of a software system and testing them against a suite of tests that encompass all possible inputs for that individual part. If all of the outputs from that part from all possible inputs are correct, the part passes. By making sure all of the component parts work, the theory is that everything built from those parts will work. Most of the time unit testing is automated, but it occasionally is done by human hands.
It’s often difficult to generate unit tests for gameplay because game systems are so delicately intertwined with each other and measuring of the output from various systems can be subjective and requires human eyes to verify. For example, the Playstation 2 required all mentions of the memory card to be “Memory Card (PS2)” - those words and that punctuation exactly. However, that’s part of UI testing, and only a human would be able to tell if the test passes or not since the text might not be a string, but a texture or part of some UI art or something.
Not that we didn’t try automating our testing. While I was working on a AAA MMOG, we had an automated test that would:
Log into a specified account
Create a randomly named character
Test a suite of basic animations
Talk to an NPC
Accept a quest
Fight and kill an enemy
Loot items from the corpse, including quest items
Use a taxi/fast travel route
Sell looted items to a vendor
Complete the accepted quest
Record any failed tests and log out
There were also other testing scripts that we could load to automate things, like attack skill sequences/rotations and such for different classes so that the combat designers could test gear loadouts and stat changes.
The real issue is a question of what we stand to gain by dedicating engineering time to building unit tests. That sort of thing can take a long time to put together (especially for more subjective systems like validating visual effects and such), and that time could potentially be better spent working on new content or features.
Are cheat codes no longer in games because people making games realized that they can charge people for what cheats used to do, or are cheats gone because of achievements?
Uh… neither. First off, nobody sells what cheats used to do. I have no idea where you got that idea, but I can’t think of a single game that sells “god mode”, “no clipping mode”, or “infinite ammo” DLC. Second, cheats aren’t gone because of achievements either. Cheats still exist because the reason for cheats still exist. Cheat codes exist in order for the developers to test their own stuff quickly and efficiently.
Imagine that you’re a gameplay programmer, and you’re working on the combat system. You’ve been tasked with adding support for critical hits to spells - fire spells will need to activate an additional burning damage over time effect after a critical hit, frost spells will freeze the enemy, lightning spells will arc to another nearby enemy, etc. Suppose that you write some initial code, and you think you’re ready to test it. It’s not reasonable to ask that you go through the entire class selection process or leveling up of your character just to get the spells needed to test your code changes. That would take a lot of time. So, instead, you use some cheat codes - a combination of console commands and internal debugging menus to set your character to an appropriate level, grant your character the appropriate spells, and probably add god mode so that you don’t run the risk of accidentally dying to the enemy while you’re testing - in order to set up your test scenario in a matter of seconds, rather than minutes or hours. You do this because you will need to be doing this many times before you are certain your code works.
Besides you, all of the other engineers - graphics, gameplay, network, etc. will need to test their stuff. The designers will need to test their stuff. The sound guys need to test their stuff. The artists need to test their stuff. If you multiply the time savings across every developer that needs to make sure their stuff works in the game, you start seeing why cheat codes (or perhaps the actual name - debug commands) become necessary. They exist to debug issues and can save the collective equivalent of months or even years of development time.
As for why debug commands aren’t always available in the finished product, it’s primarily because we now have better tools with which to develop games. Rather than building in a secret button combination, we just remotely connect to the development console and input commands from there. We just lock those codes off for the release version, and make it so that achievements won’t fire when debug commands are enabled. PC games almost always have ways to enable the developer console, granting access to the debug commands. Console games don’t because players usually lack the hardware necessary to remotely connect to the game with a PC. That’s really all there is to it.
How do games with online components test different connection qualities? How do they simulate high ping or low upstream speeds? How is testing for things like MMOs done before the central servers are up? I'd suppose engineering work for most of this stuff has to come before real work can be done on designing and testing the actual game systems, right?
It’s usually a combination of a number of options, depending on what it is we need tested.
Licensed middleware to simulate connections and lag
In-house test suite engineering creates
We build in telemetry and data logging to the client to try to catch as much information about these as possible
Mandatory scheduled time for as many developers in the office as we can get all playing at once
Distribute the client to all willing devs and employees of the publisher. Optimally, this would be with several studios and offices located across the country, possibly the world
Closed and open beta tests
The MMO I worked on also maintained a “clean room” that had computers of various different builds, each set to a different ISP for test purposes. This way QA could connect to the game from the outside.
I've heard some people say that when balancing a multiplayer game, if a character is too powerful, rather than nerfing them, all other characters should be buffed to compensate. I feel like that's the wrong approach, but I can't put my finger on why. Regardless of what I think, though, you almost never see a game dev take that approach. In fact, I don't think I've ever seen that at all. Any theories/explanations as to why?
There’s two main reasons for this - testing complexity and power creep. Allow me to explain.
This is Street Fighter 2: The World Warrior. There are eight selectable characters in it, for a total of 32 possible matchups (including mirror matches) between them. Now let’s say, over the course of the game’s first few months in the wild, that we discover that Dhalsim is overpowered compared to the rest of the cast. Now what?
If we nerf Dhalsim down to where we expect him to be, we only need to test Dhalsim’s 8 matchups against other players at varying skill levels. The rest of the cast hasn’t been touched, so there’s no need to assume that there are any changes to the Ken vs Guile matchup or the Chun Li vs E. Honda matchup. The only balance testing we need to do here are all of Dhalsim’s matchups.
If we buff everybody else and leave Dhalsim where he is, we now have to test all 32 matchups. Not only do we need to see whether the character buffs worked to even up the matchups against Dhalsim, but now we need to see whether the buffs skewed the matchups against each other. If we strengthened Ken’s fireball as part of this round of adjustments, how does that affect the Zangief matchup that was carefully balanced before? Does the buff to Blanka’s electricity completely hose Guile now? The number of test cases gets geometrically bigger as we increase the number of selectable characters. For eight characters, we’d have to test 32 matchups instead of eight. For 16 characters, we have to test 128 matchups instead of 16. For 32 characters, QA needs to test 512 matchups instead of 32.
There’s also the issue of power creep. The game of Street Fighter 2 is built under several assumptions going in - a single round will take around X seconds, the characters will about this much damage with an average combo, meaning that the players will need to land a successful hit this many times per round, and so on and so forth. If we buff all of the characters each time, we either need to account for this in the game design (more testing!) or we have power creep.
Let’s say that, due to the buffing of the characters’ damage output, the average character can deal enough damage to win a round in two combos instead of three. This means the rounds will end faster and that the pressure will be higher to avoid being hit by a combo. It also makes it a lot harder for defensive characters to win. If there’s another patch in a couple of months that brings another round of buffs, the situation might get even more skewed. If we keep buffing everybody in terms of damage, the average round length will continue to drop and the game won’t feel as satisfying to play.
Power creep is a very real thing. World of Warcraft saw this problem with their leveling experience in Cataclysm (especially in non-Cataclysm content) after cumulative changes to talent trees and abilities. Core game design can be put in jeopardy because of ever-increasing buffs. This is why we have to throttle things back, rather than keep adding gas all the time.
Is there a difference between Playtesting and Quality Assurance?
So the thing with game development is that titles will vary from studio to studio (or publisher to publisher). At my current studio, the people in charge of maintaining the schedule, setting up meetings, and tracking progress are called “Development Directors”. At other studios, they were called “Producers”. So mileage may vary because, at some studio somewhere, their QA actually might have the title “Playtester”. I’ve heard that Nintendo uses or used that particular title at some point.
That said, most of the time “playtesting” refers to gathering feedback from players to see what they like and don’t like. This is considered part of market research rather than an actual development position - somebody who participates in playtesting probably won’t get his or her name in the credits or work directly with engineers or content creators.
Quality Assurance (or Game Testers) are actual employees who test specific parts of the game, write up bug reports, and help the developers with bug reproduction in order to fix them. This is very different from playing the game for fun and giving one’s opinion on it - it’s actual work. Often times, it is “try to break the quest by going off in this other direction”, or “note how and where the UI is broken”. Once they’ve identified a bug, they have to write a detailed report on it in order to provide the engineers, designers, and artists the context with which to get the bug to happen again.
More senior QA often moves into different specializations too. For example, you’ll often have tools QA who specialize in using content creator tools and making sure they work properly. There’s also automation QA, who focus on creating automated tests so that you can establish some baseline for the project’s stability. There’s gameplay QA, who focus on understanding specific game systems and testing the various features associated with those systems.
You are using an unsupported browser and things might not work as intended. Please make sure you're using the latest version of Chrome, Firefox, Safari, or Edge.