Safe Hacking

Or how to practice Safe Hex

Ah, the world of computers. Thanks to the wonderful world of bits and bytes, we can experiment with any application, file, driver, or even the core operating system. Rip them apart, change things, put them together, and if it doesn’t work, just try again. At worst, you’ll have to wipe your hard drive and start over. If you somehow manage to destroy a computer purely through bad software, that’s considered a design problem and a true feat to pull off. Just think about it: what other profession or hobby lets you experiment as much as you want and make as many mistakes as you want without having to spend a cent if you do something wrong?

Unfortunately, things have changed. Ever since the advent of embedded devices with upgradable firmware, people have been trying to modify and hack them. These devices are usually a lot less resilient than their bigger, older siblings. Many of the new shiny gadgets that we use every day are internally fragile and a slight software mishap can render them non-functional, a “brick”.

This is a guide for developers and hackers who work on system firmware for embedded devices.


Care About Your Users

The first step towards safe hacking is to develop a deep appreciation towards your users and, especially, their hardware. Most users are clueless and entirely dependent on you to guide them towards a safe result. While everyone releases hacks with no warranty neither express nor implied, that’s just to cover their ass. Remember, users are usually completely lost if they make their devices inoperable, and, unlike you, probably won’t have a backup plan.

One great way to start developing an appreciation for each individual piece of hardware out there is to deeply care for your own. If you’re a hacker with a very low budget, you probably have this already, as you’ll want to keep your device functional for as long as possible to avoid having to spend your hard-earned cash on a new one. If you have to perform emergency repairs on it or flashing, take notice every time you do so. You might have had to spend a few hours wiring up a flasher to your device. An average user probably doesn’t have a chance of being able to do that in a week. And if you have the resources to purchase a few devices for testing purposes, keep in mind that most users don’t have that luxury. It might be tempting to run wild and experiment once you have a recovery plan in place, but remember, every mistake that you make is a mistake that might slip and end up affecting your users instead. If you take great pains you avoid bricking your own hardware, you’ll greatly decrease the chances of a critical mistake making its way into the release version.

If you still decide not to care for your users, make it plainly clear in the product documentation that they are entirely on their own, and that you don’t care about what happens when they run your tool, nor have you make any attempt to make it safe for everyone. They deserve to know.

Understand the System

Before you start working on software that makes permanent changes to a device, you should have a deep enough understanding of its operation. Reverse engineer the boot process. Understand what parts of the firmware depend on what. Know what components are vital for boot, and what recovery modes are available, if any. If you’re the hacker responsible for performing most of the reverse engineering work on the device, you probably already know a good deal about it. If you aren’t, read documentation, try to understand everything, and talk to the person who is. Explain your idea. They will probably have many useful safety tips for you. Work on less intrusive hacks that will deepen your understanding of the system before moving on to riskier hacks that might end up in a brick. Above all, work with other people who also work on that device. Every extra knowledgeable person working on a firmware hack multiplies its chances of being safe.

Program Defensively

Usually, when a program crashes, at worst users get annoyed or lose some data. However, when unstable firmware hacks can mean that devices are irreparably destroyed, entirely different standards apply. Check all error codes. Handle out of memory errors. Make sure there’s enough free disk space. Make sure headers are sane. If you’ve never written a stable app before, one that can gracefully handle most exceptional conditions without crashing or doing the wrong thing, you should seriously reconsider working on critical device firmware hacks until you do so. Learn about what kinds of problems to expect on safe ground first, before you move on to shakier terrain.

One great technique to use is to do as much as possible in advance. Gather all required information about the system, read any required data files, prepare any modifications, and only at the very end actually commit the changes to the device. If anything goes wrong during the preparations, you can just abort the entire operation and be certain that the device is still safe. If you don’t want (or can’t) architect your program like this, you can still tack it on as an underlying layer. Make the low-level functions that perform the actual changes (e.g. write to Flash) actually write to a temporary buffer instead, and bulk write everything at the very end. This also gives you a chance to check the result of the operation virtually, before it’s actually committed. It might even speed up your program as a side effect (bulk writes are faster than scattered ones).

Fail Intelligently

If you’ve followed the prior advice, you’ll have already minimized the amount of code that can fail and cause catastrophic damage to firmware. However, most of the time, there’s always something that might go wrong at just the wrong time. If a critical operation fails, the worst possible thing you can do is panic the application or otherwise halt! Then you’re guaranteed to brick the device. Instead, drop the user into some kind of failsafe mode, shell, or launcher, and direct them to keep the device powered on and seek immediate attention (e.g. on an IRC channel). If there’s a chance of saving the device, even if you have to work together with the user to develop an improvised fix, take it. He or she will be eternally grateful to you.

Sanity Check

Don’t assume anything about the user’s environment. Manufacturers often release dozens of firmware updates, and the number balloons to hundreds or even thousands if you start to consider the possible combinations of hacks that users might have already applied. Profile the system and ensure that everything is sane before you start. If you need to read any data from a user-supplied file or from the network, make sure it is exactly what you expect it to be. You can’t possibly have too many sanity checks.

Cryptographic hash algorithms (such as SHA-1) are a great tool here. Build a database of known-good firmware hashes. Include the hash of the expected result after running your program, so you can check against it before actually writing it out. If you miss an existing firmware that would’ve just worked, that only means you have to add it in and release a new version. If you don’t perform the check and that firmware turns out to be incompatible, you’ve just created a whole class of users that will be bricked by your tool. Blind patching is a recipe for disaster.

You should also make your application check itself, to make sure it hasn’t been corrupted (due to a bad download, bad media, or even bad memory), including any auxiliary files that it needs. Hashes also work great here. You can make this as simple or as complex as you want. You can have the executable check its readonly sections against built-in hashes in memory. Or you can just have a .txt file with hashes of all your files (including the main executable), and check them at runtime before anything else. Sometimes just packing your executable with an executable packer will give you this feature for free (but make sure the packer does, in fact, offer integrity protection).

Protect Users From Themselves

Users will do completely stupid things. It’s not just that they will click on things without understanding what the outcome will be; if you include a big red button that says “Brick Me!”, someone will click it too. That’s why you should at least make it hard for users to destroy their system. Sure, you can just blame them for their own incompetence, but it’s worth covering for the obvious cases. If there’s an option in your hack that will undoubtedly brick a user’s system under a conceivable set of circumstances, check for them and disable the option in that case. There’s no excuse for having a button that deletes critical system firmware, even if it’s marked “delete critical system firmware”. If such a feature makes sense as part of a longer process that will again result in a functional device, automate the process. If there’s a reason why a power user or developer might have a use for such a dangerous option, hide it behind a warning or two and make getting to it more annoying. Your software should pass the cat test: if a cat walks all over the keyboard (or touch panel, or game pad, or Kinect), it shouldn’t be able to cause permanent harm to the system.

This doesn’t mean that you have to try to envision every single possible situation. Users are extremely good at creatively breaking programs. But, at least, make sure they can’t accidentally destroy their systems without putting a moderate amount of effort into it.

Back Up

You should strongly consider offering a back up option to your users, or even automatically backing up critical information. Sometimes, having a backup can mean the difference between a device that can be fixed with a reasonable amount of effort (say, a hardware flasher), and a device that is forever toast and not even the manufacturer could hope to repair. If the amount of critical information is small, it’s worth putting the effort in and making sure it is automatically backed up whenever any dangerous operation is about to happen. Test it, to make sure the correct information is saved. And don’t forget to tell your users that they should keep the backup file in a safe place!

Even if your device is generally “brick-proof”, because it has a ROM bootloader that allows flashing, backing up can still be very important. Many devices store unique per-device data alongside the firmware, and the loss of that information can cause a messy repair process involving lots of manual guesswork, or worse, a device that, though technically alive, will never work again as intended (e.g. if critical calibration data or device private keys are lost). This information is usually very small. Back it up! You never know when a silly mistake will end up scribbling all over it.

Test

Ideally, you’ve put enough effort into making sure your application is safe. However, the unexpected can and does happen, and sometimes you will not have the resources to perform a comprehensive enough test. So gather up a few people that you can trust and who are willing to risk it, and perform a closed test. Do not release a public beta! People are way too impatient, and public betas are essentially synonymous with a release; people will ignore any warnings attached. You’ll want trustworthy people, preferably with the technical knowledge and skill to spot a problem before it is fatal and to have a chance of being able to fix it, if it is. Look for people with hardware experience who can put together some kind of flasher (JTAG, NOR, whatever) if things go terribly wrong.

If your application errors out on some devices, but does not cause any harm, give yourself a pat in the back: congratulations, you’ve saved your tester from a brick. If it doesn’t, and you brick your tester’s device, give yourself a pat in the back: you’ve saved dozens, hundreds, or thousands of potential end-users from a brick.

You should also make sure you’ve covered all bases with your testing. If your device has gone through multiple hardware revisions, especially if those changes are at all related to the firmware of the device (e.g. different flash chip vendors, or an entirely different firmware storage device even), you should test on all of them. If you don’t know, look around and ask. There are plenty of people out there willing to provide you with PCB pictures and chip part numbers that will help you identify any important changes. If you didn’t consider an entire hardware revision and your hack doesn’t work as expected on it, you’re guaranteeing that a huge percentage of your users, potentially thousands or tens of thousands, will brick their devices.


I hope this article convinced you to be careful when you write firmware hacks for embedded devices. If you follow the guidelines in it, you’ll save money, save your users money, and build a reputation for robust and dependable hacks. These are the previously unwritten principles that Team Twiizers followed when we developed the HackMii Installer, and we haven’t heard of a single brick out of 1.2 million installs to date. Ultimately, though, whether you follow them is entirely up to you. Is it worth it? You decide.

26 Responses to “Safe Hacking”

  1. Juan Castro Says:

    That was some good stuff dude. Keep it up

  2. misters42 Says:

    Well written and informative. Thank you._

  3. cmptrblder Says:

    Good job on this, very well put together… AAA+

  4. Ricardo Says:

    Nice Post man, Why don’t you help him? Talking about Wanin, He tried something that any dev besides flukes and others want to even try. If it’s that hard why don’t you give him a hand? we are all in the same side

  5. trap15 Says:

    I’d imagine this is a response to Wanky’s magical bricking PS3 PUP? heheheh

  6. Slayer21495 Says:

    Why do I feel like this rant was targeted at one specific individual? Don’t get me wrong, I agree with everything that was said. At least when that person bricked all of those Wiis there was still a chance to recover if you had BootMii installed with a NAND backup. Releasing a custom firmware for the PS3 without some sort of recovery option is just plain foolish and irresponsible. Ideally one would build an app to redirect the flash to a USB device and test from there. That way an error that would have bricked the device would only require a simple reboot to fix.

  7. CC Says:

    The timing makes it seem like you wrote this to school Waninkoko, haha ;)

    Anyway, a great read and a thorough look at safe practice on behalf of the user base as much as for oneself, it’s a shame it’s not just the user base that will dive in head first, especially with the recent developments on the PS3 scene with some devs and users being so eager.

  8. Says:

    Excellent article, covers a lot of points that may not be obvious to everyone.

    One thing kinda bugged me:

    “While everyone releases hacks with no warranty neither express nor implied, that’s just to cover their ass.”
    [...]
    “If you still decide not to care for your users, make it plainly clear in the product documentation that they are entirely on their own, and that you don’t care about what happens when they run your tool, nor have you make any attempt to make it safe for everyone. They deserve to know.”
    I can imagine the response. “Well I did make it clear. See right here: no warranty, use at your own risk.” I got the idea you were trying to explain that such a warning needs to be BIG and in your face, but it sounds more like any “cover your ass” type of message would pass.

    You’re right too about giving good instructions. Even as an experienced coder/hacker I like to have precise, clear instructions, and get annoyed if following the steps exactly doesn’t produce the intended result. (Even something trivial like a menu item being in a different place than stated.) Especially when doing something like installing a hack, where doing something wrong could cause a brick – it’s good to not have to guess, and be confident that nothing will go wrong if you just follow these steps. (And then of course when the steps have just some trivial error, you destroy that confidence – who knows what more severe errors lurk?)

  9. Cameron Says:

    8===D~~(x_0)?

  10. CM Says:

    This is great information. Much of it is applicable to the original firmware authors as well. Look at how many unmodified Wii’s Nintendo bricked when they updated boot2.

  11. JM Says:

    “Or how to practice Safe Hex” – That cracked me up! As did all your rants/tweets earlier today.
    Anyway, great article, keep up the good work.

  12. BoBo Says:

    “Or how to practice Safe Hex” I see what you did there. lol

  13. mojobojo Says:

    This is great advice, just remember to always include a Readme.txt file a :)

  14. adhs Says:

    nice article!

    just one tip:

    if you brick the device of one of your testers, hang your head in shame.
    pad *his* back! he may actually get another one for testing ;)

  15. Lefteris Says:

    is it possible to flash the nand externally on a bricked console, or should the user have already a backup of it before it got bricked?

  16. Dancinninja Says:

    Wow. Really insightful. I loved the “protect people from themselves” portions. It’s so true and necessary. Thank you.

  17. Simon Rogers Says:

    That was a very well written article. It’s quite clear who you are targeting, however it needs to be known, and common practices and tests are a must. People should stop trying to be the first to release stuff and instead concentrate on sticking to a project plan i.e build a nand backup and recovery tool first.

    All the best,
    Simon
    http://www.BeautyAdviceCenter.com

  18. anon Says:

    can someone do a tldr version this for me?

  19. Simon Rogers Says:

    PS3′s answer to BootMii… PS3-SAFEMODE… http://www.how-do-you.info/backup-the-ps3-nand

  20. Skizo Says:

    The reason I believe marcan will never (and rightfully) help waninkoko, is that wanin is solely devoted to piracy, which has NOTHING to do with hacking devices per se.

    Also, props for the title, it reminds me of the quote: “Hacking is like sex: you get in, you get out, and you hope you didn’t leave something behind that can be traced back to you.”

    Cheers

  21. Lefteris Says:

    well, that is a good reason. never thought of that!

  22. Cabeza de Weso Says:

    Hector,

    I wish you devs would use endian(3) functions instead of the ugly kludges in the current code.

    man 3 endian

    That is all.

  23. Matias Wilkman Says:

    Good post, but I’m still left slightly disappointed by the fact that it wasn’t about opening the big hunks of steel…

  24. joel Says:

    um lets try to hack the 3ds by making nds games in 3d

  25. Shamerman Says:

    I’m not a hacker and I don’t know anything about hacking. I’m just one of the stupid users that will click anything that says “don’t click” haha
    I stumbled upon this post through Magiclantern.fm, makers of nice firmware hacks for Canon DSLR’s.
    Just wanted to say that I really like this post. Well written!
    And a BIG thanks to all thoughtful hackers that do/try to do lots of good work for us stupid users (and of course the smart ones too. hehe).

    Thanks…

  26. bigshape Says:

    it really helps a lot not only on EOS hacking

Leave a Reply