Towards more secure client-side data storage

Last year, I started digging into the various client-side data storage alternatives that had popped up as a result of HTML5. These included sessionStorage, localStorage, and client-side databases. Though I was glad to see movement in this direction, I was unhappy with some of the results. I started my exploration around sessionStorage, and even though it is severely limited, I thought it was very useful and nicely wrapped up some of the security issues related to storing data client-side (see the full blog post for more). I was not, and still am not, a fan of SQL on the web as the solution of structured data storage, and I’m glad to see the folks at Microsoft and Mozilla moving in another direction.

That being said, I started looking at localStorage. Truly, this is a grand idea: a persistent storage area shared by all browser windows (or tabs) and tied to a specific domain. I know there’s a lot of dislike amidst browser vendors around this feature due to the complexities of cross-process data management, but my problems with the API have to do with how little control one has over the data.

The problems

There are two major problem the data storage mechanism in localStorage:

  1. The data is stored on unencrypted on disk. That means anyone with access to the computer can potentially get access to that data.
  2. The data remains on disk until either the site removes it or until the user explicitly tells the browser to remove it. That means the data may remain on disk permanently otherwise.

These are problems because they both increase the likelihood that the data can be examined by those for whom it is not intended.

Suppose I’m running one of the major webmail clients and would like to improve the site’s performance by storing information about the customer emails in localStorage. That way, you can speed up the site’s startup time and only download the new email information. (By the way, this is a really bad idea, please don’t do this.) Now suppose you log off and close the browser. Your email data is still saved on disk because the webmail client didn’t delete it when you left. Not a big deal if it’s your personal laptop; huge deal if it’s a computer in a cybercafe. Imagine if in that cyber cafe, twenty other people end up using the same computer to access the same webmail client and all of their data ends up stored on disk when they leave. Big problem.

You may be asking yourself, “wouldn’t encrypting the data solve that problem?” Yes and no. You could suggest that localStorage always encrypt data when it writes to disk but then it would end up being standard encryption algorithm and standard key. While this would provide a bit of a moat around the data, it would also be easy to figure out the browser’s choice in cipher and key, forcing browser vendors to either be incredibly clever in how they encrypted data to disk or to change the data storage method frequently. (Imagine if someone figured it out and posted the details on the web, there would have to be a mad rush to update the affected browser to ensure secure data.)

Don’t get me wrong, for publicly available data, there’s no reason not to use localStorage. But for anything even remotely personal to the user, you’re placing personal data into an area that is too easily accessed.

The solution

I don’t believe that there’s a clear path forward for localStorage to make it more secure. It’s out there, people are using it, and changing the API now would be a huge problem. When I brought these issues up at the Mozilla Summit on data storage, what I heard most frequently was, “if you can think of some way to solve this, write it up and we’ll talk.” And so I sat down and wrote a proposal for secure key-value storage in browsers called SecureStore.

The proposal is based on a few simple concepts that are shared amongst security-conscious companies:

  1. User data should not be stored on disk unencrypted.
  2. Even when user data is stored encrypted, the company must control the encryption algorithm and key.
  3. User data, even when encrypted, should not persist on disk forever.

These rules have traditionally applied to servers and server-side caches, but seems logical enough to extend to client-side data storage in browsers.

I tried to keep most of the API similar to the already existing client-side data storage APIs so as to not introduce something totally different. One big difference, though, is the way in which you access a storage object. To do so, you must call the openSecureStorage() method and pass in an encryption cipher, a base64-encoded key, and a callback function that will receive the storage object:

window.openSecureStorage("mystorage", window.AES_128, key, function(storage){
   //use storage object
});

This code will do one of two things. If the storage area named “mystorage” doesn’t exist, it will be created and the given cipher and key will be used whenever data is written to it. An empty SecureStorage object is then passed into the callback function. If the storage area does exist, then it is opened, the contents decrypted, and the data is made available on the SecureStorage object. Note that the storage areas are tied to a domain, and there is no limit on the number of storage areas for a particular domain (only a limit on the total amount of space a domain can use).

Once you have a SecureStorage object, you can use the length property to determine how many key-value pairs are available, and all of the standard storage methods are also there:

  • getItem(key) – retrieves the value for the given key or null if the key doesn’t exist.
  • setItem(key, value) – sets the value for the given key.
  • removeItem(key) – removes the key completely.
  • key(position) – returns the key for the value in the given numeric position.
  • clear() – removes all key-value pairs.

Note that you must use getItem(), setItem(), and removeItem() for manipulating keys; keys don’t automatically become properties on a SecureStorage object. Other than that difference, you use a SecureStorage object the same as you would sessionStorage or localStorage. Also, both the keys and the values are encrypted on disk.

An additional method called setExpiration() is present on the SecureStorage object as well. This method allows you to pass in a Date object indicating when the data should be deleted. For example:

window.openSecureStorage("mystorage", window.AES_128, key, function(storage){

    storage.setItem("username", "Nicholas");
    storage.setItem("super_secret_value", "unicorn");

    //set expiration for a year from now
    var expires = new Date();
    expires.setFullYear(expires.getFullYear() + 1);

    storage.setExpiration(expires);
});

You can set the expiration date any number of times to extend the life of the data.

The API is purposely made a bit generic, so that it’s possible to add additional encryption ciphers easily and to allow the developer to control from where the encryption key is generated. This may be done by the server in some cases, or potentially from some as-yet-undefined API that browser vendors will create in the future. The point is to allow easy extension as web technology continues to evolve.

Why?

One of the most frequent questions I get about this proposal is whether it would be better to create a general JavaScript crypto API that could be used in conjunction with localStorage rather than creating an entirely new data storage solution. First, I’ll say that I think a native JavaScript crypto API would be great and I’m all for it. What I’m looking to avoid, however, is needing to write code like this:

//write name and value so they're both encrypted
localStorage.setItem(AES.encrypt("username", key), AES.encrypt("Nicholas", key));

//retrieve the encrypted username
var username = AES.decrypt(localStorage.getItem(AES.encrypt("username", key)), key);

I’m not sure if this looks as messy to you as it does to me, but it seems like this is a common enough pattern that having a native implementation that prevents us from writing such horrid code is a good idea.

Let’s make this real

There are a lot more details on the full proposal, but I wanted to give some highlights in this post. I’ve received favorable feedback from at least one browser vendor on this proposal, and now I need help to make this real. What I really need is more feedback from people. I’ve already picked the brain of coworkers, and now I’d like to open it up to the public. What I’m interested in:

  • Implementers: is there anything about this API that makes it too difficult to implement?
  • Web developers: Do you have a use case that this would address?
  • Web developers: Is there anything you’d change about the API?
  • Everyone: Anything else?

If you’re a contributor to an open source browser, I’m also looking for someone that’s interested in prototyping this API for use in WebKit and/or Gecko. Feel free to contact me if you’re interested or have other feedback that you don’t want to post publicly.

Comments

  1. stoimen

    Hi,

    it's really important to talk about security and localStorage. Recently I wrote a localStorage wrapper/plugin for jQuery with no encryption at all. However you're completely right, localStorage is very very useful when storing non personal data. In other hand every day this html5 feature is used world wide and it's good to thing about security! Very good article!

  2. Jesse Pate

    Interesting that you should write another article on client side storage, as I have just been doing a lot of work on a storage API of my own -- I'm sure there are some well written interfaces out there, but I have yet to find one that really impresses me. Most I've seen have either been buggy or are very inflexible in terms of which storage engine you can access (localStorage is available? You get to use that and nothing else).

    The general concept for my API, aside from providing a standard interface to localStorage, sessionStorage, userData, and cookies (for those who really want to use them as a fallback), is that developers should be able to create an instance of the interface that gives them access to any combination of engines in any order of preference. Internally, the system will drop references to any engines the browser doesn't support, and on write requests will put your key into the engine that you gave highest precedence to. If that engine is out of space, it will re-arrange the data already stored based on expiration dates (set by the dev) and relative priority of the keys (also set by the dev). Basically, if you don't have any room left in local storage, it will move the least important/closest to expiration key to session storage. In IE8, if that fails, it will move stuff to over to userData. It keeps track of everything in a meta entry that it stores in the best persistent engine available to the browser (local > userdata > session). Read requests will look up where the key lives and then fetch it for you.

    I think you're definitely right that client side storage should use some form of encryption and this is something I'll probably add to my API. I would love to hear any other thoughts you may have on what could prove useful to developers in a system like this.

    As far as things I would find useful, IE8's remainingSpace property is very handy. Especially since browsers throw exceptions when they cannot write to the Storage Object. How one would reliably be able to implement it in browsers that don't have native support for it I haven't a clue, unfortunately.

  3. Andrey

    Why just don't implement it in javascript?
    Something like this:
    window.openSecureStorage = function(storageName, cipher, secretKey, callback) {
    function crypt(secretKey, value) {
    // Implementation of cipher
    return encryptedValue;
    }
    function decrypt(secretKey, encryptedValue) {
    // Implementation of cipher
    return value;
    }
    callback({
    setItem: function(key, value) {
    key = crypt(secretKey, key);
    value = crypt(secretKey, value);
    localStorage.key = value;
    },
    getItem: function(key) {
    key = crypt(secretKey, key);
    value = localStorage.key;
    value = decrypt(secretKey, value);
    return value;
    }
    });
    }

    My point is why we need browser vendors implement new api, if we can implement it in pure javascript?

  4. Nicholas C. Zakas

    @Jesse - sounds like you're working on something very similar to the YUI Storage utility (http://developer.yahoo.com/.... Have you looked at that?

    @Andrey - there are a few problems with implementing on top of localStorage. First is that the read/write operations are synchronous, which is problematic when you're dealing with multiple tabs. Second, this is one storage location for an entire domain, whereas my proposal offers multiple named storage locations for each domain, allowing you to specify an expiration for a particular area while leaving others untouched. You also don't want to implement the encrypt/decrypt function in native JavaScript as 1) you're leaving it open to snooping and 2) most good encryption algorithms would take a non-trivial amount of time to complete when implemented in pure JavaScript. To make this practical, you definitely need native encryption/decryption functionality.

  5. HRJ

    @Jesse
    There is an opensource library called persist.js which abstracts out the various client-side storage mechanisms. We at tDash.org have been using and extending it for our use.

    You can find our fork here:
    http://github.com/mayanks/p...

  6. Marcel Duran

    Life with unencrypted localStorage: http://www.cbsnews.com/stor...

  7. Kevin Decker

    Not being a security expert this may be an obvious answer, but I'm curious as to what methods can be used to protect the secret key as view source could circumvent this encryption if done incorrectly.

    Would the key have to be generated per-user using a NONCE system or similar system? Are there other methods?

  8. Nicholas C. Zakas

    @Kevin - I'm not concerned with securing the key at this point. I purposely left that out of the proposal as I think there will likely be several different solutions forthcoming. The easiest is to request the key via Ajax from the server via an uncached response, so it's not in the page at all.

  9. Kevin Decker

    Re: Spec and multiple solutions, makes sense.

    It seems like it would be a good idea to document somewhere that the developer is not safe unless they use this API in a safe manner, but this can be done through impl docs/tutorials/blogs/other associated documentation. I say this only because my initial reaction was to create something similar to a site-wide private key, which will only provide a false sense of security.

    Awesome work BTW :)

  10. Robin

    "That way, you can speed up the site’s startup time and only download the new email information. (By the way, this is a really bad idea, please don’t do this.)"

    Assuming everything stored in localStorage was encrypted with a key generated by the server. Would you be able to elaborate on why this is a bad idea?

    Thanks!

Understanding JavaScript Promises E-book Cover

Demystify JavaScript promises with the e-book that explains not just concepts, but also real-world uses of promises.

Download the Free E-book!

The community edition of Understanding JavaScript Promises is a free download that arrives in minutes.