Pack and unpack bytes to strings

  • I need to write a function that "packs" an array of bytes (integers between 0 and 255) into a string. I also need to be able to perform the reverse operation, to get my byte array from the string that it was packed into. This needs to be done as fast as possible. Seeing as JavaScript has 16-bit strings, I packed two bytes per character. Here is my code and tests:



    function pack(bytes) {
    var str = "";
    for(var i = 0; i < bytes.length; i += 2) {
    var char = bytes[i] << 8;
    if (bytes[i + 1])
    char |= bytes[i + 1];
    str += String.fromCharCode(char);
    }
    return str;
    }

    function unpack(str) {
    var bytes = [];
    for(var i = 0; i < str.length; i++) {
    var char = str.charCodeAt(i);
    bytes.push(char >>> 8);
    bytes.push(char & 0xFF);
    }
    return bytes;
    }

    var tests = [
    [],
    [126, 0],
    [0, 65],
    [12, 34, 56],
    [0, 50, 100, 150, 200, 250]
    ];

    console.log("starting tests");
    tests.forEach(function(v) {
    var p = pack(v);
    console.log(v, p, unpack(p));
    });


    And the output to that is:



    starting tests
    [] "" []
    [126, 0] "縀" [126, 0]
    [0, 65] "A" [0, 65]
    [12, 34, 56] "ఢ㠀" [12, 34, 56, 0]
    [0, 50, 100, 150, 200, 250] "2撖죺" [0, 50, 100, 150, 200, 250]


    I have a few things I'd like feedback on:




    1. This was the first time I used bitwise operators. Is this how it should be done?

    2. Are there any speed improvements that could be made?

    3. Can you guys think of any way to discard that last 0 byte when encoding then decoding an array with an odd number of bytes? (see test #4)


    Regarding 3., just add a sentinel to the end of your packed array indicating whether the final byte should be included or not.

    Good idea. I was gonna have 2 bytes at the beggining indicating length, but that would have limited the length to 65535. Using your method I can have more. Not that I need it, but it's always nice to break barriers.

    Why on earth would you want to encode bytes on the client side? Seems like it should be something to do on the server.

    Who said that was client side?

    I have used this code and there seems to be a problem when passing 0 values to it, they get converted to '' omitted, making it unreliable to pack 32bit floats to a 4 chars. Is there any solution?

    @user1711738, this is the defined behavior and is not a problem. 0 gets converted to the `null` charater, which must of the time, does not have a representation. However, rest assured that something is there, and that you will be able to convert it back (just try `unpack(pack([0, 1, 2]))`).

    Client aide byte encoding can be powerful, but I make sure its usable on the server too. But how do u pack/parse bytes using bitwise ops?

    your code is beautiful :')

  • function pack(bytes) {
    var str = "";
    // You could make it faster by reading bytes.length once.
    for(var i = 0; i < bytes.length; i += 2) {
    // If you're using signed bytes, you probably need to mask here.
    var char = bytes[i] << 8;
    // (undefined | 0) === 0 so you can save a test here by doing
    // var char = (bytes[i] << 8) | (bytes[i + 1] & 0xff);
    if (bytes[i + 1])
    char |= bytes[i + 1];
    // Instead of using string += you could push char onto an array
    // and take advantage of the fact that String.fromCharCode can
    // take any number of arguments to do
    // String.fromCharCode.apply(null, chars);
    str += String.fromCharCode(char);
    }
    return str;
    }

    function unpack(str) {
    var bytes = [];
    for(var i = 0; i < str.length; i++) {
    var char = str.charCodeAt(i);
    // You can combine both these calls into one,
    // bytes.push(char >>> 8, char & 0xff);
    bytes.push(char >>> 8);
    bytes.push(char & 0xFF);
    }
    return bytes;
    }


    so to put it all together



    function pack(bytes) {
    var chars = [];
    for(var i = 0, n = bytes.length; i < n;) {
    chars.push(((bytes[i++] & 0xff) << 8) | (bytes[i++] & 0xff));
    }
    return String.fromCharCode.apply(null, chars);
    }

    function unpack(str) {
    var bytes = [];
    for(var i = 0, n = str.length; i < n; i++) {
    var char = str.charCodeAt(i);
    bytes.push(char >>> 8, char & 0xFF);
    }
    return bytes;
    }

    Note that `String.fromCharCode.apply(null, chars);` will throw a "RangeError: Maximum call stack size exceeded" exception in browsers using JavaScriptCore (i.e. Safari) if `chars` has a length greater than 65536, and will lock up completely on arrays slightly smaller than that. A complete solution would either push the string directly into the array or use a chunked version of `fromCharCode`. I suspect the former is faster in Safari. See https://bugs.webkit.org/show_bug.cgi?id=80797

License under CC-BY-SA with attribution


Content dated before 7/24/2021 11:53 AM