Difference between revisions of "Compression"

From RogueBasin
Jump to navigation Jump to search
(→‎Bitwise Storage: cleaned up typo.)
m (GeSHi-ized the code example)
Line 7: Line 7:
Here is example code for using run-length encoding (using Visual Basic.net):
Here is example code for using run-length encoding (using Visual Basic.net):


<source lang="vbnet">
     Function RLEEncode(ByVal Bytes As Byte()) As Byte()
     Function RLEEncode(ByVal Bytes As Byte()) As Byte()
         Dim PreviousByte As Byte = Bytes(0)
         Dim PreviousByte As Byte = Bytes(0)
Line 35: Line 36:
         Return OutputBytes.ToArray
         Return OutputBytes.ToArray
     End Function
     End Function
 
</source>


A more efficient way of RLE is to use some kind of token for identifying repetitive data.
A more efficient way of RLE is to use some kind of token for identifying repetitive data.

Revision as of 09:50, 17 January 2010

Roguelike developers often need to save data. Sometimes the data takes up a large amount of space and contains repetitive data. There are different methods to solve this problem. Some of them are listed below.

Run-length Encoding

A solution to that is run-length encoding. It compresses "runs" of data, such as "aaaaaabbbb". To do that, it stores a "count" byte and a "data" byte. The count byte stores the number of times the data byte is repeated. Because the count can never by 0, we can subtract one from the count byte to compress extremely long runs of data even more.

Here is example code for using run-length encoding (using Visual Basic.net):

    Function RLEEncode(ByVal Bytes As Byte()) As Byte()
        Dim PreviousByte As Byte = Bytes(0)
        Dim ByteCount As Byte = 0
        Dim OutputBytes As New System.Collections.Generic.Queue(Of Byte)
        For CurrentByte As Integer = 1 To Bytes.Length - 1
            If Bytes(CurrentByte) = PreviousByte And ByteCount < 255 Then
                ByteCount = CByte(ByteCount + 1)
            Else
                OutputBytes.Enqueue(ByteCount)
                OutputBytes.Enqueue(PreviousByte)
                PreviousByte = Bytes(CurrentByte)
                ByteCount = 0
            End If
        Next
        OutputBytes.Enqueue(ByteCount)
        OutputBytes.Enqueue(PreviousByte)
        Return OutputBytes.ToArray
    End Function
    
    Function RLEDecode(ByVal Bytes As Byte()) As Byte()
        Dim OutputBytes As New System.Collections.Generic.Queue(Of Byte)
        For CurrentByte As Integer = 0 To Bytes.Length - 1 Step 2
            For I As Integer = 0 To Bytes(CurrentByte)
                OutputBytes.Enqueue(Bytes(CurrentByte + 1))
            Next
        Next
        Return OutputBytes.ToArray
    End Function

A more efficient way of RLE is to use some kind of token for identifying repetitive data.

   abcaaacba => abc<token>a3cba

To bypass the use of a token simply double the repetetive character and add a "countbyte".

   abcaaacba => abcaa3cba

a complete delphi-class can be found here http://svn.berlios.de/wsvn/declacol/trunk/classes/class_rle.pas

Bitwise Storage

Most roguelikes also have a lot of different monsters, or other adversaries which have different states and abilities. Some can become hungry, other freeze on touch, are immune to almost everything, or are just a little scared of light.

A lot of these states are either on or off. So a NPC, is either hungry or not. Normally, you would use a Boolean for this. Sadly, in most languages a large list of booleans can be a bit costly. You could save these states as bits in a larger byte, or an integer.

For example: Consider the list of abilities, IMMUNE_INSTANT_DEATH, IMMUNE_STONING, IMMUNE_POISON, ALIVE, THIEF, AMORPHOUS, LOCKPICKER, SCARY, HORRIBLE, MADNESS_INDUCING. These could all be stored in a single integer. Just give the different abilities the value of a bit. So the first ability would get 1, the second 2, the third 4. All the way up to MAX_INT.

Ability Value Byte
IMMUNE_INSTANT_DEATH 1 0000000001
IMMUNE_STONING 2 0000000010
IMMUNE_POISON 4 0000000100
ALIVE 8 0000001000
THIEF 16 0000010000
AMORPHOUS 32 0000100000
LOCKPICKER 64 0001000000
SCARY 128 0010000000
HORRIBLE 256 0100000000
MADNESS_INDUCING 512 1000000000

So, if we would have a Kobold Thief, it would be Alive, be a Thief, and have the Lockpicker ability. 8 + 16 + 64 = 88, or 0001011000. While a Lich would have, 1 + 2 + 4 + 128 = 135, or 0010000111. As he is not alive, and is immune to almost everything, and certainly scary. And a Shoggoth would be 4 + 8 + 32 + 512 = 556, or 1000101100.

And all these different abilities would only take the space of a single integer. And there is even space to spare.

If you implement this kind of bitwise ability storage, you can use bitwise operators to efficiently get the different abilities. For more information on bitwise operators, please look at the excellent tutorial made by Stefan Bird

Words of Warning

While optimizing is a nice game to play, and can be very interesting to do. Remember that optimizing isn't something you should focus to much on. Unless your platforms include mobile phones and other small devices.

And sometimes all the invested energy doesn't really improve your roguelike. Compression for example, is a trade-off between speed and storage space. (When you compress, you trade speed and clock cycles for storage space). If you are really worried about speed and data usage, use some profilers made for this purpose.

If you start to make your code unreadable without extensive documentation just to save a few more bytes of runtime memory you have probably gone to far. (Or making a 1kB roguelike).