You need to use bitwise operations. Note that the dictionary will be fairly limited if you do not allow for characters other than A to Z. You can use a mask to specify which character in the integer you are referring to.
For example, the following are the ranges:
Code:
first character -> ((integer & (0x1f << 0)) >> 0) + 'A'
second character -> ((integer & (0x1f << 5)) >> 5) + 'A'
third character -> ((integer & (0x1f << 10)) >> 10) + 'A'
fourth character -> ((integer & (0x1f << 15)) >> 15) + 'A'
fifth character -> ((integer & (0x1f << 20)) >> 20) + 'A'
sixth character -> ((integer & (0x1f << 25)) >> 25) + 'A'
I am sure you can notice the relation between the decimal constant and the character offset (think exclusively). Basically, we are just masking out the rest of the integer, moving it down to the 5 least significant bits - then scaling it back into the character range. Please note that you may wish to use some of your 6 spare encoding results to represent other things (such as the end of the string, so stop parsing after this character - for example). I'll leave that as an exercise to you. It may not make much sense if you do not perform checks on the string to be sure it does not contain characters outside of [A-Z]. Having words longer than longer than six characters is trivial - you can implement multiple layers so that you test 6 characters at a time. You could even use the two most significant bits to help with indications like this. For example, setting 0x8000000 could be used to signify whether the string is complete or partial. The constraint is that you have to use length managed buffers instead of null-terminated strings, but that is not difficult - especially if you use those 2 remaining bytes well.
Anyway, I think that works. I'd check it on paper or something, first. Have fun.
Bookmarks