Small talk about modern video game compression

The reason for this blog post is, should I say, very naive: I was pretty angry, to be honest, to see people on a video game enthusiast forum (NeoGAF for example, and some domestic forums) repeatedly over the course of multiple years, being so ignorant about compression and game sizes. They would routinely speak of Nintendo, the size of their games — which usually are smaller barring some notable exceptions like Xenoblade Chronicles series — and then criticize third party developers for ‘not compressing’ their games.

I do assume some of them are just Nintendo fanboys, and I have seen Nintendo fanboys that are really fanatical. But that’s beside the point. The point is that it is a good opportunity to finally talk about something I am very compassionate about. Although my recent field of work was mostly sidetracked to rendering, my real work experience lies in all low level work, the grunt work, those that users usually won’t notice but will surely complain if it fails — save/load, loading time optimizations, CPU optimizations, file size optimizations, etc. So I am pretty sure I have some related experience about the topic at hand.

This is a big topic, so I don’t expect to explain everything in detail here. But this topic also can be talked about in very non-technical terms, since every computer user will use compression software and/or algorithms in one way or another.


As a matter of fact, all modern games use compression in one way or another. But still, games with different design choices and content types will determine its sizes.

The problem is that AAA games usually have a lot of assets, and I really mean it when I say a lot. Hundreds of maps or massive open world, hundreds to thousands of PC/NPCs, voice over for all main casts, an hour or two of high quality FMVs. These are all compressed, but still sometimes more aggressive compression would be needed for them, at the cost of reducing asset quality. I read from somewhere that Nintendo still uses 32kHz audio in some strategic places. Therefore compression is always a game between content creators who want to preserve as much quality of their original creation as possible, and programmers who just wanted to put the game onto the storage medium.

Texture compression

This has been the hot new research area for video gaming ever since S3 (RIP, one of my favorites) created the S3TC compression, which is a kind of Block Truncation Coding . Most texture compression follows similar methodologies but with each progression there would be better result at the same or even smaller size. So we have the following families of encoding methods:

DXTC (successor of S3TC): This is the standard mostly used on PC and console gaming (since the 7th generation). It has 7 variants from BC1 – BC7. Each variant has its specific use (or obsolete). Nowdays developers mostly use BC5 for normal maps and BC7 for other textures. Normal maps are unique in that

a) normal maps usually tolerate even less compression artifacts and

b) only two channels are necessary for them. The other channel can be calculated on the fly.

The compression rate is usually as follows.

BC1: 3 x 8bit  -> 4bit

BC3: 4 x 8bit -> 8bit

BC5: 2 x 16bit -> 8bit

BC7: 4 x 8bit -> 8 bit (but with much better result than BC3; BC3 suffers from block artifacts too much so most modern day use would be limited.)

PVRTC: This is the format used on PowerVR GPUs and became the de facto standard on iOS devices. It is the first format to provide much more aggressive levels of compression, where you can compress your textures to 1-2bit. However at that level, image quality is greatly sacrificed. It is understandable because on mobile devices the quality loss would be less visible.

ETC: This is the format we used the most, because every Android device will support it (at least the ETC1 variant), making it the de facto standard on Android and WebGL. The compression rate is the same as BC3, but ETC1 has no full level transparency support. We usually split a 32bit texture into two (RGB and alpha) and read them separately.

ASTC: This is an advanced texture format supported on Intel and Mali GPUs. It is also worth mentioning here because it is also supported by Tegra and consequently by Nintendo Switch. Therefore Switch games can and should benefit from it, seeing that usually a 5-6bit per pixel compression rate can match or even exceed the quality of BC3 compression.

Video/Audio compression

This part in video gaming is pretty straightforward as most manufacturers use industry standards like H.264 or AAC to encode their video and audio. There are far too many articles that talk about this and are probably more detailed than I am. It is not like in the olden days like on Nintendo 64 where there is no video hardware and you need to write custom algorithms to encode your videos (ala Resident Evil 2).

Generic compression

Usually console game data nowadays will be compressed like a custom built zip file, even upon other forms of compression. You can have your texture compressed, and then still compress it with a zlib or lz4 compression. You can still net a 3:1 compression rate for BC3 textures, for example. The result of this varies across different kinds of data.

Some hardware even supports compressed file systems, and uses hardware to decode data. This would ensure maximum speed and lean file size at the same time.

Compression algorithms in this area are usually your zlib (deflate) or your LZMA algorithm. Zlib is more widely used seeing that LZMA memory usage and decode time would increase exponentially in regard to compression level and is ill fitted for gaming. Also there are variants like lz4 in order to reach a high level of decode speed with marginally worse levels of compression. There are also commercial middleware like Oodle from RAD Game Tools.

Other general and specific tricks

You may ask why I still talk about tricks even when other compression methods are present. Well most of the time it’s just removing stuff you don’t need, but still, this is where individual developers and their experience shines. Here are some of the examples I know of:

  1. When porting games, sometimes it is difficult to alter content structure in a packed file, even when you want to delete some of its data, because your file will be converted to a structure in C++ code on the fly and on site, and the structure is so complex as written by your predecessors who has left no document at all. But you can still remove the data by filling its space with zero. zlib/lz4/etc. will pick it up and compress it all.
  2. Big file pack level redundancy check: it will automatically remove redundancy without any human input, thus save development time.
  3. Use Integer polar coordinates so that the data type will be better fit for compression.

Of course the conflict here is the usual ‘dev-time vs result’ and the need to compress. If there is a hard limit that you would soon reach on your platform, then you will have to spend time on it.


How PCSX2 poisons framerate discussion

This is a collection of my tweets about PCSX2 FPS counter and its effect on framerate discussion. The title is obviously hyperbolic.

A lot of people mistakes PCSX2’s FPS counter as showing the FPS of the game. Which is understandable but can be highly misleading.

A lot of PS2 games runs at 30fps or even lower. But they still shows on PCSX2 as 60fps. Why? Because the FPS counter actually shows the refresh rate it tries to emulate. For a 30fps game, frame buffer updates every 2 V-Sync periods. PCSX2 emulates this by retaining the frame buffer from last frame if it doesn’t get updated, but still presents it to the device.

I myself found this problem when trying to use NSight to see how a frame is rendered. Sometimes it is just a simple copy. I understand using VSync to keep time is the most accurate (we have tried other time counting methods for frame updates on PC, and none of them are very satisfactory; invariably they will introduce frame pacing problems), and the mechanism is best for an emulator where FPS needs to closely match the original, which any PS2 emulator will need to do due to its exotic graphic architecture.

OTOH, if someone can’t discern that a game labeled as 60fps is actually running at 30fps, why does framerate matter so much to them? This is the real perplexing thing to me: people arguing about numbers they literally can’t see.
Video game art is about perception and how well it actually looks, not pixel counter or framerate counter. I always smirk when I see people clamoring for the FPS counter feature on Xbox One X devkits (which in itself is quite helpful to developers), and seeing from PCSX2, it seems clear that we can use the same technique as PCSX2 to ‘fake’ a higher framerate, and a lot of self-proclaimed hardcore players can be perfectly happy. Happy and oblivious.

My Childhood Dream Job and Social Mobility

I was born in a city in the heart of the Pearl River Delta. There are numerous bridges inside the region, but this is not always the case. There were practically no roads or highways connecting to other cities before the start of economic reform of modern China. People need to ride ferries to cross rivers and it takes half a day or even more to go to Guangzhou.

In my childhood, I really love roads and cars and trucks and bridges because of that. Because with them, we only need to take 2 hours to go to Guangzhou (nowadays just 1 hour). Naturally, when parents and teachers asked me what kind of job I want to do when I grow up, I often say I want to become a truck driver. My love for vehicles even extend to winnebagos (because I saw them a few times on the TV and they look awesome!)

It is not until grown up that I realize how hard it is to work as a truck driver, and that it is not very ideal to live in a winnebago. Also they are all pretty far away from my current job, computer programming. But I always know my job will have something to do with maths ever since teachers helped me discover my natural talent with it when I was in first grade.

I was picked to train in maths olympiad since fourth grade. Not the International one of course; that is for high school students. I showed my talent, and eventually I even participated in national competitions and got some trophies too. Which is not worth discussion now of course…

But even back then, I noticed something different in my ‘teammates’. Most of them have some kind of backgrounds; their parents are either teachers, professors or rich businesspersons. I am kind of an exception since my parents both work in a furniture factory (now a piano factory. Which is essentially, still the same kind of work). My teachers even praised my parents for me ‘despite’ them being common workers.

Which reminds me of the topic at hand: education and social mobility.

With the skyrocketing of real estate near the best schools, education is of course important now. But even back then, parents know the importance of education in regards to social mobility and how to climb the ladder in the social strata. My dad had to bribe (a little bit) to get me enrolled in the best primary school in town. But now money is much more important than 20 years ago, and the gap is growing.

When I was young, I thought that the US education system is much better than us and we should follow. We shouldn’t have so many tests! We should have more extra-curriculum activities! But in recent years, after my cousin migrated to the US and I heard from her mother, I realized that the US system (without any form of affirmative action), if applied here, will result in even less social mobility. Extra-curriculum activities cost money. If you are rich and/or you know people, a lot of things can be done easily.

At the very least we (along with the rest of East Asia) have a fairly equal university entrance system. Sounds scary but still fair. But even that is fast changing, and the best schools will be more inclined to receive children of economic or social elites.

But still, thanks to it, I am able to attend university in Shanghai, then found a reasonably well paid engineering job here. I can work normal hours (most time of the year), go home and play American Truck Simulator sitting on my comfy chair, with an i5 + GTX1060 rig running at good framerate.

I still love trucks as much as I was in my childhood though. That didn’t change a bit.

Emulating palette textures in modern shaders

I am going to resurrent the site after 6 years! Ha-ha! I’ve changed a LOT during these years… Anyway, here are some insights from my work.

I have worked on porting one of the older games from PS2 to modern platforms before, and one of the urgent topics was how to deal with palette textures.

PS2 is a unique hardware compared to its comtemporaries: while other consoles utilizes texture compression (PVRTC on Dreamcast, DXT1/DXT5 on GameCube and Xbox), PS2 still uses palettized textures in order to conserve texture memory.

Anyone who worked with palettes knows what kind of magic you can do with it. Just change the palette and you get a totally different texture. There are tons of palette effects one can do, and a lot of them cannot be expressed with arithmetic calculations. If it is just a change in hue or saturation for example, you can easily replicate it in programmable shaders. But what about more complicated effects like shifting/color cycling? A lot of games use shifting to create fantastic effects. For example:

Canvas Cycle

These effects are simply not possible to replicate without sending the palette from CPU to GPU, and display the textures with palettes. At first, I wrote this simple code segment to sample from an 8-bit palettized texture:

float index = (float)(indexTexture.Sample(sampler, uv) * 255.0f);
return lerp(palette[(int)index], (int)palette[(int)index + 1], frac(index));

BUT! The result looks very poor under texture filtering. There are bandings in the result textures, especially on color-rich ones like lens flare. Actually, this is not the way hardware handles sampling, AT ALL.

When doing sampling for a pixel, there are 4 texels on the texture for blending. When we sample the index texture, we actually get [the index average of these 4 texels]. However, this is not the same as [the color average of these 4 texels], because the color represented by these 4 indices may differ wildly and will most likely not be linearly related to the indices themselves.

The correct way is to

  1. get the 4 indices separately
  2. get the 4 colors of these 4 indices
  3. linear blend between these 4 colors

There are an example on the Xbox LIVE Indie Games Forums that does the job. However, there is still a half-texel problem in the code. My fixes are as follows:

float4 PixelShaderFunction(float2 pos_from_vs  : TEXCOORD0) : COLOR0
    float2 pos = pos_from_vs;
    float2 size = { 1 / TexW, 1 / TexH };
    pos.xy -= 0.5f * size;
    float2 lerps = { frac(TexW * pos.x), frac(TexH * pos.y) };
    float4 val[4] = {
        tex1D(PalSampler, tex2D(TexSampler, pos).r),
        tex1D(PalSampler, tex2D(TexSampler, pos + float2(size.x, 0)).r),
        tex1D(PalSampler, tex2D(TexSampler, pos + float2(0, size.y)).r),
        tex1D(PalSampler, tex2D(TexSampler, pos + size).r),
    return lerp(
        lerp(val[0], val[1], lerps.x),
        lerp(val[2], val[3], lerps.x),

The reason is that, the origin of texture coordinates is not strictly the top-left corner of the texture, but the center of the top-left texel. So we need to account for the half-texel difference.

The disadvantage of this method is that we need to sample the texture 4 times to get the 4 indices. However, on later hardware (Shader Model 4.1 or later), we can use Gather() to get 4 texels at once, so the GPU workload is reduced.

Video game philosophy: Omphalos hypothesis

I just read an old post on Pharyngula where some guy mentions this hypothesis.

The word “Omphalos hypothesis” may be a bit unfamiliar, but I trust many have heard about “Last Thursdayism” – which means the world is created on Last Thursday, and any memory and trace we have is artificially planted to make the world look older. According to Wikipedia,

“The concept is both unverifiable and unfalsifiable through any conceivable scientific method—in other words, it is impossible even in principle to subject it to any form of test by reference to any empirical data because the empirical data themselves are considered to have been arbitrarily created to look the way they do at every observable level of detail.”

In my opinion, such creator would be a massive liar to its subjects, and I (using my “lowly” moral standard) strongly despise this kind of behavior.  It is one of those unfalsifiable and mentally twisted theories that creationists use when they really abandon any remaining logic.

However, examples of Omphalos hypothesis is everywhere in  the realm of video games, or (using a more formal word) a simulacrum. Video games are the product of humans’ wildest imaginations, and as is said in the film Matrix, “Anything is possible.”

Sam Harris has recently posited the possibility of creating a virtual world conforming to the worldview of some religion, populated with artificial intelligence. This is not merely theoretically possible. We have already done it partially with MMO games such as World of Warcraft, which has its whole set of myths and legends, backed up with “real” locations and aesthetics. If those NPCs were sentient, they would still never realize that their history is an artificial creation when they listen to their bards, or perform their religious rites, etc.

Another example of Omphalos hypothesis appears in the helpful mechanism known as “Saving/Loading”. A save file records the current states of the virtual world, and by loading it, the game restores to a certain state. For example, in a game session of Persona 3, it can be argued that the world is destroyed in Wednesday night in-game time when we exit the game after we saved the current states of the game to a file. Then, the world is re-created in Wednesday night, also in-game time, when we load the save file and continue the game. In this case it actually involves a lot of destroying/recreating of classes and structures in the game code.

If the NPCs in the game were sentient, they would never realize this process. But now, think of a situation where one of the NPCs has the power (supervisor rights) to save and load. Isn’t this story familiar? Ah, yes, it is the Endless Eight.

Another dead Santa Claus…

I just found this on Wikipedia:

In 1907 Dr Duncan MacDougall made weight measurements of patients as they died. He claimed that there was weight loss of varying amounts at the time of death.[80] His results have never been reproduced, and are generally regarded either as meaningless or considered to have had little if any scientific merit.[81] The 2003 film 21 Grams takes its title from the approximate weight loss measured in one of MacDougall’s tests.[82]

80. ^ MacDougall, Duncan (May 1907). “Hypothesis Concerning Soul Substance Together with Experimental Evidence of The Existence of Such Substance”. Journal of the American Society for Psychical Research 1 (5): 237–244. Retrieved 19 February 2011.
81. ^ Park, Robert Ezra (2010). Superstition: Belief in the Age of Science. Princeton, N.J: Princeton University Press. p. 90. ISBN 0-691-14597-0.
82. ^ “Soul Man” – a summary of Duncan MacDougall’s research at

I must say I am pretty disappointed seeing the paper is from a “fringe science” literature, but being a scientific-skeptic (and a self-proclaimed debugging expert) I really want to see other attempts at reproducing the results.

P.S. Today during my work, I encountered a bug where one particular Wiimote always has wrong accelerometer readings (thus making the program sending voice to the Wiimote speaker all the time) lying on the desk. Do you believe that it has souls, or just that it malfunctioned? 🙂

Interesting tidbits in Bioshock Infinite preview

I was reading Bioshock Infinite previews from major gaming websites when I encountered this in Eurogamer:

Columbia was something of a travelling American emissary, a means of presenting its power unto the rest of the world. After it was involved in an international incident involving the destruction of a city in China during the Boxer rebellion, it seceded from the United States and became its own sovereign nation.

This sounds pretty interesting! While the one thing that Columbia and the Boxers share is xenophobia, the reasons are quite different. Columbia’s purpose was to showcase the strength of US while the Boxers went extreme in fighting off the foreign bad guys. Older people told me that the influence of the Boxers has been downplayed in the history books that I used compared to theirs. And I think that’s what it deserves: simple and utter xenophobia against anything foreign would never help the country.

But xenophobia is still xenophobia, regardless of its sources. I would not take ultra-nationalist as one of my worst enemy without reason.

The Founders, a group of ultra-nationalist religious fundamentalists, want her to stay in the tower, while the internationalist anarchist movement, Vox Populi, want her dead.

I’m intrigued to see some “real” fundamentalism discussed in the video game media (and of course, not in the Left Behind ways), and this seems to be the opportunity that can make it shine. Although there is still some hints of the almost-cliche “lawful vs. chaotic” talk. Hopefully Ken Levine can make the material good without looking to stereotypical.

Anyway, consider me hyped!