Block decomposition approach based on subdivision of image into small block
with local colors. We may have 2 basic colors and encode each texel of block
using an index of nearest color. This approach allows 3 bit per texel
compression for true color textures.
S3TC is another implementation of block decomposition technique. It breaks
a texture map into 4 x 4 blocks of texels. For opaque texture maps, each of
these texels is represented by two bits in a bitmap, for a total of 32 bits.
In addition to the bitmap, each block also has two representative 16 bit
colors in RGB565 format associated with it. These two explicitly encoded
colors, plus two additional colors that are derived by uniformly interpolating
the explicitly encoded colors, form a four color lookup table. This lookup
table is used to determine the actual color at any texel in the block. In
total, the 16 texels are encoded using 64 bits, or an average of 4 bits per
texel.

Decoding blocks compressed in S3TC format is straightforward. A two-bit index
is signed to each of the 16 texels. A four color lookup table is then used to
determine which 16-bit color value should be used for each texel. The decoder
requires relatively little logic, which can be operated at very high speeds
and replicated to allow parallel decoding for very high performance solutions.
Analysis
BD Texture Compression ratio is 6x and depends on image color depth.
BD allows fast random access to texels. Transfer of one texture block costs
64 bits per block. The main trouble of BD is very high blocking expenses.
An average number of sequentially transferred texels within one block is about
4; therefore all transferred information is used only for these 4 texels. That
means that transfer compression ratio is 1.5x only (64/4=16BPT).
To avoid these losses and to reuse block information a smart cashing of
fetched blocks should be implemented. That leads to more sophisticated
algorithm of decompression. First we check is requested block in our cash, if
yes we use it, else transfer it from RAM. To avoid high memory loses we should
implement static size cash with simple algorithm of block swapping, but anyway
it mplicates decompression algorithm and leads to additional memory accesses.
Block Decomposition handles color graduations very well. Since it
uses 2 basic colors and graduations of them it is very easy to create
realistic color graduations and as many as you want. However sharp edges are a
bigger problem, especially if more than 2 colors are needed, if a block of 4x4
contains 3 different colors then the algorithm runs into real trouble since it
can not make the third color. This means that colors at sharp edges can be
changed and can even be incorrect resulting is weird artifacts at the edges.
See Figures: left - no compression, middle S3TC, right - zoomed bad zone.

BD has very pure quality control, only a number of colors in block palettes
can be changed.
Study
We have implemented several Block Decomposition algorithms.
Our targets were to find an algorithm of "best" basic colors
selection and approximation, and also to estimate an average quality of this
approach. Our studies found that this approach give a good representation for
very large set of textures and images ( average PSNR is above 30 dB ).
We considered 3 general types of image areas: completely smooth, contains
several smooth subareas, noissy. Block decomposition ideally suited on smooth
and noisy areas. It also works good on 2-colors areas with large color
difference. For smooth images with small color difference and very noissy
images S3 version doesn't overhead a much classical version. Moreover for
smooth parts of textures any Block Decomposition approach is very expencive.
and it's compression ratio is very low for such areas.
Block Decomposition could be easily combined with Palletizing. That could
increase compression ratio up to 8x for true color images.