[HN Gopher] Image Compression with Singular Value Decomposition
___________________________________________________________________
Image Compression with Singular Value Decomposition
Author : egorpv
Score : 30 points
Date : 2023-02-09 21:22 UTC (1 hours ago)
HTML web link (timbaumann.info)
TEXT w3m dump (timbaumann.info)
| bob1029 wrote:
| Very interesting. I will be digging into this one. The first
| thing that popped out at me:
|
| > We can decompose a given image into the three color channels
| red, green and blue. Each channel can be represented as a (m x
| n)-matrix with values ranging from 0 to 255. We will now compress
| the matrix A representing one of the channels.
|
| I wonder if the author considered converting to YCbCr colorspace
| first. The luminance component (Y) is substantially more
| important to the human visual system than the Cb/Cr components.
| Some subsampling of the chrominance components would probably
| work well in these schemes.
| tomrod wrote:
| The representation by SVD would work almost identically. SVD is
| one of those absolutely magical, amazing algorithms that power
| a ton of things we do every day.
| nwallin wrote:
| This leaped out at me too. Doing lossy compression on RGB is
| not setting yourself up for success. Of course, then you'd need
| two sliders in the UI; one for how much to compress the Y and
| one for the CbCr.
|
| SVD also works on complex matrices. I imagine there's value in
| compressing the subsampled Cb/Cr channels as real/imaginary
| components in a complex matrix.
| ant6n wrote:
| Mmh, compression ratio doesnt seem super high. I wonder how the
| values are stored, perhaps floats? Some thoughts:
|
| Maybe using yuv would be better than rgb?
|
| Maybe values could be represented as 0..1, and the values itself
| could be stored as 0.16 fixed point numbers.
|
| Use some generic compression on top.
|
| Is there a smart way to store the basis vectors? Something about
| angles somehow (analogue to quaternions)? Also, once U is known,
| isnt the inverse implied? Also, if U is approximated and fixed,
| perhaps the singular values can be adjusted again to minimize
| errors.
| klodolph wrote:
| I'm guessing there's just a lot left on the table--
|
| > Maybe values could be represented as 0..1, and the values
| itself could be stored as 0.16 fixed point numbers. Use some
| generic compression on top.
|
| Codecs like JPEG use a variable amount of quantization. You
| quantize by scaling the 0..1 float by some scalar, putting it
| in the range 0..k, and then truncating it to an integer, and
| encoding the integer. The integer value is encoded using an
| entropy coder like Huffman. The parameter k must also be
| encoded somehow, or fixed.
|
| Look up "JPEG coefficient quantization" for how JPEG does it.
|
| Codecs for audio and images are often made up of understandable
| parts that fit together: transformations, quantization, entropy
| coding. If you come up with a new transformation, you can make
| a whole codec by putting together the remaining pieces--but
| something like a new transformation is itself interesting,
| because somebody else can always assemble it into a codec if it
| shows promise.
| bob1029 wrote:
| > Codecs like JPEG use a variable amount of quantization
|
| The reason quantization works so well in JPEG is because of
| the DCT step and its energy compaction properties. This gets
| most of the coefficients near zero. I think without this
| transform you would be introducing a lot more noise in the
| final result.
|
| At some point, we are going to end up re-implementing a thing
| approximating jpeg here. Colorspace convert, subsampling &
| DCT+quantization is most of the magic sauce.
| derf_ wrote:
| > Is there a smart way to store the basis vectors?
|
| Consider, instead, compressing small blocks of the image
| instead of a whole image plane at once. Let's say 8x8.
|
| Over such a small region, pixels are usually very highly
| correlated. You could model them statistically as an order-1
| autoregressive process with a high correlation coefficient
| (e.g., x[i + 1] = x[i]*rho + noise, with rho >= 0.95).
|
| Then, computing the eigendecomposition of the 8x8 cross-
| correlation matrix A_{ij} = rho^|i-j| produces a set of common
| eigenvectors that you can use for all blocks. So now instead of
| several very large vectors, you only need a single 8x8 matrix
| of basis vectors (because it's square, the inverse is just the
| transpose).
|
| But then, let's take the limit as rho -> 1. It just so happens
| that the resulting eigenvectors converge to the Type 2 Discrete
| Cosine Transform [0]. So, in fact, you don't need to transmit
| the basis at all.
|
| You can store the output of the transform as fixed point
| numbers with some arbitrary base (the choice controls the
| amount of compression), maybe use run-length encoding as your
| generic compression, because you expect to have a lot of
| zeroes, with some Huffman coding of the actual symbols, and
| maybe a simple predictor for the (0,0) coefficient of each
| block (it winds up being the average of the pixel values, so it
| should be pretty similar from block to block).
|
| Congratulations, you just invented JPEG.
|
| [0]
| http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1672...
| abetusk wrote:
| For those that don't know about it already, Steve Brunton has
| some good videos on many math topics, including image compression
| using SVD [0].
|
| [0] https://www.youtube.com/watch?v=QQ8vxj-9OfQ
| kelseyfrog wrote:
| Fun fact, you can also do it with various flavors of tensor
| decomposition[1].
|
| 1.
| http://tensorly.org/stable/auto_examples/decomposition/plot_...
| antegamisou wrote:
| A great primer to SVD from AMS:
|
| https://www.ams.org/publicoutreach/feature-column/fcarc-svd
| rullelito wrote:
| I remember doing this in uni 14 years ago. Can't remember it
| being very efficient.
| daturkel wrote:
| Funny enough, I did this same project (minus the fancy web
| interface) for a numerical linear algebra course in college--
| except I had to do it in Matlab.
|
| It's worse than just about any "real" image compression
| algorithm, but it works! (Plus you get lossless compression if
| your image is low-rank.)
___________________________________________________________________
(page generated 2023-02-09 23:00 UTC)