hngopher.com/1/archive/2023/2023-02-09/items/34731679

  URI:

       [HN Gopher] Image Compression with Singular Value Decomposition
       ___________________________________________________________________
        
       Image Compression with Singular Value Decomposition
        
       Author : egorpv
       Score  : 30 points
       Date   : 2023-02-09 21:22 UTC (1 hours ago)
        
  HTML web link (timbaumann.info)
  TEXT w3m dump (timbaumann.info)
        
       | bob1029 wrote:
       | Very interesting. I will be digging into this one. The first
       | thing that popped out at me:
       | 
       | > We can decompose a given image into the three color channels
       | red, green and blue. Each channel can be represented as a (m x
       | n)-matrix with values ranging from 0 to 255. We will now compress
       | the matrix A representing one of the channels.
       | 
       | I wonder if the author considered converting to YCbCr colorspace
       | first. The luminance component (Y) is substantially more
       | important to the human visual system than the Cb/Cr components.
       | Some subsampling of the chrominance components would probably
       | work well in these schemes.
        
         | tomrod wrote:
         | The representation by SVD would work almost identically. SVD is
         | one of those absolutely magical, amazing algorithms that power
         | a ton of things we do every day.
        
         | nwallin wrote:
         | This leaped out at me too. Doing lossy compression on RGB is
         | not setting yourself up for success. Of course, then you'd need
         | two sliders in the UI; one for how much to compress the Y and
         | one for the CbCr.
         | 
         | SVD also works on complex matrices. I imagine there's value in
         | compressing the subsampled Cb/Cr channels as real/imaginary
         | components in a complex matrix.
        
       | ant6n wrote:
       | Mmh, compression ratio doesnt seem super high. I wonder how the
       | values are stored, perhaps floats? Some thoughts:
       | 
       | Maybe using yuv would be better than rgb?
       | 
       | Maybe values could be represented as 0..1, and the values itself
       | could be stored as 0.16 fixed point numbers.
       | 
       | Use some generic compression on top.
       | 
       | Is there a smart way to store the basis vectors? Something about
       | angles somehow (analogue to quaternions)? Also, once U is known,
       | isnt the inverse implied? Also, if U is approximated and fixed,
       | perhaps the singular values can be adjusted again to minimize
       | errors.
        
         | klodolph wrote:
         | I'm guessing there's just a lot left on the table--
         | 
         | > Maybe values could be represented as 0..1, and the values
         | itself could be stored as 0.16 fixed point numbers. Use some
         | generic compression on top.
         | 
         | Codecs like JPEG use a variable amount of quantization. You
         | quantize by scaling the 0..1 float by some scalar, putting it
         | in the range 0..k, and then truncating it to an integer, and
         | encoding the integer. The integer value is encoded using an
         | entropy coder like Huffman. The parameter k must also be
         | encoded somehow, or fixed.
         | 
         | Look up "JPEG coefficient quantization" for how JPEG does it.
         | 
         | Codecs for audio and images are often made up of understandable
         | parts that fit together: transformations, quantization, entropy
         | coding. If you come up with a new transformation, you can make
         | a whole codec by putting together the remaining pieces--but
         | something like a new transformation is itself interesting,
         | because somebody else can always assemble it into a codec if it
         | shows promise.
        
           | bob1029 wrote:
           | > Codecs like JPEG use a variable amount of quantization
           | 
           | The reason quantization works so well in JPEG is because of
           | the DCT step and its energy compaction properties. This gets
           | most of the coefficients near zero. I think without this
           | transform you would be introducing a lot more noise in the
           | final result.
           | 
           | At some point, we are going to end up re-implementing a thing
           | approximating jpeg here. Colorspace convert, subsampling &
           | DCT+quantization is most of the magic sauce.
        
         | derf_ wrote:
         | > Is there a smart way to store the basis vectors?
         | 
         | Consider, instead, compressing small blocks of the image
         | instead of a whole image plane at once. Let's say 8x8.
         | 
         | Over such a small region, pixels are usually very highly
         | correlated. You could model them statistically as an order-1
         | autoregressive process with a high correlation coefficient
         | (e.g., x[i + 1] = x[i]*rho + noise, with rho >= 0.95).
         | 
         | Then, computing the eigendecomposition of the 8x8 cross-
         | correlation matrix A_{ij} = rho^|i-j| produces a set of common
         | eigenvectors that you can use for all blocks. So now instead of
         | several very large vectors, you only need a single 8x8 matrix
         | of basis vectors (because it's square, the inverse is just the
         | transpose).
         | 
         | But then, let's take the limit as rho -> 1. It just so happens
         | that the resulting eigenvectors converge to the Type 2 Discrete
         | Cosine Transform [0]. So, in fact, you don't need to transmit
         | the basis at all.
         | 
         | You can store the output of the transform as fixed point
         | numbers with some arbitrary base (the choice controls the
         | amount of compression), maybe use run-length encoding as your
         | generic compression, because you expect to have a lot of
         | zeroes, with some Huffman coding of the actual symbols, and
         | maybe a simple predictor for the (0,0) coefficient of each
         | block (it winds up being the average of the pixel values, so it
         | should be pretty similar from block to block).
         | 
         | Congratulations, you just invented JPEG.
         | 
         | [0]
         | http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1672...
        
       | abetusk wrote:
       | For those that don't know about it already, Steve Brunton has
       | some good videos on many math topics, including image compression
       | using SVD [0].
       | 
       | [0] https://www.youtube.com/watch?v=QQ8vxj-9OfQ
        
       | kelseyfrog wrote:
       | Fun fact, you can also do it with various flavors of tensor
       | decomposition[1].
       | 
       | 1.
       | http://tensorly.org/stable/auto_examples/decomposition/plot_...
        
       | antegamisou wrote:
       | A great primer to SVD from AMS:
       | 
       | https://www.ams.org/publicoutreach/feature-column/fcarc-svd
        
       | rullelito wrote:
       | I remember doing this in uni 14 years ago. Can't remember it
       | being very efficient.
        
       | daturkel wrote:
       | Funny enough, I did this same project (minus the fancy web
       | interface) for a numerical linear algebra course in college--
       | except I had to do it in Matlab.
       | 
       | It's worse than just about any "real" image compression
       | algorithm, but it works! (Plus you get lossless compression if
       | your image is low-rank.)
        
       ___________________________________________________________________
       (page generated 2023-02-09 23:00 UTC)