URI: 
       tadd readme; remove obsoleteness - amprolla - devuan's apt repo merger
  HTML git clone git://parazyd.org/amprolla.git
   DIR Log
   DIR Files
   DIR Refs
   DIR README
   DIR LICENSE
       ---
   DIR commit 1d9670ade4cc7c28dfd1c6de9bc14ca099be0c9d
   DIR parent 0454dba27c9b281b9eaca4b75184a8bc1f54cf15
  HTML Author: parazyd <parazyd@dyne.org>
       Date:   Mon,  5 Jun 2017 21:47:59 +0200
       
       add readme; remove obsoleteness
       
       Diffstat:
         A README.md                           |      23 +++++++++++++++++++++++
         D doc/dan-notes                       |     109 -------------------------------
         M orchestrate.py                      |       6 ++----
       
       3 files changed, 25 insertions(+), 113 deletions(-)
       ---
   DIR diff --git a/README.md b/README.md
       t@@ -0,0 +1,23 @@
       +amprolla
       +========
       +
       +amprolla is an apt repository merger originally intended for use with
       +the [Devuan](https://devuan.org) infrastructure. This version is the
       +third iteration of the software. The original version of amprolla was
       +not performing well in terms of speed, and the second version was never
       +finished - therefore this version has emerged.
       +
       +Dependencies
       +------------
       +
       +### Devuan
       +
       +```
       +gnupg2 python3-requests, python3-gnupg
       +```
       +
       +### Gentoo:
       +
       +```
       +app-crypt/gnupg dev-python/requests dev-python/python-gnupg
       +```
   DIR diff --git a/doc/dan-notes b/doc/dan-notes
       t@@ -1,109 +0,0 @@
       -Ok... so the debian repo is essentially a directory heirarchy...
       -
       -Ok.. Do you understand the repo heirarchy?  ie the main folder (in
       -amprolla case /merged) with sub folders 'dist' (for repo metadata) and
       -'pool' (where the actual binary and source packages go)??
       -forget about the "pool" folder, amprolla doesn't touch it...
       -
       -in "dists/" you have all the suites ie: jessie, ascii, ceres and all
       -the and stable, unstable  and version symlinks.
       -
       -in the suite folder, you find the section folders: main contrib non-free
       -and files InRelease, Release and Release.gpg
       -
       -InRelease is just the pgp/smime version of the Release file - the gpg
       -sig is the same as Release.gpg
       -
       -Anyway the Release file basically is a dictionary of most of the files
       -in the subdirectory with size and checksums (SHA256, SHA512 etc) in what
       -is essentially RFC822 format, with a bunch of headers at the top that
       -specify details about the Release of that suite.
       -
       -In the suite subdirectories you have a bunch of folders, binary-<arch>
       -which contains the Packages file, and compressed copies of that, and a
       -Release Stanza, and similar for the source folder with Sources file and
       -compressed copies etc.
       -
       -the Contents files (currently not processed) are their too.
       -(They contain a list of all the files in each package)
       -
       -their is also the i8n - folder which contains the processed files.
       -oops s/processed files/translation files/
       -
       -
       -Amprolla takes several mirrors and merges them in order of priority
       -starting with the highest priority.  It firsts iterates over the structure
       -to create it's repo structure, ie dists/<suite>/<section>/ etc and then first
       -copies the highest priority mirror Packages and Sources files in and then for
       -the othermirrors iterates over the Packages and Sources files and compares
       -each package stanza for a match, and if there is a match on name then the highest
       -priority mirror version is kept, if not then the package is added in.
       -(This is where the inefficient model really shows up)
       -
       -
       -After all the new Source and Packages files are processed then the Release and
       -InRelease files are generated by walking the hierarchy and adding those files in.
       -
       -There is a lot of complexities, part of which is in the design of amprolla.
       -What I had started to do, and in describing it now, it seems obvious to me
       -I should probably have started pretty much from scratch is instead of this
       -iterative approach of compare and add or skip is keep a cache of each mirrors
       -last state, and then on each run create a delta between the last state and
       -current state.
       -
       -
       -* and how does dak integrate in all of this?
       -it doesn't.  Dak is a standalone repository which just deals with the packages built by our CI
       -* so it's the same as any debian repo
       -Yup, slightly modified to handle our CI and some other tweaks
       -and I checked and our version is in gdo too.
       -
       -
       -anyway as I was saying about my approach re delta's:
       -There are big efficiencies in this approach.  For starters, we only download the InRelease or
       -Release and Release.gpg file and after verifying it, compare to the previous state, and we
       -can use the delta generated to pick what files are new, changed or removed from the repo.
       -This means we only download the changed files in the repo for a start.  And for the
       -Packages and Sources files we create a delta list of changed stanza's to apply.
       -
       -Instead of building the entire repo from scratch, we apply the delta
       -to a copy of our merged repo with handling for priority etc...
       -
       -What stumped me in the end is we actually should verify that we only have packages go in that
       -have a matching source stanza and we really need to process the contents and translations
       -at the same time.
       -
       -I suspect that nextime realised this which is why he started on amprolla2 which essentially
       -replicates dak + amprolla function...
       -
       -I just realised, I forgot to mention the overrides processing in amprolla.  In the very
       -top of the dir in "merged/" is the "indices" folder that contains overrides.  These
       -files specify for each Packages files, any metadata changes that need to be applied to
       -package stanza's
       -
       -In debian their is a entry for every single deb package/source in the archive making
       -them very large.  We did away with that to reduce the overhead of processing it created. 
       -
       -So we only have entries for those that need changing, usually to change priorities of
       -systemd packages and remove recommends and suggests for systemd related packages.
       -
       -* are indices a part of the repo or only needed by amprolla?
       -both.  In debian, dak generates them and they are hand modified by the repo masters to
       -apply needed fixes.  With amprolla, we only create them for applying our own changes as needed.
       -Technically they don't need to be in the repo, as they're not used by apt, but practically
       -it's good to have them there.
       -
       -hmmm,  I think I've cracked my problem...
       -If I use the Sources delta to identify changed packages, I can use that to pick and apply
       -the changed Packages stanza's Contents and Translations.  This would save lot's of
       -iterations, and I only need the delta Processing to be done on the Sources files.
       -Wow that would really speed things up
       -
       -The other benefit, is we can side load packages this way too and use it to replace dak
       -as well as either a standalone repo or directly into the merged repo.
       -And all without a hefty database. or the writeup
       -
       -your welcome.  It has helped me probably as much as you.  I think it's
       -turning into a full rewrite, but seems better design and possibly far easier to
       -write from scratch.
       -Anyway, it's nearly 3:30am here, so better get a couple hours sleep!
   DIR diff --git a/orchestrate.py b/orchestrate.py
       t@@ -2,7 +2,7 @@
        # see LICENSE file for copyright and license details
        
        """
       -Module used to orchestrace the entire amprolla merge
       +Module used to orchestrate the entire amprolla merge
        """
        
        from os.path import join
       t@@ -12,8 +12,6 @@ from lib.config import (arches, categories, suites, mergedir, mergesubdir,
                                pkgfiles, srcfiles, spooldir, repos)
        from lib.release import write_release
        
       -# from pprint import pprint
       -
        
        def do_merge():
            """
       t@@ -33,7 +31,7 @@ def do_merge():
        
            am = __import__('amprolla_merge')
        
       -    p = Pool(4)
       +    p = Pool(4)  # Set it to the number of CPUs you want to use
            p.map(am.main, pkg)