DIR <- Back
hbb - hyperbitblock
===================
(Right now this is only a concept / work in progress.)
Hyperbitblock provides a simple service: You can ask for information
about the reputation of an IP-address. Hyberbitblock will answer with the
status 1 or 0 that tells if the IP is considered malicious(1) or
friendly(0).
While there are similar services who provide information about the
reputation of an ip, Hyperbitblock it's specialty is being hyper-fast.
This speed is achieved by having the information stored in RAM paired
with the ability to directly access this information on a low-level.
This process does not perform a search.
"Imagine a librarian who exactly knows the position of
every book in the library by heart without even thinking."
How knowledge is stored in RAM?
-------------------------------
0.0.0.1 --> 00000000 00000000 00000000 00000001 --> 1
0.0.0.2 --> 00000000 00000000 00000000 00000010 --> 2
0.0.0.3 --> 00000000 00000000 00000000 00000011 --> 3
[...]
0.16.72.171 --> 00000000 00001000 01001000 10101011 --> 1067179
18.119.9.193 --> 00010010 01110111 00001001 11000001 --> 309791169
218.92.0.208 --> 11011010 01011100 00000000 11010000 --> 3663462608
[...]
255.255.255.253 --> 11111111 11111111 11111111 11111101 --> 4294967293
255.255.255.254 --> 11111111 11111111 11111111 11111110 --> 4294967294
255.255.255.255 --> 11111111 11111111 11111111 11111111 --> 4294967295
^ ^
| |
user queries info is stored at
an ip address this exact index in array
What knowledge do we store?
---------------------------
We store one byte per IP-Adress in RAM. That leaves is with very little
space for information. We have 8 bits of space:
1. [ ] | <-- is Blocked
2. [ ] |
| <-- confidence score: 0%, 25%, 50% or 100%
3. [ ] |
4. [ ] |
|
5. [ ] | <-- reason why blocked: 3 bits, so we can use an
| item of a catalogue of 8
6. [ ] |
7. [ ] |
| <-- not used yet
8. [ ] |
What is the advantage of this structure?
----------------------------------------
In theory we have 4.3 billion IPv4 addresses. Therefore, if we store one
byte for each, the program needs to allocate ~4GB of RAM.
For example in C we can easily hold this information in RAM by using an
array.
With minimal calculation we directly get the array-index from the address
from a request for information.
Therefore, I assume, this will be much more efficient than similar
solutions that use a database in order to optain the information.
What is the disadvantage of this structure?
-------------------------------------------
In order to have the IP-address match the array index, we have to allocate
a lot of bytes for useless ip.
We willingly take that loss in RAM-efficiency, in order to get our
information hyper-fast.