EDKAich

From MLDonkey

Jump to: navigation, search

Contents

[edit] Feature description

AICH hashes are based on SHA1
Detailed feature description

[edit] Savannah tracker

http://savannah.nongnu.org/task/index.php?6396

[edit] Code snippets

opcodes.h

#define OP_AICHREQUEST			0x9B	// <HASH 16><uint16><HASH aichhashlen>
#define OP_AICHANSWER			0x9C	// <HASH 16><uint16><HASH aichhashlen> <data>
#define OP_AICHFILEHASHANS		0x9D	  
#define OP_AICHFILEHASHREQ		0x9E

SHAHashSet.cpp

// for this version the limits are set very high, they might be lowered later
// to make a hash trustworthy, at least 10 unique Ips (255.255.128.0) must have send it
// and if we have received more than one hash  for the file, one hash has to be send by more than 95% of all unique IPs
#define MINUNIQUEIPS_TOTRUST		10	// how many unique IPs most have send us a hash to make it trustworthy
#define	MINPERCENTAGE_TOTRUST		92  // how many percentage of clients most have sent the same hash to make it trustworthy

SHAHashSet.h

/* 
 SHA haset basically exists of 1 Tree for all Parts (9.28MB) + n  Trees
 for all blocks (180KB) while n is the number of Parts.
 This means it is NOT a complete hashtree, since the 9.28MB is a given level, in order
 to be able to create a hashset format similar to the MD4 one.

 If the number of elements for the next level are odd (for example 21 blocks to spread into 2 hashs)
 the majority of elements will go into the left branch if the parent node was a left branch
 and into the right branch if the parent node was a right branch. The first node is always
 taken as a left branch.

Example tree:
	FileSize: 19506000 Bytes = 18,6 MB

								X (18,6)                                   MasterHash
							 /     \
						 X (18,55)   \
					/		\	       \
                   X(9,28)  x(9,28)   X (0,05MB)						   PartHashs
			   /      \    /       \        \
		X(4,75)   X(4,57) X(4,57)  X(4,75)   \

						[...............]
X(180KB)   X(180KB)  [...] X(140KB) | X(180KB) X(180KB [...]			   BlockHashs
									v
						 Border between first and second Part (9.28MB)

HashsIdentifier:
When sending hashs, they are send with a 16bit identifier which specifies its postion in the
tree (so StartPosition + HashDataSize would lead to the same hash)
The identifier basically describes the way from the top of the tree to the hash. a set bit (1)
means follow the left branch, a 0 means follow the right. The highest bit which is set is seen as the start-
postion (since the first node is always seend as left).

Example

								x                   0000000000000001
							 /     \		
						 x		    \				0000000000000011
					  /		\	       \
                    x       _X_          x 	        0000000000000110


Version 2 of AICH also supports 32bit identifiers to support large files, check CAICHHashSet::CreatePartRecoveryData


*/

[edit] Some design thoughts

[16:40:40] <spiralvoice> pango: or create a new subdir in $MLDONKEY_DIR and keep ini files with aich hashes
there named after the md4 hash
[16:40:52] <spiralvoice> pango: one ini file per md4 hash
[16:40:59] <pango> yes, that's a possibility
[16:41:32] <pango> not too efficient, but should at least work
[16:48:29] <pango> parsing, the packet computation, then garbage collection of the hashes will take some time and cpu...
maybe we'll need to kind of throttling to avoid DoSes
Personal tools