[ADC-Ext 1.0.6] Group search extensions (EX in SCH)

=== GR - Group search extensions
In BASE, clients add EX fields to SCH to denote which extension files should have. This can lead to a situation where the large bulk of extensions are of similar ‘type’, e.g. audio files or documents. This extension intend to add a field GR which groups multiple extensions. In addition, the field RX shall be used for group-exclusion; if all extensions in a group but one are desired, field RX will be used to exclude that group item.

Field GR values:
[options=“autowidth”]
|=====
|1 |Audio |MP3, FLAC, OGG, MPC, APE, WMA, WAV, M4A, MP2, MID, AU, AIFF, RA
|2 |Compressed |RAR, 7z, ZIP, TAR, TZ, BZ2, Z, ACE, LHA, LZH, ARJ
|4 |Document |DOC, XLS, PPT, DOCX, XLSX, PPTX, ODF, ODT, ODS, ODP, PDF, XPS, HTM, HTML, XML, TXT, NFO, RTF
|8 |Executable |EXE, COM, BAT, CMD, DLL, VBS, PS1, MSI
|16 |Picture |BMP, ICO, JPG, JPEG, PNG, GIF, TGA, AI, PS, PICT, EPS, IMG, PCT, PSP, TIF, RLE, PCX, SFW, PSD, CDR
|32 |Video |MPG, AVI, MKV, WMV, MOV, MP4, 3GP, QT, ASX, DIVX, ASF, PXP, OGM, FLV, RM, RMVB, WEBM, MPEG
|=====
Multiple groups are specified by adding the numbers together.

Field RX
[options=“autowidth”]
|=====
|RX |Extensions that in a group but are not desired. E.g., “GR1 RXMP3 RXWAV” would include all extensions in the ‘audio’ group except ‘MP3’ or ‘WAV’.
|=====

Note that I used DC++ for listing all of those extensions. Feel free to suggest additions.

A set for common programming file extensions? I suppose if anyone was bothered you could have sets relevant to a number of areas e.g. CAD could be another. Just because they’re in the protocol doesn’t mean the client has to display them to the user… just return search results for them.

Write down a list of extensions and I’ll add it. :slight_smile: (The question is rather whether those things are commonly searched for…)

this is now in DC++ and will appear in the version following 0.770.

rev 2297: http://bazaar.launchpad.net/~dcplusplus-team/dcplusplus/trunk/revision/2297

now that this is in the dev version of DC++, ADC hub owners seem to be concerned about how well searches will work in the near future.

it is likely that several mods of DC++ that don’t have the GR code changes are going to be sticking around for a while; some of them are even in the middle of important overhauls.
these mods (or old DC++ versions as well) will return expected results on NMDC searches but more random results on ADC. we knew it from the start and thought the compromise was ok, but ADC hub owners think otherwise. even if just for searches, this decreases the predictability and trust in ADC.

we have a couple of solutions to solve this concern, and one must be chosen before a DC++ release:

  • make this a named extension so it can be detected whether the peer supports it or not. i propose “SEGA” (Search Extension Grouping - version A).
  • release the parser part first, and wait a few months before releasing the actual extension grouping.
    needless to say, i strongly prefer the first solution.

there is one more issue: the default extensions that have been in DC++ were only there to match NMDC perfectly. we have now decided it will be safe to remove deprecated extensions from that list; so here is a new, updated list (by eMTee):

|1 |Audio |APE, FLAC, M4A, MID, MP3, MPC, OGG, RA, WAV, WMA
|2 |Compressed |7Z, ACE, ARJ, BZ2, LHA, LZH, RAR, TAR, TZ, Z, ZIP
|4 |Document |DOC, DOCX, HTM, HTML, NFO, ODF, ODP, ODS, ODT, PDF, PPT, PPTX, RTF, TXT, XLS, XLSX, XML, XPS
|8 |Executable |APP, BAT, CMD, COM, DLL, EXE, JAR, MSI, PS1, VBS, WSF
|16 |Picture |BMP, CDR, EPS, GIF, ICO, IMG, JPEG, JPG, PNG, PS, PSD, SFW, TGA, TIF, WEBP
|32 |Video |3GP, ASF, ASX, AVI, DIVX, FLV, MKV, MOV, MP4, MPEG, MPG, OGM, PXP, QT, RM, RMVB, SWF, VOB, WEBM, WMV

SEGA will be fine, updating accordingly.

SEGA should now be supported by DC++ (rev 2311). note that i have added “gz” to the list above.

Pushed in official ADC-Ext 1.0.6.

Now we are only missing a gui implementation :wink:

How is it with performance?

In NMDC, the filetype flag is saved for each folder, so only appropriate folders need to be explored for files. Also file extension comparison is very fast, because it just compare 4 bytes which is one operation on 32bit system.

On the other side, in ADC, whole shared folders structure needs to be explored which can be very slow. And extension comparison is slow too, because it compares strings char by char which is 4 operations in normal case.

I suggested to merge NMDC and ADC code together. Now, ADC extension checking is in ADCHub and NMDC extension checking is in ShareManager, and the extensions are doubled. If everything is merged into code in ShareManager and filetype flags are used for ADC too (at least in case of standard filetypes), it should fix the first performance problem (folder exploration).

This is in ADC-Ext 1.0.6, closing.