                              UKC_SIDX.ARJ
                              ~~~~~~~~~~~~
      Two Special Surname/Soundex indexes to the 2% Census Sample
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This archive contains:-

UKC_SIDX.TXT       The file you are now reading.

UKC_NI0.TXT    )   Two name indexes, derived from the original name
UKC_SDX.TXT    )   index to the 2% Census Sample.

These files have been contructed specially for use with the program
XTRACT, written by Ron MacRae and Rosemary Lockie, to help with
extraction of households with specified surnames from the UK 2% Census
Sample files, UKC_ccc.ARJ.  Our program will look up the surnames(s) you
specify for your search and generate the appropriate search request by
selecting the appropriate UKC_ccc.ARJ files to search for the counties
the surname occurs in - automatically.

Both files contain a list of surnames found in the various county files,
and are derived from the original name index, UKC_NIDX.TXT.  UKC_NI0.TXT
is a straight alphabetical surname listing.  UKC_SDX.TXT has the soundex
code for the surname added, and is sorted in order of soundex code.

UKC_NI0.TXT began as a straight copy of UKC_NIDX.  However for ease of
use within XTRACT, and to keep the overall size of the index to a
minimum, the following changes were made.

1.   All counties for one surname have been combined onto the one line,
     separated by commas.  The county trigraphs have been replaced with
     dinomes, 01 to 92 to represent the UK counties.

2.   Trailing question marks on surnames have been ignored, so that
     entries for BROWN and BROWN? or BROWN?? have been combined together
     in the resultant index.

     N.B. Question marks elsewhere in the surnames have been retained.

3.   Some of the entries in the original surname index have been split,
     if there appears to be more than one choice of surname.  So for
     instance, two entries have been made for "SINCLAIR OR MCKELLAR",
     "SINCLAIR" and "MCKELLAR" (found in BUT5101.TXT)  However, "DE LA
     MOTTE" (DOR5106.TXT) and "VAN DEN HONERT" (WAR5117.TXT) and similar
     have been retained as single names (in these two examples, if the
     first name is less than 4 characters - although the overall
     algorithm used for splitting is rather more complicated than that).


Together, these two changes have resulted in a 3% saving in the size of
the overall straight name index file:- 553,680 bytes, compared with
783,438 bytes in the original.  UKC_SDX in its raw state adds an
additional 733,290 bytes (229,625 bytes compressed).

The format of the two files is as follows:-

     UKC_NI0.TXT format           surname{tab}dd,dd,dd...

     UKC_SDX.TXT format   sndx{sp}surname{tab}dd,dd,dd...

In UKC_SDX, a single space separates the soundex code from the surname. 
A {tab} character (ASCII value 09) is used to separate the surname
(variable length) from the list of dinomes.  The soundex code is always 4
characters, and either of these indexes may be imported into a database
file if desired.  If so, you will need to know that the maximum length of
line is 236 characters, and the maximum length of surname contained
within the 236 characters is 19.

The way to do this would be to create a database with the following
structure:-

Soundex       5   (may be reduced to 4, after importing.  5 characters
                  (allows for the space on import.

Data        236    Surname, and list of county dinomes.

Surname      19    To be filled in after import.

Please note that if you wish to separate the surname out as a separate
field, you can do so with the following dBase command, or similar in your
own database language:-

replace all surname with left(data,at(chr(9),data)-1)



A table of the counties, and the digraphs chosen follows:-

01    ABD   Aberdeen                47    LKS   Lanarkshire
02    AGY   Anglesey                48    LAN   Lancashire
03    ARL   Argyll                  49    LEC   Leicestershire
04    AYR   Ayrshire                50    LIN   Lincolnshire
05    BAN   Banff                   51    LLS   Linlithgow
06    BDF   Bedfordshire            52    MER   Merioneth
07    BRK   Berkshire               53    MDX   Middlesex
08    BEW   Berwick                 54    MLN   Midlothian
09    BRE   Brecknockshire          55    MON   Monmouth
10    BKM   Buckingham              56    MGY   Montgomery
11    BUT   Bute                    57    MOR   Moray
12    CAI   Caithness               58    NAI   Nairn
13    CAM   Cambridgeshire          59    NFK   Norfolk
14    CGN   Cardiganshire           60    NTH   Northamptonshire
15    CMN   Carmarthenshire         61    NBL   Northumberland
16    CAE   Carnarvonshire          62    NTT   Nottinghamshire
17    CHS   Cheshire                63    ORK   Orkney
18    CLK   Clackmannan             64    OXF   Oxfordshire
19    CON   Cornwall                65    PEE   Peebles
20    CUL   Cumberland              66    PEM   Pembroke
21    DEN   Denbighshire            67    PER   Perthshire
22    DBY   Derbyshire              68    RAD   Radnor
23    DEV   Devon                   69    RFW   Renfrew
24    DOR   Dorset                  70    ROC   Ross
25    DNB   Dumbartonshire          71    ROX   Roxburgh
26    DFS   Dumfries                72    SEL   Selkirk
27    DUR   Durham                  73    SAL   Shropshire
28    EDN   Edinburgh               74    SOM   Somerset
29    ELG   Elgin                   75    STS   Staffordshire
30    ESS   Essex                   76    STI   Stirling
31    FIF   Fife                    77    SFK   Suffolk
32    FLN   Flint                   78    SRY   Surrey
33    ANS   Forfar (Angus)          79    SSX   Sussex
34    GLA   Glamorgan               80    SUT   Sutherland
35    GLS   Gloucestershire         81    WAR   Warwickshire
36    HAD   Haddingtonshire         82    WES   Westmorland
37    HAM   Hampshire               83    WIG   Wigtown
38    HEF   Hereford                84    WIL   Wiltshire
39    HRT   Hertfordshire           85    WOR   Worcestershire
40    HUN   Huntingdon              86    ERY   Yorkshire East Riding
41    INV   Inverness               87    NRY   Yorkshire North Riding
42    IOW   Isle of Wight           88    WRY   Yorkshire West Riding
43    KEN   Kent                    89    YKS   Yorkshire County
44    KCD   Kincardine              90    ZET   Shetland
45    KRS   Kinross                 91    ANT   Antrim
46    KKD   Kirkcudbright           92    RUT   Rutland


This information has been prepared by Rosemary Lockie, 2:253/188 in
FidoNet, 2nd September 1993.          
