Simulation of a large DNA Profile Database, full alleles
Simulation of a large DNA Profile Database, full alleles
Previous simulation ( file dnas.htm ) used only a maximum of 10 alleles,
wheras the average for UK caucasians published in journal
articles have an average of 11.3 alleles per locus.
The method disclosed here would be expandable to
any number of loci (eg 13 for the USA ) and more than
60 alleles per locus if ever required.
Visual basic macros and Word macros are
between the horizontal rules.
Before going into Visual Basic Editor go into ordinary
Word and call up anything in the directory you want
the VB files to go into as this is not designated in the
following code.
Using plain text handling Notepad with no line wrap
copy to Run/Create and place between Sub / End Sub ,Reset and Run.
Some long conditional statements may break
and would need reconcattenating before running.
If using straight VB6 then designate the directory
for files by "replace all" occurances of dec14 to
c:\vb\dec14 or whatever, also add a sound progress
indicator before the [ next x ] line
If x/1000 = Int (x/1000 ) Then Beep
before highlighting and copying.
In VB6 open New Project
In Form1 open up a Command1 button
Double click this button to open command
code window and copy and paste the 'DNA' VB code
between the Private Sub Command1_click ()
and the End Sub
Then Run/Start
Press command1
Wait until Beep/ clicks cease
Despite official government sites linking to these files there are
still corrupt persons knocking out my sites, so for the
purposes of searchengines cross-linking them, files no longer
available on the original web hosting sites were on
http://www.nutteing.50megs.com/dnas5.htm , http://www.nutteing.freeisp.co.uk/dnas5.htm , http://www.nutteing.batcave.net/dnas5.htm , http://home.graffiti.net/nutteing/dnas5.htm ,
http://nutteing.no-frills.net/dnas5.htm and http://nutteing3.no-frills.net/dnas5.htm (last 2 due now to host failure)
' Generating 10 loci x2 profiles
' directing pairs and first divider
Dim ph(20)
Dim pb(20)
' initialising Random Number Generator - RNG
count9 = 0
count8 = 0
Randomize
a = 214013
c = 2531011
x0 = Timer
z = 2 ^ 24
' 1 file 'dec14-g' for original, un-directed pairs, source data.
' This file is necessary to check on the performance of the RNG
' when a matched pair is found then it is highly unlikely that
' both sequences as generated, before pair directing, would
' be the same - more likely a manifest of repeat within the RNG
' (reason for adopting the 214013 / 2531011 RNG )
' Use 'Word' find function on part of the sequences, including pair reversals,
' with luck would include a 'homozygotic' pair eg (3,3) say ,so no reversal
' on that pair
Open "dec14-g" For Output As #1
' outputs directed and divided by first digit
Open "dec14-0" For Output As #10
Open "dec14-1" For Output As #11
Open "dec14-2" For Output As #12
Open "dec14-3" For Output As #13
Open "dec14-4" For Output As #14
Open "dec14-5" For Output As #15
Open "dec14-6" For Output As #16
Open "dec14-7" For Output As #17
Open "dec14-8" For Output As #18
Open "dec14-9" For Output As #19
' change for different total size eg 199999 for 200,000
For x = 0 To 4999
For j = 0 To 1
' vWA ,first locus
' RNG random number generator
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.001 Then ph(j) = 11
If ph(j) < 0.106 Then ph(j) = 1
If ph(j) < 0.186 Then ph(j) = 2
If ph(j) < 0.402 Then ph(j) = 3
If ph(j) < 0.672 Then ph(j) = 4
If ph(j) < 0.891 Then ph(j) = 5
If ph(j) < 0.984 Then ph(j) = 6
If ph(j) < 0.998 Then ph(j) = 7
If ph(j) < 1 Then ph(j) = 8
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 2 To 3
' THO1
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.243 Then ph(j) = 1
If ph(j) < 0.437 Then ph(j) = 2
If ph(j) < 0.545 Then ph(j) = 3
If ph(j) < 0.546 Then ph(j) = 4
If ph(j) < 0.686 Then ph(j) = 5
If ph(j) < 0.99 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 4 To 5
' D8
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.018 Then ph(j) = 11
If ph(j) < 0.031 Then ph(j) = 1
If ph(j) < 0.125 Then ph(j) = 2
If ph(j) < 0.191 Then ph(j) = 3
If ph(j) < 0.334 Then ph(j) = 4
If ph(j) < 0.667 Then ph(j) = 5
If ph(j) < 0.876 Then ph(j) = 6
If ph(j) < 0.964 Then ph(j) = 7
If ph(j) < 0.995 Then ph(j) = 8
If ph(j) < 1 Then ph(j) = 9
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 6 To 7
' FGA
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.025 Then ph(j) = 11
If ph(j) < 0.081 Then ph(j) = 1
If ph(j) < 0.224 Then ph(j) = 2
If ph(j) < 0.226 And ph(j) >= 0.224 Then pb(j) = "A"
If ph(j) < 0.413 Then ph(j) = 3
If ph(j) < 0.415 And ph(j) >= 0.413 Then pb(j) = "B"
If ph(j) < 0.58 Then ph(j) = 4
If ph(j) < 0.591 And ph(j) >= 0.58 Then pb(j) = "C"
If ph(j) < 0.73 Then ph(j) = 5
If ph(j) < 0.734 And ph(j) >= 0.73 Then pb(j) = "D"
If ph(j) < 0.88 Then ph(j) = 6
If ph(j) < 0.882 And ph(j) >= 0.88 Then pb(j) = "E"
If ph(j) < 0.957 Then ph(j) = 7
If ph(j) < 0.992 Then ph(j) = 8
If ph(j) < 0.999 Then ph(j) = 9
If ph(j) < 1 And ph(j) >= 0.999 Then pb(j) = "F"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 8 To 9
' D21
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.001 Then pb(j) = "A"
If ph(j) < 0.002 And ph(j) >= 0.001 Then pb(j) = "B"
If ph(j) < 0.033 Then ph(j) = 11
If ph(j) < 0.193 Then ph(j) = 1
If ph(j) < 0.419 Then ph(j) = 2
If ph(j) < 0.677 Then ph(j) = 3
If ph(j) < 0.704 Then ph(j) = 4
If ph(j) < 0.773 Then ph(j) = 5
If ph(j) < 0.866 Then ph(j) = 6
If ph(j) < 0.884 Then ph(j) = 7
If ph(j) < 0.974 Then ph(j) = 8
If ph(j) < 0.975 And ph(j) >= 0.974 Then pb(j) = "C"
If ph(j) < 0.997 Then ph(j) = 9
If ph(j) < 1 And ph(j) >= 0.997 Then pb(j) = "D"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 10 To 11
' D18
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.001 Then pb(j) = "A"
If ph(j) < 0.009 And ph(j) >= 0.001 Then pb(j) = "B"
If ph(j) < 0.021 Then ph(j) = 11
If ph(j) < 0.16 Then ph(j) = 1
If ph(j) < 0.285 Then ph(j) = 2
If ph(j) < 0.449 Then ph(j) = 3
If ph(j) < 0.594 Then ph(j) = 4
If ph(j) < 0.731 Then ph(j) = 5
If ph(j) < 0.846 Then ph(j) = 6
If ph(j) < 0.926 Then ph(j) = 7
If ph(j) < 0.967 Then ph(j) = 8
If ph(j) < 0.982 Then ph(j) = 9
If ph(j) < 0.992 And ph(j) >= 0.982 Then pb(j) = "C"
If ph(j) < 0.997 And ph(j) >= 0.992 Then pb(j) = "D"
If ph(j) < 0.998 And ph(j) >= 0.997 Then pb(j) = "E"
If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "F"
' allele 20 (C) reduced from .017 to .015 as allele
' frequencies summed to 1.002
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 12 To 13
' D2S1338
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.037 Then ph(j) = 11
If ph(j) < 0.222 Then ph(j) = 1
If ph(j) < 0.309 Then ph(j) = 2
If ph(j) < 0.419 Then ph(j) = 3
If ph(j) < 0.557 Then ph(j) = 4
If ph(j) < 0.589 Then ph(j) = 5
If ph(j) < 0.613 Then ph(j) = 6
If ph(j) < 0.725 Then ph(j) = 7
If ph(j) < 0.867 Then ph(j) = 8
If ph(j) < 0.978 Then ph(j) = 9
If ph(j) < 0.997 And ph(j) >= 0.978 Then pb(j) = "A"
If ph(j) < 1 And ph(j) >= 0.997 Then pb(j) = "B"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 14 To 15
' D16
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.019 Then ph(j) = 11
If ph(j) < 0.148 Then ph(j) = 1
If ph(j) < 0.202 Then ph(j) = 2
If ph(j) < 0.491 Then ph(j) = 3
If ph(j) < 0.779 Then ph(j) = 4
If ph(j) < 0.965 Then ph(j) = 5
If ph(j) < 0.994 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 16 To 17
' D19
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.087 Then ph(j) = 11
If ph(j) < 0.309 Then ph(j) = 1
If ph(j) < 0.322 Then ph(j) = 2
If ph(j) < 0.704 Then ph(j) = 3
If ph(j) < 0.719 Then ph(j) = 4
If ph(j) < 0.896 Then ph(j) = 5
If ph(j) < 0.934 Then ph(j) = 6
If ph(j) < 0.975 Then ph(j) = 7
If ph(j) < 0.992 Then ph(j) = 8
If ph(j) < 0.997 Then ph(j) = 9
If ph(j) < 0.999 And ph(j) >= 0.997 Then pb(j) = "A"
If ph(j) < 1 And ph(j) >= 0.999 Then pb(j) = "B"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 18 To 19
' D3
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.001 Then ph(j) = 11
If ph(j) < 0.007 Then ph(j) = 1
If ph(j) < 0.139 Then ph(j) = 2
If ph(j) < 0.404 Then ph(j) = 3
If ph(j) < 0.651 Then ph(j) = 4
If ph(j) < 0.846 Then ph(j) = 5
If ph(j) < 0.987 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
' output the original generated file
Write #1, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
' Because in real DNA profiles without further info ,no one
' knows which allele in each pair came from the mother or father
' by convention they are written smaller ,larger (or equal).
' The following directs each pair
For j = 0 To 18 Step 2
If ph(j + 1) < ph(j) Then
jjj = ph(j)
ph(j) = ph(j + 1)
ph(j + 1) = jjj
End If
Next j
' put extra conditional statements here to reduce
' the number of files or just delete some of the following
'
' dividing on first column, file by file
If ph(0) = 0 Then
Write #10, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count0 = count0 + 1
End If
If ph(0) = 1 Then
Write #11, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count1 = count1 + 1
End If
If ph(0) = 2 Then
Write #12, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count2 = count2 + 1
End If
If ph(0) = 3 Then
Write #13, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count3 = count3 + 1
End If
If ph(0) = 4 Then
Write #14, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count4 = count4 + 1
End If
If ph(0) = 5 Then
Write #15, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count5 = count5 + 1
End If
If ph(0) = 6 Then
Write #16, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count6 = count6 + 1
End If
If ph(0) = 7 Then
Write #17, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count7 = count7 + 1
End If
If ph(0) = 8 Then
Write #18, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count8 = count8 + 1
End If
If ph(0) = 9 Then
Write #19, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19)
count9 = count9 + 1
End If
Next x
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Close #1
' count file for data to fix for - next loops in sucessive dividings
Open "dec14-c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by second digit
Dim ph(20)
dim ps as string
' xxxx = count size from count file
xxxx =
' input file
Open "dec14-1" For Input As #1
' 10 divided files
Open "dec14-10" For Output As #10
Open "dec14-11" For Output As #11
Open "dec14-12" For Output As #12
Open "dec14-13" For Output As #13
Open "dec14-14" For Output As #14
Open "dec14-15" For Output As #15
Open "dec14-16" For Output As #16
Open "dec14-17" For Output As #17
Open "dec14-18" For Output As #18
Open "dec14-19" For Output As #19
count9 = 0
count8 = 0
xxxx = xxxx - 1
For x = 0 To xxxx
Input #1, ps
a2$ = Mid(ps, 2, 1)
ph(1) = Val(a2$)
If ph(1) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(1) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(1) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(1) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(1) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(1) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(1) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(1) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(1) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(1) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
' output counts
Open "dec14-1c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by third digit
Dim ph(20)
dim ps as string
' enter count in xxxx
xxxx =
Open "dec14-11" For Input As #1
Open "dec14-110" For Output As #10
Open "dec14-111" For Output As #11
Open "dec14-112" For Output As #12
Open "dec14-113" For Output As #13
Open "dec14-114" For Output As #14
Open "dec14-115" For Output As #15
Open "dec14-116" For Output As #16
Open "dec14-117" For Output As #17
Open "dec14-118" For Output As #18
Open "dec14-119" For Output As #19
count9 = 0
count8 = 0
xxxx=xxxx - 1
For x = 0 To xxxx
Input #1, ps
a3$ = Mid(ps, 3, 1)
ph(2) = Val(a3$)
If ph(2) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(2) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(2) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(2) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(2) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(2) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(2) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(2) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(2) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(2) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Open "dec14-11c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by fourth digit
Dim ph(20)
dim ps as string
' enter count in xxxx
xxxx =
Open "dec14-131" For Input As #1
Open "dec14-1310" For Output As #10
Open "dec14-1311" For Output As #11
Open "dec14-1312" For Output As #12
Open "dec14-1313" For Output As #13
Open "dec14-1314" For Output As #14
Open "dec14-1315" For Output As #15
Open "dec14-1316" For Output As #16
Open "dec14-1317" For Output As #17
Open "dec14-1318" For Output As #18
Open "dec14-1319" For Output As #19
count9 = 0
count8 = 0
xxxx=xxxx - 1
For x = 0 To xxxx
Input #1, ps
a4$ = Mid(ps, 4, 1)
ph(3) = Val(a4$)
If ph(3) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(3) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(3) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(3) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(3) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(3) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(3) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(3) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(3) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(3) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Open "dec14-131c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by fifth digit
Dim ph(20)
Dim ps As String
' enter count in xxxx
xxxx =
Open "dec14-3412" For Input As #1
Open "dec14-34120" For Output As #10
Open "dec14-34121" For Output As #11
Open "dec14-34122" For Output As #12
Open "dec14-34123" For Output As #13
Open "dec14-34124" For Output As #14
Open "dec14-34125" For Output As #15
Open "dec14-34126" For Output As #16
Open "dec14-34127" For Output As #17
Open "dec14-34128" For Output As #18
Open "dec14-34129" For Output As #19
count9 = 0
count8 = 0
xxxx = xxxx - 1
For x = 0 To xxxx
Input #1, ps
a5$ = Mid(ps, 5, 1)
ph(4) = Val(a5$)
If ph(4) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(4) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(4) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(4) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(4) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(4) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(4) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(4) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(4) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(4) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Open "dec14-3412c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
The next is sorting using Word Tables/ Sort
Before using ,make a test batch of numbers as there are various
Sort outcomes. Now I'm using string data, Text sort gave the
right form on my machine. Use Ctrl+shift+Home(or End) to
highlight text up or down .
After sort and before saving to disk press up or down
arrow to select which way the text is returned to you.
My set-up was limited to no more than 15,000. To sort
say 28,000 sort upper half ,then lower half then cut and
paste say 0 to 2 section of lower half into the top of the
top half. Re-sort the expanded 0 to 2 section then
re-sort the remainder. If say selecting 2 to 3 section then
cut and paste at the juncture of 2 and 3 in the other block
to save some repeated sorting. Other times it is quicker
to oversort then backtrack / overlap on the next sort.
Many of the subdivision files are empty because
of the directing. They consist of eg 4,4.. 4,5.... etc
never 4,0.., 4,1.. etc and a number of 8 and 9 sections
are absent back to the generator characteristics eg
only first 8 of 10 are used. When you know all files are less than
15,000, or whatever Sort limit, use the next (simply a recorded macro)
to sort 10 related files. An empty file will stop the macro so edit
out empty files before running.
'Sort 10 related files in one go
'
Documents.Open FileName:="dec14-130", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-131", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-132", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-133", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-134", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-135", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-136", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-137", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-138", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="dec14-139", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
' empty files will append spurious carriage returns at
' head or tail of files so check for this before final match routine
' otherwise use Insert / File to merge files
' merge 10 related files back to one
' for convenience I named these re-concattenated
' files as .txt so they were obvious in listing
' compared to no suffix ones
' in windows explorer the file sizes of these .txt files
' should have the same file-size as the original
' unexpanded and unsorted file
' use "Word" find for any repeated repeatedend of line "^p^p"
'
Documents.Add Template:="", NewTemplate:=False
Selection.InsertFile FileName:="dec14-130", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-131", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-132", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-133", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-134", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-135", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-136", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-137", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-138", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="dec14-139", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
ActiveDocument.SaveAs FileName:="dec14-13.txt", FileFormat:=wdFormatText, _
LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _
:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False
Copy and paste all these subfiles together to
submit to the next section. The final match finding,
initially for 12 digits ,then change to 14,16,18
and finally 20 if 18 shows something. This routine
after hours of dividing/sorting/re-merging takes only seconds to complete.
' Find matching pairs in 12 digits
' xxxx is count = ????
xxxx =
b$ = "0"
Count = 0
Dim ps As String
Open "dec14-24.txt" For Input As #1
Open "dec14-24m12.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
xxxx = xxxx - 1
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
If a$ = b$ Then
Write #2, ps
Count = Count + 1
End If
b$ = a$
Next x
Write #2, "Count ", Count
close #1
Close #2
' Find matching triples in 12 digits
' xxxx is count from the count files
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "dec14-1.txt" For Input As #1
Open "dec14-1trip.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = c$ Then
Write #2, a2$, b2$, c2$
Count = Count + 1
End If
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
' Find matching quadruples in 12 digits
' xxxx is from the count files
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "dec14-3.txt" For Input As #1
Open "dec14-3quad.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = d$ Then
Write #2, a2$, b2$, c2$, d2$
Count = Count + 1
End If
If a$ = c$ Then
d$ = c$
d2$ = c2$
End If
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
' Find matching quintuples in 12 digits
' xxxx is from the count files
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "dec14-4.txt" For Input As #1
Open "dec14-4quin.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = e$ Then
Write #2, a2$, b2$, c2$, d2$, e2$
Count = Count + 1
End If
If a$ = d$ Then
e$ = d$
e2$ = d2$
End If
If a$ = c$ Then
d$ = c$
d2$ = c2$
End If
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
' converting integre values back to DNA loci,alleles, UK 10 loci
xxxx =
' xxxx is number of profiles to be converted
Dim ph(20)
Dim pj(20)
Dim ps As String
Open "dec14-aaa.txt" For Input As #1
Open "dec14-rrr.txt" For Output As #2
For x = 1 To xxxx
Input #1, ps
a1$ = Mid(ps, 1, 1)
a2$ = Mid(ps, 2, 1)
a3$ = Mid(ps, 3, 1)
a4$ = Mid(ps, 4, 1)
a5$ = Mid(ps, 5, 1)
a6$ = Mid(ps, 6, 1)
a7$ = Mid(ps, 7, 1)
a8$ = Mid(ps, 8, 1)
a9$ = Mid(ps, 9, 1)
a10$ = Mid(ps, 10, 1)
a11$ = Mid(ps, 11, 1)
a12$ = Mid(ps, 12, 1)
a13$ = Mid(ps, 13, 1)
a14$ = Mid(ps, 14, 1)
a15$ = Mid(ps, 15, 1)
a16$ = Mid(ps, 16, 1)
a17$ = Mid(ps, 17, 1)
a18$ = Mid(ps, 18, 1)
a19$ = Mid(ps, 19, 1)
a20$ = Mid(ps, 20, 1)
ph(0) = a1$
ph(1) = a2$
ph(2) = a3$
ph(3) = a4$
ph(4) = a5$
ph(5) = a6$
ph(6) = a7$
ph(7) = a8$
ph(8) = a9$
ph(9) = a10$
ph(10) = a11$
ph(11) = a12$
ph(12) = a13$
ph(13) = a14$
ph(14) = a15$
ph(15) = a16$
ph(16) = a17$
ph(17) = a18$
ph(18) = a19$
ph(19) = a20$
For j = 0 To 1
' vWA
If ph(j) = "0" Then pj(j) = 13
If ph(j) = "1" Then pj(j) = 14
If ph(j) = "2" Then pj(j) = 15
If ph(j) = "3" Then pj(j) = 16
If ph(j) = "4" Then pj(j) = 17
If ph(j) = "5" Then pj(j) = 18
If ph(j) = "6" Then pj(j) = 19
If ph(j) = "7" Then pj(j) = 20
If ph(j) = "8" Then pj(j) = 21
If ph(j) = "9" Then pj(j) = 0
Next j
For j = 2 To 3
' THO1
If ph(j) = "0" Then pj(j) = 5
If ph(j) = "1" Then pj(j) = 6
If ph(j) = "2" Then pj(j) = 7
If ph(j) = "3" Then pj(j) = 8
If ph(j) = "4" Then pj(j) = 8.3
If ph(j) = "5" Then pj(j) = 9
If ph(j) = "6" Then pj(j) = 9.3
If ph(j) = "7" Then pj(j) = 10
If ph(j) = "8" Then pj(j) = 0
If ph(j) = "9" Then pj(j) = 0
Next j
For j = 4 To 5
' D8
If ph(j) = "0" Then pj(j) = 8
If ph(j) = "1" Then pj(j) = 9
If ph(j) = "2" Then pj(j) = 10
If ph(j) = "3" Then pj(j) = 11
If ph(j) = "4" Then pj(j) = 12
If ph(j) = "5" Then pj(j) = 13
If ph(j) = "6" Then pj(j) = 14
If ph(j) = "7" Then pj(j) = 15
If ph(j) = "8" Then pj(j) = 16
If ph(j) = "9" Then pj(j) = 17
Next j
For j = 6 To 7
' FGA
If ph(j) = "0" Then pj(j) = 18
If ph(j) = "1" Then pj(j) = 19
If ph(j) = "2" Then pj(j) = 20
If ph(j) = "A" Then pj(j) = 20.2
If ph(j) = "3" Then pj(j) = 21
If ph(j) = "B" Then pj(j) = 21.2
If ph(j) = "4" Then pj(j) = 22
If ph(j) = "C" Then pj(j) = 22.2
If ph(j) = "5" Then pj(j) = 23
If ph(j) = "D" Then pj(j) = 23.2
If ph(j) = "6" Then pj(j) = 24
If ph(j) = "E" Then pj(j) = 24.2
If ph(j) = "7" Then pj(j) = 25
If ph(j) = "8" Then pj(j) = 26
If ph(j) = "9" Then pj(j) = 27
If ph(j) = "F" Then pj(j) = 30
Next j
For j = 8 To 9
' D21
If ph(j) = "A" Then pj(j) = 25
If ph(j) = "B" Then pj(j) = 26
If ph(j) = "0" Then pj(j) = 27
If ph(j) = "1" Then pj(j) = 28
If ph(j) = "2" Then pj(j) = 29
If ph(j) = "3" Then pj(j) = 30
If ph(j) = "4" Then pj(j) = 30.2
If ph(j) = "5" Then pj(j) = 31
If ph(j) = "6" Then pj(j) = 31.2
If ph(j) = "7" Then pj(j) = 32
If ph(j) = "8" Then pj(j) = 32.2
If ph(j) = "C" Then pj(j) = 33
If ph(j) = "9" Then pj(j) = 33.2
If ph(j) = "D" Then pj(j) = 34.2
Next j
For j = 10 To 11
' D18
If ph(j) = "A" Then pj(j) = 9.2
If ph(j) = "B" Then pj(j) = 10
If ph(j) = "0" Then pj(j) = 11
If ph(j) = "1" Then pj(j) = 12
If ph(j) = "2" Then pj(j) = 13
If ph(j) = "3" Then pj(j) = 14
If ph(j) = "4" Then pj(j) = 15
If ph(j) = "5" Then pj(j) = 16
If ph(j) = "6" Then pj(j) = 17
If ph(j) = "7" Then pj(j) = 18
If ph(j) = "8" Then pj(j) = 19
If ph(j) = "9" Then pj(j) = 20
If ph(j) = "C" Then pj(j) = 21
If ph(j) = "D" Then pj(j) = 22
If ph(j) = "E" Then pj(j) = 23
If ph(j) = "F" Then pj(j) = 24
Next j
For j = 12 To 13
' D2S1338
If ph(j) = "0" Then pj(j) = 16
If ph(j) = "1" Then pj(j) = 17
If ph(j) = "2" Then pj(j) = 18
If ph(j) = "3" Then pj(j) = 19
If ph(j) = "4" Then pj(j) = 20
If ph(j) = "5" Then pj(j) = 21
If ph(j) = "6" Then pj(j) = 22
If ph(j) = "7" Then pj(j) = 23
If ph(j) = "8" Then pj(j) = 24
If ph(j) = "9" Then pj(j) = 25
If ph(j) = "A" Then pj(j) = 26
If ph(j) = "B" Then pj(j) = 27
Next j
For j = 14 To 15
' D16
If ph(j) = "0" Then pj(j) = 8
If ph(j) = "1" Then pj(j) = 9
If ph(j) = "2" Then pj(j) = 10
If ph(j) = "3" Then pj(j) = 11
If ph(j) = "4" Then pj(j) = 12
If ph(j) = "5" Then pj(j) = 13
If ph(j) = "6" Then pj(j) = 14
If ph(j) = "7" Then pj(j) = 15
If ph(j) = "8" Then pj(j) = 0
If ph(j) = "9" Then pj(j) = 0
Next j
For j = 16 To 17
' D19
If ph(j) = "0" Then pj(j) = 12
If ph(j) = "1" Then pj(j) = 13
If ph(j) = "2" Then pj(j) = 13.2
If ph(j) = "3" Then pj(j) = 14
If ph(j) = "4" Then pj(j) = 14.2
If ph(j) = "5" Then pj(j) = 15
If ph(j) = "6" Then pj(j) = 15.2
If ph(j) = "7" Then pj(j) = 16
If ph(j) = "8" Then pj(j) = 16.2
If ph(j) = "9" Then pj(j) = 17
If ph(j) = "A" Then pj(j) = 18.2
If ph(j) = "B" Then pj(j) = 19.2
Next j
For j = 18 To 19
' D3
If ph(j) = "0" Then pj(j) = 12
If ph(j) = "1" Then pj(j) = 13
If ph(j) = "2" Then pj(j) = 14
If ph(j) = "3" Then pj(j) = 15
If ph(j) = "4" Then pj(j) = 16
If ph(j) = "5" Then pj(j) = 17
If ph(j) = "6" Then pj(j) = 18
If ph(j) = "7" Then pj(j) = 19
If ph(j) = "8" Then pj(j) = 0
If ph(j) = "9" Then pj(j) = 0
Next j
Write #2, ""; pj(0), pj(1); ""; pj(2), pj(3); ""; pj(4), pj(5); ""; pj(6), pj(7); ""; pj(8), pj(9); ""; pj(10), pj(11); ""; pj(12), pj(13); ""; pj(14), pj(15); ""; pj(16), pj(17); ""; pj(18), pj(19); ""
Next x
Close #1
Close #2
Cross-check on generator functioning comparing
output ' allele frequencies ' to published original
for UK caucasian. From 5000 'profiles' and
10,000 'alleles' for each locus
Allele / Published / Generated
Locus VWA
13 0.001 0.0010
14 0.105 0.1063
15 0.080 0.0794
16 0.216 0.2159
17 0.270 0.2727
18 0.219 0.2184
19 0.093 0.0919
20 0.014 0.0124
21 0.002 0.0020
THO1
5 0.002 0.0016
6 0.241 0.2513
7 0.194 0.1976
8 0.108 0.1041
8.3 0.001 0.0006
9 0.140 0.1407
9.3 0.304 0.2938
10 0.012 0.0103
D8 D8S1179 / D6S50
8 0.018 0.0195
9 0.013 0.0133
10 0.094 0.0929
11 0.066 0.0661
12 0.143 0.1363
13 0.333 0.3395
14 0.209 0.2067
15 0.088 0.0910
16 0.031 0.0302
17 0.004 0.0045
FGA
18 0.025 0.0239
19 0.056 0.0561
20 0.143 0.1474
20.2 0.002 0.0021
21 0.187 0.1831
21.2 0.002 0.0022
22 0.165 0.1609
22.2 0.011 0.0110
23 0.139 0.1394
23.2 0.004 0.0038
24 0.146 0.1494
24.2 0.002 0.0019
25 0.075 0.0764
26 0.035 0.0347
27 0.007 0.0069
30 0.001 0.0008
D21 D21S11
with Urquhart to Moller conversion
54 0.001 0.0013
57 (26) 0.001 0.0013
59 (27) 0.031 0.0330
61 (28) 0.160 0.1589
63 (29) 0.226 0.2210
65 (30) 0.258 0.2653
66 0.027 0.0267
67 (31) 0.069 0.0653
68 0.093 0.0913
69 (32) 0.018 0.0203
70 0.090 0.0926
71 (33) 0.001 0.0006
72 0.022 0.0200
74 0.002 0.0024
D18 D18S51
9.2 0.001 0.0005
10 0.008 0.0083
11 0.012 0.0111
12 0.139 0.1379
13 0.125 0.1273
14 0.164 0.1678
15 0.145 0.1405
16 0.137 0.1402
17 0.115 0.1116
18 0.080 0.0808
19 0.041 0.0406
20 0.017 0.0147
21 0.010 0.0102
22 0.005 0.0039
23 0.001 0.0021
24 0.002 0.0025
D2 D2S1338
16 0.037 0.0355
17 0.185 0.1875
18 0.087 0.841
19 0.110 0.1162
20 0.138 0.1369
21 0.032 0.0326
22 0.024 0.0244
23 0.112 0.1098
24 0.142 0.1409
25 0.111 0.1121
26 0.019 0.0174
27 0.002 0.0026
D16 D16S539
8 0.019 0.0221
9 0.129 0.1255
10 0.054 0.0567
11 0.289 0.2882
12 0.288 0.2819
13 0.186 0.1895
14 0.029 0.0210
15 0.005 0.0051
D19 D19S433
12 0.087 0.0834
13.0 0.222 0.2197
13.2 0.013 0.0132
14 0.382 0.3786
14.2 0.015 0.0184
15 0.177 0.1773
15.2 0.038 0.0385
16 0.041 0.0422
16.2 0.017 0.0194
17 0.005 0.0052
18.2 0.002 0.0025
19.2 0.001 0.0016
D3 D3S1358
12 0.001 0.0012
13 0.006 0.0063
14 0.132 0.1308
15 0.265 0.2645
16 0.247 0.2451
17 0.195 0.1998
18 0.141 0.1393
19 0.014 0.0130
As a further check
for a 4 million run the combined 2 loci count for
E 'allele' was 23,822 ,desired was 24,000
and F 'allele' was 24,181 for desired 24,000
Exploring further
the 7 x 19 digit matches in one subset.
Including the other 19 digit matches
and shifting columns and checking all then there are
at least 38 x 19 digit matches given
a free choice. That is ANY 9 pairs matching
and 1 pair matching in the 4 remaining alleles.
This time with 4 million and have included all
rare alleles for UK caucasians.
Simulating all loci and alleles and in agreement
with the published allele frequency tables.
All generated profiles are parthenogenic in effect
as generated purely randomly in each of the
4 million cases. With no allowance made for parentage,
co-ancestry, in-breeding etc which will increase
the chance of matches above the base-level
determined here.
So far I've fully sorted and match-checked
the majority starting 1...,2.... and 3....
Remaining subsets 0..., 4... , 5.... , 6.... ,7.... 8...
results over the next few days.
Note the 12 digit matches as pairs,triples,quads
and quins was enough to falsely convict people
in the UK up till 1999.
19 datapoint matches are these days considered
enough of a match to convict the likes of Colin
Waite , with no other corroborative evidence.
A single mismatch is
stated in court to be an anomaly with biology
eg 'spontaneous mutation' without any further
biological testing to confirm this conjecture.
Wheras a single datapoint mismatch should
be evidence of exclusion unless there is biological
confirmation of single allele mismatch within
an individual ,say between semen and buccal cells.
Results so far for 2,562,221 of the 4 million 'profiles'
Subset 1................... ( 795,224 profiles)
12 digit matches 14,190
14 digit 429
16 digit 36
18 digit 3
20 digit 0
12 digit triples 806
12 digit quads 60
12 digit quins 4
subset 2................... ( 546,224 ' profiles')
12 digit matches 7,863
14 digit 241
16 digit 25
18 digit 2
20 digit match 1
which was
23165525234649133524
starting as
32165552326494315342 and
32615552324649313542
which before processing are directed into
ordered pairs, as structured in DNA profiles.
So no problem with the random number generator
spuriously repeating the same sequence of numbers.
converted to standard loci pairs on
vWA,THO1,D8,FGA,D21,D18,D2,D16,D19,D3
(15,16)(6,9.3)(13,13)(20,23)(29,30)(15,17)(20,25)(9,11)(14,15)(14,16)
12 digit triple matches 355
qudruples 34
quintuples 7
subset 3................... ( 1,220,712 'profiles' )
12 digit matches 42,679
14 digit 1,302
16 digit 118
18 digit 11
19 digit 7
20 digit match 1
One of those annoying matters of chance.
So nearly 2x 20 digit matches but only one.
match was
34235726232658443555
converted to standard
(16,17)(7,8)(13,15)(20,24)(29,30)(13,17)(21,24)(12,12)(14,15)(17,17)
started as
34237526326285445355 and
43325726326258445355
so random number generator ok.
As a further check the generator function
for the first 3 pairs has not changed
with the addition of rare alleles.
Just considering the first 6 digits
then these numbers have not appeared
in any of the previous outings of this
simulation.
Subset 4................... (999635 'profiles' )
12 digit matches 23,166
14 digit 657
16 digit 49
18 digit 3
19 'real' ,not first 19, 2
20 digit match 1
44264525134528441322
which converts to
vWA,THO1,D8,FGA,D21,D18,D2,D16,D19,D3
(17,17)(7,9.3)(12,13)(20,23)(28,30)(15,16)(18,24)(12,12)(13,14)(14,14)
Started as
44625452314582441322 and
44625452134528441322
so RNG ok
12 digit triple matches 2,097
qudruples 263
quintuples 45
The 19 digit matches were from checking the 18 point
matches for any match in either 19 or 20 position.
There is of course many more 19 point matches
possible if given free choce from any 9 pairs
rather than in this simulation the first 9 pairs.
If any statistician can tell me what this result
implies for general case, I would be interested
to hear.
That is 7 matches on 9 designated pairs
plus one match in either position of the tenth pair.
Compared to how many matches on
any 9 pairs plus one in either position of
the remaining pairs. 7 x 10 ? = 70 ?
12 digit triple matches 4,364
12 digit quadruple 630
12 digit quintuple 116
A few of the added rare alleles do come through
in 12,14 and 16 digit matches but none so far in
18 or 20 point matches.
A final piece of analysis will be the quantitive change
of allele frequencies in matches compared to
the original profiles. Qualitively - increase
in frequency of common alleles and decrease
in frequency of rare alleles ( multi-modal matches).
One more '10 locus' match in the remainder.
0....., and 5........ , 6.......... , 7........ 8.......
12 digit matches 7,082
14 digit 208
16 digit 15
18 digit 2
19 digit (real) 1
20 1
55164623232349351335
converts to
(18,18)(6,9.3)(12,14)(20,21)(29,30)(13,14)(20,25)(11,13)(13,14)(15,17)
started as
55614623232394531353 and
55614623322349353153
so RNG ok
12 digit triple matches 396
quad 37
quin 4
Summary of results of matches on first x of 10 loci matches
rather than any x of 4 million 10 loci.
12 digit = 6 loci, 94,980
7 loci , 2,837
8 loci , 243
9 loci , 21
10 loci , 4 matches
Aggrgated 6 loci higher order matches in 4 million
Triples, 3,654
Quadruples , 394
Quins , 60
-----------------------------
I also checked whether there were any triple
matches on 18 digits but none.
Multi-modal match analysis
First 7 loci data based on 2837 profiles from the
total 14 digit match data and full alleles for UK Caucasian
Allele / UK Caucasian allele frequency / match allele frequency
Locus VWA
13 0.001 0
14 0.105 0.0767
15 0.080 0.0458
16 0.216 0.2843
17 0.270 0.3255
18 0.219 0.2249
19 0.093 0.0419
20 0.014 0.0009
21 0.002 0
THO1
5 0.002 0
6 0.241 0.2608
7 0.194 0.1791
8 0.108 0.0730
8.3 0.001 0
9 0.140 0.1078
9.3 0.304 0.3791
10 0.012 0.0001
D8 D8S1179 / D6S50
8 0.018 0.0025
9 0.013 0.0016
10 0.094 0.0597
11 0.066 0.0261
12 0.143 0.1251
13 0.333 0.4977
14 0.209 0.2346
15 0.088 0.0476
16 0.031 0.0051
17 0.004 0
FGA
18 0.025 0.0002
19 0.056 0.0263
20 0.143 0.1581
20.2 0.002 0
21 0.187 0.2545
21.2 0.002 0
22 0.165 0.1942
22.2 0.011 0.0005
23 0.139 0.1442
23.2 0.004 0
24 0.146 0.1576
24.2 0.002 0
25 0.075 0.0483
26 0.035 0.0120
27 0.007 0.0002
30 0.001 0
D21, D21S11
with Urquhart to Moller conversion in brackets
54 0.001 0
57 (26) 0.001 0
59 (27) 0.031 0.0070
61 (28) 0.160 0.1641
63 (29) 0.226 0.3086
65 (30) 0.258 0.3611
66 0.027 0.0062
67 (31) 0.069 0.0312
68 0.093 0.0627
69 (32) 0.018 0.0019
70 0.090 0.0543
71 (33) 0.001 0
72 0.022 0.0028
74 0.002 0
D18, D18S51
9.2 0.001 0
10 0.008 0.0007
11 0.012 0.0018
12 0.139 0.1581
13 0.125 0.1255
14 0.164 0.2201
15 0.145 0.1660
16 0.137 0.1532
17 0.115 0.1027
18 0.080 0.0555
19 0.041 0.0137
20 0.017 0.0016
21 0.010 0.0007
22 0.005 0.0002
23 0.001 0
24 0.002 0.0002
D2, D2S1338
16 0.037 0.0134
17 0.185 0.2624
18 0.087 0.0587
19 0.110 0.1059
20 0.138 0.1556
21 0.032 0.0085
22 0.024 0.0058
23 0.112 0.1110
24 0.142 0.1674
25 0.111 0.1066
26 0.019 0.0044
27 0.002 0.0002
Next based on 243 profiles only from 16 digit match data
D16, D16S539
8 0.019 0.0041
9 0.129 0.0638
10 0.054 0.0144
11 0.289 0.3560
12 0.288 0.3724
13 0.186 0.1893
14 0.029 0
15 0.005 0
D19 D19S433 and D3 D3S1358
not analysed because only 23 profiles from 18 digit matches
and 4 on 20 digit
Unmistakable conclusion - the commonest alleles increase
in frequency in matches, rare alleles decrease in frequency ,
and rarest alleles are practically absent from matches. At
least at the level of 4 million pool.
Giving 'Average Joe' profile on first 7 loci of
VWA,THO1,D8,FGA,D21,D18,D2,D16
(16,17)(6,9.3)(13,13)(21,21)(29,30)(14,14)(17,17)(11,12)
So for 4million profiles ,fully sorted and match
checked there were 4 matches. Because each of these
4 is a minimum ie greater than 1 and less than 2
and as so near a pair of matches in the 3..... section
I will declare result as 4.8 matches in 4 million as
reasonable conclusion.
For an interesting exploration and derrivation of this "Square Law" look
in the Usenet 2003 archives for group "uk.legal" or
"alt.sci.math.probability"
and thread titled "Prosecutors fallacy revisited "
Then by square law implies 1.2 matches in 2 million or
1 in 1.8 million.
Whether the square law applies scaling upward
I've no idea but assuming it does then it implies:-
4.8 matches in 4 million or 1 in 830,000
30 matches in 10 million or 1 in 330,000
750 matches in 50 million or 1 in 67,000
75,000 in 500 million or 1 in 6,700
7.5 million in 5000 million or 1 in 670 if
whole earth was profiled with SGM+
The next determination is how co-ancestry
factored in reduces these figures for
real world representation.
The other result for 4 million population is for matches of 19 out of 20.
For matches in first 9 pairs and 1 allele match in 10th pair is 12.
Giving about 120 matches of any 19 out of 20 in a 4 million population
or 1 in 33,000
To show how wrong forensic 'scientists' are.
This was the first discoverd randomly generated 10 loci matching profile.
vWA,THO1,D8,FGA,D21,D18,D2,D16,D19,D3
(15,16)(6,9.3)(13,13)(20,23)(29,30)(15,17)(20,25)(9,11)(14,15)(14,16)
Multiply the allele frequencies together gives
0.08x.216x.241x..304x.333x.333x.143x.139x
.226x.258x.145x.115x.138x.111x
.129x.289x.382x.177x.132x.247
=3.42 *10^-15
reciprocal = 290 million million or 290 trillion.
A forensic 'scientist' given the above profile is
likely to say that the chance of finding a pair of
such profiles would be 1 in 290 trillion.