Simulation of a large DNA Profile Database, full alleles, 13 loci ,CODIS
Simulation of a large DNA Profile Database, full alleles , 13 loci ,CODIS
13 loci CODIS simulation using RCMP Toronto Caucasian data
http://www.csfs.ca/databases/cfs_CC_ProfilerPlus_freq.htm
http://www.csfs.ca/databases/cfs_CC_Cofiler_freq.htm
' Generating 13 loci x2 profiles
' directing pairs and first divider
Dim ph(26)
Dim pb(26)
' initialising Random Number Generator - RNG
count9 = 0
count8 = 0
Randomize
a = 214013
c = 2531011
x0 = Timer
z = 2 ^ 24
' 1 file 'jun27-g' for original, un-directed pairs, source data.
' This file is necessary to check on the performance of the RNG
' when a matched pair is found then it is highly unlikely that
' both sequences as generated, before pair directing, would
' be the same - more likely a manifest of repeat within the RNG
' (reason for adopting the 214013 / 2531011 RNG )
' Use 'Word' find function on part of the sequences, including pair reversals,
' with luck would include a 'homozygotic' pair eg (3,3) say ,so no reversal
' on that pair
Open "jun27-g" For Output As #1
' outputs directed and divided by first digit
Open "jun27-0" For Output As #10
Open "jun27-1" For Output As #11
Open "jun27-2" For Output As #12
Open "jun27-3" For Output As #13
Open "jun27-4" For Output As #14
Open "jun27-5" For Output As #15
Open "jun27-6" For Output As #16
Open "jun27-7" For Output As #17
Open "jun27-8" For Output As #18
Open "jun27-9" For Output As #19
' change xxxx for different total size
' for xxxx = 10000000 my computer took 5 hours to generate over-night
xxxx = 5000
xxxx = xxxx - 1
For x = 0 To xxxx
For j = 0 To 1
' D3 , locus 1
' RNG random number generator
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.005 Then ph(j) = 11
If ph(j) < 0.01 Then ph(j) = 1
If ph(j) < 0.124 Then ph(j) = 2
If ph(j) < 0.382 Then ph(j) = 3
If ph(j) < 0.623 Then ph(j) = 4
If ph(j) < 0.84 Then ph(j) = 5
If ph(j) < 0.988 Then ph(j) = 6
If ph(j) < 0.998 Then ph(j) = 7
If ph(j) < 1 Then ph(j) = 8
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 2 To 3
' vWA locus 2
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.084 Then ph(j) = 1
If ph(j) < 0.193 Then ph(j) = 2
If ph(j) < 0.42 Then ph(j) = 3
If ph(j) < 0.691 Then ph(j) = 4
If ph(j) < 0.903 Then ph(j) = 5
If ph(j) < 0.987 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 4 To 5
' D8 , locus 3
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.012 Then ph(j) = 11
If ph(j) < 0.022 Then ph(j) = 1
If ph(j) < 0.107 Then ph(j) = 2
If ph(j) < 0.187 Then ph(j) = 3
If ph(j) < 0.324 Then ph(j) = 4
If ph(j) < 0.64 Then ph(j) = 5
If ph(j) < 0.865 Then ph(j) = 6
If ph(j) < 0.976 Then ph(j) = 7
If ph(j) < 1 Then ph(j) = 8
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 6 To 7
' D5 , locus 4
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.004 Then ph(j) = 1
If ph(j) < 0.031 Then ph(j) = 2
If ph(j) < 0.099 Then ph(j) = 3
If ph(j) < 0.435 Then ph(j) = 4
If ph(j) < 0.824 Then ph(j) = 5
If ph(j) < 0.985 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 8 To 9
' D13 , locus 5
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.111 Then ph(j) = 11
If ph(j) < 0.198 Then ph(j) = 1
If ph(j) < 0.268 Then ph(j) = 2
If ph(j) < 0.543 Then ph(j) = 3
If ph(j) < 0.862 Then ph(j) = 4
If ph(j) < 0.944 Then ph(j) = 5
If ph(j) < 1 Then ph(j) = 6
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 10 To 11
' D7, locus 6
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.02 Then ph(j) = 11
If ph(j) < 0.162 Then ph(j) = 1
If ph(j) < 0.302 Then ph(j) = 2
If ph(j) < 0.589 Then ph(j) = 3
If ph(j) < 0.816 Then ph(j) = 4
If ph(j) < 0.955 Then ph(j) = 5
If ph(j) < 0.993 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 12 To 13
' D16, locus 7
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.017 Then ph(j) = 11
If ph(j) < 0.133 Then ph(j) = 1
If ph(j) < 0.182 Then ph(j) = 2
If ph(j) < 0.472 Then ph(j) = 3
If ph(j) < 0.824 Then ph(j) = 4
If ph(j) < 0.972 Then ph(j) = 5
If ph(j) < 1 Then ph(j) = 6
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 14 To 15
' FGA, locus 8
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.011 Then ph(j) = 1
If ph(j) < 0.091 Then ph(j) = 2
If ph(j) < 0.245 Then ph(j) = 3
If ph(j) < 0.429 Then ph(j) = 4
If ph(j) < 0.431 Then ph(j) = 5
If ph(j) < 0.605 Then ph(j) = 6
If ph(j) < 0.612 Then ph(j) = 7
If ph(j) < 0.755 Then ph(j) = 8
If ph(j) < 0.897 Then ph(j) = 9
If ph(j) < 0.963 And ph(j) >= 0.897 Then pb(j) = "A"
If ph(j) < 0.993 And ph(j) >= 0.963 Then pb(j) = "B"
If ph(j) < 1 And ph(j) >= 0.993 Then pb(j) = "C"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 16 To 17
' D21, locus 9
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.005 Then ph(j) = 11
If ph(j) < 0.041 Then ph(j) = 1
If ph(j) < 0.201 Then ph(j) = 2
If ph(j) < 0.452 Then ph(j) = 3
If ph(j) < 0.454 Then ph(j) = 4
If ph(j) < 0.683 Then ph(j) = 5
If ph(j) < 0.7 Then ph(j) = 6
If ph(j) < 0.785 Then ph(j) = 7
If ph(j) < 0.874 Then ph(j) = 8
If ph(j) < 0.886 Then ph(j) = 9
If ph(j) < 0.964 And ph(j) >= 0.886 Then pb(j) = "A"
If ph(j) < 0.996 And ph(j) >= 0.964 Then pb(j) = "B"
If ph(j) < 0.998 And ph(j) >= 0.996 Then pb(j) = "C"
If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "D"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 18 To 19
' D18, locus 10
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.005 Then ph(j) = 11
If ph(j) < 0.017 Then ph(j) = 1
If ph(j) < 0.137 Then ph(j) = 2
If ph(j) < 0.256 Then ph(j) = 3
If ph(j) < 0.435 Then ph(j) = 4
If ph(j) < 0.602 Then ph(j) = 5
If ph(j) < 0.742 Then ph(j) = 6
If ph(j) < 0.863 Then ph(j) = 7
If ph(j) < 0.924 Then ph(j) = 8
If ph(j) < 0.955 Then ph(j) = 9
If ph(j) < 0.974 And ph(j) >= 0.955 Then pb(j) = "A"
If ph(j) < 0.986 And ph(j) >= 0.974 Then pb(j) = "B"
If ph(j) < 0.991 And ph(j) >= 0.986 Then pb(j) = "C"
If ph(j) < 0.998 And ph(j) >= 0.991 Then pb(j) = "D"
If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "E"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
Next j
For j = 20 To 21
' THO1 , locus 11
' RNG random number generator
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.239 Then ph(j) = 11
If ph(j) < 0.403 Then ph(j) = 1
If ph(j) < 0.512 Then ph(j) = 2
If ph(j) < 0.679 Then ph(j) = 3
If ph(j) < 0.988 Then ph(j) = 4
If ph(j) < 1 Then ph(j) = 5
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 22 To 23
' TPOX, locus 12
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.543 Then ph(j) = 11
If ph(j) < 0.632 Then ph(j) = 1
If ph(j) < 0.7 Then ph(j) = 2
If ph(j) < 0.968 Then ph(j) = 3
If ph(j) < 1 Then ph(j) = 4
If ph(j) > 10 Then ph(j) = 0
Next j
For j = 24 To 25
' CSF1PO , locus 13
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.031 Then ph(j) = 1
If ph(j) < 0.301 Then ph(j) = 2
If ph(j) < 0.62 Then ph(j) = 3
If ph(j) < 0.928 Then ph(j) = 4
If ph(j) < 0.991 Then ph(j) = 5
If ph(j) < 0.998 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
Next j
' output the original generated file
Write #1, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
' Because in real DNA profiles without further info ,no one
' knows which allele in each pair came from the mother or father
' by convention they are written smaller ,larger (or equal).
' The following directs each pair
For j = 0 To 24 Step 2
If ph(j + 1) < ph(j) Then
jjj = ph(j)
ph(j) = ph(j + 1)
ph(j + 1) = jjj
End If
Next j
' put extra conditional statements here to reduce
' the number of files or just delete some of the following
'
' dividing on first column, file by file
If ph(0) = 0 Then
Write #10, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count0 = count0 + 1
End If
If ph(0) = 1 Then
Write #11, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count1 = count1 + 1
End If
If ph(0) = 2 Then
Write #12, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count2 = count2 + 1
End If
If ph(0) = 3 Then
Write #13, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count3 = count3 + 1
End If
If ph(0) = 4 Then
Write #14, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count4 = count4 + 1
End If
If ph(0) = 5 Then
Write #15, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count5 = count5 + 1
End If
If ph(0) = 6 Then
Write #16, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count6 = count6 + 1
End If
If ph(0) = 7 Then
Write #17, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count7 = count7 + 1
End If
If ph(0) = 8 Then
Write #18, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count8 = count8 + 1
End If
If ph(0) = 9 Then
Write #19, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count9 = count9 + 1
End If
Next x
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Close #1
' count file for data to fix for - next loops in sucessive dividings
Open "jun27-c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by second digit
Dim ph(26)
dim ps as string
' xxxx = count size from count file
xxxx =
' input file
Open "jun27-1" For Input As #1
' 10 divided files
Open "jun27-10" For Output As #10
Open "jun27-11" For Output As #11
Open "jun27-12" For Output As #12
Open "jun27-13" For Output As #13
Open "jun27-14" For Output As #14
Open "jun27-15" For Output As #15
Open "jun27-16" For Output As #16
Open "jun27-17" For Output As #17
Open "jun27-18" For Output As #18
Open "jun27-19" For Output As #19
count9 = 0
count8 = 0
xxxx = xxxx - 1
For x = 0 To xxxx
Input #1, ps
a2$ = Mid(ps, 2, 1)
ph(1) = Val(a2$)
If ph(1) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(1) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(1) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(1) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(1) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(1) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(1) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(1) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(1) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(1) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
' output counts
Open "jun27-1c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by third digit
Dim ph(26)
dim ps as string
' enter count in xxxx
xxxx =
Open "jun27-11" For Input As #1
Open "jun27-110" For Output As #10
Open "jun27-111" For Output As #11
Open "jun27-112" For Output As #12
Open "jun27-113" For Output As #13
Open "jun27-114" For Output As #14
Open "jun27-115" For Output As #15
Open "jun27-116" For Output As #16
Open "jun27-117" For Output As #17
Open "jun27-118" For Output As #18
Open "jun27-119" For Output As #19
count9 = 0
count8 = 0
xxxx=xxxx - 1
For x = 0 To xxxx
Input #1, ps
a3$ = Mid(ps, 3, 1)
ph(2) = Val(a3$)
If ph(2) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(2) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(2) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(2) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(2) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(2) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(2) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(2) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(2) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(2) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Open "jun27-11c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by fourth digit
Dim ph(26)
dim ps as string
' enter count in xxxx
xxxx =
Open "jun27-131" For Input As #1
Open "jun27-1310" For Output As #10
Open "jun27-1311" For Output As #11
Open "jun27-1312" For Output As #12
Open "jun27-1313" For Output As #13
Open "jun27-1314" For Output As #14
Open "jun27-1315" For Output As #15
Open "jun27-1316" For Output As #16
Open "jun27-1317" For Output As #17
Open "jun27-1318" For Output As #18
Open "jun27-1319" For Output As #19
count9 = 0
count8 = 0
xxxx=xxxx - 1
For x = 0 To xxxx
Input #1, ps
a4$ = Mid(ps, 4, 1)
ph(3) = Val(a4$)
If ph(3) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(3) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(3) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(3) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(3) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(3) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(3) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(3) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(3) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(3) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Open "jun27-131c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
' Dividing file into 10 by fifth digit
Dim ph(26)
Dim ps As String
' enter count in xxxx
xxxx =
Open "jun27-3412" For Input As #1
Open "jun27-34120" For Output As #10
Open "jun27-34121" For Output As #11
Open "jun27-34122" For Output As #12
Open "jun27-34123" For Output As #13
Open "jun27-34124" For Output As #14
Open "jun27-34125" For Output As #15
Open "jun27-34126" For Output As #16
Open "jun27-34127" For Output As #17
Open "jun27-34128" For Output As #18
Open "jun27-34129" For Output As #19
count9 = 0
count8 = 0
xxxx = xxxx - 1
For x = 0 To xxxx
Input #1, ps
a5$ = Mid(ps, 5, 1)
ph(4) = Val(a5$)
If ph(4) = 0 Then
Write #10, ps
count0 = count0 + 1
End If
If ph(4) = 1 Then
Write #11, ps
count1 = count1 + 1
End If
If ph(4) = 2 Then
Write #12, ps
count2 = count2 + 1
End If
If ph(4) = 3 Then
Write #13, ps
count3 = count3 + 1
End If
If ph(4) = 4 Then
Write #14, ps
count4 = count4 + 1
End If
If ph(4) = 5 Then
Write #15, ps
count5 = count5 + 1
End If
If ph(4) = 6 Then
Write #16, ps
count6 = count6 + 1
End If
If ph(4) = 7 Then
Write #17, ps
count7 = count7 + 1
End If
If ph(4) = 8 Then
Write #18, ps
count8 = count8 + 1
End If
If ph(4) = 9 Then
Write #19, ps
count9 = count9 + 1
End If
Next x
Close #1
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Open "jun27-3412c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9
Close #20
The next is sorting using Word Tables/ Sort
Before using ,make a test batch of numbers as there are various
Sort outcomes. Now I'm using string data, Text sort gave the
right form on my machine. Use Ctrl+shift+Home(or End) to
highlight text up or down .
After sort and before saving to disk press up or down
arrow to select which way the text is returned to you.
My set-up was limited to no more than 15,000. To sort
say 28,000 sort upper half ,then lower half then cut and
paste say 0 to 2 section of lower half into the top of the
top half. Re-sort the expanded 0 to 2 section then
re-sort the remainder. If say selecting 2 to 3 section then
cut and paste at the juncture of 2 and 3 in the other block
to save some repeated sorting. Other times it is quicker
to oversort then backtrack / overlap on the next sort.
Many of the subdivision files are empty because
of the directing. They consist of eg 4,4.. 4,5.... etc
never 4,0.., 4,1.. etc and a number of 8 and 9 sections
are absent back to the generator characteristics eg
only first 8 of 10 are used. When you know all files are less than
15,000, or whatever Sort limit, use the next (simply a recorded macro)
to sort 10 related files. An empty file will stop the macro so edit
out empty files before running.
'Sort 10 related files in one go
'
Documents.Open FileName:="jun27-130", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-131", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-132", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-133", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-134", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-135", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-136", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-137", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-138", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
'
Documents.Open FileName:="jun27-139", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending, _
FieldNumber2:="", SortFieldType2:=wdSortFieldAlphanumeric, SortOrder2:= _
wdSortOrderAscending, FieldNumber3:="", SortFieldType3:= _
wdSortFieldAlphanumeric, SortOrder3:=wdSortOrderAscending, Separator:= _
wdSortSeparateByTabs, SortColumn:=False, CaseSensitive:=False, LanguageID _
:=wdLanguageNone
ActiveDocument.Save
' It is possible to sometimes join the above sort macro
' with this re-concattenation macro
'
' empty files will append spurious carriage returns at
' head or tail of files so check for this before final match routine
' otherwise use Insert / File to merge files
' merge 10 related files back to one
' for convenience I named these re-concattenated
' files as .txt so they were obvious in listing
' compared to no suffix ones
' in windows explorer the file sizes of these .txt files
' should have the same file-size as the original
' unexpanded and unsorted file
' use "Word" find for any repeated repeatedend of line "^p^p"
'
Documents.Add Template:="", NewTemplate:=False
Selection.InsertFile FileName:="jun27-130", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-131", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-132", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-133", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-134", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-135", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-136", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-137", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-138", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
Selection.InsertFile FileName:="jun27-139", Range:="", ConfirmConversions _
:=False, Link:=False, Attachment:=False
ActiveDocument.SaveAs FileName:="jun27-13.txt", FileFormat:=wdFormatText, _
LockComments:=False, Password:="", AddToRecentFiles:=True, WritePassword _
:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts:=False, _
SaveNativePictureFormat:=False, SaveFormsData:=False, SaveAsAOCELetter:= _
False
Copy and paste all these subfiles together to
submit to the next section. The final match finding,
initially for 12 digits ,then change to 14,16,18
and finally 20 if 18 shows something. This routine
after hours of dividing/sorting/re-merging takes only seconds to complete.
' Find matching pairs in 12 digits
' ie not including triples ( there was an error in previous
' versions of these macros that over-inflated the number
' of pair matches by the number of triple situations )
' xxxx is count from the count files
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "jun27-1.txt" For Input As #1
Open "jun27-0pair12.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = b$ Then
Write #2, a2$, b2$
Count = Count + 1
End If
If a$ = c$ Then Count = Count - 1
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
' Find matching triples in 12 digits
' xxxx is count
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "jun27-1.txt" For Input As #1
Open "jun27-1trip12.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = c$ Then
Write #2, a2$, b2$, c2$
Count = Count + 1
End If
If a$ = d$ Then Count = Count - 1
If a$ = b$ Then
d$ = c$
d2$ = c2$
End If
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
' Find matching quadruples in 12 digits
' xxxx is the count
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "jun27-3.txt" For Input As #1
Open "jun27-3quad12.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = d$ Then
Write #2, a2$, b2$, c2$, d2$
Count = Count + 1
End If
If a$ = e$ Then Count = Count - 1
If a$ = b$ Then
e$ = d$
e2$ = d2$
End If
If a$ = c$ Then
d$ = c$
d2$ = c2$
End If
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
' Find matching quintuples in 12 digits
' xxxx is the count
xxxx =
b$ = "0"
c$ = "0"
Count = 0
Dim ps As String
xxxx = xxxx - 1
Open "jun27-4.txt" For Input As #1
Open "jun27-4quin12.txt" For Output As #2
' change the 12 in the #2 file name above and
' the Left function below to suit number of matches
For x = 0 To xxxx
Input #1, ps
a$ = Left(ps, 12)
a2$ = ps
If a$ = e$ Then
Write #2, a2$, b2$, c2$, d2$, e2$
Count = Count + 1
End If
If a$ = f$ Then Count = Count - 1
If a$ = b$ Then
f$ = e$
f2$ = e2$
End If
If a$ = d$ Then
e$ = d$
e2$ = d2$
End If
If a$ = c$ Then
d$ = c$
d2$ = c2$
End If
If a$ = b$ Then
c$ = b$
c2$ = b2$
End If
b$ = a$
b2$ = a2$
Next x
Write #2, "Count ", Count
Close #1
Close #2
Cross-check for generator comparing output allele frequencies against
intended allele frequencies for 5,000 profiles
D3 Intended Generated (per cent )
0 0.5 0.62
1 0.5 0.48
2 11.4 10.9
3 25.8 25.5
4 24.1 23.96
5 21.7 21.57
6 14.8 15.69
7 1.0 1.06
8 0.2 0.22
VWA
0 .2 .25
1 8.2 7.82
2 10.9 10.4
3 22.7 22.81
4 27.1 27.91
5 21.2 21.04
6 8.4 8.32
7 1.4 1.45
D8
0 1.2 1.1
1 1.0 .97
2 8.5 8.37
3 8.0 7.76
4 13.7 14.39
5 31.6 31.55
6 22.5 22.43
7 11.1 11.07
8 2.4 2.36
D5
0 .2 .22
1 .2 .19
2 2.7 2.54
3 6.8 6.9
4 33.6 33.45
5 38.9 39.09
6 16.0 15.92
7 1.5 1.59
D13
0 11.1 10.87
1 8.7 8.54
2 7.0 6.8
3 27.5 27.7
4 31.9 32.92
5 8.2 8.22
6 5.6 4.95
D7
0 2.0 2.17
1 14.2 14.25
2 14.0 14.45
3 28.7 28.33
4 22.7 22.43
5 14.0 13.93
6 3.8 3.61
7 0.7 0.83
D16
0 17.0 16.9
1 11.6 12.28
2 4.9 4.86
3 29.0 28.72
4 35.2 34.6
5 14.8 15.11
6 2.7 2.74
FGA
0 0.2 0.23
1 0.9 0.89
2 8.0 8.85
3 15.4 15.43
4 18.4 17.66
5 0.2 0.2
6 17.4 17.48
7 0.7 0.65
8 14.3 14.39
9 14.2 14.57
A 6.7 6.36
B 3.1 2.7
C 0.7 0.59
D21
0 0.5 0.57
1 3.6 3.63
2 16.0 16.59
3 25.1 24.81
4 0.2 0.15
5 22.9 22.79
6 1.7 15.2
7 8.5 8.33
8 8.9 8.61
9 1.2 1.0
A 7.8 7.99
B 3.2 3.53
C 0.2 0.25
D 0.2 0.23
D18
0 0.5 0.51
1 1.2 11.1
2 11.9 12.54
3 11.9 11.67
4 17.9 17.47
5 16.7 16.67
6 14.0 13.81
7 12.1 12.1
8 6.1 6.16
9 3.1 3.22
A 19.0 19.1
B 1.2 1.4
C 0.5 0.59
D 0.7 0.66
E 0.2 0.18
THO1
0 23.9 23.4
1 16.4 16.32
2 10.9 11.39
3 16.7 17.2
4 30.9 30.58
5 1.2 1.11
TPOX
0 54.3 53.77
1 8.9 8.93
2 6.8 6.82
3 26.8 26.98
4 3.2 3.5
CSF1PO
0 0.2 0.18
1 2.9 3.13
2 27.0 26.72
3 31.9 31.79
4 30.9 31.33
5 6.3 5.86
6 0.7 0.77
7 0.2 0.22
Results
For 13 loci in order D3,vWA,D8,D5,D13,D7,D16,FGA,D21,D18,THO1,TPOX,CSF1PO
and using RCMP data for Toronto caucasian allele frequencies
http://www.csfs.ca/databases/cfs_CC_ProfilerPlus_freq.htm
http://www.csfs.ca/databases/cfs_CC_Cofiler_freq.htm
and 10 million profiles - no pair mtches on 12 or 13 loci
The random number generator worked fine for 260 million calls in sequence
as no 13 loci false matches and I checked back to the generator file for
each of the following 4 and all started as various undirected sequences.
4 separate 11 loci pairs of matches converted back to standard representation
D3,vWA,D8,D5,D13,D7,D16,FGA,D21,D18,THO1
(15,16)(17,18)(13,14)(11,12)(8,11)(10,11)(11,11)(20,24)(29,31)(15,16)(6,9.3)
(15,16)(14,17)(13,14)(11,12)(11,12)(10,12)(12,13)(20,21)(29,29)(13,15)(6,9.3)
(15,18)(16,18)(10,14)(12,12)(9,12)(11,12)(12,12)(20,22)(29,29)(13,15)(6,6)
(18,18)(16,17)(14,15)(11,11)(11,12)(9,9)(12,12)(20,23)(28,32.2)(16,17)(9.3,9.3)
The minimum allele frequencies for these 4 matches was
8.5 per cent, 8.2 , 8.5 and 7.8
Then each case one allele of each of the locus 12 and 13 matching.
Any 11 loci from 13 would produce many more matches of course.
That is despite 2 chances of 54 percent for each 'person' to inherit allele 8 on TPOX
or 79 percent for at least one TPOX 8 allele.
I would conjecture that to obtain one CODIS match would require
about 26 million profiles. Summary of results so far would
suggest this
Table for totally random (no inter-relatedness ) 'populations' of 4 and 10 million
6 loci : 1 in 14,000
7 loci : 1 in 78,000
8 loci : 1 in 0.26 million
9 loci : 1 in 0.87 million
10 loci : 1 in 1.8 million
11 loci : 1 in 4.5 million
12 loci : 1 in 11 million (projected)
13 loci : 1 in 26 million (projected)
The fractional discontinuity at 9 to 10 loci probably due
to swapping from UK loci to US loci.
The most remarkable result is the effect on triple
matches and higher n-tuple matches.
The following is a bit apples and pears because the 4 million
results were for UK and 10 million for US loci , 7 of which are the same.
Loci / Pairs in 4 million / 10 million
6- 88,000/ 1.46 million
7- 2,600/ 535,000
8- 243/ 33,000
9- 21/ 1,570
10- 4/ 44
11- -/ 4
So a 2.5 factor increase in population increases the number
of reduced loci matches phenomenally - well above any
square law or even cubes - even as much as to the power of 5
For the 10 loci situation despite change from UK
to USA loci the reults are consistent with square law.
4.8 matches in 4 million UK would suggest 4.8 * 2.5^2
for 10 million or 30 predicted. US 10 million gioves 44
Loci / Triples in 4 million / 10m
6- 8,018/ 626,000
Then 10 million only
7- 105,000
8- 739
9- 2
Loci / Quadruples in 4 million / 10m
6- 1024/ 338,000
Then 10 million only
7- 32,000
8- 35
Loci / Quintuples in 4 million / 10m
6- 176/ 207,000
Then for 10 million only
7- 12,700
8- 2
Remember the forensic statisticians in the 1990s quoting in court
the probability for an unrelated 6 locus profile match to this
defendent being 37 million to 1 against.
37 million 6 locus profiles would be totally swamped with
n-tuple repeats let alone pair matches
To go to a 30 million profile simulation I would have to automate
a lot of the processing to be done overnight as the generation
would take 16 hours and just the sorting,
on my pc, would take perhaps 100 hours and all require 10G Byte of free disk space.
' Generating 13 loci x2 profiles excluding rare alleles
' as set here for all allele frequencies
' less than 8% 0.08
' To change from 8% to 5% or whatever delete the appropriate
' flagged allele conditionals eg "If ph(j) = "2" Then flag = 1"
' directing pairs and first divider
Dim ph(26)
Dim pb(26)
' initialising Random Number Generator - RNG
count9 = 0
count8 = 0
countf = 0
Randomize
a = 214013
c = 2531011
x0 = Timer
z = 2 ^ 24
' 1 file 'jul06-g' for original, un-directed pairs, source data.
' This file is necessary to check on the performance of the RNG
' when a matched pair is found then it is highly unlikely that
' both sequences as generated, before pair directing, would
' be the same - more likely a manifest of repeat within the RNG
' (reason for adopting the 214013 / 2531011 RNG )
' Use 'Word' find function on part of the sequences, including pair reversals,
' with luck would include a 'homozygotic' pair eg (3,3) say ,so no reversal
' on that pair
Open "jul06-g" For Output As #1
' outputs directed and divided by first digit
Open "jul06-0" For Output As #10
Open "jul06-1" For Output As #11
Open "jul06-2" For Output As #12
Open "jul06-3" For Output As #13
Open "jul06-4" For Output As #14
Open "jul06-5" For Output As #15
Open "jul06-6" For Output As #16
Open "jul06-7" For Output As #17
Open "jul06-8" For Output As #18
Open "jul06-9" For Output As #19
' change xxxx for different total size, no large numbers are required
' as only obtaining a scaling factor
xxxx = 26000
xxxx = xxxx - 1
For x = 0 To xxxx
flag = 0
For j = 0 To 1
' D3 , locus 1
' RNG random number generator
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.005 Then ph(j) = 11
If ph(j) < 0.01 Then ph(j) = 1
If ph(j) < 0.124 Then ph(j) = 2
If ph(j) < 0.382 Then ph(j) = 3
If ph(j) < 0.623 Then ph(j) = 4
If ph(j) < 0.84 Then ph(j) = 5
If ph(j) < 0.988 Then ph(j) = 6
If ph(j) < 0.998 Then ph(j) = 7
If ph(j) < 1 Then ph(j) = 8
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "7" Then flag = 1
If ph(j) = "8" Then flag = 1
Next j
For j = 2 To 3
' vWA locus 2
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.084 Then ph(j) = 1
If ph(j) < 0.193 Then ph(j) = 2
If ph(j) < 0.42 Then ph(j) = 3
If ph(j) < 0.691 Then ph(j) = 4
If ph(j) < 0.903 Then ph(j) = 5
If ph(j) < 0.987 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "7" Then flag = 1
Next j
For j = 4 To 5
' D8 , locus 3
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.012 Then ph(j) = 11
If ph(j) < 0.022 Then ph(j) = 1
If ph(j) < 0.107 Then ph(j) = 2
If ph(j) < 0.187 Then ph(j) = 3
If ph(j) < 0.324 Then ph(j) = 4
If ph(j) < 0.64 Then ph(j) = 5
If ph(j) < 0.865 Then ph(j) = 6
If ph(j) < 0.976 Then ph(j) = 7
If ph(j) < 1 Then ph(j) = 8
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "3" Then flag = 1
If ph(j) = "8" Then flag = 1
Next j
For j = 6 To 7
' D5 , locus 4
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.004 Then ph(j) = 1
If ph(j) < 0.031 Then ph(j) = 2
If ph(j) < 0.099 Then ph(j) = 3
If ph(j) < 0.435 Then ph(j) = 4
If ph(j) < 0.824 Then ph(j) = 5
If ph(j) < 0.985 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "2" Then flag = 1
If ph(j) = "3" Then flag = 1
If ph(j) = "7" Then flag = 1
Next j
For j = 8 To 9
' D13 , locus 5
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.111 Then ph(j) = 11
If ph(j) < 0.198 Then ph(j) = 1
If ph(j) < 0.268 Then ph(j) = 2
If ph(j) < 0.543 Then ph(j) = 3
If ph(j) < 0.862 Then ph(j) = 4
If ph(j) < 0.944 Then ph(j) = 5
If ph(j) < 1 Then ph(j) = 6
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "2" Then flag = 1
If ph(j) = "6" Then flag = 1
Next j
For j = 10 To 11
' D7, locus 6
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.02 Then ph(j) = 11
If ph(j) < 0.162 Then ph(j) = 1
If ph(j) < 0.302 Then ph(j) = 2
If ph(j) < 0.589 Then ph(j) = 3
If ph(j) < 0.816 Then ph(j) = 4
If ph(j) < 0.955 Then ph(j) = 5
If ph(j) < 0.993 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "6" Then flag = 1
If ph(j) = "7" Then flag = 1
Next j
For j = 12 To 13
' D16, locus 7
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.017 Then ph(j) = 11
If ph(j) < 0.133 Then ph(j) = 1
If ph(j) < 0.182 Then ph(j) = 2
If ph(j) < 0.472 Then ph(j) = 3
If ph(j) < 0.824 Then ph(j) = 4
If ph(j) < 0.972 Then ph(j) = 5
If ph(j) < 1 Then ph(j) = 6
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "2" Then flag = 1
If ph(j) = "6" Then flag = 1
Next j
For j = 14 To 15
' FGA, locus 8
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.011 Then ph(j) = 1
If ph(j) < 0.091 Then ph(j) = 2
If ph(j) < 0.245 Then ph(j) = 3
If ph(j) < 0.429 Then ph(j) = 4
If ph(j) < 0.431 Then ph(j) = 5
If ph(j) < 0.605 Then ph(j) = 6
If ph(j) < 0.612 Then ph(j) = 7
If ph(j) < 0.755 Then ph(j) = 8
If ph(j) < 0.897 Then ph(j) = 9
If ph(j) < 0.963 And ph(j) >= 0.897 Then pb(j) = "A"
If ph(j) < 0.993 And ph(j) >= 0.963 Then pb(j) = "B"
If ph(j) < 1 And ph(j) >= 0.993 Then pb(j) = "C"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "2" Then flag = 1
If ph(j) = "5" Then flag = 1
If ph(j) = "7" Then flag = 1
If pb(j) = "A" Then flag = 1
If pb(j) = "B" Then flag = 1
If pb(j) = "C" Then flag = 1
Next j
For j = 16 To 17
' D21, locus 9
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.005 Then ph(j) = 11
If ph(j) < 0.041 Then ph(j) = 1
If ph(j) < 0.201 Then ph(j) = 2
If ph(j) < 0.452 Then ph(j) = 3
If ph(j) < 0.454 Then ph(j) = 4
If ph(j) < 0.683 Then ph(j) = 5
If ph(j) < 0.7 Then ph(j) = 6
If ph(j) < 0.785 Then ph(j) = 7
If ph(j) < 0.874 Then ph(j) = 8
If ph(j) < 0.886 Then ph(j) = 9
If ph(j) < 0.964 And ph(j) >= 0.886 Then pb(j) = "A"
If ph(j) < 0.996 And ph(j) >= 0.964 Then pb(j) = "B"
If ph(j) < 0.998 And ph(j) >= 0.996 Then pb(j) = "C"
If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "D"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "4" Then flag = 1
If ph(j) = "6" Then flag = 1
If ph(j) = "9" Then flag = 1
If pb(j) = "A" Then flag = 1
If pb(j) = "B" Then flag = 1
If pb(j) = "C" Then flag = 1
If pb(j) = "D" Then flag = 1
Next j
For j = 18 To 19
' D18, locus 10
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
pb(j) = "Z"
If ph(j) < 0.005 Then ph(j) = 11
If ph(j) < 0.017 Then ph(j) = 1
If ph(j) < 0.137 Then ph(j) = 2
If ph(j) < 0.256 Then ph(j) = 3
If ph(j) < 0.435 Then ph(j) = 4
If ph(j) < 0.602 Then ph(j) = 5
If ph(j) < 0.742 Then ph(j) = 6
If ph(j) < 0.863 Then ph(j) = 7
If ph(j) < 0.924 Then ph(j) = 8
If ph(j) < 0.955 Then ph(j) = 9
If ph(j) < 0.974 And ph(j) >= 0.955 Then pb(j) = "A"
If ph(j) < 0.986 And ph(j) >= 0.974 Then pb(j) = "B"
If ph(j) < 0.991 And ph(j) >= 0.986 Then pb(j) = "C"
If ph(j) < 0.998 And ph(j) >= 0.991 Then pb(j) = "D"
If ph(j) < 1 And ph(j) >= 0.998 Then pb(j) = "E"
If ph(j) > 10 Then ph(j) = 0
If pb(j) <> "Z" Then ph(j) = pb(j)
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "8" Then flag = 1
If ph(j) = "9" Then flag = 1
If pb(j) = "A" Then flag = 1
If pb(j) = "B" Then flag = 1
If pb(j) = "C" Then flag = 1
If pb(j) = "D" Then flag = 1
If pb(j) = "E" Then flag = 1
Next j
For j = 20 To 21
' THO1 , locus 11
' RNG random number generator
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.239 Then ph(j) = 11
If ph(j) < 0.403 Then ph(j) = 1
If ph(j) < 0.512 Then ph(j) = 2
If ph(j) < 0.679 Then ph(j) = 3
If ph(j) < 0.988 Then ph(j) = 4
If ph(j) < 1 Then ph(j) = 5
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "5" Then flag = 1
Next j
For j = 22 To 23
' TPOX, locus 12
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.543 Then ph(j) = 11
If ph(j) < 0.632 Then ph(j) = 1
If ph(j) < 0.7 Then ph(j) = 2
If ph(j) < 0.968 Then ph(j) = 3
If ph(j) < 1 Then ph(j) = 4
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "2" Then flag = 1
If ph(j) = "4" Then flag = 1
Next j
For j = 24 To 25
' CSF1PO , locus 13
' RNG
temp = x0 * a + c
temp = temp / z
x1 = (temp - Fix(temp)) * z
x0 = x1
phj = x1 / z
ph(j) = phj
If ph(j) < 0.002 Then ph(j) = 11
If ph(j) < 0.031 Then ph(j) = 1
If ph(j) < 0.301 Then ph(j) = 2
If ph(j) < 0.62 Then ph(j) = 3
If ph(j) < 0.928 Then ph(j) = 4
If ph(j) < 0.991 Then ph(j) = 5
If ph(j) < 0.998 Then ph(j) = 6
If ph(j) < 1 Then ph(j) = 7
If ph(j) > 10 Then ph(j) = 0
If ph(j) = "0" Then flag = 1
If ph(j) = "1" Then flag = 1
If ph(j) = "5" Then flag = 1
If ph(j) = "6" Then flag = 1
If ph(j) = "7" Then flag = 1
Next j
If flag = 1 Then countf = countf + 1
If flag = 0 Then
' output the original generated file
Write #1, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
' output the original generated file
For j = 0 To 24 Step 2
If ph(j + 1) < ph(j) Then
jjj = ph(j)
ph(j) = ph(j + 1)
ph(j + 1) = jjj
End If
Next j
' put extra conditional statements here to reduce
' the number of files or just delete some of the following
'
' dividing on first column, file by file
If ph(0) = 0 Then
Write #10, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count0 = count0 + 1
End If
If ph(0) = 1 Then
Write #11, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count1 = count1 + 1
End If
If ph(0) = 2 Then
Write #12, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count2 = count2 + 1
End If
If ph(0) = 3 Then
Write #13, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count3 = count3 + 1
End If
If ph(0) = 4 Then
Write #14, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count4 = count4 + 1
End If
If ph(0) = 5 Then
Write #15, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count5 = count5 + 1
End If
If ph(0) = 6 Then
Write #16, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count6 = count6 + 1
End If
If ph(0) = 7 Then
Write #17, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count7 = count7 + 1
End If
If ph(0) = 8 Then
Write #18, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count8 = count8 + 1
End If
If ph(0) = 9 Then
Write #19, ph(0) & ph(1) & ph(2) & ph(3) & ph(4) & ph(5) & ph(6) & ph(7) & ph(8) & ph(9) & ph(10) & ph(11) & ph(12) & ph(13) & ph(14) & ph(15) & ph(16) & ph(17) & ph(18) & ph(19) & ph(20) & ph(21) & ph(22) & ph(23) & ph(24) & ph(25)
count9 = count9 + 1
End If
End If
Next x
Close #10
Close #11
Close #12
Close #13
Close #14
Close #15
Close #16
Close #17
Close #18
Close #19
Close #1
' count file for data to fix for - next loops in sucessive dividings
' count for >= 8.0% allele frequencies is xxxx minus countf
Open "jul06-c" For Output As #20
Write #20, 0, count0, 1, count1, 2, count2, 3, count3, 4, count4, 5, count5, 6, count6, 7, count7, 8, count8, 9, count9, "count flag ", xxxx - countf
Close #20
So for the moment assuming 26 million totally random profiles to produce
one 13 locus match then the above simulation produces the following results
for that subset of the population with 13 locus profiles with no rare alleles and all allele frequencies
greater than
1 percent then 1 in 17.5 million
3 per cent then 1 in 7 million
5 per cent then 1 in 5.9 million
8 per cent then 1 in 1.6 million
I woul be interested to hear from anyone who is aware of processed
large swathes of data showing non- Hardy-Weinberg equilibrium.
That is cross-linking of loci/alleles showing up in real populations
(similar to inheritance of blonde hair / blue eyes versus dark hair /
dark eyes sort of effect). I've only, so far, seen references
to Non H-WE effects in certain populations and specific loci but not
data precise to the loci and alleles involved. As it is not across the board
effect with all loci then it is hard to say it is a manifest of an inbred population.
To go higher than 10 million I will have to
automate a lot of this processing to do it
hands-off over night or for hours at a time.
First to determine the how the successive file dividings
break down to give all file sizes below the maximum
sort size under Word97.
For locus L(h) with alleles i and j and allele frequency a.
The sum of all allele frequencies is 1 so for S read sigma
S(a(i,j)) = 1 for all h
For X profiles in total
For the locus L(h) the fraction of total in first column
and allele i is
F(h,i) = F(h-1)*[1+S(R(>i)) -S(R(i)) + S(R(i)) is the sum of the allele frequencies
more than the ith and R(i)) S(R(