Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Added three test cases for the kernel density estimation. Resulted in a few small corrections (deal with missing values) |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | descendants | both | trunk |
Files: | files | file ages | folders |
SHA1: |
2456e0d41399a53b8460aed292c54a4c |
User & Date: | markus 2014-01-18 14:20:22.935 |
Context
2014-01-20
| ||
05:17 | Corrected use of bandwidth, adjusted a test to take care of pecularities of the kernels check-in: 991cdf14aa user: markus tags: trunk | |
2014-01-18
| ||
14:20 | Added three test cases for the kernel density estimation. Resulted in a few small corrections (deal with missing values) check-in: 2456e0d413 user: markus tags: trunk | |
12:13 | Add a procedure for estimating probability density functions by means of the kernel density estimation method check-in: bfc68668cd user: markus tags: trunk | |
Changes
Changes to modules/math/ChangeLog.
1 2 3 4 5 6 7 8 9 10 11 12 | 2014-01-18 Arjen Markus <[email protected]> * statistics.tcl: Added stat_kernel.tcl * stat_kernel.tcl: Implements a straightforward kernel density estimation procedure * statistics.man: Describe the kernel denstity estimation procedure, moved the description of several tests to the general section * pkgIndex.tcl: Bumped version of statistics package to 0.9 2013-12-20 Arjen Markus <[email protected]> * interpolate.tcl: [Ticket 843c2257d2] Added special case for points coincident with the data points * interpolate.test: [Ticket 843c2257d2] Added test case for coincident points 2013-12-17 Andreas Kupries <[email protected]> | > | 1 2 3 4 5 6 7 8 9 10 11 12 13 | 2014-01-18 Arjen Markus <[email protected]> * statistics.tcl: Added stat_kernel.tcl * stat_kernel.tcl: Implements a straightforward kernel density estimation procedure * statistics.man: Describe the kernel denstity estimation procedure, moved the description of several tests to the general section * statistics.test: Added three tests for the kernel density estimation (note: one result is a bit troublesome) * pkgIndex.tcl: Bumped version of statistics package to 0.9 2013-12-20 Arjen Markus <[email protected]> * interpolate.tcl: [Ticket 843c2257d2] Added special case for points coincident with the data points * interpolate.test: [Ticket 843c2257d2] Added test case for coincident points 2013-12-17 Andreas Kupries <[email protected]> |
︙ | ︙ |
Changes to modules/math/TODO.
1 2 3 4 5 | This file records outstanding actions for the math module dd. 26 october 2005, Arjen Markus qcomplex.test: extend the tests for cos/sin .. to include | > > > > > | | | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | This file records outstanding actions for the math module dd. 18 january 2014, Arjen Markus test cases for kernel-density: One test case is troublesome - uniform kernel, checking the total density dd. 26 october 2005, Arjen Markus qcomplex.test: extend the tests for cos/sin .. to include non-real results. dd. 28 september 2005, Arjen Markus optimize.tcl: linear programming algorithm ignores certain constraints (of type x > 0). Needs to be fixed dd. 22 june 2004, Arjen Markus interpolate.man: add examples interpolate.tcl: more consistency in the calling convention checks on arguments (add tests for them) |
︙ | ︙ |
Changes to modules/math/stat_kernel.tcl.
︙ | ︙ | |||
87 88 89 90 91 92 93 94 95 96 97 98 99 100 | if { $opt(-bandwidth) <= 0.0 } { return -code error -errorcode ARG -errorinfo "The bandwidth must be positive: $opt(-bandwidth)" } if { $opt(-number) <= 0.0 } { return -code error -errorcode ARG -errorinfo "The number of bins must be positive: $opt(-number)" } if { [llength [info proc $opt(-kernel)]] == 0 } { return -code error -errorcode ARG -errorinfo "Unknown kernel function: $opt(-kernel)" } # # Construct the weights | > > > > | 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | if { $opt(-bandwidth) <= 0.0 } { return -code error -errorcode ARG -errorinfo "The bandwidth must be positive: $opt(-bandwidth)" } if { $opt(-number) <= 0.0 } { return -code error -errorcode ARG -errorinfo "The number of bins must be positive: $opt(-number)" } if { [lindex $opt(-interval) 0] == [lindex $opt(-interval) 1] } { return -code error -errorcode ARG -errorinfo "The interval has length zero: $opt(-interval)" } if { [llength [info proc $opt(-kernel)]] == 0 } { return -code error -errorcode ARG -errorinfo "Unknown kernel function: $opt(-kernel)" } # # Construct the weights |
︙ | ︙ | |||
117 118 119 120 121 122 123 | if { $d != {} } { lappend weight [expr {$w / $scale}] } else { lappend weight {} } } } else { | | | | 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | if { $d != {} } { lappend weight [expr {$w / $scale}] } else { lappend weight {} } } } else { set weight [lrepeat [llength $data] [expr {1.0/$ndata}]] ;# Note: missing values have weight zero } # # Construct the centres of the bins # set xbegin [lindex $opt(-interval) 0] set xend [lindex $opt(-interval) 1] set dx [expr {($xend - $xbegin) / double($opt(-number))}] set xb [expr {$xbegin + 0.5 * $dx}] set xvalue {} for {set i 0} {$i < $opt(-number)} {incr i} { lappend xvalue [expr {$xb + $i * $dx}] } # |
︙ | ︙ | |||
170 171 172 173 174 175 176 | # Note: # The standard deviation is 1. # proc ::math::statistics::gaussian {x} { return [expr {exp(-0.5*$x*$x) / sqrt(2.0*acos(-1.0))}] } proc ::math::statistics::uniform {x} { | | | 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | # Note: # The standard deviation is 1. # proc ::math::statistics::gaussian {x} { return [expr {exp(-0.5*$x*$x) / sqrt(2.0*acos(-1.0))}] } proc ::math::statistics::uniform {x} { if { abs($x) <= 1.0 } { return 0.5 } else { return 0.0 } } proc ::math::statistics::triangular {x} { if { abs($x) < 1.0 } { |
︙ | ︙ |
Changes to modules/math/statistics.test.
︙ | ︙ | |||
595 596 597 598 599 600 601 602 603 604 | } -result -0.175757575758 test "spearman-rank-extended-1.0" "Test extended Spearman rank correlation procedure" -match tolerant -body { ::math::statistics::spearman-rank-extended {106 86 100 101 99 103 97 113 112 110} \ { 7 0 27 50 28 29 20 12 6 17} } -result {-0.175757575758 10 -0.456397284} # End of test cases testsuiteCleanup | > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > | 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 | } -result -0.175757575758 test "spearman-rank-extended-1.0" "Test extended Spearman rank correlation procedure" -match tolerant -body { ::math::statistics::spearman-rank-extended {106 86 100 101 99 103 97 113 112 110} \ { 7 0 27 50 28 29 20 12 6 17} } -result {-0.175757575758 10 -0.456397284} # # Note: for the uniform kernel the sum is roughly 0.92. It would appear from a visual # investigation that the sides may influence this, as well as the default bandwidth. # Leave it for now. # test "kernel-density-1.0" "Test various kernel functions" -body { set data {1 2 3 4 5 6 7 8 9 10} set roughlyOne {} foreach kernel {gaussian uniform triangular epanechnikov biweight cosine logistic} { set result [::math::statistics::kernel-density $data -kernel $kernel] set sum 0.0 set xbegin [lindex $result 2 0] set xend [lindex $result 2 1] set number [llength [lindex $result 0]] set dx [expr {($xend-$xbegin) / $number}] # # Integral should be roughly one # set sum 0.0 foreach v [lindex $result 1] { set sum [expr {$sum + $dx * $v}] } lappend roughlyOne [expr {abs($sum-1.0) < 0.01}] } return $roughlyOne } -result {1 0 1 1 1 1 1} test "kernel-density-1.1" "Test various options - just that they have effect" -body { set subResults {} set data {1 2 3 4 5 6 7 8 9 10} set result [::math::statistics::kernel-density $data -number 20] lappend subResults [llength [lindex $result 0]] ;# Number of bins lappend subResults [llength [lindex $result 1]] ;# Number of density values set result [::math::statistics::kernel-density $data -interval {0 20}] lappend subResults [lindex $result 2 0] ;# Beginning of interval lappend subResults [lindex $result 2 1] ;# End of interval lappend subResults [expr {[lindex $result 0 0] > [lindex $result 2 0]}] ;# First bin -- beginning of interval lappend subResults [expr {[lindex $result 0 0] < [lindex $result 2 1]}] ;# First bin -- end of interval lappend subResults [expr {[lindex $result 0 end] > [lindex $result 2 0]}] ;# Last bin -- beginning of interval lappend subResults [expr {[lindex $result 0 end] < [lindex $result 2 1]}] ;# Last bin -- end of interval set result [::math::statistics::kernel-density $data -bandwidth 2] lappend subResults [lindex $result 2 end] ;# Bandwidth return $subResults } -result {20 20 0 20 1 1 1 1 2} test "kernel-density-1.2" "Dealing with missing values" -body { set subResults {} set data {1 2 3 4 {} 6 7 8 9 10} set result [::math::statistics::kernel-density $data] set sum 0.0 set xbegin [lindex $result 2 0] set xend [lindex $result 2 1] set number [llength [lindex $result 0]] set dx [expr {($xend-$xbegin) / $number}] # # Integral should be roughly one # set sum 0.0 foreach v [lindex $result 1] { set sum [expr {$sum + $dx * $v}] } return [expr {abs($sum-1.0) < 0.01}] } -result 1 # End of test cases testsuiteCleanup |