To: perfctr-devel@lists.sourceforge.net, perfapi-devel@nacse.org Subject: perfctr-2.0 released (finally!) perfctr-2.0 is now (finally!) available at the usual place: . The most important change since 2.0-pre6 is that interrupt-mode virtual perfctrs now work again. There are also some API changes due to internal data-structure rearrangements and cleanups. These were originally part of an attempt to optimise cache-line usage. That didn't give any improvements in my testing, but I kept the core changes as they lead to a cleaner API. Since so many people seem unwilling to use standard kernels as opposed to vendor-hacked kernels, I now include a kernel patch for the 2.4.3-12 (update) RedHat 7.1 kernel. Version 2.0, 2001-08-08 - Resurrected partial support for interrupt-mode virtual perfctrs. virtual.c permits a single i-mode perfctr, in addition to TSC and a number of a-mode perfctrs. BUG: The i-mode PMC must be last, which constrains CPUs like the P6 where we currently restrict the pmc_map[] to be the identity mapping. (Not a problem for K7 since it is symmetric, or P4 since it is expected to use a non-identity pmc_map[].) - Bug fix in perfctr_cpu_update_control(), to prevent a failed attempt to update the control from leaving the object in an inconsistent state. Version 2.0-pre7, 2001-08-07 - Updated user-space library: * Coding tweaks to attempt to make gcc (various versions) generate better code. (Not entirely successful. May have to resort to hand-written assembly code.) * New vperfctr_read_ctrs() sampling procedure. * New perfctr_print_info() helper procedure. - Updated example applications: * Use the library's perfctr_print_info() for consistent output. * Counts are now printed in decimal, not hex. * 'perfex' now checks for data layout mismatch when the child process' virtual perfctr is mmap:ed into user space. * 'self' uses the new vperfctr_read_ctrs() sampling procedure. * 'signal' compiles again. - Cleaned up the driver's debugging code. - Internal driver rearrangements. The low-level driver (x86) now handles sampling/suspending/resuming counters. Merged counter state (sums and start values) and CPU control data to a single "CPU state" object. This simplifies the high-level drivers, and permits some optimisations in the low-level driver by avoiding the need to buffer tsc/pmc samples in memory before updating the accumulated sums (not yet implemented). - Removed WinChip "fake TSC" support. The user-space library can now sample with slightly less overhead on sane processors. / Mikael Pettersson