我把perf放在/data/bin下。
adb shell /data/bin/perf list将列出所有的performance events,分成四类:hardware event,software event, hardware cache event和tracepoint event.adb shell /data/bin/perf stat ls会列出ls命令执行过程中各个performance counter的统计。adb shell /data/bin/perf -e event ls将会输出针对事件event的统计
执行adb shell /data/bin/perf stat ls,会发现如下的输出:
Error: open_counter returned with 19 (No such device). /bin/dmesg may provide additional information. Fatal: Not all events could be opened.
按提示执行adb shell dmesg,发现这个出错信息:
hw perfevents: unable to reserve pmu
对每个event逐个执行adb shell /data/bin/perf -e event ls,会发现,只要event是hardware event 或hardware cache event,就会出上面提到的错误,出错信息是一样的。而event是software event或者tracepoint event时,则成功。这意味着什么呢?意味着PMU硬件没有起作用,所有的hardware performance counter都没法统计。
Galaxy Nexus的CPU是OMAP (arm cortex A9),以前已经把对应的kernel源代码下载到了omap目录git clone cd omapgit checkout remotes/origin/android-omap-tuna-3.0 -b tuna
用上面的出错信息去搜索引擎检索,会发现很多有关omap的perf stat的出错的讨论,有人说,是这款cpu芯片设计有问题,导致没法发生中断。是不是硬件有问题,可以用ARM提供的gator来检验一下。gator-driver中,针对不同的内核版本,提供了不同的profiling方式。如果版本低于3.0.0,则用arm自己提供的PMU操作,否则,采用linux的perf体系。目前,该手机的内核版本是3.0.31,将会采用linux的perf,用DS-5的streamliner做实验,确实可以从dmesg输出中看到
hw perfevents: unable to reserve pmu
如果修改gator-driver对内核版本的判断,使其在版本高于3.1.0时才用perf体系,那么,在这款手机上,gator模块会用自己的pmu操作取counter数据,而不是依赖linux内核所带设备驱动。实验结果是,dmesg中的“hw perfevents: unable to reserve pmu" 消失了,hardware performance counter的值被读回来了。这证明硬件是没有问题的,应该把注意力放在内核代码上。
在内核代码中检索"unable to reserve pmu",可以发现,它只出现在omap/arch/arm/kernel/perf_event.c的armpmu_reserve_hardware()函数中,当reserve_pmu(ARM_PMU_DEVICE_CPU)返回错误码时,就会输出这个警告信息。391 static int392 armpmu_reserve_hardware(void)393 {394 struct arm_pmu_platdata *plat;395 irq_handler_t handle_irq;396 int i, err = -ENODEV, irq;397398 pmu_device = reserve_pmu(ARM_PMU_DEVICE_CPU);399 if (IS_ERR(pmu_device)) {400 pr_warning("unable to reserve pmu\n");401 return PTR_ERR(pmu_device);402 }403 ……
reserver_pmu(enum arm_pmu_type device)在omap/arch/arm/kernel/pmu.c中:
61 struct platform_device * 62 reserve_pmu(enum arm_pmu_type device) 63 { 64 struct platform_device *pdev; 65 66 if (test_and_set_bit_lock(device, &pmu_lock)) { 67 pdev = ERR_PTR(-EBUSY); 68 } else if (pmu_devices[device] == NULL) { 69 clear_bit_unlock(device, &pmu_lock); 70 pdev = ERR_PTR(-ENODEV); 71 } else { 72 pdev = pmu_devices[device]; 73 } 74 75 return pdev; 76 }
从中可以看到,找不到设备的情况下,返回ENODEV,正好和perf stat ls的出错信息吻合。
在omap/arch/arm/mach-omap2/devices.c中,对pmu设备进行初始化注册工作:
592 static void omap_init_pmu(void) 593 { 594 if (cpu_is_omap24xx()) 595 omap_pmu_device.resource = &omap2_pmu_resource; 596 else if (cpu_is_omap34xx()) 597 omap_pmu_device.resource = &omap3_pmu_resource; 598 else 599 return; 600 601 platform_device_register(&omap_pmu_device); 602 }
从dmesg的输出中,可以发现Galaxy Nexus的CPU型号是OMAP 4460.对照源码,当cpu为4460时,根本就没有分配resource,也没有进行设备注册。 omap24xx和omap34xx的resource定义如下
568 static struct resource omap2_pmu_resource = { 569 .start = 3, 570 .end = 3, 571 .flags = IORESOURCE_IRQ, 572 }; 573 574 static struct resource omap3_pmu_resource = { 575 .start = INT_34XX_BENCH_MPU_EMUL, 576 .end = INT_34XX_BENCH_MPU_EMUL, 577 .flags = IORESOURCE_IRQ, 578 };
可以看出,这个resource是中断号。那么omap 4460的PMU的中断号是多少呢?omap 4460有两个核(Cortex-A9 MPCore),每个核都有自己的PMU,每个PMU都有一个中断号,所以,应该有两个中断号。从网上搜索OMAP 4460 和PMU的结果是,这两个中断号为:
54 + OMAP44XX_IRQ_GIC_START 55 + OMAP44XX_IRQ_GIC_START
在omap/arch/arm/mach-omap2/omap_hwmod_44xx_data.c中:
#define OMAP44XX_IRQ_GIC_START 32
所以,这两个中断号就是86,87 于是,修改后的omap/arch/arm/mach-omap/devices.c如下:
568 static struct resource omap2_pmu_resource = { 569 .start = 3, 570 .end = 3, 571 .flags = IORESOURCE_IRQ, 572 }; 573 574 static struct resource omap3_pmu_resource = { 575 .start = INT_34XX_BENCH_MPU_EMUL, 576 .end = INT_34XX_BENCH_MPU_EMUL, 577 .flags = IORESOURCE_IRQ, 578 }; 579 580 static struct resource omap446x_pmu_resource = { 581 .start = 86, 582 .end = 87, 583 .flags = IORESOURCE_IRQ, 584 }; 585 586 static struct platform_device omap_pmu_device = { 587 .name = "arm-pmu", 588 .id = ARM_PMU_DEVICE_CPU, 589 .num_resources = 1, 590 }; 591 592 static void omap_init_pmu(void) 593 { 594 if (cpu_is_omap24xx()) 595 omap_pmu_device.resource = &omap2_pmu_resource; 596 else if (cpu_is_omap34xx()) 597 omap_pmu_device.resource = &omap3_pmu_resource; 598 else if (cpu_is_omap446x()) 599 omap_pmu_device.resource = &omap446x_pmu_resource; 600 else 601 return; 602 603 platform_device_register(&omap_pmu_device); 604 }
或者看git diff的输出
diff --git a/arch/arm/mach-omap2/devices.c b/arch/arm/mach-omap2/devices.cindex cf7a0ba..fce5cbc 100644--- a/arch/arm/mach-omap2/devices.c+++ b/arch/arm/mach-omap2/devices.c@@ -577,6 +577,12 @@ static struct resource omap3_pmu_resource = { .flags = IORESOURCE_IRQ, };+static struct resource omap446x_pmu_resource = {+ .start = 86,+ .end = 87,+ .flags = IORESOURCE_IRQ,+};+ static struct platform_device omap_pmu_device = { .name = "arm-pmu", .id = ARM_PMU_DEVICE_CPU,@@ -589,6 +595,8 @@ static void omap_init_pmu(void) omap_pmu_device.resource = &omap2_pmu_resource; else if (cpu_is_omap34xx()) omap_pmu_device.resource = &omap3_pmu_resource;+ else if (cpu_is_omap446x())+ omap_pmu_device.resource = &omap446x_pmu_resource; else return;
重新编译好,烧制进手机,执行perf stat ls
hzh@fangtian:~/android/omap$ adb shell /data/bin/perf stat ls /sdcard/sdcard Performance counter stats for 'ls /sdcard': 9.735107 task-clock # 0.761 CPUs utilized 7 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 127 page-faults # 0.013 M/sec 3351924 cycles # 0.344 GHz 0 stalled-cycles-frontend # 0.00% frontend cycles idle 0 stalled-cycles-backend # 0.00% backend cycles idle 0 instructions # 0.00 insns per cycle 0 branches # 0.000 M/sec 0 branch-misses # 0.00% of all branches 0.012786865 seconds time elapsed