oprofiling ffh264

Recently I got some inquiries about h264 and altivec. just testing decode time was disappointing to some user.

I did my test and on my g4 1.6 I got about the double ofthe speed he experienced on his g5 2.4.

time nice –20 ./ffmpeg -i ~ryan/bluesky_HD_CAVLC_JM93_217f.264 -f rawvideo – > /dev/null
real 0m47.685s
user 0m44.304s
sys 0m3.220s

cat /proc/cpuinfo
processor : 0
cpu : ppc970, altivec supported
clock : 2400.000000MHz
revision : 4.0 (pvr 0070 0400)

time nice –20 ./ffmpeg -i /tmp/bluesky_HD_CAVLC_JM93_217f.264 -f
rawvideo – > /dev/null
real 0m25.877s
user 0m23.768s
sys 0m1.904s

cat /proc/cpuinfo
processor : 0
cpu : 7447A, altivec supported
clock : 1666.666000MHz
revision : 0.5 (pvr 8003 0105)

The ffmpeg code is the same, I hadn’t use anything but the stock cflags, same for him.
I was expecting quite a different result, time hunt the slow gear!

I used oprofile

just started and stopped it befor the ffmpeg call, and the asked opreport to compute some statistics about symbols.

an excerpt

CPU: PowerPC G4, speed 1666.67 MHz (estimated)
Counted CYCLES events (Cycles) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name symbol name
60355 23.2602 libc-2.4.so _wordcopy_fwd_aligned
13572 5.2305 ffmpeg_g put_h264_chroma_mc8_altivec
13417 5.1708 ffmpeg_g filter_mb
11379 4.3853 ffmpeg_g put_h264_qpel16_h_lowpass_altivec
9700 3.7383 ffmpeg_g fill_caches
9332 3.5965 ffmpeg_g hl_decode_mb
8201 3.1606 vmlinux __flush_dcache_icache

Looks like I’ll have to replace something… or start thinking about optimized glibc…
(mine is built targeting my cpu and is pretty recent, I wonder if the G5 isn’t running on an older or generic built glibc…)

Leave a Reply

Your email address will not be published.