In >=portage-2.2_rc2 there are a few new emerge options that many Gentoo users will probably be interested in:
--jobs JOBS Specifies the number of packages to build simultaneously. Also see the related --load-average option. --keep-going Continue as much as possible after an error. When an error occurs, dependencies are recalculated for remaining packages and any with unsatisfied dependencies are automatically dropped. Also see the related --skipfirst option. --load-average LOAD Specifies that no new builds should be started if there are other builds running and the load average is at least LOAD (a floating-point number). This option is recommended for use in combination with --jobs in order to avoid excess load. See make(1) for information about analogous options that should be configured via MAKEOPTS in make.conf(5).
Here is some sample parallel build output from a catalyst stage2 build, with emerge’s new –jobs option enabled:
>>> Building (1 of 10) sys-devel/gettext-0.17 for / >>> Building (2 of 10) sys-libs/zlib-1.2.3-r1 for / >>> Building (3 of 10) virtual/libintl-0 for / >>> Building (4 of 10) dev-util/unifdef-1.20 for / >>> Installing virtual/libintl-0 to / >>> Installing dev-util/unifdef-1.20 to / >>> Building (5 of 10) sys-kernel/linux-headers-2.6.23-r3 for / >>> Installing sys-libs/zlib-1.2.3-r1 to / >>> Jobs: 3 of 10 complete, 2 running Load avg: 3.44, 1.46, 0.69
Niiiiiiiiiiice.
Cool stuff, especially the load average option.
(Btw, the email checking seems to be broken)
Thank you! I haven’t done any extensive testing but when using the parallel building feature, system resources actually seemed to be used more evenly on multicore machines resulting into better responsiveness of the system…
One thing I ran into the rc2 (before rc3 came out) was a rather high memory consumption (peaked at hundreds of MBs) when compiling inkscape. I didnt notice this with any other package. Is this common?
If you’re referring to the memory consumption of emerge itself then see bug 229069. It should be better now than it was in rc1. If one of the ebuild subprocesses is doing it then it depends on the specific process. You can use pstree to find the ebuild subprocesses, and ps aux to view the memory consumption of those processes.
This is great news, just parallelizing src_unpack() and configure in src_compile().econf() is awesome on multicore, expecially combined with raid storage.
Just a question this stuff is totally separated from MAKEOPTS=”-j?” right?
Yes, MAKEOPTS is completely separate. The –jobs, –keep-going, and –load-average emerge options are analogous to the ones provided by make. They have the same meaning for both tools, but they are controlled separately. The make options are set via MAKEOPTS, while the emerge options are controlled via the emerge command line or via an EMERGE_DEFAULT_OPTS setting in /etc/make.conf.
I used
watch -n 5 ‘ps aux | grep emerge’
watch -n 5 ‘ps aux | grep cc1plus’
to find out whats going on. When comparing the previus rcs to rc3, emerge uses less memory. Rc3 started at about 130M when resolving deps and fell to about 20M during compilation. The culprit is cc1plus … I’ve seen memory usages of up to 450M (wow) … is that normal or is something wrong with gcc4.1.2 on amd64?
nice, thanks a lot !
Well, gcc is known to consume lots of memory in some cases. It depends on which package is being built, the combination of USE flags, and the profile. You can try asking people on the forums or the gentoo-user mailing list to see what results they get for specific combinations of packages, USE flags, and profiles.
Wow! ‘ve been waiting for THIS one for awhile! =8^)
I’ve been handling it manually, using –pretend to detect and manage sequence dependencies, for awhile (up to 9-ish konsole windows open and merging at a time), with MAKEOPTS=”-j -l15 or so, keeping load average ~16. Given a dual-dual-core Opteron 290, with 8 gigs RAM and PORTAGE_TMPDIR pointed at a tmpfs to keep I/O down and compile speeds up, with PORTDIR on a 4-spindle kernel/mdp RAID-0 and the system on the same 4-spindles as kernel/mdp RAID-6, and with swap similarly striped, 4 gigs to each spindle so 16 gigs total, it works great! =8^)
In fact, with a recent kernel and per-user scheduling active, I can even keep streaming media going, with a small (800×600 or so) visualization window, and get reasonable system responsiveness, no audio skips, and very tolerable visualization skips. Further, I can do that up to a load average of hundreds (the largest I’ve been able to get, it could surely handle more), as long as the memory usage doesn’t roll over into swap more than a gig or so — the -l15 has thus been more an indirect way of controlling memory usage, than the more direct load average it actually controls.
Anyway, I’m seriously looking forward to this as it’ll seriously reduce the manual hassle. Also, I’m looking at an Acer Aspire One (Atom CPU), on which I expect I’ll load up Gentoo as well, but doing all the compiling in a 32-bit chroot on my main system, then using emerge –pkgonly on the Acer. So I’ll soon be doing twice the maintenance compiling (expect I probably won’t update it as often, maybe every couple weeks or every month) and will have the initial setup and rebuild to do for it as well. And back on my main system, I’ll probably try KDE4 again with 4.1 (4.0 wasn’t just beta, it was very early half developed concept demo/preview, 4.1 will hopefully be beta, and just my thing), so there’s some more compiling to try it out on! =8^)
So indeed, this automated parallel emerge handling couldn’t come at a better time! =8^)
@Pavel: It’s normal for a few packages, particularly C++ packages, while using certain CFLAGS. KDE’s kmail package (or kdepim if you’re using monolithic) is infamous for this if USE=kdeenablefinal is set — it uses, or used to anyway, 1.3 GB of memory for a single compile thread (plus the tmpfs I’m compiling it in) at one point!
Thus, while as I already posted I’ve been running parallel merges manually for some time, I always make it a point to do kmail alone or close to it.
@ zmedico: Is there a package.noparallel file or other way to tell portage not to parallel-emerge particular packages? Is it possible to do something for memory usage similar to what is done with load average — that is, a switch to limit it, not starting more jobs if usage is greater than X megs/gigs, say?
Wow,these are great new features!!! I especially love the keep-going option
I am afraid though that it wake take a while to reach stable (x86) 
@Duncan: thanks for the info, i wasnt aware since i usually build things overnight … it was somewhat scary to see the box swapping only due to compiling something…
@zmedico: thanks for the info … actually Duncans suggested ‘package.noparallel’ might be a meaningful feature after 2.2 stabilizes (i.e. as list of notorious packages as a part of gentoo profiles?). Are in rc3 still the issues with preserved-rebuild described by Diego in his blog? Tomorrow I’ll be setting up a production box and am really itching to use 2.2
Nothing like that has been implemented yet. It seems like a more dynamic and self maintaining approach, similar to –load-average, is a lot more desirable than some static setting that would require maintenance.
That seems like a good possibility. We’ll need some kind of memory load average measurement that that we can use in an analogous fashion to the cpu utilization load average. If you see anything like that then please let me know.
Yes, preserved-libs still has lots of remaining issues to solve. However, you can set FEATURES=-preserve-libs in /etc/make.conf and then you won’t have to worry about any of those issues. With the preserve-libs feature disabled, it will behave just like older versions of portage and you’ll have to use revdep-rebuild as usual.
i love this damn portage..
rocks!
Nice work, Zac! My only complaint/comment is that there are too many spaces in the window title, thus making the Konsole tab much wider than it needs to be and limiting the number of tabs that can be displayed.
ok, used this for a few days and i like it ( the ionice bits that went in look interesting too !). A few comments :
– would it be possible to have a more verbose output when running with –jobs enabled ? Specially if there is a build failure, it would be nice to have the error message.
– would it be possible that portage organizes the “build-qeue” so that you don
It’s already supposed to display the die message. If you don’t happen to have PORTAGE_ELOG_SYSTEM=”echo” enabled then it displays just the eerror messages which should include the die message. The die message includes the path to the log file so that it’s easy to locate.
In >=portage-2.2_rc4, the entire build log will also be displayed if there is only a single build failure (displaying the whole log isn’t really useful if there happen to be multiple failures).
thanks for your input !
1- ok, will check this, i was quite sure there was already such feature but didn
well, yes, but my point is that if you schedule qt to build in parallel with other smaller package(s) and you build ooo once qt is done there will probably not be such big overload.
will try, thanks.
Don
When the goal is to maximize the amount of work done in a given amount of time, it may be advantageous to have a slight over-abundance of jobs that are ready and waiting to execute as soon as the load average drops low enough for a new job to be scheduled.
Not having a lot of time and not having interest are two different things. We get lots of feature requests and the list is really quite long. We might get to that eventually but there are lots of other things with higher priority at this time. Obviously we don’t always have time to implement every person’s request in a timely manner.
i know and obviously i don
I have been using these great new features on a multi-core machine during the last days, but it seems that portage has problems
a) after –resume, and
b) even after –keep-going resumes
since I do not see any parallel emerges anymore.
In fact, IIRC after –resume the compressed output of parallel emerges would not even appear, so the new options probably did not survive the restart.
But it’s a great feature; in the past I have always tried to mimic that manually, but that would be a lot of work, far from perfect, and sometimes compile some common dependencies more than once if not done carefully.
Thus, I am looking forward to see this feature become more mature.
I can confirm that –resume does not properly preserve –jobs and –load-average options, so I will have that fixed in the next release (that will be 2.2_rc7).
I suspect that the –keep-going behavior that you’ve observed is due to the topology of the dependency graph, and the following algorithm:
We might consider implementing more aggressive parallelization algorithms in the future. However, those who would consider choosing a more aggressive algorithm should ask themselves why they are in such a hurry. A more aggressive algorithm is inherently less optimal and therefore it only makes sense when there is some kind of deadline to meet. Even with a deadline, there are other techniques such as ccache and distcc which can help improve performance without sacrificing optimal build order.
Now I’ve found a way to make the parallelization algorithm more aggressive in some cases, without making the build order any less optimal. The patch is in svn r11405 and it will be included in >=sys-apps/portage-2.2_rc9.