ulatencyd and linux audio

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

ulatencyd and linux audio

Daniel Poelzleithner-3
Hi,

I'm working on a userspace scheduling optimizing daemon called ulatencyd.

https://github.com/poelzi/ulatencyd

I have seen some postings to the kernel list about the problems you are
having with realtime attributes, cgroups etc.

As I'm not a audio guy nor do I use RT, I don't have much experience at
that, but I just added an API to set RT classes in ulatencyd.

If someone is interested, ulatencyd could be thought to set realtime
attributes on jackd and all the needed subprocesses, easily.

Please CC me, I'm not on the jack-devel list.

kind regards
 daniel






_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org

signature.asc (918 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Paul Davis
On Sat, Jan 8, 2011 at 4:20 PM, Daniel Poelzleithner <[hidden email]> wrote:
> Hi,
>
> I'm working on a userspace scheduling optimizing daemon called ulatencyd.
>
> https://github.com/poelzi/ulatencyd

from reading the README, this seems as if it has rather broad goals.
our needs tend to be "this app is running with SCHED_FIFO, so put it
on the CPU as soon as its ready, and don't even think of taking it off
until it blocks". we generally don't believe that any conventional
time-sharing scheduling class (i.e. anything without reservation or
deadline scheduling) can do the job required, ad reservation or
deadline scheduling is problematic for the types of applications we
tend to use (not impossible, just more difficult to provide the right
information to the scheduler).

is there anything in your work that would apply to this sort of thing?
_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Daniel Poelzleithner-3
On 01/08/2011 10:25 PM, Paul Davis wrote:

> from reading the README, this seems as if it has rather broad goals.
> our needs tend to be "this app is running with SCHED_FIFO, so put it
> on the CPU as soon as its ready, and don't even think of taking it off
> until it blocks". we generally don't believe that any conventional
> time-sharing scheduling class (i.e. anything without reservation or
> deadline scheduling) can do the job required, ad reservation or
> deadline scheduling is problematic for the types of applications we
> tend to use (not impossible, just more difficult to provide the right
> information to the scheduler).
>
> is there anything in your work that would apply to this sort of thing?
if linux has any parameters that can be set on jackd and the required
processes, so linux will handle them they way you need. ulatencyd can
set them for you ;-)
Thats what it does. It is not a scheduler itself, but more an linux
scheduler optimizer. If none of the current realtime classes will fit
your needs, it can't do anything for you.

The only idea i got is to build a cgroup that has a exclusive cpu core
for it's work, and maybe a set it to a realtime class available.

As i said, I don't have much experience in that field :-)

kind regards
 Daniel





_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org

signature.asc (918 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Tommaso Cucinotta-2
In reply to this post by Paul Davis
Il 08/01/2011 22:25, Paul Davis ha scritto:
> and reservation or
> deadline scheduling is problematic for the types of applications we
> tend to use (not impossible, just more difficult to provide the right
> information to the scheduler).
Hi Paul,

can you elaborate a bit on this difficulty ?

I don't know whether you noticed my post on jack-devel about
our attempt to use deadline scheduling with Jack (I'm forwarding
that message). I have to say I saw a few comments on that as
of now (mostly concerning RT scheduling on Mac OS-X), and I'd
be still interested in getting tough technical comments on the
(usefulness or wastefulness of the) approach.

Also, should you search for any better support to CPU scheduling
by the kernel (than what you have now with SCHED_FIFO, and/or
what we suggested with deadline scheduling), then what would
you be looking for ?

Thanks, bye,

     T.

--
Tommaso Cucinotta, Computer Engineering PhD, Researcher
ReTiS Lab, Scuola Superiore Sant'Anna, Pisa, Italy
Tel +39 050 882 024, Fax +39 050 882 003
http://retis.sssup.it/people/tommaso


Hi all,

I'd like to mention that we made some experiments with Jack running
under a deadline-based (CPU) scheduling policy we developed for the
Linux kernel at our lab. The key points of this work are the following:

Briefly, the scheduler capabilities (http://lwn.net/Articles/398470/)
are the following:
1) the scheduler allows for specifying exactly what a task (or tasks
set) needs in terms of CPU power and time granularity over which it is
granted: for example, I need 5ms (a.k.a., budget) of execution time
every time-window as wide as 20ms (a.k.a., period) over this CPU;
2) each real-time task (set) may specify its own requirements
independently of each other (periods may be heterogeneous); the kernel
performs an admission control check in order to guarantee the new
request does not disrupt the guarantees committed to the already
admitted requests;
3) you can attach to a given reservation one or more *tasks*, i.e.,
either multiple processes, or multiple threads belonging to a process,
or even multiple threads taken arbitrarily from different processes;
4) the CPU reservation parameters may be varied dynamically during
execution of the (re)served tasks.
5) last, but not least, we are trying to push deadline scheduling into
the Linux kernel (and we presented it during the last Kernel Summit and
Linux Plumbers Conference in Boston - http://lwn.net/Articles/412745/),
but of course this is a

Briefly, the modifications we performed to Jack are the following:
A) we modified the Jack framework so that jackd creates a reservation on
the CPU for the whole Jack processing pipeline; the reservation period
is the Jack audio processing period, while for the budget see below;
B) we encapsulated into the real-time reservation all the Jack client
threads belonging to the various Jack client processes;
C) we added to Jack a "feedback-scheduling" control loop, by which Jack
measures the CPU time needed at each cycle, and it dynamically adjusts
the CPU reservation into the kernel accordingly;
D) we modified the Jack daemon and Jack library, but no changes are
required to the applications so far, in order to have them exploit the
new scheduling policy.

For whoever might be interested, this work was carried out as a master
degree thesis of Giacomo Bagnoli under my supervision and it can be
found here (English):

   
http://etd.adm.unipi.it/theses/available/etd-01252010-121333/unrestricted/realtime_low_latency_audio_on_linux.pdf

Also, any comments from developers on this list are very welcome and
appreciated, about the possibility that such work might turn out to be
concretely useful, exploitable and usable, or any issues you would like
to point out, etc. Let me stress that the key point here is the
following: Jack knows what are its CPU computing requirements, so it can
submit them to the kernel and obtain strong computing guarantees. Other
real-time applications with the same awareness may do the same, and all
of them can be  scheduled without any need for the (final) user to
follow receipts for fine-tuning priorities etc.

Should you have any doubts, comments, suggestions, objections, strong
critiques and whatever else on the topic, don't hesitate to write me.

Thanks and regards. Bye,

     Tommaso

--
Tommaso Cucinotta, Computer Engineering PhD, Researcher
ReTiS Lab, Scuola Superiore Sant'Anna, Pisa, Italy
Tel +39 050 882 024, Fax +39 050 882 003
http://retis.sssup.it/people/tommaso

_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org


_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Paul Davis
On Mon, Jan 10, 2011 at 8:52 AM, Tommaso Cucinotta
<[hidden email]> wrote:

> Il 08/01/2011 22:25, Paul Davis ha scritto:
>>
>> and reservation or
>> deadline scheduling is problematic for the types of applications we
>> tend to use (not impossible, just more difficult to provide the right
>> information to the scheduler).
>
> Hi Paul,
>
> can you elaborate a bit on this difficulty ?

there are a couple of issues. the first is that a typical JACK graph
is always composed of more than one process, and so any scheduling
that is aimed at helping JACK has to work for a set of processes -
somewhat like a cgroup in fact. deadline scheduling is relatively easy
in the abstract  - we have a fixed time interval until the next time
the audio interface needs more data, and we just need to make sure
that everything happens by that time. but ... what is everything? if
the clients include processing in other threads, its not possible for
a kernel scheduler to know if that work done by those threads is part
of the deadline or not, and even if it is, the other threads may see
significant fluctuations in workload from time to time (e.g. caused by
buffering data in order to fill an FFT window). resource reservation
is also easy in the abstract - JACK can estimate how long clients are
currently taking to process all audio for a given "cycle", and ask the
kernel to ensure that it and the clients (by thread ID) get the
required time on the CPU. but things can get a bit wierd when anything
out of the ordinary happens. denormals are one example. the FFT
windows just mentioned are another. you could also get situations
where the sudden addition of a very processor-hungry node to the JACK
graph (e.g. a very CPU-intensive plugin gets added to ardour, or is
started as a standalone JACK client) would make the current
reservation request (presumably low-pass filtered) inaccurate for a
short time as the heuristic catches up, possibly leading to
degradation in audio quality.

> I don't know whether you noticed my post on jack-devel about
> our attempt to use deadline scheduling with Jack (I'm forwarding
> that message). I have to say I saw a few comments on that as
> of now (mostly concerning RT scheduling on Mac OS-X), and I'd
> be still interested in getting tough technical comments on the
> (usefulness or wastefulness of the) approach.
>
> Also, should you search for any better support to CPU scheduling
> by the kernel (than what you have now with SCHED_FIFO, and/or
> what we suggested with deadline scheduling), then what would
> you be looking for ?

I did work on scheduling about 16 years ago, and as time has elapsed
since then, I've become more and more convinced that all I want from a
kernel are the following:

   1) properly implemented SCHED_FIFO (i.e. absolute, unalterable
priority over any other scheduling class)
   2) scheduler activations
   3) a way to hand over the CPU to a designated other process and
block on a timer

if you don't know what scheduler activations are, a quick google
should reveal all. i'd be happy to answer any questions on it.
_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Tommaso Cucinotta-2
Il 10/01/2011 15:52, Paul Davis ha scritto:
> there are a couple of issues. the first is that a typical JACK graph
> is always composed of more than one process, and so any scheduling
> that is aimed at helping JACK has to work for a set of processes -
> somewhat like a cgroup in fact.
in fact, in the implementation we proposed we picked only the
Jack threads from the various client processes. So, only these
threads get scheduled by the deadline scheduler, while the
other threads of the client processes (e.g., the GUI parts etc.)
keep being scheduled by the default Linux scheduler.
And, by using our IRMOS RT scheduler, this selection of threads
is exactly handled via the cgroups subsystem :-)
>   deadline scheduling is relatively easy
> in the abstract  - we have a fixed time interval until the next time
> the audio interface needs more data, and we just need to make sure
> that everything happens by that time. but ... what is everything? if
> the clients include processing in other threads, its not possible for
> a kernel scheduler to know if that work done by those threads is part
> of the deadline or not, and even if it is,

if Jack is aware of these other threads, then they can be added
to the group of threads to be scheduled by the next cycle. So, the
kernel just adds them to the group of tasks to which to provide
precise and stable scheduling guarantees within a given audio
cycle. If Jack is unaware of that, i.e., dependency between a
Jack client RT thread, and other parts of the Jack client application
which are not running RT at all, then something that can be done
is a sort of mixed Priority Inheritance (PI) that may be useful, for
example, when the RT thread is blocked waiting for some mutex
to be released by the non RT thread. In such case, this inheritance
would allow the SCHED_OTHER task to temporarily inherit the
properties of the deadline scheduled task (e.g., scheduling
guarantees) and so for example run before any other SCHED_OTHER
task in the system. However, the amount of budget to guarantee
every period to the whole Jack pipeline would need to account for
these further processing time that may be needed in such cases.
However, it's a matter of proper monitoring . . . see also right below
> the other threads may see
> significant fluctuations in workload from time to time (e.g. caused by
> buffering data in order to fill an FFT window).

sure, multimedia processing is also well-known to have large
fluctuations in computation times. What we did in the modified Jack,
was to observe the CPU time spent by the whole Jack pipeline in the
last period(s), then build a percentile estimator out of that, which
can also simply output the maximum computation time observed
in the last N periods.
However, I'm perfectly aware that there are always workload peaks
which overcome any history-based estimation. So, what is very useful
in these cases, is a *soft reservation* mechanism, one in which the
Jack pipeline may run for more than the allocated budget in a given
cycle, if there are still threads ready to run, and if there is CPU time
available. This is also a policy we use to deal with, in our deadline
based scheduler implementation on Linux (for example, in the
Giacomo's thesis, there is the SHRUB policy realizing soft reservations).

> resource reservation
> is also easy in the abstract - JACK can estimate how long clients are
> currently taking to process all audio for a given "cycle", and ask the
> kernel to ensure that it and the clients (by thread ID) get the
> required time on the CPU.
exactly
> but things can get a bit wierd when anything
> out of the ordinary happens. denormals are one example. the FFT
> windows just mentioned are another. you could also get situations
> where the sudden addition of a very processor-hungry node to the JACK
> graph (e.g. a very CPU-intensive plugin gets added to ardour, or is
> started as a standalone JACK client) would make the current
> reservation request (presumably low-pass filtered) inaccurate for a
> short time as the heuristic catches up, possibly leading to
> degradation in audio quality.

you're right, we experimented all of these issues, however we worked around
them with proper heuristic. Let me summarize:
1) we allocate the reservation with a safety threshold over the
history-based
     max-estimator
2) whenever a new Jack client is added to the graph, we react by
preventively
     bumping up the reserved budget by a configurable quantity -- however we
     plan to store somewhere the last observed computation time of
clients, so
     that the second time one activates the same Jack client, the framework
     will exploit the previous execution(s) in order to understand how
much more
     budget is required at the beginning. After a few cycles, the system
adapts
     by observing the budget spent by the new client...

> I've become more and more convinced that all I want from a
> kernel are the following:
>
>     1) properly implemented SCHED_FIFO (i.e. absolute, unalterable
> priority over any other scheduling class)

surely it is the most important point as far as you keep having only jackd
running as the single important RT application in the system. Think however
of more involved scenarios where you might have also have scheduling
guarantees to device driver kernel threads (e.g., PREEMPT_RT), video with
real-time capture or, in a different context, you might have
2 jackd instances controlling 2 independent soundcards (e.g., a workstation
being used by 2 users) with independently configured audio parameters.
>     2) scheduler activations
>     3) a way to hand over the CPU to a designated other process and
> block on a timer
>
> if you don't know what scheduler activations are, a quick google
> should reveal all. i'd be happy to answer any questions on it.

that seems like you want more control from the user-space about how the
CPU is scheduled across threads. This reminds me of the efforts I saw in the
past to realize a (guess ?) deadline scheduler completely at the user-space
level, in Linux (by adding a small kernel-level notification mechanism). Of
course this is possible, but the additional overhead may not be worth doing
it. Or, perhaps I didn't get exactly what Scheduler Activations are, I'm
just
going to read about that. In the mean time, it might be interesting to hear
from you how you plan to use such mechanism and how do you imagine it
could solve or simplify your problem.

Last, but not least, what do you think about support for SMP scheduling ?

Thanks again, I really appreciate your feedback on this.

     T.

--
Tommaso Cucinotta, Computer Engineering PhD, Researcher
ReTiS Lab, Scuola Superiore Sant'Anna, Pisa, Italy
Tel +39 050 882 024, Fax +39 050 882 003
http://retis.sssup.it/people/tommaso

_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Paul Davis
On Mon, Jan 10, 2011 at 12:08 PM, Tommaso Cucinotta
<[hidden email]> wrote:

Just  a couple of quick comments for now:

>>    1) properly implemented SCHED_FIFO (i.e. absolute, unalterable
>> priority over any other scheduling class)
>
> surely it is the most important point as far as you keep having only jackd
> running as the single important RT application in the system. Think however
> of more involved scenarios where you might have also have scheduling
> guarantees to device driver kernel threads (e.g., PREEMPT_RT),

the entire point of PREEMPT_RT from our perspective is *precisely* to
stop IRQ bottom half handlers from preempting user-space threads with
higher RT priority.

> video with
> real-time capture or,

we're audio guys. we don't care about video :))

 in a different context, you might have
> 2 jackd instances controlling 2 independent soundcards (e.g., a workstation
> being used by 2 users) with independently configured audio parameters.

i personally don't believe that this is a use case that is of any real
world interest. hardware is (relatively) cheap, and this is just not
how people do this sort of thing.

>>
>>    2) scheduler activations
>>    3) a way to hand over the CPU to a designated other process and
>> block on a timer
>>
>> if you don't know what scheduler activations are, a quick google
>> should reveal all. i'd be happy to answer any questions on it.
>
> that seems like you want more control from the user-space about how the
> CPU is scheduled across threads. This reminds me of the efforts I saw in the
> past to realize a (guess ?) deadline scheduler completely at the user-space
> level, in Linux (by adding a small kernel-level notification mechanism). Of
> course this is possible, but the additional overhead may not be worth doing
> it. Or, perhaps I didn't get exactly what Scheduler Activations are,

what you described is precisely what scheduler activations are. the
key idea is that neither the kernel nor user space has all the
information required to get scheduling done "right". so you can either
(a) move information from user space into the kernel and allow the
kernel to try to get things right, across all applications and
workloads or you can (b) move information from the kernel into user
space (e.g. "this thread just blocked on I/O") and allow userspace to
use its allocated CPU time in the best possible way given its
understanding of the app design.

i was one of a trio that implemented SA for Mach back in the early-mid
1990s and although you'd sort of think that its all a bit quaint at
this point, and although it definitely wasnot my idea, i continue to
think that its the right answer for a lot of scheduling problems. the
rather odd name comes from its inventor, tom anderson, who imagined
the notifications from the kernel as "activating the user space
scheduler".

--p
_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Fons Adriaensen-2
In reply to this post by Paul Davis
On Mon, Jan 10, 2011 at 09:52:50AM -0500, Paul Davis wrote:

>    2) scheduler activations

I remember the original Solaris threads library having this.
It was removed in later releases in favour of the much simpler
solution of mapping  each user level thread to a kernel entity.
Similar things happened in other system IIRC.

It could help a bit performance-wise as long as such user threads
don't make blocking system calls. If they do and the system starts
moving user threads to allow them to continue things become in
practice more complicated than for a simple 1:1 model.

Without the automatic transfer of non-blocked user threads to
non-blocked kernel threads all of this just amounts to a userland
coroutine library that doesn't require any system support.

Ciao,

--
FA

There are three of them, and Alleline.

_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Reply | Threaded
Open this post in threaded view
|

Re: ulatencyd and linux audio

Paul Davis
On Tue, Jan 11, 2011 at 4:44 AM,  <[hidden email]> wrote:
> On Mon, Jan 10, 2011 at 09:52:50AM -0500, Paul Davis wrote:
>
>>    2) scheduler activations
>
> I remember the original Solaris threads library having this.

yeah, we knew the people who added it after our usenix paper. i was a
little sad when i heard that they removed it, but hey, sometimes when
you build it, nobody shows up :)

> It was removed in later releases in favour of the much simpler
> solution of mapping  each user level thread to a kernel entity.

this doesn't actually accomplish the same thing. scheduling decisions
are still being made with less information than is available and/or
needed.

> It could help a bit performance-wise as long as such user threads
> don't make blocking system calls. If they do and the system starts
> moving user threads to allow them to continue things become in
> practice more complicated than for a simple 1:1 model.

sure. but the complexity was mostly in the code that we implemented,
not in the user-space scheduler. the user-space scheduler was about as
complex as it had to be - that is, it reflected the complexity of the
scheduling decisions needed by the app, but not the thread mapping
imposed by the OS.

> Without the automatic transfer of non-blocked user threads to

but that's effectively precisely what scheduler activations did do.

> non-blocked kernel threads all of this just amounts to a userland
> coroutine library that doesn't require any system support.

if you read tom anderson's original paper, it lays out a pretty clear
justification for SA. the scheduling needs of modern applications are
fairly complex. large systems, like databases, can really benefit a
great deal from being able to use their own application-specific needs
and scope when making scheduling decisions.

also keep in mind that anderson's work came out of a group at UW CS&E
that was (at least then) quite focused on the basic question: if
computers are getting so much faster, how come OS's are getting so
much slower. one of the most significant chunks of the slowdown that
they measured on a variety of (admittedly early 90's) systems was
mis-scheduling due to inadequate information in the kernel. i don't
know if its true that OS's are still proportionally slower than apps
given CPU speed, but the logic in both anderson's paper and others
from that group was, i thought, fairly convincing. the need for the
kinds of ideas they were discussing (SA, separating address spaces
from memory protection, and more) may have become less obvious only
because CPUs have gotten *so* much faster.

> Ciao,
>
> --
> FA
>
> There are three of them, and Alleline.
>
> _______________________________________________
> Jack-Devel mailing list
> [hidden email]
> http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
>
_______________________________________________
Jack-Devel mailing list
[hidden email]
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org