Final SSE/3DNow [PATCH]

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Final SSE/3DNow [PATCH]

Jussi Laako
Hi,

This is the "final" version of SSE/3DNow patch.

I'll wait comments for few days before committing this to the CVS.

Changes since last version:

        - Added memcpy() replacement
        - Fixed coding style
        - Added config option for the feature, defaults to off


--
Jussi Laako <[hidden email]>

jack-simd.patch (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Final SSE/3DNow [PATCH]

Lee Revell
On Mon, 2005-08-29 at 21:26 +0300, Jussi Laako wrote:

> Hi,
>
> This is the "final" version of SSE/3DNow patch.
>
> I'll wait comments for few days before committing this to the CVS.
>
> Changes since last version:
>
> - Added memcpy() replacement
> - Fixed coding style
> - Added config option for the feature, defaults to off

This is a great feature.

What about MMX?

Some processors like my VIA C3 support both MMX and 3DNow!.  How do you
decide which to use?  X tests both of them on startup and chooses the
fastest.  The performance is quite close; interestingly, sometimes it
picks MMX, sometimes 3DNow.

(II) VIA(0): Benchmarking video copy. Less is better.
(--) VIA(0): Timed   libc YUV420 copy... 4430020. Throughput: 80.4 MiB/s.
(--) VIA(0): Timed kernel YUV420 copy... 4404505. Throughput: 80.8 MiB/s.
(--) VIA(0): Ditch    SSE YUV420 copy... Not supported by CPU.
(--) VIA(0): Timed    MMX YUV420 copy... 2991777. Throughput: 119.0 MiB/s.
(--) VIA(0): Timed 3DNow! YUV420 copy... 2976313. Throughput: 119.6 MiB/s.
(--) VIA(0): Ditch   MMX2 YUV420 copy... Not supported by CPU.
Freed 4194304 (pool 2)
(--) VIA(0): Using 3DNow! YUV42X copy for video.
(II) VIA(0): [XvMC] Initialized XvMC extension.
(II) VIA(0): - Done
(==) RandR enabled

Lee



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Jackit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jackit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Final SSE/3DNow [PATCH]

Phil Frost
On Mon, Aug 29, 2005 at 04:01:37PM -0400, Lee Revell wrote:
> What about MMX?
>
> Some processors like my VIA C3 support both MMX and 3DNow!.  How do you
> decide which to use?  X tests both of them on startup and chooses the
> fastest.  The performance is quite close; interestingly, sometimes it
> picks MMX, sometimes 3DNow.

MMX operates on integers, while SSE(1/2) and 3dnow! operate on floats.
For some operations like memcpy either can do the trick, but for
arithmetic, the type of data will require one or the other.


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Jackit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jackit-devel
Reply | Threaded
Open this post in threaded view
|

Re: Final SSE/3DNow [PATCH]

Jussi Laako
In reply to this post by Lee Revell
On Mon, 2005-08-29 at 16:01 -0400, Lee Revell wrote:

> What about MMX?

MMX would be good only for memcpy() replacement as it's integer only.
Current implementation is optimized mixing solution where memcpy() is
only one part. Of course MMX memcpy() replacement could be put there,
but it may not be worth it due to slower "emms" instruction. However
3DNow memcpy() code should be also MMX compatible except the last asm
part.

> Some processors like my VIA C3 support both MMX and 3DNow!.  How do you
> decide which to use?  X tests both of them on startup and chooses the
> fastest.  The performance is quite close; interestingly, sometimes it
> picks MMX, sometimes 3DNow.

Practically it should be almost the same code, but 3DNow supports "Fast
EMMS" instruction while MMX has only "EMMS".


--
Jussi Laako <[hidden email]>



-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Jackit-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/jackit-devel