Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Parallel video encoding



On 8/30/07, Matthew Gillen <me-5yx05kfkO/aqeI1yJSURBw at public.gmane.org> wrote:
> Check the "threads" option (it's codec-specific, ie -<codec>opts threads=8)

Thanks :-)

> What would be the point of having it be multi-threaded then?

I just wanted to know the difference between running say two threads
on one physical CPU, versus two threads with one thread on each CPU.
Let's say the machine is a dual quad-core?  So, it has two physical
chips, and 8 total cores.

> Shouldn't affect that at all.  In Intel's hyperthreading it would, but the
> true dual core chips nowadays have independent caches (a lesson learned from
> hyperthreading).

How about Intel Xeon's?  Core 8 of 8 is shown below, ie core #4 on CPU
#2, or in computery terms... core 3 on cpu 1 (zero based)

processor       : 7
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Xeon(R) CPU           E5320  @ 1.86GHz
stepping        : 7
cpu MHz         : 1869.000
cache size      : 4096 KB
physical id     : 1
siblings        : 4
core id         : 3
cpu cores       : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
nx lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr dca
lahf_lm
bogomips        : 3724.05
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:


> Knowing a little bit about how mpeg encoding works, one way to do it would be
> to break up the input into "work units" that a single cpu would do.  Each work
> unit would start with an I frame, and do all the images leading up to the next
> I frame.  So each work unit is completely independent from the others, thus
> allowing each CPU to do the bulk of it's work independently.
>
> Then the "master" thread just concatenates all the results in the correct order.

Yes, that would make the most sense, since breaking up anything
further than between I-frames would wasteful, and you would need to
coordinate all those intermediate frames.  However, the topic of
I-frames is probably the "easiest" way of thinking about the problem,
but not necessarily the most efficient.  Are there any hardcore video
devs here who know about G/B-frames, ?-frames, etc and obscure
encoding tricks?

In any event, it doesn't really matter since I wasn't planning on
writing a custom app.  I just wanted to know if it would be more
efficient to spawn multiple mencoder process at the same time with one
thread allocated to the process (up to MAX_CORES or MAX_CORES - 1),
spawn mencoder on MAX_CORES - 1 for one file at a time (that is the
suggested way, right?), or just invoke mencoder at MAX_FILES_TO_ENCODE
times with one thread for each.

I bet option #2 works the best, since spawning multiple mencoder
processes could be wasteful in terms of caching and memory usage.  I
bet one mencoder process working on one file with MAX_CORES - 1 would
be ideal.  If you go higher than MAX_CORES, then you are going to have
to context switch in and out the threads as they are shared, right?
Tell me if my logic is flawed here...

> Matt
>


-- 
Kristian Erik Hermansen

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.







BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org