Skip to content
eggstoastbacon edited this page Feb 18, 2020 · 32 revisions

Process jobs concurrently by invoking this function. You specify the amount of jobs, the records to use, and the command to perform on the record. This function automatically balances items based on the amount of jobs specified. To keep track of specific job output use the variable for the job number: $x. Cleans up jobs when it's done. Designed to use a cache directory to carry data in and out of jobs, see example below. Stopwatch automatically times and displays duration of your jobs.

($myjobvar or $_) this represents a single item in your record array.

The variable for a specific running job number is ( $x )

Params:

-jobs (int) (how many jobs to create)

-int_records (numeric value) ex: (1..100)

-exp_records (invoke a command to retrieve your records) ex: 'get-content c:\temp\list.txt'

-scriptblock (command to use on your records) ex: '$ | out-file c:\temp\results_$x.txt -append'_

-cache_dir (string) specify a directory to save processed data. Useful for bringing back in a variable for further processing.

-replace (string) replace a variable or string in your scriptblock

ex:

$mycustomcommand = {

$items = get-childitem -path $path\$_

foreach($it in $items){

$listitem = [string]$listitem.replace("9","3")

$listitem | out-file $cachedir\results_$x.txt -append}

}`

createEggJob -jobs 5 -int_records $listitems -scriptblock $mycustomcommand -cache_dir "c:\temp" -replace "listitem"

-path (string) variable path to your data, useful if you already declared your data path earlier.

Example:

createEggJob -jobs 5 -exp_records 'get-content $path' -scriptblock $mycustomcommand -path $path

-errorlog (string)_ path to save errors.txt ex:_ d:\logs

-skipnth (int) _divides job records by skipping instead of assigning in order (can speed up some jobs) _

Example:

The default for -jobs 4 would be

job1 assigned records 0,1,2,3,4

job2 assigned records 5,6,7,8,9

job3 assigned records 10,11,12,13,14

job4 assigned records 15,16,17,18,19

If you choose -jobs 4 and -skipnth 4

job1 assigned records 0,3,7,11,15

job2 assigned records 1,4,8,12,16

job3 assigned records 2,5,9,13,17

job4 assigned records 3,6,10,14,18

Usage Examples:

int_records:

createEggJob -jobs 6 -int_records (1..450) -scriptblock '$_ | out-file c:\temp\results_$x.txt -append' resultant output is 6 text files with numbers from 1 to 450 distributed evenly between all 6 files written concurrently.

exp_records:

createEggJob -jobs 15 -exp_records 'get-content c:\temp\list.txt' -scriptblock '$_ | out-file c:\temp\results_$x.txt -append' resultant output of data lines in list.txt are evenly distributed to 15 text files written concurrently.

Command is more than one line? Create a variable with your command as a string in single quotes or parenthesis and pass the variable to command, just make sure to include $_ where necessary. Your custom commands can be as long and as complicated as you want.

example:

$mycustomcommand = {

$_ | $line1

$_ | $line2

$_ | $line3

}'`

createEggJob -jobs 4 -exp_records 'get-content c:\temp\list.txt' -scriptblock $mycustomcommand

Use a cache directory to store variable data that's been processed, in the example below I fetch the data of all the jobs and bring it into one variable. I specified the result file and appended the job number variable in my custom command and I can reasonably assume all of the output files will be called results_<job_number>.txt

example:

createEggJob -jobs 5 -exp_records 'get-content c:\temp\list.txt' -scriptblock $mycommand -cache_dir "c:\temp"

$mycommand = '

$_ | out-file $cachedir\results_$x.txt -append

'

$cachedir = "c:\temp"

$results = Get-ChildItem $cachedir | where-object {$.name -like "*results*"} | foreach{get-content -path ("$cachedir" + $_.Name)}

Benchmark on 4 core processor Intel i7-2620M @ 2.7GHZ

Benchmarking script:

$paths = Get-ChildItem -Path C:\ -Recurse -Directory -Force -ErrorAction SilentlyContinue | Select-Object FullName

$command = {

$measure = (Get-ChildItem c:\users\public\$_ -Force -ErrorAction SilentlyContinue | Measure-Object -Property Length -Sum) $sum = $measure.Sum

$count = $measure.Count

[pscustomobject]@{

Name = $Measure.Name

FileCount = $measure.Count

SizeMB = ([math]::round(($sum/1MB),2))

}| export-csv $path\dir_$x.csv -append }

createEggJob -jobs $j -int_records $paths[0..25000] -scriptblock $command -errorlog "c:\temp" -path "c:\temp"

Benchmarking results: Time elapsed processing 25,000 items

1 job: Time elapsed: 00:06:09.3622023

2 jobs: Time elapsed: 00:03:06.4221603

4 jobs: Time elapsed: 00:02:06.8947915

8 jobs: Time elapsed: 00:01:56.3173195

16 jobs: Time elapsed: 00:01:52.4445865

Clone this wiki locally