-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of all cores
without mem
in the database
#73
Comments
Here's the script I used to check usage to allocation efficiency, I'll try to get this converted to gxadmin so we can dump to influx for monitoring: #!/usr/bin/env python3
import argparse
import statistics
import psycopg2
INTERVAL = '1 week'
PG_DBNAME = 'galaxy_main'
MEM_FLOOR = 16
RUNTIME_FLOOR = 300
SQL = """
SELECT
t.id,
t.tool_id,
t.mem_allocated,
t.mem_used
FROM (
SELECT
j.id,
j.tool_id,
(SELECT metric_value FROM job_metric_numeric WHERE job_id = j.id AND metric_name = 'galaxy_memory_mb') * pow(1024, 2) AS mem_allocated,
(SELECT metric_value FROM job_metric_numeric WHERE job_id = j.id AND metric_name = 'memory.peak') AS mem_used,
(SELECT metric_value FROM job_metric_numeric WHERE job_id = j.id AND metric_name = 'runtime_seconds') AS runtime
FROM
job j
WHERE
j.update_time > timezone('UTC', now()) - %(interval)s::INTERVAL
AND j.state = 'ok'
) AS t
WHERE
t.mem_allocated >= (%(mem_floor)s * pow(1024, 3))
AND t.runtime > %(runtime_floor)s
"""
def get_runtime_data(args):
args = {
"interval": args.interval,
"mem_floor": args.mem_floor,
"runtime_floor": args.runtime_floor,
}
conn = psycopg2.connect(dbname=PG_DBNAME)
cur = conn.cursor()
cur.execute(SQL, args)
return cur.fetchall()
def calculate_ratios(rows):
total = len(rows)
missing = 0
tools = {}
for row in rows:
job_id, tool_id, mem_allocated, mem_used = row
if not mem_allocated or not mem_used:
missing += 1
continue
if tool_id not in tools:
tools[tool_id] = []
tools[tool_id].append(float(mem_used) / mem_allocated)
means = {}
for tool_id, ratios in tools.items():
means[tool_id] = statistics.mean(ratios)
for tool_id in sorted(means, key=means.get, reverse=True):
print(f"{tool_id}: {means[tool_id]*100:0.2f}%")
print("")
print(f"{(missing/total)*100:0.2f}% missing")
parser = argparse.ArgumentParser()
parser.add_argument("--interval", "-i", default=INTERVAL, help="Age of jobs to check")
parser.add_argument("--mem-floor", "-m", default=MEM_FLOOR, help="Memory floor (GB), ignore jobs that used less")
parser.add_argument("--runtime-floor", "-r", default=RUNTIME_FLOOR, help="Runtime floor (seconds), ignore jobs that ran for less")
args = parser.parse_args()
rows = get_runtime_data(args)
calculate_ratios(rows) |
|
Latest results from EU:
|
Thank you! I am putting together another PR with a lot more tools based on my data and this helps a lot. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Rules like this for raxml allocate additional cores but because most people set
mem: cores * SOME_FACTOR
in their default tool it also scales memory, which is unnecessary for tools that parallelize well but don't use much memory.We should probably run down the list of tools in the DB that only set
cores
and setmem
as well unless they should actually scale memory lineraly with cores (and it should be roughly 4 GB/core, which is what most of us use I believe).The text was updated successfully, but these errors were encountered: