Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flatten is very slow #17

Open
zingmane opened this issue Aug 16, 2021 · 1 comment
Open

Flatten is very slow #17

zingmane opened this issue Aug 16, 2021 · 1 comment

Comments

@zingmane
Copy link

This code has problems with large arrays:

import { getValueOr } from "./main";
const doFlatten = (elems: any): any[] =>
elems instanceof Array
? elems.reduce(
(accum: any[], elem: any) =>
elem instanceof Array
? [...accum, ...doFlatten(elem)]
: [...accum, elem],
[]
)
: elems;
export function flatten<A = any>(coll: any[]): A[] {
return doFlatten(getValueOr([], coll));
}

Reduce with spread is an anti-pattern because it implicitly makes nested iterations -> see https://www.richsnapp.com/article/2019/06-09-reduce-spread-anti-pattern which is very bad for performance.

Also it would be an advantage to distinguish between a single flatten and a recursive flatten function, like lodash does it with flatten and flattenDeep?

A more performant version of flatten/flattenDeep could be that one:

const doFlattenDeep = (elems: any): any[] =>
  elems instanceof Array
    ? elems.reduce((accum: any[], elem: any) => {
      elem instanceof Array ? accum.push(...doFlattenDeep(elem)) : accum.push(elem);
        return accum;
      }, [])
    : elems;

const doFlatten = (elems: any): any[] =>
  elems instanceof Array
    ? elems.reduce((accum: any[], elem: any) => {
      accum.push(elem);
        return accum;
      }, [])
    : elems;

function flattenDeep<A = any>(coll: any[]): A[] {
  return doFlattenDeep(l.getValueOr([], coll));
}

function flatten<A = any>(coll: any[]): A[] {
  return doFlatten(l.getValueOr([], coll));
}

These are the perf results with the new functions compared against lodash and the old one (nested array with 1000 entries):
flattenDeep is still slower than the lodash versions (but much faster than the old one), maybe because of not be an tail recursion?

image

@hermann-p
Copy link
Owner

Your observations are correct, but I think your doFlatten function does not do what you think it does, what would explain your extreme performance gains. Still, implementing flatten for example as

const flatten = (arr: unknown[]) => {
  const x: any[] = [];
  arr.forEach((y) =>
    Array.isArray(y) ? y.forEach((z) => x.push(z)) : x.push(y)
  );
  return x;
};

Provides significant gains over lodash/flatten:

Flatten arrays

----------------------
  sample size: 16667, repeats: 50
----------------------


lodash:
  min 1.973, max 14.881 (7.544),
  median 2.084 (sd 3.238), avg 3.887 (sd 2.690)
lodash/fp:
  min 2.258, max 14.954 (6.624),
  median 2.318 (sd 4.020), avg 4.342 (sd 3.473)
manual:
  min 0.899, max 6.672 (7.419),
  median 0.949 (sd 1.619), avg 1.747 (sd 1.409)

I also strongly support splitting the behaviour into flatten and flattenDeep. This would break the existing API, but then we never used it anywhere... move fast and break things?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants