Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Phalcon\Support classes leak memory round 2 #16593

Open
maxgalbu opened this issue May 21, 2024 · 10 comments
Open

[BUG]: Phalcon\Support classes leak memory round 2 #16593

maxgalbu opened this issue May 21, 2024 · 10 comments
Assignees
Labels
5.0 The issues we want to solve in the 5.0 release bug A bug report status: medium Medium

Comments

@maxgalbu
Copy link
Contributor

maxgalbu commented May 21, 2024

ref: #16571

Describe the bug
Instantiating and invoking a Phalcon/Support class many times make memory grow too much.
I'm using it in a forever-running CLI script.

To Reproduce

Script to reproduce the behavior:

<?php
use Phalcon\Support\HelperFactory;

echo "beginning: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";

$helper = new HelperFactory();
foreach (range(1,100000) as $num) {
    $helper->camelize("transaction_id");
}

echo "end: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";

Output:

beginning: 0.40937042236328
end: 42.000595092773

Expected behavior
Memory should not increase so much. Same camelized output using userland HelperFactory that uses ucwords():

Expand to see code
<?php

echo "beginning: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";

class Camelize
{
    public function __invoke(
        string $text,
        string $delimiters = "",
        bool $lowerFirst = false
    ): string {
        return str_replace(str_split($delimiters), "", ucwords($text));
    }
}

class HelperFactory
{
    private array $services = [];

    public function __call(string $name, array $arguments)
    {
        $helper = $this->newInstance($name);
        return call_user_func_array([$helper, "__invoke"], $arguments);
    }

    public function newInstance(string $name)
    {
        if (!isset($this->services[$name])) {
            $this->services[$name] = new ($this->getServices($name));
        }

        return $this->services[$name];
    }

    protected function getServices($name)
    {
        return [
            "camelize"      => "Camelize",
        ][$name];
    }
}

$helper = new HelperFactory();
foreach (range(1,100000) as $num) {
    $helper->camelize("transaction_id");
}

echo "end: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";
beginning: 0.41297149658203
end: 4.3821716308594

Details

Phalcon version: 5.7.0
PHP Version: 8.1.27
Operating System: Debian
Installation type: installing via package manager
Zephir version (if any):
Server: N/A (cli script)
@maxgalbu maxgalbu added bug A bug report status: unverified Unverified labels May 21, 2024
@apoca
Copy link

apoca commented May 23, 2024

This issue is a very serious matter, we have schedulers taking 40 minutes longer than before. We downgraded to version 5.3.1 and it's super fast... We also tried 5.6.1 and it has the same problem. We are currently reverting all of our sites to version 5.3.1

@niden I think this is very serious...

@noone-silent
Copy link
Contributor

noone-silent commented May 24, 2024

I could detect the cause for the memory leak in camelize but ucwords I did not have any memory increase at all.

// Run with 100.000 iterations
root@cphalcon-83:/srv# php -d"extension=phalcon.so" ./.local/helper-test.php
beginning: 0.40730285644531
end: 2.3891067504883
// Run with 1.000.000 iterations
root@cphalcon-83:/srv# php -d"extension=phalcon.so" ./.local/helper-test.php
beginning: 0.40730285644531
end: 16.389106750488

@apoca
Copy link

apoca commented May 24, 2024

Well, guys... It seen to be the PHP version, we've changed the PHP version and it was fast again.. Not sure if is Phalcon version... I'll put here more information.

Anyway, @noone-silent what is your PHP version?

@noone-silent
Copy link
Contributor

Well, guys... It seen to be the PHP version, we've changed the PHP version and it was fast again.. Not sure if is Phalcon version... I'll put here more information.

Anyway, @noone-silent what is your PHP version?

I came on it, because of this php bug https://bugs.php.net/bug.php?id=76982
They introduced some bugs with closures and anonymous functions. I think they mentioned it started with php 8.1

niden added a commit that referenced this issue May 24, 2024
…-anonymous-function

[#16593] - fix: fixed memory leak by anonymous function in PascalCase.zep
@niden niden self-assigned this May 24, 2024
@niden niden added status: medium Medium 5.0 The issues we want to solve in the 5.0 release and removed status: unverified Unverified labels May 24, 2024
@maxgalbu
Copy link
Contributor Author

maxgalbu commented May 27, 2024

This issue is a very serious matter, we have schedulers taking 40 minutes longer than before. We downgraded to version 5.3.1 and it's super fast... We also tried 5.6.1 and it has the same problem. We are currently reverting all of our sites to version 5.3.1

@apoca I don't think this is either fast or slow, it's a memory leak... the issue was present in 5.0.0 all the way up to 5.7.0.
Anyway, what were the php versions you used before and after?

I could detect the cause for the memory leak in camelize but ucwords I did not have any memory increase at all.

@noone-silent Is this the same as what I wrote in the issue description (camelize in phalcon VS ucwords in userland code), or are you talking about something else?

They introduced some bugs with closures and anonymous functions. I think they mentioned it started with php 8.1

@noone-silent It's weird that your commit fixes the bug... so every part of phalcon that uses array_map() or any other function that uses anonymous functions is leaking? 🤯

@noone-silent
Copy link
Contributor

I could detect the cause for the memory leak in camelize but ucwords I did not have any memory increase at all.

@noone-silent Is this the same as what I wrote in the issue description (camelize in phalcon VS ucwords in userland code), or are you talking about something else?

Maybe I understood you wrong. I understood you, that camelize and ucwords in Phalcon have the same memory leak. So I checked camelize and ucwords, but could only find the bug in camelize.

They introduced some bugs with closures and anonymous functions. I think they mentioned it started with php 8.1

@noone-silent It's weird that your commit fixes the bug... so every part of phalcon that uses array_map() or any other function that uses anonymous functions is leaking? 🤯

It looks like that.

@maxgalbu
Copy link
Contributor Author

maxgalbu commented May 28, 2024

Maybe I understood you wrong. I understood you, that camelize and ucwords in Phalcon have the same memory leak. So I checked camelize and ucwords, but could only find the bug in camelize.

Not at all, my example with ucwords() was to show that camelize() leaks a lot:

Camelize with Phalcon HelperFactory has this memory impact:

beginning: 0.40937042236328
end: 42.000595092773

Same output (not memory, output, eg echo "somethingCamelized") using userland HelperFactory that uses ucwords() has this memory impact:

beginning: 0.41297149658203
end: 4.3821716308594

@apoca
Copy link

apoca commented May 28, 2024

Well, we just tested something else the php cli is much slower in versions of PHP 8.1.26 or higher with Phalcon 5.3.1 or higher.

@maxgalbu could you test with PHP version 8.1.23?

@kenyeung826
Copy link

<?php

use Phalcon\Support\Helper\Str\PascalCase as PhPascalCase;

class PascalCase {

   public function __invoke(
        $text,
        $delimiters = null
    ) {

        $exploded = $this->processArray($text, $delimiters);

        $output = array_map(
            function ($element) {
                return ucfirst(mb_strtolower($element));
            },
            $exploded
        );

        return implode("", $output);
        }

        protected function processArray($text, $delimiters = null)
    {

        if ($delimiters === null) {
            $delimiters = "-_";
        }

        /**
         * Escape the `-` if it exists so that it does not get interpreted
         * as a range. First remove any escaping for the `-` if present and then
         * add it again - just to be on the safe side
         */
        if (strpos($delimiters, "\\-") !== false || strpos($delimiters, "-") !== false) {
            $delimiters = str_replace(["\\-", "-"], ["", ""], $delimiters);
            $delimiters = "-" .$delimiters;
        }

        $result = preg_split(
            "/[" . $delimiters . "]+/",
            $text,
            -1,
            PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY
        );

        return (false === $result) ? [] : $result;
    }
}

echo "beginning: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";
foreach (range(1,100000) as $num) {
    $helper = (new PascalCase)("transaction_id");
}
echo "end: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";
memory_reset_peak_usage();

echo "beginning: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";
foreach (range(1,100000) as $num) {
    $helper = (new PhPascalCase)("transaction_id");
}

echo "end: ".memory_get_peak_usage() / 1024 / 1024;
echo "\n";

some insight (phalcon-5.6.2, php8.3)
I have copy the code from cphalcon/phalcon/Support/Helper/Str/PascalCase.zep and run the result

beginning: 0.8388671875
end: 2.8052978515625

beginning: 0.79993438720703
end: 40.419456481934

@noone-silent
Copy link
Contributor

There is already a fix coming. Try your code with the following changes and it should work:

Replace:

$output = array_map(
    function ($element) {
        return ucfirst(mb_strtolower($element));
    },
    $exploded
);

return implode("", $output);

With

$output = '';
foreach ($exploded as $element) {
    $output .=  ucfirst(mb_strtolower($element));
}

return $output;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5.0 The issues we want to solve in the 5.0 release bug A bug report status: medium Medium
Projects
None yet
Development

No branches or pull requests

5 participants